This is only an article draft
Open Flash is a project to create an open-source media player for SWF files. Its goal is to keep Flash animations accessible for archival purposes. It aims to achieve it by learning from related projects and building an ecosystem of reusable components.
I see garbage collection as an extension of memory management offered by smart pointers.
The essential requirement of memory management is to provide ways to dynamically allocate portions of memory to programs at their request, and free it for reuse when no longer needed.
Memory management is a hard problem. Rust's borrow-checker, one of its main innovation selling point, was actually created to deal with memory management. The main challenge is to free memory only once it is no longer needed. It sounds easy but it is easy to make a mistake when doing it manually.
By default, Rust's compiler prevents memory errors. This is achieved by adding constraints to your program (such as
lifetimes) so Rust is able to reason about your code and prove that it is safe with regard to memory errors. The
borrow-checker and abstractions in the standard library let you create most programs, but it is sometimes not enough.
If you want manual control over memory, you can use Rust's
unsafe blocks but you are on your own then: the compiler
can no longer catch memory errors. Garbage collection is a memory-safe abstraction that lets you represent arbitrary
data. It is a form of automatic memory management. The downside is that it has a higher runtime cost.
The different abstractions to handle memory are a bit like sets of numbers. You can start with simple ones and build more general ones: integers, rationals, real numbers, complex numbers, quaternions, and so on. The more general you are, the more complex (no pun intended) your model is. Using a more general model comes with a cost (that's why people prefer to avoid garbage collection in performance-critical code).
Static values are the simplest one. They just exist for the whole duration of your program. These are usually constants or global variables. Obviously their main downside is that you can't create them at runtime.
Stack-allocated values are the next simplest values. These are the values in the local variables of your functions.
They are allocated when the function is called and freed when the function returns. They are also the most limited:
their size must be known at compilation (they must have the
Sized trait). Values that live entirely on the stack are
integers, pointers and structs or enums of stack-allocated values. This restriction is why you cannot have local
variables of type
str: the compiler cannot predict the size of the string slice. Instead, the type of your local
variable is a reference
&str. The reference is based on a pointer of known of known size (8 bytes on a 64-bits
You need heap memory to implement dynamically sized collections such as vectors. The values on the stack are
allocated for the lifetime of a function call. The heap is more general: you can allocate and free memory on the heap
whenever you want. Heap-allocated values are accessed through pointers (memory addresses). For example you can
allocate memory in one function, return the corresponding pointer and free its heap-memory in another function. In the
case of vectors,
Vec is small fixed-size structure with a pointer to heap-allocated data. When you push values to the
vector, it automatically ensures that its heap memory has enough space and stores the value there.
This situation is very common when dealing with dynamic data. The dynamic part is allocated on the heap and a small fixed-size struct represents it on the stack. When this struct acts as a simple pointer but automatically controls when its heap-memory should be freed, we call it a "smart pointer".
Rust's Box smart pointers have ownership of their value. They are implemented as raw pointers that free their memory when they are dropped. A Box pointer is the only one pointing to its heap-data so both the pointer and the data live as long. It lets you represent recursive tree structure.
Rc provides shared ownership of a heap value. There can be many Rc pointers for the same value. The value holds a counter of active references ("rc" stands for reference-counted) and is dropped once this number reaches zero. It lets you represent directed acyclic graphs
Weak pointers are an extension to reference-counted values. A
Weak pointer can point to a shared value (with other
Rc) but is not tracked with the counter keeping the value alive. The memory is reclaimed when the last
is dropped even if there are still some
Weak pointing to it. Memory safety is achieved by requiring you to manually
Weak pointer to an
Rc before accessing its value. You are also required to deal with possible failure if the
value was freed.
Weak pointers let you represent directed acyclic graphs with weak back-references.
With all these pointers, the only thing you can't represent without leaks or invalid pointers are arbitrary object
graphs: graphs with cycles. The role of
Gc is to support these kinds of object graphs. The memory management
becomes more complex: objects are freed once they are no longer reachable. An object is reachable if you can move from
pointer to pointer by starting at a value on the stack.