A Wrong Name: Fifteen Years of TransitiveClosure in MMTk
by Kunshan Wang
The TransitiveClosure
interface in MMTk is confusing. It should have been
split into two different interfaces, but not… until now. What’s more
interesting is how we ended up having an interface like that 15 years ago, and
why it stayed that way since then.
MMTk?
I have been contributing to the Memory Management Toolkit (MMTk) project since I left Huawei. MMTk is a framework for garbage collection. It was part of the JikesRVM, and has been a good platform for GC research. Many state-of-the-art garbage collection algorithms have been developed on it. Now we are re-implementing MMTk in Rust so that it can be integrated into many different languages and VMs, such as OpenJDK, V8, Julia, GHC, PyPy and, of course Ruby which I am working on.
As I started working on MMTk, one part of the code, that is, the
TransitiveClosure
interface/trait has always confused me.
TransitiveClosure?
In the core MMTk repository, the mmtk-core, you can find the
TransitiveClosure trait and its implementation for all
ProcessEdgesWork
instances. (Let’s not worry about what ProcessEdgesWork
does for now.)
pub trait TransitiveClosure {
fn process_edge(&mut self, slot: Address);
fn process_node(&mut self, object: ObjectReference);
}
impl<T: ProcessEdgesWork> TransitiveClosure for T {
fn process_edge(&mut self, _slot: Address) {
unreachable!();
}
#[inline]
fn process_node(&mut self, object: ObjectReference) {
ProcessEdgesWork::process_node(self, object);
}
}
The presence of unreachable!();
startled me. The code implemented one method
(process_node
), but declared the other method (process_edge
) unreachable.
This is not how we usually use traits. When we define a trait with two methods,
we expect both methods to be callable on all instances. Otherwise, why do we
even have the process_edge
method in TransitiveClosure
in the first place?
I guess some types must have implemented the TransitiveClosure
trait and
provided a proper process_edge
implementation. And… I am right. It is
ObjectsClosure.
impl<'a, E: ProcessEdgesWork> TransitiveClosure for ObjectsClosure<'a, E> {
#[inline(always)]
fn process_edge(&mut self, slot: Address) {
if self.buffer.is_empty() {
self.buffer.reserve(E::CAPACITY);
}
self.buffer.push(slot);
// ... more code omitted.
}
fn process_node(&mut self, _object: ObjectReference) {
unreachable!()
}
}
unreachable!()
again? What the …!
Clearly ProcessEdgesWork
and ObjectsClosure
are implementing two different
interfaces.
ProcessEdgesWork
only implementsprocess_node
, whileObjectsClosure
only implementsprocess_edge
.
Should we split TransitiveClosure?
Maybe we should split it into two different traits, one contains
process_node
and the other contains process_edge
. In this way, a type may
only implement the trait it needs, and not the unreachable!()
stub.
Or, should we?
To confirm this, let’s find their call sites, and see whether we ever use both methods at the same time. The short answer is, no.
The process_node method
process_node
is only called by XxxSpace::trace_object
, where XxxSpace
is a
concrete space. It can be CopySpace, MallocSpace,
LargeObjectSpace and so on.
ImmixSpace
even has two flavours of
trace_object
, both call process_node
.
The trace_object
method of a space visits an object during tracing. It marks
or copies the object and, if it is the first time it visits an object, it
enqueues the object into the marking queue by calling the process_node
method which does the actual enqueuing.
The process_edge method
process_edge
is only called by VM::Scanning::scan_object
.
scan_object
is implemented by a VM binding (such as the OpenJDK
binding) because it is VM-specific. MMTk calls scan_object
when it needs the VM to locate all reference fields in an object.
pub trait Scanning<VM: VMBinding> {
fn scan_object<T: TransitiveClosure>(
trace: &mut T,
object: ObjectReference,
tls: VMWorkerThread,
);
// ... more code omitted
}
The trait
parameter is a call-back. When scan_object(trace, object, tls)
is
called, it scans object
, and calls trace.process_edge
on each edge, i.e.
each reference field, of object
.
Yes! They are different!
We confirmed that each of the two methods is used in a different scenario.
process_node
is only used bytrace_object
, andprocess_edge
is only used byscan_object
.
And nothing calls both process_node
and process_edge
at the same time.
So let’s split them into two traits.
But my colleague Yi Lin reminded me that there were other classes that extends
TransitiveClosure
in the original MMTk in JikesRVM. To be safe, I looked into
the original JikesRVM MMTk before making further decisions.
Back in JikesRVM
Now we temporarily move away from Rust MMTk, and go back to JikesRVM MMTk. If
the TransitiveClosure
trait is designed like this in Rust, it must have its
roots back in JikesRVM.
TransitiveClosure in JikesRVM MMTk
In JikesRVM, there is a class named TransitiveClosure. Yes. It is a class, not an interface.
@Uninterruptible
public abstract class TransitiveClosure {
// Other methods omitted...
public void processEdge(ObjectReference source, Address slot) {
VM.assertions.fail("processEdge not implemented.");
}
public void processNode(ObjectReference object) {
VM.assertions.fail("processNode not implemented.");
}
}
Defining TransitiveClosure
as a class allows it to provide failing default
implementations of processEdge
and processNode
. This allows its subclasses
to override one method while leaving the other failing.
(Note that the JikesRVM code was written before Java 8, so interface methods could not have default implementations.)
TransitiveClosure subclasses in JikesRVM MMTk
In JikesRVM, many classes inherit from TransitiveClosure
.
- TransitiveClosure
- TraceLocal
- NoGCTraceLocal
- MSTraceLocal
- SSTraceLocal
- MCMarkTraceLocal
- MCForwardTraceLocal
- ImmixTraceLocal
- ImmixDefragTraceLocal
- …
- RCZero
- RCModifiedProcessor
- GenRCModifiedProcessor
- ObjectReferenceBuffer
- RCDecBuffer
- TraceWriteBuffer
- TraceLocal
TraceLocal
Back in JikesRVM MMTk, there was no concept of “work packet”. The abstraction
of tracing is the Trace
class and its thread-local counterpart, TraceLocal
.
ThreadLocal
was the local context of a GC thread during tracing GC.
public abstract class TraceLocal extends TransitiveClosure {
@Override
@Inline
public final void processNode(ObjectReference object) {
values.push(object);
}
@Override
@Inline
public final void processEdge(ObjectReference source, Address slot) {
ObjectReference object = VM.activePlan.global().loadObjectReference(slot);
ObjectReference newObject = traceObject(object, false);
if (overwriteReferenceDuringTrace()) {
VM.activePlan.global().storeObjectReference(slot, newObject);
}
}
// ... more code omitted
}
In JikesRVM, TraceLocal
is the counterpart of both ProcessEdgesWork
and
ObjectsClosure
of the current Rust MMTk.
Space.traceObject
callsTraceLocal.processNode
to enqueues newly visited objects,- just like
XxxSpace::trace_object
callingProcessEdgesWork.process_node
in Rust MMTk.
- just like
Scanning.scanObject
callsTraceLocal.processEdge
to process each reference field (edge),- just like
Scanning::scan_object
callingObjectsClosure.process_edge
in Rust MMTk.
- just like
In JikesRVM MMTk, each plan (GC algorithm) defines its own TraceLocal
subclass.
(Some GC algorithms. such as MarkCompact and Immix, even have more than
one TraceLocal
for different kinds of traces.)
It looks like the TransitiveClosure
is a proper interface for TraceLocal
.
It implements both processNode
and processEdge
.
However, TraceLocal
is the only class that implements both processNode
and
processEdge
. Other classes don’t.
processEdge: Field visitors
Some subclasses of TransitiveClosure
are related to reference counting (RC).
They are RCZero
, RCModifiedProcessor
, etc. They only override the
processEdge
method, assuming processNode
is never called.
What do they do?
They are field visitors. For example, RCZero
visits edges and stores
null
to each field.
public final class RCZero extends TransitiveClosure {
// Does not override processNode, leaving it failing
public void processEdge(ObjectReference source, Address slot) {
slot.store(ObjectReference.nullReference());
}
}
During RC collection, RCZero
is passed to Scanning.scanObject
as the
callback to set all reference fields of an object to null
.
public abstract class RCBaseCollector extends StopTheWorldCollector {
private final RCZero zero; // This implements TransitiveClosure
// ... more code omitted
@Override
public void collectionPhase(short phaseId, boolean primary) {
// ... more code omitted
if (phaseId == RCBase.PROCESS_DECBUFFER) {
// ... more code omitted
while (!(current = decBuffer.pop()).isNull()) {
// ... more code omitted
VM.scanning.scanObject(zero, current); // Passing zero as callback
}
}
}
}
Other RC-related classes are similar. They record the fields or apply decrements to the objects pointed by each field.
processNode: to enqueue objects
The TraceWriteBuffer
class only implements processNode
.
(In fact, TraceWriteBuffer
is the only class that overrides
processNode
but note processEdge
throughout the history of
JikesRVM.)
public final class TraceWriteBuffer extends TransitiveClosure {
private final WriteBuffer buffer;
// ... more code omitted
@Inline
public void processNode(ObjectReference object) {
buffer.insert(object.toAddress());
}
}
TraceWriteBuffer
is only used by CMSMutator
(concurrent mark-sweep mutator).
public class CMSMutator extends ConcurrentMutator {
private final TraceWriteBuffer remset;
// ... more code omitted
@Override
protected void checkAndEnqueueReference(ObjectReference ref) {
if (ref.isNull()) return;
if (barrierActive) {
if (!ref.isNull()) {
if (Space.isInSpace(CMS.MARK_SWEEP, ref)) CMS.msSpace.traceObject(remset, ref); // here
else if (Space.isInSpace(CMS.IMMORTAL, ref)) CMS.immortalSpace.traceObject(remset, ref); // here
else if (Space.isInSpace(CMS.LOS, ref)) CMS.loSpace.traceObject(remset, ref); // here
else if (Space.isInSpace(CMS.NON_MOVING, ref)) CMS.nonMovingSpace.traceObject(remset, ref); // here
else if (Space.isInSpace(CMS.SMALL_CODE, ref)) CMS.smallCodeSpace.traceObject(remset, ref); // here
else if (Space.isInSpace(CMS.LARGE_CODE, ref)) CMS.largeCodeSpace.traceObject(remset, ref); // here
}
}
// ... more code omitted
}
}
In simple terms, the code above means “Trace the object ref
in the space it is
in, and, if it is the first time the object is traced, enqueue it in remset
.”
What’s in common between TraceWriteBuffer
and TraceLocal
is that both of
them contain a buffer (a remember set, or the tracing queue) where the
traceObject
method can enqueue newly visited objects to. This is what
processNode
is for, i.e. enqueuing objects.
So why not introducing a dedicated interface?
There comes an interesting question:
If some classes are just field visitors, why don’t we have a dedicated interface for it, and name it
FieldVisitor
?
and
If some classes are just places to enqueue objects, why don’t we have a dedicated interface for it, and name it
ObjectBuffer
?
TransitiveClosure
has been there for 15 years. Many developers have made
contributions to MMTk, and some of them must have noticed the issues I talked
about. Why have TransitiveClosure
remained this way till today?
And why did we end up having this TransitiveClosure
amalgamation in the first
place?
Through the history
To answer these questions, I dug into the Git revision history of the JikesRVM repository.
The git blame
command can show me in which commit any line in any source file
is last modified. I use vim-fugitive, and it even allows me to follow a line
of code from one commit to another, and see every single change to a line of
code in history.
Early days of object scanning
The history of Scanning.scanObject
is the history of object scanning interface
and implementation.
Back in 2003, MMTk (was JMTk back then) and JikesRVM were
more tightly coupled than they are today. Unlike the modern Scanning
interface, the ScanObject
class back then contained concrete implementations
directly. The ScanObject.scan
method enumerates
reference fields, and directly calls the Plan.traceObjectLocation
static
method, which does the load/traceObject/store sequence like our modern
ProcessEdgesWork::process_edge
method. Everything was hard-wired. The
operation for visiting field was fixed.
A commit in 2003 introduced the
ScanObject.enumeratePointers
method which calls back to
Plan.enumeratePointerLocation
which can be customised. This allows a certain
degree of freedom of what to do with each field, instead of
load/traceObject/store.
Another commit in 2003 introduced the Enumerate
class
which was subsequently renamed to Enumerator
and made fully
abstract.
abstract public class Enumerator implements Uninterruptible {
abstract public void enumeratePointerLocation(Address location);
}
The ScanObject.enumeratePointers
method then used Enumerator
as the call
back instead calling into Plan
directly, allowing the behavior of visiting
each edge to be fully customised.
As I conjectured, some developers did notice that the call-back for
scanObjects
should be customisable, and Enumerator
was introduced just for
that.
In 2006, just before that important change
Both the Scanning.scanObject
and Scanning.enumeratePointers
existed in the
Scanning class before late 2006.
public abstract class Scanning implements Constants, Uninterruptible {
public abstract void scanObject(TraceLocal trace, ObjectReference object);
public abstract void enumeratePointers(ObjectReference object, Enumerator e);
// ... more code omitted
}
Both scanObject
and enumeratePointers
enumerate reference fields in an
object. However, they are used in totally different places.
Scanning.scanObject
The Scanning.scanObject
method was used for tracing, as it took TraceLocal
as parameter, and called TraceLocal.traceObjectLocation
for each reference
filed.
Note that at that time, TraceLocal was a root
class. There was no superclasses or interfaces
like TransitiveClosure
for it to extend/implement. (Constants
is an
all-static interface, and Uninterruptible
is just a marker.) This means
scanObject
was only applicable to subclasses of TraceLocal
, and nothing
else.
public abstract class TraceLocal implements Constants, Uninterruptible { // no superclass
public final void enqueue(ObjectReference object) throws InlinePragma { // traceObject calls this
values.push(object);
}
public final void traceObjectLocation(Address objLoc, boolean root) // scanObject calls this
throws InlinePragma {
ObjectReference object = objLoc.loadObjectReference();
ObjectReference newObject = traceObject(object, root);
objLoc.store(newObject);
}
// ... more code omitted
}
Scanning.enumeratePointers
On the other hand, the Scanning.enumeratePointers
could in theory be used by
any code that needs to enumerate reference fields. At that time, it was used
for (deferred) reference counting. The following was the “mark grey” operation
in trial-deletion for cycle collection in reference counting.
public final class TrialDeletion extends CycleDetector
implements Constants, Uninterruptible {
private TDGreyEnumerator greyEnum; // extends Enumerator
// ... more code omitted
private final boolean markGrey(ObjectReference object, long timeCap)
throws InlinePragma {
// ... more code omitted
while (!object.isNull()) {
// ... more code omitted
if (!abort && !RefCountSpace.isGrey(object)) {
RefCountSpace.makeGrey(object);
Scan.enumeratePointers(object, greyEnum); // pay attention to this call site
}
object = workQueue.pop();
}
return !abort;
}
// ... more code omitted
}
The call site of Scan.enumeratePointers
passed a TDGreyEnumerator
instance
which customised the behaviour of visiting fields. It just forward the call to
TrialDeletion.enumerateGrey
.
public class TDGreyEnumerator extends Enumerator implements Uninterruptible {
private TrialDeletion td;
// ... more code omitted
public void enumeratePointerLocation(Address objLoc) throws InlinePragma {
td.enumerateGrey(objLoc.loadObjectReference());
}
}
The obvious problem
We then had Scanning.scanObject
for tracing, and Scanning.enumeratePointers
for RC.
But why did we need both methods? scanObject
was basically a special case
of enumeratePointers
that called TraceObject.enumeratePointerLocation
.
Could we unify them? Apparently someone noticed that, and he did a refactoring.
Unifying scanObject and enumeratePointers with “TraceStep”
In late 2006, someone created a commit which introduced a new version of reference counting collector, and at the same time did “a huge refactoring” (see the commit message).
This commit created a class named TraceStep
. Scanning.scanObject
then took
TraceStep
as parameter instead of TraceLocal
. The enumeratePointers
method was removed.
public abstract class TraceStep implements Constants, Uninterruptible {
public abstract void traceObjectLocation(Address objLoc);
}
public abstract class Scanning implements Constants, Uninterruptible {
public abstract void scanObject(TraceStep trace, ObjectReference object);
}
Obviously, TraceStep
was intended to replaced Enumerator
as the call-back
interface for scanObject
to enumerating fields. Both tracing and RC started
using scanObject
from then on.
For tracing, the TraceLocal
started to extend TraceStep
.
public abstract class TraceLocal extends TraceStep // Now extends TraceStep
implements Constants, Uninterruptible {
public final void traceObjectLocation(Address objLoc) // called by scanObject
throws InlinePragma {
traceObjectLocation(objLoc, false);
}
public final void traceObjectLocation(Address objLoc, boolean root)
throws InlinePragma {
// ... just like before
}
}
RC operations, such as “mark grey”, started extending TraceStep
instead of
Enumerator
.
public final class TrialDeletionGreyStep extends TraceStep // Now extends TraceStep instead of Enumerator
implements Uninterruptible {
public void traceObjectLocation(Address objLoc) {
ObjectReference object = objLoc.loadObjectReference();
((TrialDeletionCollector)CDCollector.current()).enumerateGrey(object);
}
}
public final class TrialDeletionCollector extends CDCollector
implements Uninterruptible, Constants {
public TrialDeletionGreyStep greyStep; // A TraceStep instead of Enumerator
// ... more code omitted
private final void markGrey(ObjectReference object) throws InlinePragma {
// ... more code omitted
Scan.scanObject(greyStep, object); // now passes a TraceStep as arg
// ... more code omitted
}
}
This commit successfully unified the object scanning interface.
Scanning.scanObject
became the only object-scanning method. It became
applicable to all TraceStep
instances alike, and was no longer coupled with
TraceLocal
. Good! Nicely done!
More over, this commit cleverly found a concept that describes both
TraceLocal
and the various operations in reference counting, such as “mark
grey”, “scan black”, etc. It was “TraceStep”. Intuitively, all of them
were steps of tracing.
But really? We will soon find that it is not really that clever.
Here comes TransitiveClosure
In 2007, someone made another commit to “reorganise the core of transitive closure”, and the motivations were “concurrent collection” and “implementing prefetching during trace”.
In this commit, TraceStep
was renamed to our familiar TransitiveClosure
.
- The
TraceStep.traceObjectLocation
method was renamed toTransitiveClosure.processEdge
, and - a new method
processNode
was added.
@Uninterruptible
public abstract class TransitiveClosure {
/**
* Trace an edge during GC.
*
* @param objLoc The location containing the object reference.
*/
public void processEdge(Address objLoc) {
VM.assertions.fail("processEdge not implemented.");
}
/**
* Trace a node during GC.
*
* @param object The object to be processed.
*/
public void processNode(ObjectReference object) {
VM.assertions.fail("processNode not implemented.");
}
}
Note that I didn’t add any // ... more code omitted
comment because back in
2007, that was the entire class body of TransitiveClosure
.
TraceLocal
now extends TransitiveClosure
instead of TraceStep
, and
- the
traceObjectLocation
method was renamed toprocessEdge
, and - the
enqueue
method was renamed toprocessNode
.
public abstract class TraceLocal extends TransitiveClosure {
public final void processNode(ObjectReference object) {
// ... like before
}
public final void processEdge(ObjectReference source, Address slot) {
// ... like before
}
// ... more code omitted
}
A few days later, a subsequent commit introduced a
TraceWriteBuffer
class that extended TransitiveClosure
and overrode
processNode
alone, and not processEdge
.
public final class TraceWriteBuffer extends TransitiveClosure {
// ... more code omitted
@Inline
public void processNode(ObjectReference object) {
// .. as you have seen in previous sections.
}
}
It remains the only class that overrides processNode
but note processEdge
in the history of JikesRVM, even today.
With this change, TransitiveClosure
started to serve two distinct purposes.
- the callback for
scanObject
, and - the place to enqueue an object after tracing it.
If a class was only used in one of the two cases, it would override only one of the two methods.
10 years passed…
TransitiveClosure
remained this way in JikesRVM MMTk since then.
And the TransitiveClosure
class grew a little bit. Some static fields and
methods about specialised scanning were added to it, as if
it were a good place to hold information for specialised scanning.
…and there was Rust MMTk.
In 2017, we started porting MMTk to Rust.
Porting MMTk to Rust
The TransitiveClosure
was copied to the Rust version, except this time it
was represented as a Rust trait.
pub trait TransitiveClosure {
fn process_edge(&mut self, slot: Address);
fn process_node(&mut self, object: ObjectReference);
}
And TraceLocal
was a trait that requires TransitiveClosure
.
pub trait TraceLocal: TransitiveClosure {
// ... other methods
}
The Scanning.scan_object
method took TransitiveClosure
as parameter, just
like JikesRVM MMTk.
pub trait Scanning<VM: VMBinding> {
fn scan_object<T: TransitiveClosure>(
trace: &mut T,
object: ObjectReference,
tls: OpaquePointer,
);
// ... more code omitted
}
And the XxxSpace::trace_object
methods still took TransitiveClosure
as
parameter, like this:
impl<VM: VMBinding> CopySpace<VM> {
pub fn trace_object<T: TransitiveClosure>(
&self,
trace: &mut T,
object: ObjectReference,
allocator: Allocator,
tls: OpaquePointer,
) -> ObjectReference {
// ... more code omitted
}
And there was still TraceLocal. (Not any more now.)
pub trait TraceLocal: TransitiveClosure {
// ... omitted. There are many methods, but none is interesting here.
}
Like in JikesRVM MMTk,
- the
scan_object
method still calledtrace.process_edge
for each visited edge, and - the
trace_object
method still calledtrace.process_node
to enqueue the object on first visit.
Initially, we did not address the fact that TransitiveClosure served two different purposes. By that time, we had just begun porting MMTk to Rust. We prioritised making Rust MMTk working, and ported from JikesRVM MMTk in a style closely resembled the original Java code.
And MMTk worked. Not just worked, but worked for OpenJDK, JikesRVM and several other VMs, too.
Introducing work packets
We later removed TraceLocal
and introduced the work packet system.
The work packet system represents each unit of work as a “packet” that can be
scheduled on any GC worker thread. The TraceLocal
class was replaced by two
work packets:
- The
ProcessEdgesWork
work packet represents a list of edges to be traced. - The
ScanObjects
work packet represents a list of objects to be scanned.
A ProcessEdgesWork
work packet also carries an object queue inside it. It
needs the process_node
method so that XxxSpace.trace_object
can call it and
queue newly visited objects.
Therefore, ProcessEdgesWork
implements TransitiveClosure because
trace_object
expects it. And… remember? That implementation startled me…
impl<T: ProcessEdgesWork> TransitiveClosure for T {
fn process_edge(&mut self, _slot: Address) {
unreachable!();
}
#[inline]
fn process_node(&mut self, object: ObjectReference) {
ProcessEdgesWork::process_node(self, object);
}
}
Then what visits edges when scanning an object? It is now the ObjectsClosure
object. It needs to provide process_edge
, but Scanning::scan_object
expects
TransitiveClosure
.
So we have to do what the … we need to satisfy that requirement, as we have seen before:
impl<'a, E: ProcessEdgesWork> TransitiveClosure for ObjectsClosure<'a, E> {
#[inline(always)]
fn process_edge(&mut self, slot: Address) {
if self.buffer.is_empty() {
self.buffer.reserve(E::CAPACITY);
}
self.buffer.push(slot);
// ... more code omitted.
}
fn process_node(&mut self, _object: ObjectReference) {
unreachable!()
}
}
As you can see, even after we migrated to the work packet system, and even
though we no longer have any type that overrides both process_edge
and
process_node
, we still kept both process_edge
and process_node
in the
TransitiveClosure
trait.
Why do we end up having TransitiveClosure like this?
We have been startled by smelly code. We have expressed our anger, impatience, surprise, etc., without explicit vulgarity. We have looked into JikesRVM for the old MMTk. We have gone through the history to see the change of the object scanning interface, and read the commits from developers with the intention of improving MMTk.
But why do we end up having a TransitiveClosure
trait like this?
Have we noticed that object scanning is not necessarily part of tracing?
Yes. The Enumerator
interface was introduced just for
that. When called back from scan_object
, it allows us to do anything to
reference fields.
Have we refactored scan_object so it takes a simple callback instead of TraceLocal?
Yes. When TraceStep
was introduced, We unified
Scanning.scanObject
and Scanning.enumeratePointers
. Scanning.scanObject
was refactored to only depend on TraceStep
, and TraceStep
was an abstract
class with only an abstract method traceObjectLocation(Address objLoc)
.
public abstract class TraceStep implements Constants, Uninterruptible {
public abstract void traceObjectLocation(Address objLoc);
}
public abstract class Scanning implements Constants, Uninterruptible {
public abstract void scanObject(TraceStep trace, ObjectReference object);
}
This should have been the ideal interface for Scanning.scan_object
for MMTk in
Java. (Rust could use closure to make it more concise.)
But why did we migrate away from it?
Probably only the author of this commit knows the exact reason.
To my understanding, I think it was because TraceStep
was such a good, but
a wrong, name.
-
We named it “TraceStep”.
-
We made it the superclass of
TraceLocal
. -
We then introduced TraceWriteBuffer.
-
We noticed
TraceWriteBuffer
was another place to enqueue objects in addition toTraceLocal
. -
Then we naturally thought that both
TraceWriteBuffer
andTraceLocal
should have a common superclass that had a method namedenqueue
.But
TraceStep
doesn’t haveenqueue
. -
Then we extended
TraceStep
intoTransitiveClosure
, and addedenqueue
.And we even renamed
enqueue
toprocessNode
to make it consistent withprocessEdge
. -
Then we have
TransitiveClosure
which hadprocessEdge
andprocessNode
.
public abstract class TraceLocal extends TransitiveClosure {
public final void processNode(ObjectReference object) { ... }
}
public final class TraceWriteBuffer extends TransitiveClosure {
public void processNode(ObjectReference object) { ... }
}
Wow! “TransitiveClosure” was an even better name! That was what TraceLocal
really was, i.e. computing the transitive closure of an object graph! A
“transitive closure” is a graph! A graph has nodes and edges!
public abstract class TransitiveClosure {
public void processEdge(Address objLoc) { ... }
public void processNode(ObjectReference object) { ... }
}
But TraceWriteBuffer
doesn’t override processEdge
, and RCZero
doesn’t
override processNode
!
No worries. We leave them “unreachable”.
“Unreachable”? That doesn’t sound right.
But it worked… for 15 years.
The name “TransitiveClosure” made so much sense that we stuck to
TransitiveClosure
forever, even after we ported MMTk to Rust.
But TraceStep is a wrong name.
TraceLocal
, “mark grey”, “scan black”,RCZero
and so on are all steps in tracing,- and those trace steps process edges,
- hence
Scanning.scanObject
should acceptTraceStep
as a call-back argument, soScanning.scanObject
can giveTraceStep
edges to process.
Wrong.
Scanning.scanObject
accepts a callback argument not because the callback is a
step of tracing, but simply because the callback visits edges.
I don’t really know what counts as a “trace step”.
- Does “assigning
null
to each reference field” count as a “trace step”? - Does “applying decrement operations to the reference counts of all neighbor objects” count as a “trace step” even if it is used in reference counting, only?
- Is trial-deletion considered as a kind of tracing at all?
No matter what it is, isn’t it much easier to just say
“
Scanning.scanObject
accepts it as a callback argument because it visits edges.”
The nature of interfaces.
This is the nature of interfaces. A reuseable component should not make assumptions about its neighbours more than necessary. This is the separation of concern.
It is just like when we do the following in rust:
vec![1,2,3].iter().foreach(|e| {
println!("{}", e);
});
The Iterator::foreach
function accepts this closure not because it prints
things, but because it visits elements. foreach
is intended for visiting
elements. The closure receives the object. That is the contract of the
foreach
method. Whether it prints the element or how it prints the element is
not part of the contract.
So “Enumerator” was a right name. It correctly describes the role of the object
in a scanObject
invocation, that is, “it enumerates fields”, nothing more.
That’s all what Scanning.scanObject
need to care about. It passes each field
to the callback, and that’s it. It should not assume what that callback object
does to each edge. Whether it is a TraceLocal
or an RC operation is beyond
its obligation.
“TraceStep” was wrong. “TransitiveClosure” was also wrong. Neither of them
is what Scanning.scanObject
cares about.
Finding the way out
We have seen the history, and know why it ended up like this.
We know what was wrong, and what would be right.
“Enumerator” was right. “TraceStep” was wrong. “TransitiveClosure” was wrong, too, but it just sounded so good.
No matter how good it sounds, we need to fix it.
From our analysis in the beginning of this article, we should split
TransitiveClosure
into two traits,
- one as the callback of
Scanning::scan_object
, and - the other to be used by
XxxSpace::trace_object
to enqueue object.
I have opened an issue, and detailed the steps of
splitting and removing TransitiveClosure
from mmtk-core
.
Refactoring Scanning::scan_object
Scanning::scan_object
takes an object and a callback as parameters. It will
find all reference fields in the object, and invoke the callback for each
reference field.
We need a proper name for the callback of Scanning::scan_object
.
I name it EdgeVisitor
, and its only method is, as you can imagine,
visit_edge
.
pub trait EdgeVisitor {
fn visit_edge(&mut self, edge: Address);
}
And it replaces TransitiveClosure
as the parameter type of
Scanning::scan_object
:
pub trait Scanning<VM: VMBinding> {
fn scan_object<EV: EdgeVisitor>(
tls: VMWorkerThread,
object: ObjectReference,
edge_visitor: &mut EV,
);
}
Then the only callback that have ever been passed to Scanning::scan_object
,
i.e. ObjectsClosure
, now implements EdgeVisitor
, instead.
impl<'a, E: ProcessEdgesWork> EdgeVisitor for ObjectsClosure<'a, E> {
fn visit_edge(&mut self, slot: Address) {
// ... code omitted
}
}
And this change has been merged into the master branch of
mmtk-core
.
Meanwhile in Australia…
While I was refactoring mmtk-core
and working on mmtk-ruby
, my colleague
Wenyu Zhao was busy with his paper about the LXC GC algorithm targetting PLDI 2022.
Wenyu independently introduced the
EdgeIterator
struct. It is a wrapper over
Scanning::scan_object
and TransitiveClosure
, and the unreachable!()
statement, too! :P
pub struct EdgeIterator<'a, VM: VMBinding> {
f: Box<dyn FnMut(Address) + 'a>,
_p: PhantomData<VM>,
}
impl<'a, VM: VMBinding> EdgeIterator<'a, VM> {
pub fn iterate(o: ObjectReference, f: impl FnMut(Address) + 'a) {
let mut x = Self { f: box f, _p: PhantomData };
<VM::VMScanning as Scanning<VM>>::scan_object(&mut x, o, VMWorkerThread(VMThread::UNINITIALIZED));
}
}
impl<'a, VM: VMBinding> TransitiveClosure for EdgeIterator<'a, VM> {
#[inline(always)]
fn process_edge(&mut self, slot: Address) {
(self.f)(slot);
}
fn process_node(&mut self, _object: ObjectReference) {
unreachable!()
}
}
With this struct, it allows a closure to be used in the place of a
TransitiveClosure
. The following code applies the inc
RC operation to all
adjacent objects:
EdgeIterator::<E::VM>::iterate(src, |edge| {
self.inc(unsafe { edge.load() });
})
And the following applies dec
, and optionally frees the object:
EdgeIterator::<VM>::iterate(o, |edge| {
let t = unsafe { edge.load::<ObjectReference>() };
if !t.is_null() {
if Ok(1) == super::rc::dec(t) {
debug_assert!(super::rc::is_dead(t));
f(t);
}
}
});
Pretty neat, isn’t it? It is so neat that I want to steal the code and add an
EdgeVisitor::from_closure
factory method for my EdgeVisitor
.
One interesting thing is, Wenyu introduced this for reference counting,
according to the commit message. What a coincidence! Daniel
introduced TraceStep
in 2006 for exactly the same
reason: reference counting, according to its commit message, too.
Understandably, reference counting is very different from tracing. RC needs to
scan objects, but not for tracing, so passing TraceLocal
to scan_object
doesn’t make sense. Therefore, those additional operations, be it mark-grey,
scan-black or just freeing objects, all define their own callbacks to be called
by scan_object
. This necessitates the creation of a better interface for the
callback of scan_object
.
Refactoring Space::trace_object
The XxxSpace::trace_object
method usually has this form:
impl<VM: VMBinding> CopySpace<VM> {
#[inline]
pub fn trace_object<T: TransitiveClosure>(
&self,
trace: &mut T,
object: ObjectReference,
semantics: Option<CopySemantics>,
worker: &mut GCWorker<VM>,
) -> ObjectReference {
// ... more code omitted
if /* is first visited */ {
let new_object = object_forwarding::forward_object::<VM>(...);
trace.process_node(new_object); // enqueue object
new_object // return the forwarded obj ref
}
}
// ... more code omitted
}
If an object is first visited, it enqueues the object in trace
, and returns
the forwarded object reference.
This method is polymorphic w.r.t. T
. T
is the ProcessEdgesWork
(a
sub-trait of TransitiveClosure
) type used by the plan. We used to have one
different ProcessEdgesWork
type for each plan, which made this generic type
parameter necessary.
We already noticed that we may safely remove this
only use case of the TransitiveClosure
trait (i.e. trace_object
) once we
remove plan-specific ProcessEdgesWork
implementations. And the good thing is,
we have recently just removed all plan-specific
ProcessEdgesWork
implementations. Although we are not sure whether we will
have plan-specific ProcessEdgesWork
for complex GC algorithms in the future, I
think the code is much cleaner for a refactoring.
However, I believe the trace
parameter is
completely unnecessary. We just need a return value to indicate whether it is
the first time the object is visited, so that the caller of trace_object
(which is only ProcessEdgesWork::process_edge
at this time) can enqueue the
object. So instead of
fn process_edge(&mut self, slot: Address) {
let object = unsafe { slot.load::<ObjectReference>() };
let new_object = self.trace_object(object);
if Self::OVERWRITE_REFERENCE {
unsafe { slot.store(new_object) };
}
}
we shall have
fn process_edge(&mut self, slot: Address) {
let object = unsafe { slot.load::<ObjectReference>() };
let (new_object, first_visit) = self.trace_object(object);
if Self::OVERWRITE_REFERENCE {
unsafe { slot.store(new_object) };
}
if first_visit {
self.enqueue(new_object);
}
However, the #[inline]
above trace_object
indicates that it is very
performance-critical. We’d better measure before making the decision to change.
Epilogue
After fifteen years, the TransitiveClosure
trait is finally going to change.
The Rust MMTk is under active development now. As we proceed, we may see more things like this, things that have remained in its current state for years, or even decades. But this doesn’t mean they are always right. We have to rethink about the code again and again, and fix the problems whenever we can.
Update
We eventually removed TransitiveClosure
as well as the stale
TraceLocal
trait from mmtk-core
. However, unlike we discussed in previous
sections, we did not remove the trace
parameter from trace_object
. We think
“enqueuing objects” is still a responsibility of the trace_object
method, so
we should still enqueue objects in trace_object
instead of in process_edges
using the return value. Therefore, we simply renamed TransitiveClosure
to
ObjectQueue
, and renamed process_node
to enqueue
, so they indicate exactly
what they do. And now it is the actual object queue instead of
ProcessEdgesWork
that implements ObjectQueue
.
The signature of CopySpace::trace_object
now looks like this:
#[inline(always)]
pub fn trace_object<Q: ObjectQueue>(
&self,
queue: &mut Q,
object: ObjectReference,
semantics: Option<CopySemantics>,
worker: &mut GCWorker<VM>,
) -> ObjectReference {
Note that ObjectQueue
is still a trait. It does not have to be a physical
queue. If a GC algorithm does not hold objects in a queue during tracing, it
can implement this trait and process the “enqueued” object immediatey.
See also
The tracking issue: https://github.com/mmtk/mmtk-core/issues/559
tags: mmtk