This package contains code for generating an Ovm virtual machine
from a set of configuration parameters. Parameters are processed by
the {@link ovm.core.stitcher.InvisibleStitcher InvisibleStitcher},
and are used to select implementations of OVM's key components.
These components are are created as part of the VM generator, but
they persist as part of the generated virtual machine. We preserve
these objects by writing them to a bootimage file that will
form a segment of the running VM's memory.
A virtual machine is generated in six phases:
- First, the VM's core classes must be
bootstrapped so that code can be
loaded.
- Next, VM and application code is be loaded and
compiled by an interprocedural compiler.
- After all code has been compiled, we generate a number of
auxillary files that will be used in
a subsequent run of {@code make}.
- At this point, the state of all objects in the VM-generator that
will persist in the generated VM must be
dumped to a bootimage file.
- Finally, we optionally print a number of useful
statistics we've gathered ofver the
course of execution.
- After {@link s3.services.bootimage.Driver Driver} exits, the
{@code gen-ovm} script invokes {@code make} to create the final
executable.
Bootstrapping
When {@link s3.services.bootimage.Driver Driver} initially starts
up, most Ovm code simply will not run. Ovm's bootstrapping
procedure ensures that the basic types for {@link
ovm.core.services.memory.VM_Address arbitrary pointers} and
references to {@link ovm.core.domain.Oop object headers} are
useable, and that VM implementation code can reason about itself.
For example, suppose that you need the {@link
ovm.core.domain.Blueprint Blueprint} for the vm-internal {@code
int[]} type. The easiest way to accomplish this is to grab an
{@code int[]}, and read the {@link ovm.core.domain.Blueprint} from
it's header, like so
{@code
Oop oop = VM_Address.fromObject(new int[0]).asOop();
intArrayBP = (Blueprint.Array) oop.getBlueprint();
}
Ovm's bootstrapping procedure ensures once {@link
ovm.core.stitcher.InvisibleStitcher#bootstrapComplete} is called and
we begin to load VM components, the example above will work.
Several key methods are involved in the bootstrapping process.
- {@link ovm.core.stitcher.InvisibleStitcher#init
InvisibleSticher.init()} must be called before invisibly
stitched components can be used.
- {@link ovm.core.domain.DomainDirectory#getExecutiveDomain
DomainDirectory.getExecutiveDomain()} constructs a key
component: the namespace in which VM implementation classes
live.
- {@link ovm.core.services.memory.VM_Address#initialize
VM_Address.initialize()} must be called before operations on
addresses and object headers can be simulated in the host
virtual machine. However, calling {@code initialize()} is not
sufficient to enable proper {@code VM_Address} behavior.
- A {@link s3.services.bootimage.DomainSprout DomainSprout}
must be constructed to initialize the
{@link s3.core.domain.S3ExecutiveDomain ExecutiveDomain}. The
sprout first associates a {@link
s3.services.bootimage.BuildTimeLoader BuildTimeLoader} with the
domain, then calls {@link ovm.core.domain.Domain#bootstrap
bootstrap()} to load core classes. After the executive domain has
been bootstrapped, basic operations on {@link
ovm.core.services.memory.VM_Address} and {@link
ovm.core.domain.Oop} work as advertised.
We reach this point about half way through {@link
s3.services.bootimage.Driver#bootstrap Driver.bootstrap()}. The
remainder of this method bootstraps the application domain, and
prepares to compile the VM.
Complation
The generated virtual machine will contain all the code and objects
that implement the configured VM features. It will also contain
meta-objects such as {@link ovm.core.domain.Type Types} and
{@link ovm.core.domain.Method Methods}, and at least a minimal core
of the user-domain runtime library. As with VM components,
meta-objects are instantiated at VM-generation time, and their code
is statically compiled.
The user domain is treated somewhat differently. With the exception
of {@link ovm.core.domain.Type.Class#getSingleton "shared-states"}, there
will be no user-domain objects embedded in the VM
executable. However, some user-domain code must be pre-compiled to
make runtime class loading possible, and entire applications can be
statically compiled into the VM for performance reasons.
The core steps to generate a VM (which are implemented in
{@link s3.services.bootimage.DomainSprout#importCode
DomainSprout.importCode()}) are:
- Determine the set of live methods and objects using a
combination of {@link s3.services.bootimage.Analysis
static analysis} and
{@link s3.services.bootimage.GC garbage collection techniques}.
- Perform interprocedural optimations such as
{@link s3.services.bootimage.Inliner inlining}.
- Hand off the intermediate reprepresentation of all
runtime code to the backend compiler (which extends {@link
s3.services.bootimage.ImageObserver ImageObserver}).
Our {@link s3.services.bootimage.GC} is structured in an unusual
way. It inherits mch of its behavior from {@link s3.util.Walkabout
Walkabout}, which uses java reflection to do a depth-first traversal
of arbitrary object graphs. {@link s3.util.Walkabout#start This
traversal} serves as the collector's mark phase. Each object's mark
bit is maintained in the {@link ovm.core.services.memory.VM_Address
VM_Address} proxy, and the heap that will be dumped to the bootimage
file is represented by the same map used to implement {@link
ovm.core.services.memory.VM_Address#fromObject(Object)
VM_Address.fromObject()}.
Soundness Of Analysis
This section describes how the compile phase carefully works around
problems with the soundness of static analysis within the executive
domain. It describes the workaound, describes the problem more
generally, and explores how things change in a Just-In-Time compiled
setting.
We compile the user domain before the executive domain as a way to
work around some of the complexity that arises when a virtual
machine compiles itself. The inliner and compiler backend may
generate objects (such as {@link ovm.core.domain.InlinedAttribute
inlined method attributes} and {@link ovm.core.domain.Code compile
code objects} that are used, but never allocated, at runtime.
Because we've finished all compilation passes in the user domain
before we start analyzing the executive domain, we are able to
assume that examples of every type allocated within the compilation
process will be present in the bootimage once we begin static
analysis in the executive domain.
It is important to note that we cannot hope to encounter every
object that will finally appear in the bootimage while analyzing the
executive domain. This fact all but precludes the use of CFA or
points-to analysis to compile the executive domain. It also points
to a fundemental weakness in Ovm's static compilation procedure: the
compiler cannot explicitly reason about its own future actions.
Ovm's image dumping technique is quite popular. Very similar
techniques are used in IBM's JikesRVM and J9. Other variations on
the theme can be found in a wide variety of applications including
not only language processors, but applications like Emacs and TeX.
However, issues of soundness do not seem to arise in other
contexts.
The difference appears to be the purely ahead-of-time compilatin
strategy used on most Ovm configurations. If the compiler where
part of the virtual machine, it certainly could reason about its own
future behaviour. We would see the effects of later code in the
{@code gen-ovm} program simply because this could would also be
called at runtime. However, we systematically exclude both this
package and {@link s3.services.j2c} from static analysis.
Generating Auxillary Files
The next major step is to dump the objects that will reside in the
bootimage to a file. However, before we proceed to that step, we
generate several auxillary files that will be required by {@code
make} and give the backend compiler a chance to invoke external
tools, and tweak {@link ovm.core.services.memory.VM_Address
VM_Address} bindings. {@link
s3.services.bootimage.Driver#genAuxFiles Driver.genAuxFiles()} is
responsible for this small execution stage, and produces the
following:
- OVMMakefile
- Is similar to Ovm's configure-generated makefiles, and is
generated by a similar pattern substitution. It includes make
rules from the {@link s3.services.bootimage.ImageObserver
ImageObserver's} home directory, and is generated by {@link
s3.services.bootimage.ImageObserver#compilationComplete
compilationComplete()}.
- .gdbinit
- Is generated in the same way, and provides debugger commands.
- gen-ovm.c
- This file is a random dumping-ground for C code. Currently,
it is used by the
{@link s3.services.memory.mostlyCopying.Manager mostly-copying GC}
to record additional information about the bootimage layout.
-
- structs.h
- Declares C structs corresponding to executive domain types in
the generated VM. {@link s3.services.bootimage.StructGenerator}
creates this file.
- structs.hh
- Declares C++ structs corresponding to every Java type
in the generated VM. {@link
s3.services.bootimage.CxxStructGenerator} creates this file.
- img.ovmir.ascii
- When
-dumpovmirascci
is provided, the final
intermediate represenation of every method is disassembled to
this file.
- native_calls.gen
- Is the body of a C switch statement that implements Ovm's
low level native interface. This file is created by
{@link s3.services.bytecode.ovmify.NativeCallGenerator
NativeCallGenerator} and is included by the backend compiler's C
code.
A backend compiler typically generates a slew of additional files
that are referenced by its make rules.
Bootimage Dumping
When compilation is finished, we can be reasonably sure that the
state of objects that will reside in the bootimage is stable, and
begin dumping these objects in the generated VM's binary
represenation. As with garbage collection, the task of image
dumping is shared between {@link
ovm.core.services.memory.VM_Address VM_Address}, this package,
{@link s3.util.Walkabout Walkabout}.
When the concrete address of an object is taken (with {@link
ovm.core.services.memory.VM_Address#asInt VM_Address.asInt()}),
{@code VM_Address} allocates space for the object in the bootimage's
bytebuffer, and writes a header whose format is deterimined by the
configuration's {@link ovm.core.domain.ObjectModel ObjectModel}.
In addition, {@code VM_Address} memory operations are simulated by
updating the bytebuffer. We exploit this fact when copying the
values of fields and array elements into the bootimage.
{@link s3.services.bootimage.ISerializer} is responsible for dumping
object state. Like {@code GC}, it extends {@code Walkabout}, but
unlike {@code GC}, {@code ISerializer} does not perform a
depth-first search. Instead, it iterates over the {@code GC}'s
final heap, and writes the state of every object using methods
defined in {@link ovm.core.domain.Field Field} and {@link
ovm.core.domain.Blueprint.Array Blueprint.Array}.
The behavior of both {@code GC} and {@code ISerializer} is
controlled by the sets of {@link s3.util.Walkabout.Advice
Walkabout.Advice} objects. For the most part, these sets are
identical, but each defines a couple core peices of advice to
perform its core activity. Great care must be taken to ensure that
the behavior of these two classes remains in sync.
Printing Statistics
When {@code gen-ovm} is invoked with {@code -stats}, {@link
s3.services.bootimage.Driver Driver} outputs a slew of statistics
via {@link s3.services.bootimage.Driver#printStats Driver.printStats}.
Beware of package docs: they often fall out of date. This one was
last updated as follows: {@code $Id: package.html,v 1.9 2007/06/25 20:27:20 baker29 Exp $}.