(Many thanks are due to Andrew Gaylard and David Lewis for putting this together.)
- What is a 32-bit image?
- A 32-bit image is an image in which the object memory uses a 32-bit word
size for object pointers, limiting its total size to a maximum amount of 4GB
of memory. The formats of object memory and object pointers are defined in
class ObjectMemory (see the class comment for a basic explanation). As of this
writing, all Squeak images of practical interest are 32-bit images.
- What is a 64-bit image?
- A 64-bit image is an image in which the object memory uses a 64-bit word size
for object pointers, allowing the size of the image to grow beyond 4GB of
memory. Squeak now supports a 64-bit image format that is sufficient to
produce a working system, but which is intentionally simple. It is expected
to be modified and extended to take advantage of additional 64-bit
capabilities in the future.
- Can I run a 32-bit image on my 64-bit computer?
- Yes. A 32-bit image can be run on either a 32-bit VM or a 64-bit VM. Some
computer platforms (e.g. 64-bit Linux) can run both the 32-bit VM and 64-bit
VM on the same system.
- Can I run a 64-bit image on my 32-bit computer?
- Yes. If you build a VM with the "64-bit VM?" check box selected, you will
create a VM that runs 64-bit images. This will work on 32-bit host systems
as well as on 64-bit host systems.
- Can a single VM run both 32-bit and 64-bit images?
- No. For any given computer platform, two different VMs are required to run
32-bit and 64-bit images. The type of VM that you build is governed by the
"64-bit VM?" check box in VMMaker, and is independent of the word size of
your computer. While it would be possible to create a VM that is
"smart" enough to run both 32-bit and 64-bit images, this is currently of
little practical value due to Squeak's reliance on plugins that are linked to
32-bit or 64 external library code.
Any combination of 32/64 bit VM and 32/64-bit image is possible, but note
that all currently available Squeak images are still in 32-bit format, and
most (perhaps all) pre-built VMs are 32-bit applications.
- What is a 64-bit VM?
- A 64-bit VM is one which is compiled with the LP64 or ILP64 data model. This
means, in C terms, that pointers and longs are 64-bits wide.
- Can I run a 32- or 64-bit VM on my computer?
- It depends. Some current architectures, such as the x86-64 and the
UltraSPARC, can run 32-bit as well as 64-bit applications; these are known
as "bi-arch" systems. However, some systems, such as the Alpha, can only
run 64-bit applications. For bi-arch systems, you can choose whether to run
a 32-bit or 64-bit VM. For 64-bit-only systems, you don't have that choice;
you can only run a 64-bit VM, since there's no way of compiling a 32-bit
application.
- For which hardware/OS combinations is there a 64-bit VM?
- Linux on 64-bit architectures: x86-64, SPARC64, Alpha, Power64, etc.
- Solaris on x86-64 and SPARC64
- MacOS on Power64
- Windows on x86-64
- Does my 64-bit VM run both 32-bit and 64-bit images?
- No. Any VM will run either 32-bit or 64-bit images, but not both. You can
select one or the other when you generate sources with VMMaker, and you can
install both flavors of VM on your system (one each for 32-bit images and
64-bit images).
If you try to run a 64-bit image with a VM built for 32-bit images, you will get
an error message such as this:
This interpreter (vers. 6502) cannot read image file (vers. 68000).
If you try to run a 32-bit image using a VM built for 64-bit images, you will
get an error message such as this:
This interpreter (vers. 68000) cannot read image file (vers. 6502).
- I have a 64-bit computer; should I use a 64-bit VM to run my 32-bit Squeak images?
- It depends. Either one will work, but if your image depends on plugins that
are only available for 32-bit systems, use the 32-bit VM. Otherwise, if
you are building your own VM, go ahead and use the 64-bit version.
- What are the advantages of using 64-bits?
- The first advantage is that your image size can be enormous. If you need the
size of your VM code plus in-memory image to exceed 4 GB, then a 64-bit image
running on a 64-bit VM is for you. Note that it will take ages to write out
an image that's this big to disk. The sort of applications that need this
are those which load a small(ish) image, and run code that creates millions
of objects, but don't save them back to disk in the image. Keep in mind that
the garbage collector is probably not up to the task of collecting multiple
gigabytes.
Another advantage is that certain architectures (e.g. the Alpha) don't offer
a 32-bit mode; they are 64-bit only. For such machines, a 64-bit VM is
required; the image may be 32- or 64-bit.
Another advantage is that when the 64-bit-VM is built, the C compiler knows
the ABI is different from the 32-bit ABI. The x86-64 case is an interesting
example: the old i386 ABI offered few registers, used i387 floating-point,
and passed parameters on the stack (remember, memory writes are slower than
register moves). The x86-64 ABI and architecture, on the other hand, has
many more registers, has SSE, SSE2, etc. for FP, and passes parameters in
registers where possible. It also has additional instructions (MMX et al).
All of these CPU and ABI features may make for a VM that runs faster, but
only if (a) the compiler is able to make use of them and (b) is told to do so
at compile-time. However, the gains are unlikely to be much, and will also
be offset by the cost of large pointers (see below). If you're looking for
performance, it's important to measure a 32-bit VM with a 32-bit image versus
a 64-bit VM with a 64-bit image before assuming anything.
- What are the disadvantages to using 64-bits?
- A disadvantage to 64-bit code is that pointers are 8 bytes instead of 4;
they are also aligned on 8-byte boundaries, meaning that some space around
them, known as `padding', is wasted. This means that (a) pointers take more
space in RAM, (b) take more memory bandwidth when the CPU loads and
stores them, (c) take up valuable space in on-chip caches, and (d) will have
greater wastage due to padding compared to 32-bit pointers aligned on 4-byte
boundaries. For most users, the upper 32-bits be zero, so it makes little sense to load,
process and store pointers that
are double the size but only half-used. So for these users, a 32-bit VM and
image is a good choice.
Another disadvantage is that most users use the 32-bit VM and a 32-bit image.
This combination is therefore the most tested, and therefore most stable,
combination.
A third disadvantage is that code for many of the plugins is not yet ported
to a 64-bit VM.
- My OS allows a single process to grow to 3.75GB; if I need a lot of objects can I use a 32-bit image / VM, or must I use a 64-bit one?
- There have in the past been problems related to the so-called "2-GB
limit". This is due to conversion to and from signed 32-bit integers to 32-bit
pointers in the VM code. These problems appeared when the operating system
loaded the image into memory at addresses above the 2GB mark, and could occur
with normal-sized images, not only images larger than 2GB. These issues
should be a thing of the past. Use the most recently-released VM for your
platform, and report any problems that you see. You should only *need* a
64-bit VM and 64-bit image if your image size will exceed 4GB.
- Where can I find a 64-bit image?
- These are scarce. The original 64-bit port project (from Ian and Dan)
includes a 64-bit image that worked with the VM distributed at that time. A
current VM cannot execute the original 64-bit image due to changes in the
interpreter since that time. It is possible to update that original image
using a modified VM, and the resulting image is executable using a current
unmodified VM. However, there are no official or supported releases of 64-bit
images at this time.
- Why isn't there an officially-released 64-bit image?
- Lack of interest: most people don't need a 64-bit image.
There may yet be some changes to the 64-bit image format to take advantage of
features of 64-bit CPUs. For instance, 63-bit tagged integers might be
possible.
- How can I make a 64-bit image from a 32-bit image?
- Use the SystemTracer (SystemTracerV2 on SqueakMap). The original 64-bit
Squeak image was created using this tool, and a sufficiently motivated
person should be able to reproduce the job. However, the SystemTracer does
not currently work on little-endian computers (including Intel), so some
work should be expected in order to enhance SystemTracer before a successful
conversion will be possible.
- How does a 32-bit VM manage to run a 64-bit image if pointers are 32-bits?
- The short answer: It relies on the image size being smaller than 4GB.
The long answer: Object pointers within the object memory are not pointers in
the C sense of the word. The VM needs to be able to convert the object
pointers into C pointers, and this can be done on either a 32-bit host or a
64 bit host. The only caveat would be that if a 64-bit image grew to a size
large enough to use object pointers larger than the 32-bit limit (i.e. an
image approaching 4GB in size), then a 64-bit VM would be required.
- What sizes and alignment does the new 64-bit image format use for pointers and integers?
- Object pointers are 64-bits wide, allowing for memory up to 2^64 bytes to be
directly addressable. They are aligned on 8-byte boundaries. Integers are
still implemented as tagged 31-bit values, but are sign-extended to use the
full 64-bit object word size and therefore are aligned on 8-byte boundaries.
Future enhancements to 64-bit Squeak will probably make use of the larger
word size to increase the range of SmallInteger values, which will require
further changes to both the VM and the image in order to be effective. These
alignments were chosen as most 64-bit CPUs require them.
The object header and pointer formats for both 32-bit and 64-bit images are
documented in the class comment of ObjectMemory (in the VMMaker package).
The conversions to and from host data types are done in
platforms/Cross/vm/sqMemoryAccess.h using either macros or inline functions.
The actual conversions vary from host to host, and are controlled by macros
such as SQ_HOST64 and SQ_IMAGE32 which must be set for that host. In the
case of a Unix VM, the configure utility is used to specify the
characteristics of the host platform.
The word size to be used in the object memory is specified by the
SQ_VI_BYTES_PER_WORD macro on src/vm/interp.h. This file is created in the
VMMaker code generation process, and the value of SQ_VI_BYTES_PER_WORD is
determined by the "64-bit VM?" check box in the VMMaker tool.
In summary, the object word format is described in class ObjectMemory, the
host data type conversions are specified in sqMemoryAccess.h, and the image
word size is specified in interp.h.
- How do I tell if a given image file is 32-bit or 64-bit?
- The first four bytes in the image file are a "magic" value that indicates the
image word size. This is specified in Interpreter>>imageFormatVersion. For
most images, this will be the first four bytes of the image file, although
in some cases the image data may be offset by 512 bytes in order to permit an
image file to be treated as an executable program on Unix platforms (see
http://en.wikipedia.org/wiki/Shebang_(Unix)).
For instance, if I load VMMaker-3.8b6 into a stock Squeak3.9a-7024.image
file, I see this:
imageFormatVersion
"Return a magic constant that changes when the image format
changes. Since the image reading code uses this to detect byte
ordering, one must avoid version numbers that are invariant
under byte reversal."
BytesPerWord == 4
ifTrue: [^6502]
ifFalse: [^68000]
Examining the file itself gives this:
apg at breakfast: ~/squeak xxd Squeak3.9a-7024.image | head -1
0000000: 0000 1966 0000 0040 011c 7ee0 0427 b000 ...f... at ..~..'..
Looking at the first four bytes gives this:
apg at breakfast: ~/squeak perl -e 'print 0x1966'
6502
(Or do "16r1966 " in a workspace; it also returns 6502.)
So Squeak3.9a-7024.image is a 32-bit image file (since BytesPerWord == 4).