Acorn's binary format is used as an efficient way to transmit serialized data of any level of complexity. It is used in several scenarios:
- Binary encoded Acorn program or data resources. Acorn can load Acorn programs/data as human-readable character (.acn) or binary-encoded (.avm) files. Both are platform-independent and mostly interchangeable. In general, use the character format for world-building and testing. Then, convert .acn files to .avm for deployment, as they are likely to be smaller and easier-to-digest, which means your worlds will start up more quickly. Using binary-encoded files makes it harder (but not impossible) for others to decipher your logic, which may be an advantage or disadvantage for your needs.
- Saved configuration or state data. This is used where the resource's file system allows data to be changed and preserved for later use, such as for user configuration information (and keyboard maps), avatar's current state, etc.
- Client-server synchronization. This format is used to propagate to object properties from one machine to others.
This section describes how Acorn data is serialized in a binary encoding. It is optimized for rapid, sequential loading of the entire data structure into an existing virtual machine. The resulting single value from the loaded data is then used by whatever program requested and received it.
The Header
All .avm files begin with this header block.
- "AVMB" - Identifies the file as an Acorn Virtual Machine Binary file
- length (16-bit uint) - size of the header block (not including "AVMB"
- Version (16-bit uint): version of .avm standard used (starts with 0). Use the lowest number that applies to the encoding, for maximal compatibility.
- Flags (16-bit)
- 0x8000 -
Blocks
The rest of the file is a sequential collection of variable-sized blocks. Each block implicitly has a number (starting with 0), representing its position in the sequence. While loading, Acorn will map this number index to its internal value (typically a pointer). The blocks are effectively bottom-up sequenced, so a list would have a block for each of its members, before putting the block for the list itself.
Every block begins with:
- Encoding type and sizing (8-bit). The bottom 4 bits represent an encoding type:
- 0x00 - null, true or false
- 0x01 - signed integer
- 0x02 - floating-point number
- 0x03 - short symbol (up to 16 characters)
- 0x04 - long symbol (more than 16 characters)
- 0x05 - bytes
- 0x06 - Integer list
- 0x07 - Float list
- 0x08 - Value list
- 0x09 - Property list
- 0x0a - Function call
The meaning of the top 4 bits varies depending on the encoding type. In most cases, it provides byte-sizing information regarding numbers or indices in the block,:
- 0x0 - 1 byte
- 0x1 - 2 bytes
- 0x2 - 3 bytes
- 0x3 - 4 bytes
- 0x4 - 6 bytes
- 0x5 - 8 bytes
Null, true, false block
- 0x00 = null, true, false value, according to the top "sizing" bits:
- 0x0 - null
- 0x1 - false
- 0x2 - true
Integer block
- 0x01 with sizing bits up top, indicating size of the data that follows.
- integer (varying size) - a signed integer.
Float block
- 0x02 with sizing bits up top, indicating size of the data that follows (typically 4 or 8 bytes).
- float (varying size) - an IEEE 744 floating point number.
Small symbol block
- 0x03 with length bits up top indicating number of characters in the symbol, where 0-15 represents 1-16 characters.
- characters (varying number of characters) - the symbol's characters.
Large symbol block
- 0x04 with sizing bits up top indicating size of length field.
- length (varying size) - an unsigned integer specifying number of characters in the symbol
- characters (varying number of characters) - the symbol's characters.
Byte block
- 0x05 with sizing bits up top indicating size of type and length fields below.
- type (varying size) - an index pointing the symbol for the type applied to the byte data stream.
- length (varying size) - an unsigned integer specifying number of bytes in the data stream
- bytes (varying number of characters) - the data stream (no pointers or numbers).
Integer List block
- 0x06 with sizing bits up top indicating size of type, length and each integer below.
- type (varying size) - an index pointing the symbol for the type applied to the integer list.
- length (varying size) - an unsigned integer specifying number of integers in the list.
- integers (varying size and number) - the signed integers.
Float List block
- 0x07 with sizing bits up top indicating size of type, length and each float below.
- type (varying size) - an index pointing the symbol for the type applied to the float list.
- length (varying size) - an unsigned integer specifying number of floats in the list.
- floats (varying size and number) - the floating point numbers.
Value List block
- 0x08 with sizing bits up top indicating size of type, length and each value below.
- type (varying size) - an index pointing the symbol for the type applied to the value list.
- length (varying size) - an unsigned integer specifying number of values in the list.
- values (varying size and number) - the values.
Property List block
- 0x09 with sizing bits up top indicating size of type, length and each value below.
- type (varying size) - an index pointing the symbol for the type applied to the property list.
- length (varying size) - an unsigned integer specifying number of value pairs in the list.
- value pairs (varying size and number) - pairs of values.
Function Call block
- 0x0A with sizing bits up top indicating size of length and each value below.
- length (varying size) - an unsigned integer specifying number of values in the list.
- values (varying size and number) - The first value is the function to call. The remaining values are parameters to pass.
Data Synchronization
TBD: Address format of data synchronized between servers and client devices (multiple pieces of data updated), and maybe to run an update or assignment "program".