4.3. primitiv File Format v0.1¶
primitiv File Format is a common binary format to store/load data used in primitiv. It uses the MessagePack wire format as the inner binary representation.
4.3.1. Legend¶
+------+ +---------------+---------------+...
| Type | = | Member Type 1 | Member Type 2 |
| | | Member Name 1 | Member Name 2 |
+------+ +---------------+---------------+...
4.3.2. Types¶
+-------+ +---------------+--------+
| Shape | = | array<uint32> | uint32 |
| | | dims | batch |
+-------+ +---------------+--------+
In the current version, the batch
member is always 1
for all Shape
objects.
+--------+ +-------+------+
| Tensor | = | Shape | bin |
| | | shape | data |
+--------+ +-------+------+
data
member has an array of single-precision floating number with the
following format:
- Byte order: Little-endian (differ than MessagePack’s float)
- Array order: Column-major (Fortran)
- Batch is treated as the last dimension of the shape
(if
shape.batch > 1
). I.e., The next data begins just after the previous data according to the column-major array order.
+-----------+ +--------+--------+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~+.........
| Parameter | = | Tensor | uint32 | str | Tensor |
| | | value | N | stat_key[1] | stat_value[1] | N times
+-----------+ +--------+--------+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~+.........
+-------+ +--------+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~+.........
| Model | = | uint32 | array<str> | Parameter |
| | | N | param_key[1] | param_value[1] | N times
+-------+ +--------+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~+.........
The key of each parameter represents the address of the parameter from the root model. E.g.:
param_key == ["foo"]
: Parameter has the name"foo"
, and is directly owned by the root model.param_key == ["foo", "bar"]
: Parameter has the name"bar"
, and is owned by the submodel"foo"
.
+-----------+ +------------------+-----------------+
| Optimizer | = | map<str, uint32> | map<str, float> |
| | | uint_configs | float_configs |
+-----------+ +------------------+-----------------+
4.3.3. File Format¶
+-----------+-----------+-----------+----------------------------------------+
| uint32 | uint32 | uint32 | Shape|Tensor|Parameter|Model|Optimizer |
| ver_major | ver_minor | data_type | data |
+-----------+-----------+-----------+----------------------------------------+
Version numbers are typically equal to following:
ver_major == 0
ver_minor == 1
Following table shows the correspondence between data_type
and data
:
data_type | data |
---|---|
0x0 |
Shape |
0x100 |
Tensor |
0x200 |
Parameter |
0x300 |
Model |
0x400 |
Optimizer |