Every binary format faces this question: "How do you encode a number when you don't know how big it will be?"
Until VSF, every format in existence picked one of these three bad answers:
Store everything as u32 or u64
Problem: Hits hard limits (4GB for u32) and wastes space for small numbers
Used by: TIFF, PNG, HDF5
7 bits per byte, MSB indicates "more bytes follow"
Problem: Must read every byte to find the end (literally cannot skip), hard cap at 64 bits
Used by: Protobuf, MessagePack
Store length as u32, then data
Problem: Length field itself has a limit! Recursion required for bigger lengths
Used by: Most TLV formats
VSF introduces Exponential Width Encoding (EWE) - a novel byte-aligned scheme where ASCII markers map directly to exponential size classes:
How it works:
0. Type marker: 'u' (unsigned), 'i' (signed), etc.
1. Size marker: ASCII character '0'-'Z'
2. Data: Exactly 2^(marker) bits follow
3. '0'=bool, '3'=8 bits, '4'=16 bits, '5'=32 bits, '6'=64 bits, ..., 'Z'=2^36 bits (8 GB)
Example: 'u' '5' [0x01234567]
│ │ └─ Data (2^5 bits = 32 bits = 4 bytes)
│ └─ Size class marker
└─ Type marker
Result: O(1) seekability + unbounded integers
Every number can be represented as mantissa × 2^exponent:
| Value | Overhead | Data | Total |
|---|---|---|---|
| 42 | 2 bytes | 1 byte | 3 bytes |
| 2^64-1 | 2 bytes | 8 bytes | 10 bytes |
| RSA-16384 prime | 2 bytes | 2048 bytes | 2050 bytes |
| Planck volumes in universe (~10^185) | 2 bytes | 23 bytes | 25 bytes |
The overhead stays negligible even for numbers larger than the universe.
❌ Planck volumes in observable universe: ~10^185
(Needs 185 bits, Protobuf stops at 64)
✅ VSF: 'u' 'B' + 23 bytes = 25 bytes total
❌ Cryptographic keys (RSA-16384 = 2048 bytes)
JSON can't represent integers > 2^53 exactly
✅ VSF: 'u' 'D' + 2048 bytes = 2050 bytes
❌ Storing 1 million boolean flags as u64
Wastes 8 MB instead of 125 KB
✅ VSF bitpacked: 125KB (1000x smaller)
Written entirely in Rust with zero wildcards in all match statements. Add a type? Won't compile until handled everywhere.
Hashes, signatures, keys, and MACs as first-class types with mandatory BLAKE3 file integrity.
Integrates Spirix for two's complement floating-point that preserves mathematical identities.
Store 12-bit camera RAW, quantized ML models, and sensor data at their natural bit depths.
use vsf::{VsfType, BitPackedTensor, Tensor};
// Store 12-bit camera RAW
let raw = BitPackedTensor::pack(12, vec![4096, 3072], &pixel_data);
let encoded = VsfType::p(raw).flatten();
// Store a tensor (8-bit grayscale image)
let tensor = Tensor::new(vec![1920, 1080], grayscale_data);
let img = VsfType::t_u3(tensor);
// Store text (automatically Huffman compressed)
let doc = VsfType::x("Hello, world!".to_string());
// Store a hash (BLAKE3)
use vsf::crypto_algorithms::HASH_BLAKE3;
let hash = VsfType::h(HASH_BLAKE3, hash_bytes);
// Round-trip
let decoded = VsfType::parse(&encoded)?;
assert_eq!(original, decoded);
| Format | Seekable | Unbounded | Optimal Size | Crypto Types |
|---|---|---|---|---|
| TIFF | ✅ | ❌ (4GB limit) | ❌ (fixed u32) | ❌ |
| PNG | ✅ | ❌ (4GB limit) | ❌ (12 byte overhead) | ❌ |
| HDF5 | ✅ | ❌ (64-bit max) | ❌ (u64 everywhere) | ❌ |
| Protobuf | ❌ (must parse) | ❌ (64-bit max) | ⚠️ (small only) | ❌ |
| JSON | ❌ | ❌ (precision loss) | ❌ (text bloat) | ❌ |
| VSF | ✅ | ✅ | ✅ | ✅ |
DNA sequencing quality scores use 6 bits but get stored in 8-bit ASCII. A human genome (3 billion bases) wastes 750MB on padding. VSF bitpacking eliminates this overhead while embedding cryptographic signatures.
Quantized neural networks use 4-bit or 8-bit weights. Standard formats store these in 32-bit arrays (4-8x waste). A 4-bit quantized LLaMA-7B: 3.5GB actual, 14GB in typical formats.
Particle physics experiments produce petabytes with heterogeneous precision. VSF selects optimal encoding per field. Spirix prevents IEEE-754 underflow in long-running cumulative calculations.
Satellite sensors transmit over power/bandwidth-constrained RF links. Temperature sensors: 12-bit, accelerometers: 10-bit. Storing as 16-bit wastes 20-40% per reading.
# Add to Cargo.toml
[dependencies]
vsf = "0.1"
# Or install directly
cargo add vsf
VSF is part of a broader computational foundation:
Each component addresses fundamental problems in computing.
Custom open-source:
11011101 10000 0100001
Built with zero wildcards.