Expand description
GGUF v3 typed input (ADR-023 amended by ADR-060).
The GGUF spec defines no canonical form; this realization defines one — a flat Merkle skeleton. Two GGUF files that decode to the same logical content canonicalize to byte-identical skeletons. Every variable-length leaf (a string, an array payload, a tensor’s data region) is replaced by its streamed SHA-256 digest, so the skeleton’s size grows only with the KV / tensor counts (never with model size), while still binding every weight byte into the κ-label.
LE_u32(GGUF_MAGIC)
LE_u32(GGUF_VERSION_REQUIRED)
LE_u64(tensor_count)
LE_u64(kv_count)
LE_u64(canonical_alignment)
── metadata KVs, sorted by key bytes ──
for kv: sha256(key) || LE_u32(type_tag) || canonical_value(kv)
scalar → the value's natural little-endian bytes
string → LE_u64(len) || sha256(utf8 bytes)
array → LE_u32(elem_type) || LE_u64(len) || sha256(wire payload)
── tensor info, sorted by name bytes ──
for t: sha256(name) || LE_u32(n_dims) || (LE_u64(dim) × n_dims)
|| LE_u32(ggml_type_id) || LE_u64(recomputed_offset)
|| sha256(tensor data bytes) ← streamed; binds the weightsrecomputed_offset is the cumulative aligned byte position in
sorted-tensor order (NOT the input’s stored offset), so two inputs
whose tensor-data sections are laid out in different orders
canonicalize identically.
Under ADR-060 the full skeleton flows through the pipeline as a
[TermValue::Borrowed] carrier and ψ₉ folds it through the σ-axis —
there is no two-level commitment, no carrier ceiling, and no count /
width cap. Tensor data and large string / array payloads are streamed
through prism::crypto::Sha256Hasher with bounded resident memory,
so arbitrarily large weights bind into the κ-label.
GgufValue (the owned parsed value, alloc-gated) holds the
skeleton; GgufCarrier is the borrowed model-input handle the
pipeline binds.
Structs§
- Gguf
Carrier - Borrowed canonical-skeleton input handle (ADR-060 borrowed carrier). A
thin,
Copyborrow of the skeleton bytes produced bycanonicalize;as_binding_valuereturns theBorrowedcarrier zero-copy. - Gguf
Value - A parsed, canonicalized GGUF v3 file. The stored bytes are the
flat canonical skeleton (see module docs).
alloc-gated — the pipeline binds the borrowedGgufCarrier.
Functions§
- canonicalize
- Canonical skeleton as an owned
Vec<u8>.