Skip to main content

Module value

Module value 

Source
Expand description

GGUF v3 typed input (ADR-023 amended by ADR-060).

The GGUF spec defines no canonical form; this realization defines one — a flat Merkle skeleton. Two GGUF files that decode to the same logical content canonicalize to byte-identical skeletons. Every variable-length leaf (a string, an array payload, a tensor’s data region) is replaced by its streamed SHA-256 digest, so the skeleton’s size grows only with the KV / tensor counts (never with model size), while still binding every weight byte into the κ-label.

LE_u32(GGUF_MAGIC)
LE_u32(GGUF_VERSION_REQUIRED)
LE_u64(tensor_count)
LE_u64(kv_count)
LE_u64(canonical_alignment)
── metadata KVs, sorted by key bytes ──
  for kv: sha256(key) || LE_u32(type_tag) || canonical_value(kv)
    scalar  → the value's natural little-endian bytes
    string  → LE_u64(len) || sha256(utf8 bytes)
    array   → LE_u32(elem_type) || LE_u64(len) || sha256(wire payload)
── tensor info, sorted by name bytes ──
  for t: sha256(name) || LE_u32(n_dims) || (LE_u64(dim) × n_dims)
      || LE_u32(ggml_type_id) || LE_u64(recomputed_offset)
      || sha256(tensor data bytes)        ← streamed; binds the weights

recomputed_offset is the cumulative aligned byte position in sorted-tensor order (NOT the input’s stored offset), so two inputs whose tensor-data sections are laid out in different orders canonicalize identically.

Under ADR-060 the full skeleton flows through the pipeline as a [TermValue::Borrowed] carrier and ψ₉ folds it through the σ-axis — there is no two-level commitment, no carrier ceiling, and no count / width cap. Tensor data and large string / array payloads are streamed through prism::crypto::Sha256Hasher with bounded resident memory, so arbitrarily large weights bind into the κ-label.

GgufValue (the owned parsed value, alloc-gated) holds the skeleton; GgufCarrier is the borrowed model-input handle the pipeline binds.

Structs§

GgufCarrier
Borrowed canonical-skeleton input handle (ADR-060 borrowed carrier). A thin, Copy borrow of the skeleton bytes produced by canonicalize; as_binding_value returns the Borrowed carrier zero-copy.
GgufValue
A parsed, canonicalized GGUF v3 file. The stored bytes are the flat canonical skeleton (see module docs). alloc-gated — the pipeline binds the borrowed GgufCarrier.

Functions§

canonicalize
Canonical skeleton as an owned Vec<u8>.