Skip to main content

uor_addr/gguf/
mod.rs

1//! **`uor_addr::gguf` — the GGUF v3 realization of UOR-ADDR.**
2//!
3//! Typed content-addressing for GGUF v3 model files
4//! (`GGUF_MAGIC = 0x46554747`, `version = 3`) under a spec-canonical
5//! structural form. The default σ-projection is
6//! [`prism::crypto::Sha256Hasher`]; [`address_blake3`], [`address_sha3_256`],
7//! [`address_keccak256`], and [`address_sha512`] select the other axes
8//! ([`crate::hash`]).
9//!
10//! ## σ-axis vs. the canonical form
11//!
12//! The leaf commitments inside the skeleton (tensor-data, array-payload,
13//! and long-string digests) are **SHA-256 by canonical-form definition**
14//! ([`CANONICAL_FORM_VERSION`]) — they are a fixed part of the
15//! serialization, exactly as JCS fixes JSON number formatting independently
16//! of the κ-hash. The selected κ-axis `H` is applied *on top* of that fixed
17//! canonical form: κ = `H(skeleton)`. So `address_blake3` yields
18//! `blake3(skeleton-with-sha256-leaves)`. Every byte still binds (a flipped
19//! tensor byte changes its SHA-256 leaf → changes the skeleton → changes
20//! κ), and the sha256 κ-labels are byte-identical to prior releases.
21//!
22//! ## Authoritative sources
23//!
24//! - GGUF v3 binary format — <https://github.com/ggml-org/ggml/blob/master/docs/gguf.md>
25//! - Reference C++ header — <https://github.com/ggml-org/ggml/blob/master/include/gguf.h>
26//! - Reference Python tooling — <https://github.com/ggml-org/llama.cpp/tree/master/gguf-py>
27//! - `ggml_type` enum / `GGML_MAX_DIMS` — <https://github.com/ggml-org/ggml/blob/master/include/ggml.h>
28//! - SHA-256 σ-projection — NIST FIPS 180-4.
29//!
30//! ## Canonical form
31//!
32//! The GGUF spec defines no canonical form; this realization defines one
33//! (canonical form v2 — [`CANONICAL_FORM_VERSION`]). It is the **full
34//! flat Merkle skeleton** (ADR-060): a structural form (header, metadata
35//! KVs sorted by key bytes, tensor info sorted by name bytes with
36//! recomputed canonical offsets) in which every variable-length leaf —
37//! tensor data, metadata array payloads, long strings — is represented by
38//! its 32-byte streamed SHA-256 digest. The skeleton's size grows only
39//! with the KV / tensor counts (never with model size) and flows through
40//! the pipeline as a `Borrowed` carrier that ψ₉ folds; tensor data is
41//! streamed through the hash axis at the host boundary (true incremental
42//! SHA-256) with bounded resident memory. There is no two-level
43//! commitment and no count / width ceiling. See [`crate::gguf::value`]
44//! for the full byte layout.
45//!
46//! Two GGUF files that decode to the same logical content (modulo
47//! metadata-KV order, tensor order, and tensor-data layout) canonicalize
48//! to byte-identical skeletons and therefore to the same κ-label.
49//!
50//! ## Tensor element types
51//!
52//! Validated against the [`prism::tensor::dtype`] alphabet via
53//! [`dtype::GgmlType`] — a total mapping of the 29 GGUF v3 `ggml_type`
54//! IDs to `prism::tensor::dtype` shapes.
55
56pub mod dtype;
57pub mod model;
58pub mod pipeline;
59pub mod shapes;
60pub mod value;
61pub mod verbs;
62
63/// Canonical-form version (see module docs). Bumped to 2 under ADR-060:
64/// the canonical form is now the full flat Merkle skeleton (no two-level
65/// commitment), so v2 κ-labels differ from the v1 commitment's.
66pub const CANONICAL_FORM_VERSION: u32 = 2;
67
68pub use dtype::GgmlType;
69pub use model::{
70    AddressModel, AddressModelBlake3, AddressModelKeccak256, AddressModelSha3_256,
71    AddressModelSha512, AddressRoute,
72};
73#[cfg(feature = "alloc")]
74pub use pipeline::{address, address_blake3, address_keccak256, address_sha3_256, address_sha512};
75pub use pipeline::{AddressFailure, AddressOutcome, AddressWitness, VerifyError};
76pub use shapes::bounds::{
77    GGUF_DEFAULT_ALIGNMENT, GGUF_HEADER_BYTES, GGUF_MAGIC, GGUF_MAX_DIMS,
78    GGUF_METADATA_ARRAY_DEPTH_MAX, GGUF_VERSION_REQUIRED,
79};
80pub use value::GgufCarrier;
81#[cfg(feature = "alloc")]
82pub use value::{canonicalize, GgufValue};
83pub use verbs::{address_inference, VERB_TERMS_ADDRESS_INFERENCE};
84
85/// The shared, format-independent ψ-tower (re-exported for convenience;
86/// canonical path is [`crate::resolvers::AddressResolverTuple`]).
87pub use crate::resolvers::AddressResolverTuple;