Content Addressing

Definition

Content addressing is the principle that an object's identity is determined by its content, not by an external location or name. In the UOR framework, this is formalized through the Element class.

Mathematical Basis

A content address is a canonical representation of an object derived from its bytes. The UOR framework uses the ring structure (see Ring) to define a canonical form — a unique representative for each equivalence class of objects.

The CanonicalFormResolver computes this canonical form by factorizing an object's representation in the dihedral group D_{2^n}.

Ontology Representation

The addressing namespace u/ provides two foundational classes:

ClassDescription
ElementUniversal content address

Properties in the u/ namespace:

PropertyDescription
wittLengthThe Witt level n of this element
lengthLength in bytes
digestContent hash (algorithm-prefixed)
digestAlgorithmHash algorithm ('blake3' or 'sha256')
canonicalBytesCanonical byte pre-image for hashing

Cryptographic Primitive Pinning (Amendment 43)

The digest property is pinned to two allowed hash algorithms: BLAKE3 (primary) and SHA-256 (secondary). The digest value is prefixed with the algorithm identifier and a colon, e.g. blake3:af13... or sha256:e3b0....

The hash input is a deterministic byte string stored in canonicalBytes. This canonical form consists of a 4-byte header (magic UR + Witt level + reserved byte) followed by the datum value in little-endian byte order. At Witt level k, the total length is 4 + (k + 1) bytes.

The digestAlgorithm property records which algorithm was used, ensuring that the algorithm prefix in the digest can be cross-checked.

Resolution

Given a content address, the Resolver hierarchy performs resolution:

  1. DihedralFactorizationResolver — factorizes in D_{2^n}
  2. CanonicalFormResolver — computes the canonical form
  3. EvaluationResolver — evaluates the canonical form

The result is a Partition decomposing the address into irreducible, reducible, unit, and exterior components.

Schema Integration

The Datum class represents raw byte content, while Term represents symbolic content. These two are owl:disjointWith — a datum and a term are fundamentally different kinds of things.

A Literal (a subclass of Term) can denote a Datum via the denotes property, bridging the symbolic and data layers without conflating them.