Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Chapter 22: Database Systems

Introduction

The Hologram’s perfect hash and content-addressable memory eliminate traditional database pain points: index maintenance, collision resolution, and deduplication overhead. This chapter explores how these foundations enable a new class of index-free databases where identity is intrinsic, queries are proofs, and storage is automatically deduplicated.

Index-Free Architecture

The End of B-Trees

Traditional databases rely on auxiliary index structures. The Hologram eliminates them:

#![allow(unused)]
fn main() {
pub struct IndexFreeDB {
    lattice: Lattice12288,
    cam: ContentAddressableMemory,
}

impl IndexFreeDB {
    pub fn insert(&mut self, record: Record) -> Address {
        // No index update needed - address IS the index
        let address = self.cam.compute_address(&record);
        self.lattice.store_at(address, record);
        address
    }

    pub fn lookup(&self, key: &Key) -> Option<Record> {
        // Direct content-based lookup - O(1)
        let address = self.cam.address_from_key(key);
        self.lattice.retrieve(address)
    }

    pub fn range_query(&self, start: &Key, end: &Key) -> Vec<Record> {
        // Exploit lattice ordering
        let start_addr = self.cam.address_from_key(start);
        let end_addr = self.cam.address_from_key(end);

        self.lattice.scan_range(start_addr, end_addr)
            .filter(|record| record.is_lawful())
            .collect()
    }
}
}

Query as Proof

Queries return proofs of their results:

#![allow(unused)]
fn main() {
pub struct ProofQuery {
    predicate: Predicate,
    witness_builder: WitnessBuilder,
}

impl ProofQuery {
    pub fn execute(&self, db: &IndexFreeDB) -> QueryResult {
        let mut results = Vec::new();
        let mut proof = QueryProof::new();

        // Scan relevant region
        for record in db.scan_predicate_region(&self.predicate) {
            if self.predicate.matches(&record) {
                // Build witness for this match
                let witness = self.witness_builder.build(&record);
                proof.add_witness(witness);
                results.push(record);
            } else {
                // Proof of non-match
                let non_match_proof = self.prove_non_match(&record);
                proof.add_exclusion(non_match_proof);
            }
        }

        QueryResult {
            records: results,
            proof,
            receipt: proof.compute_receipt(),
        }
    }

    fn prove_non_match(&self, record: &Record) -> ExclusionProof {
        // Construct proof that record doesn't match predicate
        ExclusionProof {
            record_receipt: record.compute_receipt(),
            predicate_hash: self.predicate.hash(),
            violation: self.predicate.find_violation(record),
        }
    }
}
}

Schema-Free Storage

The lattice naturally handles schema evolution:

#![allow(unused)]
fn main() {
pub struct SchemaFreeStore {
    type_registry: TypeRegistry,
    poly_storage: PolyOntologicalStorage,
}

impl SchemaFreeStore {
    pub fn store_poly(&mut self, obj: impl PolyOntological) -> Address {
        // Object carries its own type information
        let type_facets = obj.type_facets();
        let canonical_form = obj.to_canonical();

        // Register new types dynamically
        for facet in &type_facets {
            self.type_registry.register_if_new(facet);
        }

        // Store with type receipts
        let storage_receipt = Receipt::with_types(
            canonical_form.compute_receipt(),
            type_facets
        );

        self.poly_storage.store_with_receipt(canonical_form, storage_receipt)
    }

    pub fn query_by_type<T: TypeFacet>(&self) -> impl Iterator<Item = T> {
        self.poly_storage
            .scan_all()
            .filter(|obj| obj.has_facet::<T>())
            .map(|obj| obj.as_facet::<T>().unwrap())
    }
}
}

Perfect Hash Tables

Collision-Free Hash Tables

The perfect hash eliminates collision handling:

#![allow(unused)]
fn main() {
pub struct PerfectHashTable {
    lattice: Lattice12288,
    normalizer: GaugeNormalizer,
}

impl PerfectHashTable {
    pub fn insert(&mut self, key: Key, value: Value) -> Result<(), HashError> {
        // Normalize to canonical form
        let normal_form = self.normalizer.normalize(&key);

        // Compute perfect hash
        let address = Address::from_normal_form(&normal_form);

        // Check lawfulness
        if !self.is_lawful_address(address) {
            return Err(HashError::UnlawfulKey);
        }

        // Direct store - no collision possible for lawful keys
        self.lattice.store(address, value);
        Ok(())
    }

    pub fn get(&self, key: &Key) -> Option<Value> {
        let normal_form = self.normalizer.normalize(key);
        let address = Address::from_normal_form(&normal_form);

        self.lattice.retrieve(address)
    }

    fn is_lawful_address(&self, addr: Address) -> bool {
        // Verify address satisfies conservation laws
        let receipt = Receipt::from_address(addr);
        receipt.verify_at_budget_zero()
    }
}
}

Dynamic Perfect Hashing

Handles insertions without rehashing:

#![allow(unused)]
fn main() {
pub struct DynamicPerfectHash {
    primary: PerfectHashTable,
    overflow: BTreeMap<Address, Value>, // For budget > 0 items
    rebalance_threshold: f64,
}

impl DynamicPerfectHash {
    pub fn insert(&mut self, key: Key, value: Value) -> Address {
        let address = self.compute_address(&key);

        // Try primary table (budget = 0)
        match self.primary.insert(key.clone(), value.clone()) {
            Ok(_) => address,
            Err(_) => {
                // Store in overflow with budget cost
                self.overflow.insert(address, value);
                self.maybe_rebalance();
                address
            }
        }
    }

    fn maybe_rebalance(&mut self) {
        let overflow_ratio = self.overflow.len() as f64 / 12288.0;

        if overflow_ratio > self.rebalance_threshold {
            self.rebalance();
        }
    }

    fn rebalance(&mut self) {
        // Find better gauge normalization to minimize overflow
        let items: Vec<_> = self.overflow.iter().collect();
        let new_gauge = self.optimize_gauge(&items);

        // Re-normalize all items with new gauge
        for (key, value) in items {
            let renormalized = new_gauge.normalize(key);
            let _ = self.primary.insert(renormalized, value.clone());
        }

        self.overflow.clear();
    }
}
}

Deduplication by Design

Automatic Deduplication

Content addressing provides automatic deduplication:

#![allow(unused)]
fn main() {
pub struct DeduplicatingStore {
    content_store: ContentStore,
    reference_counter: ReferenceCounter,
}

impl DeduplicatingStore {
    pub fn store(&mut self, data: &[u8]) -> StoreResult {
        // Compute content address
        let address = Address::from_content(data);

        // Check if already stored
        if self.reference_counter.exists(address) {
            // Just increment reference count
            self.reference_counter.increment(address);
            return StoreResult::Duplicate(address);
        }

        // Store new content
        self.content_store.store(address, data);
        self.reference_counter.initialize(address, 1);

        StoreResult::Stored(address)
    }

    pub fn dedupe_ratio(&self) -> f64 {
        let total_references = self.reference_counter.total_references();
        let unique_objects = self.reference_counter.unique_count();

        1.0 - (unique_objects as f64 / total_references as f64)
    }
}
}

Merkle DAG Storage

Store structured data as content-addressed DAGs:

#![allow(unused)]
fn main() {
pub struct MerkleDAG {
    node_store: NodeStore,
    root_tracker: RootTracker,
}

impl MerkleDAG {
    pub fn store_tree(&mut self, tree: Tree) -> MerkleRoot {
        self.store_node(&tree.root)
    }

    fn store_node(&mut self, node: &TreeNode) -> Address {
        match node {
            TreeNode::Leaf(data) => {
                // Store leaf data
                let addr = self.node_store.store_leaf(data);
                addr
            }
            TreeNode::Branch(children) => {
                // Recursively store children
                let child_addrs: Vec<_> = children
                    .iter()
                    .map(|child| self.store_node(child))
                    .collect();

                // Store branch with child addresses
                let branch_data = BranchData {
                    child_addresses: child_addrs,
                    metadata: node.metadata(),
                };

                self.node_store.store_branch(&branch_data)
            }
        }
    }

    pub fn retrieve_tree(&self, root: MerkleRoot) -> Option<Tree> {
        self.retrieve_node(root.address()).map(|node| Tree { root: node })
    }

    fn retrieve_node(&self, addr: Address) -> Option<TreeNode> {
        self.node_store.retrieve(addr).map(|data| {
            match data {
                NodeData::Leaf(leaf_data) => TreeNode::Leaf(leaf_data),
                NodeData::Branch(branch_data) => {
                    let children = branch_data.child_addresses
                        .iter()
                        .filter_map(|addr| self.retrieve_node(*addr))
                        .collect();
                    TreeNode::Branch(children)
                }
            }
        })
    }
}
}

Transaction Processing

ACID Without Locks

Achieve ACID properties through receipts:

#![allow(unused)]
fn main() {
pub struct ReceiptTransaction {
    transaction_id: TransactionId,
    operations: Vec<Operation>,
    isolation_receipt: IsolationReceipt,
}

impl ReceiptTransaction {
    pub fn execute(&mut self, db: &mut Database) -> TransactionResult {
        // Atomicity: All-or-nothing via receipts
        let mut operation_receipts = Vec::new();

        for op in &self.operations {
            match self.execute_operation(op, db) {
                Ok(receipt) => operation_receipts.push(receipt),
                Err(e) => {
                    // Rollback using receipts
                    self.rollback(db, &operation_receipts);
                    return TransactionResult::Aborted(e);
                }
            }
        }

        // Consistency: Verify constraints via receipts
        if !self.verify_consistency(&operation_receipts) {
            self.rollback(db, &operation_receipts);
            return TransactionResult::ConstraintViolation;
        }

        // Isolation: Check no conflicts
        if !self.isolation_receipt.verify_no_conflicts(&operation_receipts) {
            self.rollback(db, &operation_receipts);
            return TransactionResult::IsolationViolation;
        }

        // Durability: Commit with combined receipt
        let commit_receipt = Receipt::combine(operation_receipts);
        db.commit(self.transaction_id, commit_receipt.clone());

        TransactionResult::Committed(commit_receipt)
    }

    fn rollback(&self, db: &mut Database, receipts: &[Receipt]) {
        // Receipts enable perfect rollback
        for receipt in receipts.iter().rev() {
            db.undo_operation(receipt);
        }
    }
}
}

Multi-Version Concurrency

MVCC through content addressing:

#![allow(unused)]
fn main() {
pub struct MVCCDatabase {
    versions: BTreeMap<Timestamp, Address>,
    active_transactions: HashMap<TransactionId, Timestamp>,
}

impl MVCCDatabase {
    pub fn begin_transaction(&mut self) -> Transaction {
        let timestamp = self.get_timestamp();
        let snapshot = self.versions
            .range(..=timestamp)
            .last()
            .map(|(_, addr)| *addr)
            .unwrap_or_default();

        Transaction {
            id: TransactionId::new(),
            snapshot_address: snapshot,
            timestamp,
        }
    }

    pub fn read(&self, tx: &Transaction, key: Key) -> Option<Value> {
        // Read from transaction's snapshot
        let snapshot = self.load_snapshot(tx.snapshot_address);
        snapshot.get(key)
    }

    pub fn write(&mut self, tx: &mut Transaction, key: Key, value: Value) {
        // Copy-on-write for isolation
        let mut snapshot = self.load_snapshot(tx.snapshot_address);
        snapshot.insert(key, value);

        // Store new version
        let new_address = self.store_snapshot(&snapshot);
        tx.snapshot_address = new_address;
    }

    pub fn commit(&mut self, tx: Transaction) -> CommitResult {
        // Check for conflicts
        let conflicts = self.check_conflicts(&tx);
        if !conflicts.is_empty() {
            return CommitResult::Conflict(conflicts);
        }

        // Add new version
        self.versions.insert(tx.timestamp, tx.snapshot_address);
        self.active_transactions.remove(&tx.id);

        CommitResult::Success(tx.snapshot_address)
    }
}
}

Query Optimization

Receipt-Guided Optimization

Use receipts to guide query planning:

#![allow(unused)]
fn main() {
pub struct ReceiptOptimizer {
    statistics: QueryStatistics,
    receipt_cache: ReceiptCache,
}

impl ReceiptOptimizer {
    pub fn optimize_query(&self, query: Query) -> OptimizedQuery {
        // Analyze query predicates
        let predicate_receipts = query.predicates
            .iter()
            .map(|p| p.compute_receipt())
            .collect::<Vec<_>>();

        // Check cache for similar queries
        let cached_plans = self.receipt_cache
            .find_similar(&predicate_receipts);

        if let Some(cached) = cached_plans.first() {
            // Reuse successful plan
            return self.adapt_plan(cached, &query);
        }

        // Build new plan
        let access_paths = self.enumerate_access_paths(&query);
        let costs = access_paths
            .iter()
            .map(|path| self.estimate_cost(path))
            .collect::<Vec<_>>();

        // Select minimum cost path
        let best_index = costs
            .iter()
            .position_min()
            .unwrap();

        OptimizedQuery {
            plan: access_paths[best_index].clone(),
            estimated_cost: costs[best_index],
            receipt: predicate_receipts,
        }
    }

    fn estimate_cost(&self, path: &AccessPath) -> Cost {
        // Use receipt statistics for cost estimation
        let selectivity = self.statistics
            .estimate_selectivity(&path.predicate_receipt());

        let io_cost = self.estimate_io(path, selectivity);
        let cpu_cost = self.estimate_cpu(path, selectivity);

        Cost {
            io: io_cost,
            cpu: cpu_cost,
            total: io_cost + cpu_cost,
        }
    }
}
}

Parallel Query Execution

#![allow(unused)]
fn main() {
pub struct ParallelExecutor {
    thread_pool: ThreadPool,
    partition_size: usize,
}

impl ParallelExecutor {
    pub fn execute_parallel(&self, query: Query) -> QueryResult {
        // Partition query space
        let partitions = self.partition_query(&query);

        // Execute partitions in parallel
        let futures: Vec<_> = partitions
            .into_iter()
            .map(|partition| {
                let query = query.clone();
                self.thread_pool.spawn(async move {
                    Self::execute_partition(partition, query)
                })
            })
            .collect();

        // Merge results
        let partial_results = join_all(futures);
        self.merge_results(partial_results)
    }

    fn partition_query(&self, query: &Query) -> Vec<Partition> {
        // Use lattice structure for partitioning
        let mut partitions = Vec::new();

        for page in 0..48 {
            partitions.push(Partition {
                page,
                predicate: query.predicate.clone(),
            });
        }

        partitions
    }
}
}

Storage Engines

Log-Structured Merge

LSM trees with perfect hashing:

#![allow(unused)]
fn main() {
pub struct PerfectLSM {
    memtable: PerfectHashTable,
    immutable_memtables: VecDeque<PerfectHashTable>,
    levels: Vec<Level>,
}

impl PerfectLSM {
    pub fn write(&mut self, key: Key, value: Value) {
        // Write to memtable
        if self.memtable.size() >= MEMTABLE_SIZE {
            self.flush_memtable();
        }

        self.memtable.insert(key, value);
    }

    fn flush_memtable(&mut self) {
        // Move to immutable
        let table = std::mem::replace(
            &mut self.memtable,
            PerfectHashTable::new()
        );
        self.immutable_memtables.push_back(table);

        // Trigger background compaction
        self.maybe_compact();
    }

    fn compact_level(&mut self, level: usize) {
        let current = &self.levels[level];
        let next = &mut self.levels[level + 1];

        // Merge with perfect deduplication
        let merged = self.merge_tables(current, next);

        // Replace levels
        self.levels[level] = Level::new();
        self.levels[level + 1] = merged;
    }
}
}

Column-Oriented Storage

#![allow(unused)]
fn main() {
pub struct ColumnStore {
    columns: HashMap<ColumnId, ColumnData>,
    row_receipts: Vec<Receipt>,
}

impl ColumnStore {
    pub fn insert_row(&mut self, row: Row) {
        // Decompose into columns
        for (col_id, value) in row.columns() {
            self.columns
                .entry(col_id)
                .or_insert_with(ColumnData::new)
                .append(value);
        }

        // Store row receipt for consistency
        let receipt = row.compute_receipt();
        self.row_receipts.push(receipt);
    }

    pub fn scan_column<T>(&self, col_id: ColumnId) -> impl Iterator<Item = T> {
        self.columns
            .get(&col_id)
            .map(|col| col.typed_iterator::<T>())
            .into_iter()
            .flatten()
    }

    pub fn vectorized_aggregate<T, R>(&self, col_id: ColumnId, agg: impl Fn(&[T]) -> R) -> R {
        let column = &self.columns[&col_id];
        let data = column.as_typed_slice::<T>();
        agg(data)
    }
}
}

Exercises

  1. Join Without Indexes: Implement a hash join algorithm that uses content addresses instead of building temporary hash tables.

  2. Time-Travel Queries: Design a temporal database that uses receipts to query any point in history with O(log n) overhead.

  3. Compressed Storage: Create a storage engine that uses receipt patterns to identify and compress repetitive structures.

  4. Distributed Joins: Implement a distributed join that uses receipt-based partitioning to minimize network transfer.

  5. Schema Migration: Design a schema migration system that uses poly-ontological types to evolve schemas without downtime.

Summary

The Hologram’s perfect hash and content-addressable memory fundamentally change database architecture. Index-free storage eliminates maintenance overhead while providing O(1) lookups. Automatic deduplication through content addressing reduces storage requirements without explicit management. Receipt-based transactions provide ACID guarantees without locks, and query optimization uses receipt statistics for intelligent planning. These patterns show that databases can be both simpler and more powerful when built on lawful foundations.

Further Reading

  • Chapter 4: Content-Addressable Memory - For CAM theory
  • Chapter 5: Lawfulness as a Type System - For poly-ontological storage
  • Chapter 21: Distributed Systems - For distributed database patterns
  • Chapter 23: Compiler Construction - For query compilation