UNPKG

temporal-db

Version:

Git-like versioning for application data

128 lines (92 loc) 4.03 kB
**File: temporal-db/ARCHITECTURE.md** ```markdown # TemporalDB Architecture This document describes the design decisions and architecture of TemporalDB. ## Core Concepts TemporalDB was designed around several key concepts that enable efficient versioning of application data: 1. **Content-Addressable Storage**: Data is stored by its content hash, enabling deduplication and efficient storage of similar versions 2. **Merkle Trees**: Hierarchical hash trees that allow efficient comparison and partial updating 3. **Path-based Diffing**: Changes are tracked at specific paths in the data structure 4. **Branching Model**: Similar to Git, separate branches can evolve independently ## Components ### Storage Layer The Storage class provides: - Content-addressable object storage using IndexedDB - Branch reference management - Commit metadata storage - Path utilities for working with nested objects The storage is structured as three IndexedDB object stores: - `objects`: Stores actual data keyed by content hash - `refs`: Stores branch and tag references - `commits`: Stores commit metadata with indices for efficient lookup ### Merkle Tree Implementation The MerkleTree class: - Builds hash trees from JavaScript objects - Enables efficient structural sharing - Allows retrieval of only changed portions of data - Provides efficient comparison between object versions Each node in the tree contains: - A hash of its content - The type of node (object, array, primitive) - For objects/arrays: references to child nodes - For primitives: the actual value ### Diff Engine The Diff class provides: - Generation of patches between object versions - Application of patches to objects - Conflict detection between diffs - Inversion of diffs (for undo functionality) Diffs are represented as: - `added`: Paths and values that were added - `modified`: Paths and values that were changed - `deleted`: Paths that were removed ### Branch Management The Branch class handles: - Branch creation and switching - Commit creation and storage - Data retrieval for branches - Time travel to previous states Branches are implemented as references to commit hashes, similar to Git. ### Merge Engine The Merge class provides: - Three-way merging between branches - Common ancestor detection - Conflict identification - Automated and manual conflict resolution The merge algorithm: 1. Finds the common ancestor of two branches 2. Computes diffs from ancestor to both branches 3. Identifies conflicting changes 4. Automatically merges non-conflicting changes 5. Returns conflicts for manual resolution ## Design Decisions ### Why IndexedDB? We chose IndexedDB as the storage backend because: - It's available in all modern browsers - It provides transactional access to data - It supports larger datasets than localStorage - It allows efficient key-based lookups ### Path-based vs. Full-Object Diffing We use path-based diffing rather than full-object diffing because: - It's more efficient for large nested objects - It provides more precise conflict detection - It allows for targeted updates of specific paths - It makes conflict resolution more intuitive ### Content-Addressable Storage Using content-addressable storage provides: - Automatic deduplication of identical data - Efficient storage of similar object versions - Natural support for structural sharing - Immutability of stored data ### Performance Considerations Several optimizations were made for performance: - Hash caching to avoid recalculating hashes - Structural sharing of unchanged subtrees - Lazy loading of data when traversing history - Efficient indexing of commits by branch and timestamp ## Future Improvements Potential future enhancements include: - Garbage collection for orphaned objects - Remote synchronization between instances - Compression of stored data - Customizable conflict resolution strategies