@trap_stevo/star-vault

Version:

Deterministic data engine that eliminates query-time joins and enables normalized data execution. Architect secure, scalable, real-time systems with integrated sharding, encryption, and event-driven data flows. Manage hierarchical structures, execute adva

974 lines (771 loc) • 118 kB

Markdown

# 🚀 Star-Vault **Deterministic data engine that eliminates query-time joins and enables normalized data execution.** Architect secure, scalable, real-time systems with integrated sharding, encryption, and event-driven data flows. Manage hierarchical structures, execute advanced queries, and power enterprise systems, real-time dashboards, collaborative platforms, IoT infrastructures, and complex data-driven applications. Traditional systems force a fundamental tradeoff: - relational databases preserve structure but depend on runtime joins - NoSQL systems remove joins but sacrifice integrity through denormalization **Star-Vault introduces Hyper-Normalization™ — a model that resolves relationships deterministically within the storage and cache layers instead of reconstructing them during queries.** This model delivers: - near-constant-time access across normalized data structures - zero join planning or query optimization overhead - deterministic traversal across collections - real-time consistency without duplication Star-Vault integrates sharding, WAL, caching, and real-time events into a unified execution engine for high-performance, structure-driven systems. > ⚡ In Star-Vault, normalization stops costing performance and starts driving it. ## 📘 Table of Contents - [🌌 Features](#-features) - [⚙️ System Requirements](#️-system-requirements) ### API Overview - [📜 API Specifications](#-api-specifications) - [🔧 Constructor](#-constructor) - [🔐 authOptions (StarAuth Configuration)](#-authoptions-starauth-configuration) - [🔒 Lockout Configuration](#-lockout-configuration) - [Timing Heuristics](#-timing-heuristics) - [Password Management](#-password-management) - [🌍 Star Locator Configuration](#-star-locator-configuration) - [Stellar Login](#stellar-login) - [Hooks & Extensions](#hooks--extensions) - [Guest Session Configurations](#guest-session-configurations) - [Cleanup & Locking](#cleanup--locking) ### Core Engine - [📦 Core Methods](#-core-methods) - [📥 Import & Snapshot Methods](#-import--snapshot-methods) - [🧾 StarTransactions (WAL + Idempotency)](#-startransactions-wal--idempotency) ### Query System - [🔍 Query Engine](#-query-engine) - [⚙️ Return Modes](#️-return-modes) - [Execution Modes](#execution-modes) - [Methods](#methods) - [Examples](#examples) - [⚙️ Advanced Controls](#️-advanced-controls) - [📄 Cursor-Based Pagination](#️-cursor-based-pagination) - [Performance & Behavior](#performance--behavior) - [Developer Note](#developer-note) ### Hyper Normalization - [⚡️ Hyper-Normalization™](#️-hyper-normalization) - [🧬 Origin of Hyper-Normalization™](#-origin-of-hyper-normalization) - [🌌 Why Hyper-Normalization Matters](#-why-hyper-normalization-matters) - [🧩 User Display Example](#-example--the-user-display-case) - [Benchmark & Reproducibility](#benchmark--reproducibility) - [🚀 Performance Implications](#-performance-implications) ### Authentication - [🛡️ Authentication](#️-authentication) - [👤 User Registration & Authentication](#-user-registration--authentication) - [🛰️ Session Management](#-session-management) - [🔑 Password & Magic Link Recovery](#-password--magic-link-recovery) - [👥 Guest Account Management](#-guest-account-management) - [🔁 User Lifecycle Management](#-user-lifecycle-management) - [📊 User & Session Querying](#-user--session-querying) - [🧩 Internal Validation & Utility Methods](#-internal-validation--utility-methods) ### History + Logging - [🧭 Record History & Timeline](#-record-history--timeline) - [🧾 Auditing](#-auditing-history--timeline-reads) - [🪶 Debug & Developer Logging Options](#-debug--developer-logging-options) ### Logger - [🛰️ StarLogger](#-starlogger) - [🔧 Core Logger Methods](#-core-logger-methods) - [Write, System, and Audit Events](#write-system-and-audit-events) - [Recovery, Query & Maintenance](#recovery-query--maintenance) - [Rotation & Retention](#rotation--retention-built-in) ### Storage Layer - [📂 Starchive (Storage Layer)](#-starchive-storage-layer) - [⚙️ Starchive Configuration Modes](#️-starchive-configuration-modes) - [⚙️ Starchive Configurations](#️-starchive-configurations) - [📦 Starchive Core Methods](#-starchive-core-methods) - [✍️ Starchive Convenience Methods](#️-starchive-convenience-methods) - [🛰️ Remote Management Helpers](#-remote-management-helpers) - [🔏 Starchive Signing & Presigning](#-starchive-signing--presigning) - [🔏 Starchive Downloading](#-starchive-downloading) - [📜 Starchive Records](#-starchive-records) ### Event System - [🎧 Event Listening](#-event-listening) - [🌌 Collection & Path Matching](#-collection--path-matching) - [🌌 StarVault Events](#-starvault-events) ### Vacuum System - [🧽 Vacuum & Auto-Maintenance System](#-vacuum--auto-maintenance-system) - [⚙️ Automatic Vacuum Policy](#️-automatic-vacuum-policy) - [🪄 Methods](#-methods-1) - [🪶 Example](#-example) - [⚡ Auto-Vacuum Triggers](#️-auto-vacuum-triggers) - [📊 Metrics Example](#-metrics-example) ### Getting Started - [✨ Getting Started](#-getting-started) - [📦 Installation](#-installation) - [Basic Usage](#basic-usage) - [📈 Querying](#-querying) - [🎧 Listening to Changes](#-listening-to-changes) - [🌐 Wildcard Collection Queries](#-wildcard-collection-queries) - [👥 Registering Users](#-registering-users-guest-and-normal) - [📋 Listing Users and Sessions](#-listing-users-and-sessions) - [🔍 Quick Starchive Examples](#-quick-starchive-examples) - [✨ License](#-license) - [🚀 Transform Data into Action](#-transform-data-into-action) --- ## 🌌 Features - 🔐 **Optional Encryption** – Secure sensitive data at rest - ⚙️ **Sharded Storage Engine** – Efficiently scales writes across shards - 🧠 **In-Memory Caching** – High-speed read layer with `StarCache` - 📜 **Write-Ahead Logging** – Resilient logs with rotation and retention policies - 🔍 **Advanced Query Engine** – Chainable and expressive queries with filtering, search, sorting, and spatial support - 🚀 **Real-Time Event Emission** – Listen to data changes with fine-grained control - 🛡️ **Authentication Layer** – Optional handler to authorize every operation - 🌠 **Collection Wildcards** – Seamlessly operate across multiple collections --- ## ⚙️ System Requirements | Requirement | Version | |----------------|--------------------| | **Node.js** | ≥ 19.x | | **npm** | ≥ 9.x (recommended)| | **OS** | Windows, macOS, Linux | --- # 📜 API Specifications ## 🔧 Constructor Use these parameters when initializing `StarVault`. ```js new StarVault(dbPath, logPath, shardCount, maxLogSize, logRetention, options) ``` ### Core Parameters | Key | Description | Default | |---------------|------------------------------------------------------------------------------|--------------| | `dbPath` | Root directory for all collection data; each collection is stored in sharded files. | **Required** | | `logPath` | Directory where write-ahead logs (WAL) are stored for durability and replay. | **Required** | | `shardCount` | Number of shards used per collection. More shards improve concurrency. | `4` | | `maxLogSize` | Max size of each WAL file before rotation. Accepts values like `"100MB"`. | `"869MB"` | | `logRetention`| Duration to retain old WAL files. Accepts values like `"1w"` or `"30d"`. | `"1w"` | ### `options` Object | Key | Description | Default | |-------------------|-----------------------------------------------------------------------------|------------------------| | `enableEncryption`| Enables encryption for data at rest. | `false` | | `vaultPath` | Path to the vault metadata file used for encryption. | `"./star-vault.json"` | | `masterKey` | Master encryption key for vault encryption. Must securely generate/store.| `null` | | `authHandler` | Custom function that receives `clientAuth` and returns `true`/`false`. | `null` | | `authOptions` | Configuration object passed to internal `StarAuth` for authentication. | `{}` | | `auditHistory` | Turn on default auditing for history/timeline reads (writes entries to `audit.log`). Also toggle per call. | `false` | | `enableMesh` | Turn on StarMesh sync layer. | `false` | | `storageOptions` | Configuration object passed to the main storage engine for controlling low-level storage behavior, metrics flushing, shard hashing, and the automatic, idle-aware background compaction scheduler. | `{}` | | `meshOptions` | Options passed to `StarMesh` when `enableMesh` = `true`. | `{}` | | `starchive` | Configuration for the `Starchive` subsystem (local and remote file management). | `{}` | | `queryMetrics` | Query metrics configuration object or `false` to disable metrics. | `{ enabled : true }` | | `debugOptions` | Fine-grained developer logging controls for internal tracing (see below). | `{ all flags default to true }` | | `dirMode` | UNIX permission mode for directories (e.g., `0o700`). | `0o700` | | `fileMode` | UNIX permission mode for files (e.g., `0o600`). | `0o600` | --- ## storageOptions (Main Storage Configuration) These options control low-level storage behavior, metrics flushing, shard hashing, and the automatic, idle-aware background compaction scheduler. | Key | Type | Description | Default | | ----------------------------------------- | ------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ----------- | | `idxFlushIntervalMs` | `number` | Interval (ms) for flushing pending `.idx` writes when `idxSyncMode` is set to `interval`. Lower values improve durability and cross-process visibility; higher values reduce IO overhead. Minimum enforced value is `10`. | `50` | | `idxSyncMode` | `"interval" \| Controls how `.idx` files are synced to disk. `"interval"` batches fsync calls based on `idxFlushIntervalMs`; `"always"` fsyncs on every index write for maximum durability at the cost of IO. | `"always"` | | `onWriteError` | `function` | Optional callback invoked when a write fails prior to persistence (for example, due to serialization failure). Receives the error and contextual metadata (`collection`, `id`, `value`). The write skips persistence and creates no pending or on-disk state. | | `metricsFlushIntervalMs` | `number` | Interval (ms) for flushing in-memory storage metrics (`fileBytes`, `liveBytes`, update counters) to disk. Lower values increase durability; higher values reduce IO. | `50` | | `shardHashFn` | `(id : string) => number` | Optional custom hash function used to determine shard placement. Return value is modulo-ed by shard count. | `undefined` | | `compactionEnabled` | `boolean` | Enable or disable background compaction entirely. Manual `vacuum*()` APIs still work when disabled. | `true` | | `compactionIdleMs` | `number` | Minimum time (ms) a collection must be idle (no writes) before it becomes eligible for background compaction. | `2500` | | `compactionCooldownMs` | `number` | Cooldown period (ms) after a collection is compacted before it may be compacted again by the scheduler. | `60000` | | `compactionTickMs` | `number` | Scheduler tick interval (ms). Determines how often pending compaction work is evaluated. | `200` | | `compactionMaxCollectionsPerTick` | `number` | Maximum number of collections that may be compacted in a single scheduler tick. | `1` | | `compactionMaxShardsPerCollectionPerTick` | `number` | Maximum number of shards compacted per collection per tick. Keeps pauses small and predictable. | `1` | --- ## 🔐 authOptions (StarAuth Configuration) Pass this object under the `options.authOptions` key when creating your `StarVault` instance. ### Collection Settings | Key | Description | Default | |----------------------|--------------------------------------------|---------------------| | `starAuthEnabled` | Global switch that controls whether the StarAuth subsystem activates. Set to `false` to disable authentication features entirely while keeping vault access open. | `true` | | `collection` | User collection name | `"auth-users"` | | `sessionCollection` | Session tracking collection | `"auth-sessions"` | | `resetCollection` | Password reset token collection | `"auth-resets"` | | `lockCollection` | Lockout state tracking collection (per account/IP/UA). | `"auth-locks"` | | `stellarCollection` | Magic link / code token collection | `"stellar-auths"` | ### Session Behavior These options control the creation, validation, and protection of sessions against brute-force attacks. | Key | Description | Default | |----------------------------|------------------------------------------------------------------------------|--------------------------| | `tokenExpiry` | Lifetime of each session token in seconds. | `3600` (1 hour) | | `sessionValidationFields` | Fields compared on every request to detect hijacking (e.g. IP, fingerprint). | `["ip", "fingerprint"]` | | `strictSessionValidation` | If `true`, any mismatch in validation fields invalidates the session. | `true` | | `enableSuspiciousCheck` | If `true`, compares geo/IP against recent sessions to flag anomalies. | `true` | | `sessionPolicy` | Session strategy: `"default"` (allow many), `"strict"` (1 active max), `"replace"` (replace old). | `"default"` | --- ### 🔒 Lockout Configuration Controls the tracking, weighting, and escalation behavior of failed login attempts. All options go inside `options.authOptions.lockout`. | Key | Description | Default | |--------------------|-------------------------------------------------------------------------------------------------|------------------------| | `strategy` | Lockout algorithm: `"fixed"`, `"slidingWindow"`, or `"exponential"`. | `"fixed"` | | `scopes` | Dimensions to track failures against: `["account", "ip", "ipua"]`. | `["account", "ipua"]` | | `maxAttempts` | Failures allowed before triggering lock. | `5` | | `baseDuration` | Base lockout duration in ms. | `15 * 60 * 1000` | | `maxDuration` | Max lockout duration in ms (for exponential strategy). | `24 * 60 * 60 * 1000` | | `minDuration` | **Absolute minimum** lock duration in ms (applies to all strategies; jitter won’t go below this). | `30 * 1000` | | `jitterDuration` | Random jitter ± ms added to lock duration. | `15 * 1000` | | `decayDuration` | Auto-decay failures if no attempts within this duration. | `30 * 60 * 1000` | | `windowDuration` | For `slidingWindow`, the rolling time window in ms. | `10 * 60 * 1000` | | `windowThreshold` | For `slidingWindow`, number of failures in the window that triggers lock. | `7` | | `windowGrowth` | For `slidingWindow`, how lock duration grows when over threshold: `"none" \| "linear" \| "exponential"`. | `"none"` | | `windowGrowthFactor` | For `slidingWindow: "exponential"` — multiplier per extra failure over threshold. | `2` | | `windowGrowthStep` | For `slidingWindow: "linear"` — extra ms per extra failure over threshold. | `baseDuration / 2` | | `captchaAfter` | Number of failures after which a CAPTCHA challenge is required. | `null` | | `otpAfter` | Number of failures after which an OTP challenge is required. | `null` | | `captchaVerifier` | Function `(req) => boolean` to validate CAPTCHA. | `null` | | `otpVerifier` | Function `({ email, code }) => boolean` to validate OTP. | `null` | | `onLock` | Callback `(info) => void` fired when a lock is applied. | `null` | | `onUnlock` | Callback `(info) => void` fired when a lock is cleared. | `null` | | `onChallenge` | Callback `(info) => void` fired when a CAPTCHA/OTP challenge is required. | `null` | | `notifyLock` | Optional notifier `(info) => void` for alerts/logging. | `null` | | `bypassed` | Function `(req) => boolean` to completely bypass lockout logic for trusted requests. | `null` | --- #### Timing Heuristics `options.authOptions.lockout.timingHeuristics` adds intelligence by classifying failures as bursts, normal, or human-like delays. | Key | Description | Default | |-----------------------|-----------------------------------------------------------|----------| | `enabled` | Enable timing heuristics. | `true` | | `burstDetectionDuration` | Attempts within this ms count as a burst (weighted more). | `3000` | | `humanDetectionDuration` | Attempts slower than this ms count as human/grace. | `45000` | | `maxConsiderDuration` | Ignore intervals longer than this ms. | `600000` | | `burstWeight` | Weight applied to burst failures. | `2.0` | | `normalWeight` | Weight applied to normal failures. | `1.0` | | `graceWeight` | Weight applied to grace (slow) failures. | `0.0` | | `emaAlpha` | Alpha for exponential moving average of attempt intervals. | `0.3` | | `decayOnGrace` | Whether to decay counters when a grace attempt is detected.| `true` | | `decayStep` | How many failures to subtract during a grace decay. | `1` | | `minFailures` | Clamp: minimum number of failures tracked. | `0` | | `maxFailures` | Clamp: maximum number of failures tracked. | `9999` | --- #### Example Fixed Lockout Config ```js authOptions: { lockout: { strategy: "fixed", scopes: ["account", "ipua"], maxAttempts: 5, // lock after N total failures baseDuration: 15 * 60 * 1000, // fixed lock duration (with jitter) minDuration: 30 * 1000, jitterDuration: 15 * 1000, // desync scripted retries decayDuration: 30 * 60 * 1000, // clear counters after idle period // Optional human gates captchaAfter: 3, otpAfter: null, captchaVerifier: async (req) => verifyCaptcha(req.body.token), // Heuristics amplify bursts, forgive human pace timingHeuristics: { enabled: true, burstDetectionDuration: 3000, humanDetectionDuration: 45000, burstWeight: 2.0, normalWeight: 1.0, graceWeight: 0.0, emaAlpha: 0.3, decayOnGrace: true, decayStep: 1, minFailures: 0, maxFailures: 9999 }, onLock: (info) => console.log("LOCKED", info), onUnlock: (info) => console.log("UNLOCKED", info), onChallenge: (info) => console.log("CHALLENGE", info), bypassed: (req) => false } } ``` 👉 When to use: Simple setups where predictability matters. Good for smaller apps, prototypes, or admin dashboards where you want consistent enforcement. --- #### Example Sliding Window Lockout Config ```js authOptions: { lockout: { strategy: "slidingWindow", scopes: ["account", "ipua"], windowDuration: 10 * 60 * 1000, windowThreshold: 7, baseDuration: 60 * 1000, minDuration: 30 * 1000, maxDuration: 24 * 60 * 60 * 1000, jitterDuration: 15 * 1000, windowGrowth: "exponential", windowGrowthFactor: 2, // windowGrowthStep: 30 * 1000, // if using "linear" // Optional Human gates captchaAfter: 3, otpAfter: null, captchaVerifier: async (req) => verifyCaptcha(req.body.token), // Heuristics amplify bursts, forgive human pace timingHeuristics: { enabled: true, burstDetectionDuration: 2000, humanDetectionDuration: 45000, burstWeight: 2.5, normalWeight: 1.0, graceWeight: 0.0, emaAlpha: 0.35, decayOnGrace: true, decayStep: 1, minFailures: 0, maxFailures: 9999 }, onLock: (info) => console.log("LOCKED", info), onUnlock: (info) => console.log("UNLOCKED", info), onChallenge: (info) => console.log("CHALLENGE", info), bypassed: (req) => false } } ``` 👉 When to use: Ideal for games and consumer apps. It’s forgiving to legit players (counters decay naturally) but still stops brute-force bursts quickly. --- #### Example Exponential Lockout Config ```js authOptions: { lockout: { strategy: "exponential", scopes: ["account", "ipua"], maxAttempts: 5, baseDuration: 30 * 1000, minDuration: 30 * 1000, maxDuration: 24 * 60 * 60 * 1000, jitterDuration: 10 * 1000, captchaAfter: 3, otpAfter: 5, captchaVerifier: async (req) => verifyCaptcha(req.body.token), otpVerifier: async ({ email, code }) => verifyOtp(email, code), onLock: (info) => console.log("LOCKED", info), onUnlock: (info) => console.log("UNLOCKED", info), timingHeuristics: { enabled: true, burstWeight: 3.0, graceWeight: 0.0, emaAlpha: 0.4 } } } ``` 👉 When to use: Best for high-security apps, admin portals, or payment flows. Makes repeated brute-forcing impractical, but can be harsh for end-users if tuned too aggressively. --- #### 🧭 Legacy Lockout These highlight the original “simple” lockout knobs. They remain in place for backward compatibility and apply **in addition** to the modern `authOptions.lockout` logic. | Key | Description | Default | |---------------------|----------------------------------------------------------------------------------------------------------|----------------------------| | `lockoutDuration` | **(Legacy)** Fixed lock duration (ms) after too many failed logins. | `900000` (15 minutes) | | `maxLoginAttempts` | **(Legacy)** Number of failed attempts before the legacy lockout triggers. | `5` | **How they interact:** - On every failed login, the modern lockout mechanism (if configured) evaluates first. - The legacy counters also increment and, if `failedAttempts >= maxLoginAttempts`, a fixed lock of `lockoutDuration` is applied. - This ensures older deployments keep their original behavior while you migrate to the new model. **Recommendation:** - Prefer the new `authOptions.lockout` settings for fine-grained control. - If you want the legacy lockout to *rarely* affect users, set a **very large** `maxLoginAttempts` (e.g., `1_000_000`) and leave modern lockout to do the real work. **Example (modern lockout active, legacy effectively neutralized):** ```js authOptions: { // Modern lockout (recommended) lockout: { strategy: "exponential", scopes: ["account", "ipua"], maxAttempts: 5, baseDuration: 15 * 60 * 1000, // 15 minutes maxDuration: 24 * 60 * 60 * 1000 // 24 hours // ...captcha/otp/timingHeuristics, etc. }, maxLoginAttempts: 5, lockoutDuration: 15 * 60 * 1000 } ``` --- ### Password Management | Key | Description | Default | |----------------------|----------------------------------------------------------------------------------------------|--------------------------------------------| | `passwordRequirements` | Object defining the password policy | `{ minLength: 8, requireLetter: true, requireNumber: true, requireSymbol: false }` | | └ `minLength` | Minimum number of characters required | `8` | | └ `requireLetter` | Requires at least one letter (a-z or A-Z) | `true` | | └ `requireNumber` | Requires at least one numeric digit (0-9) | `true` | | └ `requireSymbol` | Requires at least one symbol (e.g., `!@#$%^&*`) | `false` | | └ `customValidator` | Optional custom function `(password) => boolean` for advanced password checks | `null` | | `lockingCombinations` | Password hash complexity | `10` | ### 🌍 Star Locator Configuration | Key | Description | Default | |---|---|---| | `enableGeo` | Enable geo lookups. | `false` | | `enableReverseGeo` | Reverse geocode coordinates. | `false` | | `enableGeoDebug` | Log provider results. | `false` | | `geoProviders` | Extra providers `(ip, {signal}) => result`. | `[]` | | `ipgeolocationKey` | ipgeolocation.io key. | `null` | | `ipinfoToken` | ipinfo.io token. | `null` | | `nominatimUserAgent` | UA for Nominatim reverse geo. | `"StarAuth/1.0 (star-vault@sclpowerful.com)"` | | `googleMapsKey` | Google Geocoding key. | `null` | | `mapboxToken` | Mapbox token. | `null` | | `geoCacheTtlMs` | Cache TTL. | `600000` | | `geoDeadlineMs` | Deadline per lookup. | `3000` | | `geoTimeoutMs` | Timeout per provider. | `2500` | | `geoScoreOk` | Early-accept score threshold. | `5` | | `geoMaxConcurrency` | Max concurrent calls. | `8` | #### Example `geo` Object in Sessions ```js geo: { requestIP: "203.0.113.45", ip: "203.0.113.45", city: "City", region: "Region", country: "Country", continent: "Continent", org: "Org / ASN", isp: "ISP", loc: "12.3456,-98.7654", timezone: "Area/City", postal: "ZIP", flag: "🌍", confidence: 0.83, geoAddress: "123 Main St, Example City, Country" } ``` #### Custom Provider Registration ```js const vault = new StarVault(dataDir, logDir, 4, "869MB", "1w", { authOptions: { enableGeo: true, enableReverseGeo: true, ipinfoToken: process.env.IPINFO_TOKEN, geoProviders: [ async function customProvider(ip, { signal }) { return { city: "Custom City", country: "Customland" }; } ] } }); ``` ### Stellar Login | Key | Description | Default | |---------------------------|-------------------------------------------------|-----------------| | `stellarRequestCooldown`| Minimum delay between stellar requests | `60000` (1 min) | | `generateStellarCode` | Function to generate a stellar numeric code | 6-digit default | ### Hooks & Extensions | Key | Description | |----------------------|------------------------------------------------------| | `onSuspiciousSession`| `(currentSession, pastSession) => void` | | `handleHijack` | `(session, field, expected, actual) => void` | | `onCleanup` | `({ result, timestamp, vaultID }) => void` | | `tagSession` | `(session, userData) => string[]` | ### Guest Session Configurations | Key | Description | Default | |-----------------------------|-----------------------------------------------------------------------------|----------------------------------| | `allowGuestSessions` | Enables or disables guest session support. | `true` | | `guestInactivityThreshold` | Duration of inactivity (in ms) before considering a guest account stale. | `7 * 24 * 60 * 60 * 1000` (1w) | | `cleanupGuestInterval` | Interval (in ms) to check for inactive guest accounts. | `5 * 60 * 1000` (5m) | | `guestActivityTrackers` | Array of custom functions to execute on guest activity updates. | `[]` | | `generateGuestID` | Function to generate guest user IDs. | `() => "guest-" + UUID` | ### Cleanup & Locking | Key | Description | Default | |--------------------------------|------------------------------------------------------|-------------| | `vaultID` | ID for this vault instance | `null` | | `autoCleanupInterval` | How often to auto-clean expired tokens | `null` | | `expiredSessionCleanupInterval` | Interval for cleaning expired sessions | `null` | | `detachedTimers` | Controls whether background cleanup timers remain linked to the process. Set to `false` to keep timers active until completion (useful for long-running or graceful shutdown scenarios). | `true` | | `cleanupExpiredTokensActionInfo` | Metadata for cleanup activity | `{}` | | `cleanupExpiredTokensClientAuth` | Auth context used for cleanup calls | `null` | | `cleanupExpiredSessionsActionInfo` | Metadata for session cleanup | `{}` | | `cleanupExpiredSessionsClientAuth` | Auth context used during session cleanup | `null` | --- ## 📦 Core Methods The primary database interaction methods. | Method | Description | Sync/Async | |--------|-------------|------------| | `create(collection, data, actionInfo = {}, clientAuth = null)` | Creates a new record in the specified collection. Returns the created record object. | ✅ Sync | | `update(collection, id, updates, actionInfo = {}, clientAuth = null)` | Updates an existing record by ID. Setting record data properties to undefined removes them respectively. Returns the updated record. Throws error if not found. | ✅ Sync | | `unset(collection, id, dotPaths = [], actionInfo = {}, clientAuth = null)` | Removes specific nested fields using dot notation (e.g., "profile.stats.xp"). Returns updated record. | ✅ Sync | | `deleteCollection(collection, actionInfo = {}, clientAuth = null)` | Deletes an entire collection. Returns `{ wholeCollection: true, deleted: true/false }`. | ✅ Sync | | `deleteRecord(collection, id, actionInfo = {}, clientAuth = null)` | Deletes a specific record by ID. Returns `{ id, deleted: true/false }`. | ✅ Sync | | `deleteMany(collection, ids = [], actionInfo = {}, clientAuth = null)` | Deletes many records efficiently (shard-grouped delete). Returns `{ requested, deleted, shardsTouched }`. | ✅ Sync | | `delete(collection, id, actionInfo = {}, clientAuth = null)` | Soft-deletes a record by filtering it out and overwriting the collection. Returns `{ id, deleted: true }`. | ✅ Sync | | `softDelete(collection, id, { reason = "soft-delete", deletedBy = null } = {}, actionInfo = {}, clientAuth = null)` | Mark a record as deleted **without removing it**. Sets flags: `deleted: true`, `deletedAt`, `deletedReason`, `deletedBy`. Emits `update`. Returns `{ id, softDeleted: true, at, reason, deletedBy }`. | ✅ Sync | | `restore(collection, id, { restoredBy = null } = {}, actionInfo = {}, clientAuth = null)` | Reverse a prior soft delete. Clears `deleted*` flags and sets `restoredAt`, `restoredBy`. Emits `update`. Returns `{ id, restored: true, at, restoredBy }`. | ✅ Sync | --- ### 🧾 StarTransactions (WAL + Idempotency) Star-Vault handles **multi-record transactions** with staged commits, WAL logging, and per-step idempotency. #### Core Traits - Logs each `TX-BEGIN` and `TX-COMMIT` before writing data. - Replays committed groups safely after crash or restart. - Keeps uncommitted changes isolated until commit. #### Transaction Methods ```js // High-level helper const { transactionID, result } = await vault.transact({ type: "transfer" }, async (t) => { await t.update("accounts", "A", { balance: v => v - 100 }); await t.update("accounts", "B", { balance: v => v + 100 }); await t.create("ledger", { from: "A", to: "B", amount: 100, ts: Date.now() }); return "ok"; }); // Manual flow const tx = await vault.begin({ type: "batch-import" }); await tx.create("users", { id: "u1", name: "Nova" }); await tx.softDelete("posts", "p-123", { reason: "policy", deletedBy: "mod-9" }); await tx.restore("posts", "p-123", { restoredBy: "mod-9" }); await tx.commit(); // or await tx.abort("reason"); ``` | Context | Method | Purpose | Sync/Async | |----------|---------|----------|-------------| | vault | `transact(actionInfo, fn)` | Open txn, run `fn(tx)`, commit on success, abort on failure. Returns `{ transactionID, result }`. | ⚙️ Async | | vault | `begin(actionInfo?, clientAuth?)` | Start txn and return handle `{ id, create, update, unset, deleteRecord, softDelete, restore, commit, abort }`. | ⚙️ Async | | tx | `create(collection, data, actionInfo?, clientAuth?)` | Stage record creation. | ⚙️ Async | | tx | `update(collection, id, updates, actionInfo?, clientAuth?)` | Stage record update; supports function deltas `v => v - 50`. | ⚙️ Async | | tx | `unset(collection, id, dotPaths = [], actionInfo?, clientAuth?)` | Stage removal of nested fields. | ⚙️ Async | | tx | `deleteRecord(collection, id, actionInfo?, clientAuth?)` | Stage hard deletion. | ⚙️ Async | | tx | `softDelete(collection, id, { reason?, deletedBy? } = {}, actionInfo?, clientAuth?)` | Stage soft deletion (flags only). | ⚙️ Async | | tx | `restore(collection, id, { restoredBy? } = {}, actionInfo?, clientAuth?)` | Stage record restoration. | ⚙️ Async | | tx | `commit(actionInfo?)` | Apply staged ops, refresh cache, emit events, seal WAL. | ⚙️ Async | | tx | `abort(reason = "manual")` | Discard staging and log `TX-ABORT`. | ⚙️ Async | #### Idempotency and WAL - Each step logs internal `transactionID` and deterministic `idempotencyKey`. - Steps replay safely on crash recovery; duplicates ignored. #### Example Flow ```js await vault.transact({ actor: "u42" }, async (tx) => { await tx.create("users", { id: "42", name: "Orion" }); await tx.update("profiles", "42", { visits: v => v + 1 }); await tx.softDelete("sessions", "s-23", { reason: "expired", deletedBy: "system" }); }); ``` --- ## 🔍 Query Engine The **Query Engine** powers Star-Vault’s adaptive data traversal and filtering system — providing a fully chainable, composable query builder with deterministic caching and automatic mode optimization. ### Overview ```js vault.query("collection") .whereRecord("id", "value") // Filter by root record metadata .where({ key : "value" }) // Filter by record data fields .search("field", "text") // Text search within a field .recent("timestamp", "7d") // Filter records from last 7 days .near("location", { lat : 0, lng : 0 }, 50) // Geospatial radius filter .sort({ name : 1 }) // Sort ascending (1) or descending (-1) .select(["id", "name"]) // Return specific fields only .limit(10) // Limit number of results .offset(0) // Skip first N results .page({ limit : 10, after : null }) // Cursor-based pagination alternative .filterBy(record => record.active) // Apply custom JS-level filter .callback(row => { row._scanned = true }) // Optional pre-execution side effect .return("data") // (Optional) Return only the record data .execute(false, "auto"); // Execute with collection matching and mode control ``` --- ### ⚙️ Return Modes | Method | Description | Default | |---------|-------------|----------| | **`return("record")`** | Return the full record object including metadata (`{ id, data, timestamp, ... }`). | ✅ Default | | **`return("data")`** | Return only the user data (`record.data`). | | | **`returnFull()`** | Shorthand for `.return("record")`. | ✅ | --- ### Execution Modes | Mode | Description | Ideal Use Case | |------|--------------|----------------| | **auto** | Automatically selects the best execution path depending on query shape. Defaults to `hyper` when possible. | General queries | | **hyper** | Direct single-record retrieval path. Bypasses iteration and uses direct cache lookup for deterministic key-value equality filters. | Exact key lookups (`id`, `userID`, etc.) | | **default** | Standard streaming scan with limit/offset early cutoffs. Supports progressive paging and partial reductions. | Sequential or paginated scans | > ⚙️ Star-Vault automatically promotes `auto` queries to `hyper` when the filter is deterministic (single key-value pair + limit ≤ 1). --- ### Methods | Method | Description | Sync/Async | |--------|--------------|------------| | `query(collection, actionInfo?)` | Begin a query on the specified collection. Returns a chainable QueryBuilder. | ✅ Sync | | `whereRecord(criteria)` | Filter by root record metadata (`record.id`, etc.). | ✅ Sync | | `where(criteria)` | Filter by record data key-value pairs. | ✅ Sync | | `search(field, text)` | Text search (substring or tokenized) in a specific field. | ✅ Sync | | `recent(field, duration)` | Filter by recency, e.g. `"7d"`, `"30m"`. | ✅ Sync | | `near(field, center, radius)` | Filter spatially near a `{ lat, lng }` point. | ✅ Sync | | `sort(criteria)` | Sort results ascending (`1`) or descending (`-1`). | ✅ Sync | | `select(fields)` | Choose fields to include in the output. | ✅ Sync | | `limit(number)` | Restrict result count. | ✅ Sync | | `offset(number)` | Skip first N results for pagination. | ✅ Sync | | `page(options)` | Enables cursor-based pagination with deterministic ordering. Returns `{ items, nextCursor }` when calling execute(). | ✅ Sync | | `filterBy(fn)` | Apply a custom filter function to each record. | ✅ Sync | | `callback(fn)` | Add a hook or side effect before result finalization. | ✅ Sync | | `configureAutoIndex(options)` | Enables or tunes adaptive auto-indexing for queries that repeatedly miss fast paths. | ✅ Sync | | `return(mode)` | Control whether queries return `"record"` (full record) or `"data"` (just data). | ✅ Sync | | `returnFull()` | Explicitly reset return mode to `"record"`. | ✅ Sync | | `execute(exactCollection = false, executionMode = "auto")` | Execute query. If `exactCollection = true`, restrict to the specified collection only. | ✅ Sync | | `getByID(collection, id, actionInfo?)` | Retrieve a record directly by primary key. | ✅ Sync | | `getManyByID(collection, ids, options?)` | Retrieve multiple records by an array of primary keys, **preserving input order**. | ✅ Sync | | `range(min, max)` | Static helper to produce inclusive numeric ranges. | ✅ Sync | --- ### Examples #### 🔸 Full Records (Default) const orders = vault.query("orders") .where({ status : "shipped" }) .execute(); // Returns [{ id, data, timestamp, ... }] #### 🔸 Data-Only View const orders = vault.query("orders") .where({ status : "shipped" }) .return("data") .execute(); // Returns only record.data objects #### 🔸 Revert to Full Records const orders = vault.query("orders") .where({ status : "shipped" }) .returnFull() .execute(); --- #### 🔸 Auto Mode (Adaptive) ```js const user = vault.query("users") .where({ id : "u-1" }) .select(["name", "email"]) .execute(false, "auto")[0]; ``` Automatically selects `hyper` mode due to deterministic equality filter and small result limit. --- #### 🔸 Hyper Mode (Accelerated) ```js const user = vault.query("users") .where({ id : "u-1" }) .limit(1) .execute(false, "hyper")[0]; ``` Uses a direct reference lookup path through **StarCache** for **O(1)** deterministic retrieval. --- #### 🔸 Default Mode (Streaming Scan) ```js const users = vault.query("users") .where({ active : true }) .offset(100) .limit(50) .sort({ createdAt : -1 }) .execute(false, "default"); ``` Uses a sequential scan with **early cutoff** logic for predictable pagination over large datasets. --- #### 🔸 **Many-by-ID (order-preserving)** ```js const [ u1, u5, u3 ] = vault.getManyByID("users", ["u-1", "u-5", "u-3"]); ``` --- #### 🔸 Advanced Example (Custom Logic & Hooks) ```js vault.query("orders") .where({ status : "pending" }) .filterBy(order => order.total > 100) .callback(order => { order.checked = true }) .select(["id", "total", "status"]) .limit(5) .execute(false, "auto"); ``` > The callback() hook works best for small, customized post-processing operations or live UI effects before final emission. --- ### ⚙️ Advanced Controls #### `configureAutoIndex(options)` Enable or tune adaptive auto-indexing for queries that repeatedly miss fast paths. **Options** - `enabled : boolean` — Turn auto-indexing on/off. - `missThreshold : number` — How many cache/scan “misses” trigger an index build. - `windowMs : number` — Rolling time window for counting misses. - `maxConcurrentBuilds : number` — Cap background index builds. - `lazyMaxScan : number` — Upper bound for a “lazy” scan before promoting to an index. **Example** ```js vault.query("orders") .configureAutoIndex({ enabled : true, missThreshold : 8, windowMs : 60_000, maxConcurrentBuilds : 2, lazyMaxScan : 50_000 }) .where({ status : "pending" }) .sort({ createdAt : -1 }) .limit(100) .execute(false, "auto"); ``` **Notes** - `configureAutoIndex()` affects only the current query builder chain. - The adaptive indexer monitors repetitive scans and builds background indexes once thresholds reach their limits. - Star-Vault manages concurrency and I/O impact internally—no manual worker orchestration required. - Auto-indexing integrates with StarCache and never blocks live queries. - You can disable it in lightweight or memory-constrained deployments without affecting correctness. --- ### 📄 Cursor-Based Pagination Star-Vault provides **deterministic, cursor-based pagination** for stable traversal across large datasets. Unlike offset-based pagination, cursor pagination: - avoids skipped/duplicated records under concurrent writes - maintains stable ordering - scales efficiently for large collections --- #### Basic Usage ```js const page1 = vault.query("users") .where({ country : "US" }) .page({ limit : 5 }) .execute(); console.log(page1.items); console.log(page1.nextCursor); ``` --- #### Fetching the Next Page ```js const page2 = vault.query("users") .where({ country : "US" }) .page({ limit : 5, after : page1.nextCursor }) .execute(); ``` --- #### Return Shape ```js { items : [ /* records */ ], nextCursor : "cursor-string-or-null" } ``` | Field | Description | |--------------|-------------| | `items` | Array of records (or data depending on return mode) | | `nextCursor` | Cursor string for the next page, or `null` if no more results | > ℹ️ Pagination changes the return type of `.execute()` from an array to a `{ items, nextCursor }` object. --- #### Cursor Behavior - `nextCursor = null` → no more pages - Passing `after : cursor` resumes from the last record of the previous page - Cursor encoding is **opaque and versioned internally** --- #### Example — Full Pagination Flow ```js let cursor = null; do { const page = vault.query("users") .where({ country : "US" }) .page({ limit : 5, after : cursor }) .execute(); console.log(page.items); cursor = page.nextCursor; } while (cursor); ``` --- #### ⚙️ Deterministic Ordering Pagination relies on Star-Vault’s **deterministic ID ordering**, ensuring: - stable page boundaries - no duplication across pages - no missing records during traversal > ⚠️ Important: Do not mutate record IDs or manually inject non-deterministic IDs if you rely on pagination stability. --- ### Performance & Behavior - **Adaptive Execution Pathing** — chooses optimal mode based on filter complexity and limit count. - **Cache-Aware Traversal** — leverages in-memory record handles for active collections. - **Early-Cutoff Scanning** — truncates scan loops when limits are reached to reduce latency. - **Geospatial and Temporal Filtering** — supports proximity, duration, and time-based conditions natively. - **Lightweight Chaining** — each query stage mutates minimal state, maintaining sub-millisecond builder overhead. --- ### Developer Note Query Engine provides direct low-level access to **StarVault's compositional access model**. When combined with **Hyper-Normalization™**, queries can achieve sub-millisecond lookups even across multi-collection normalized schemas. --- ## ⚡️ Hyper-Normalization™ Star-Vault introduces **Hyper-Normalization™**, a revolutionary data-modeling paradigm that unlocks the **highest logical purity** of data — with **near-zero performance penalties**. ### Definition Hyper-Normalization™ (noun) /ˈhaɪ.pər ˌnɔr.mə.laɪˈzeɪ.ʃən/ A data-modeling paradigm introduced by Steven Compton in 2025 (Star-Vault) that implements physical-path normalization, eliminating runtime join planning through deterministic reference resolution and achieving near-constant-time cross-collection traversal under defined consistency and locality models. _First published: 2025-11-04_ _Coined and defined by Steven Compton, creator of Star-Vault._ --- ### 🧬 Origin of Hyper-Normalization™ **Before Star-Vault**, no database truly achieved runtime-lev