ruvector-extensions
Version:
Advanced features for ruvector: embeddings, UI, exports, temporal tracking, and persistence
387 lines (281 loc) • 8.45 kB
Markdown
# ruvector-extensions
Advanced persistence and extension features for the [ruvector](https://github.com/ruvnet/ruvector) vector database.
## Features
- 💾 **Multiple Persistence Formats**: JSON, Binary (MessagePack), SQLite
- 📸 **Snapshot Management**: Create, list, restore, and delete database snapshots
- 🔄 **Incremental Saves**: Save only changed data for efficiency
- 📤 **Export/Import**: Flexible data portability
- 🗜️ **Compression Support**: Gzip and Brotli compression for smaller files
- 📊 **Progress Tracking**: Real-time progress callbacks for large operations
- ⚡ **Auto-Save**: Configurable automatic saves
- 🔒 **Data Integrity**: Built-in checksum verification
## Installation
```bash
npm install ruvector-extensions ruvector
```
## Quick Start
```typescript
import { VectorDB } from 'ruvector';
import { DatabasePersistence } from 'ruvector-extensions';
// Create a vector database
const db = new VectorDB({ dimension: 384 });
// Add vectors
db.insert({
id: 'doc1',
vector: [0.1, 0.2, ...], // 384-dimensional vector
metadata: { title: 'My Document' }
});
// Create persistence manager
const persistence = new DatabasePersistence(db, {
baseDir: './data',
format: 'json',
compression: 'gzip',
autoSaveInterval: 60000, // Auto-save every minute
});
// Save database
await persistence.save({
onProgress: (p) => console.log(`${p.percentage}% - ${p.message}`)
});
// Create snapshot
const snapshot = await persistence.createSnapshot('backup-v1');
// Later: restore from snapshot
await persistence.restoreSnapshot(snapshot.id);
```
## API Documentation
### DatabasePersistence
Main class for managing database persistence.
#### Constructor
```typescript
new DatabasePersistence(db: VectorDB, options: PersistenceOptions)
```
**Options:**
- `baseDir` (string): Base directory for persistence files
- `format` (string): Default format - 'json', 'binary', or 'sqlite'
- `compression` (string): Compression type - 'none', 'gzip', or 'brotli'
- `incremental` (boolean): Enable incremental saves
- `autoSaveInterval` (number): Auto-save interval in ms (0 = disabled)
- `maxSnapshots` (number): Maximum snapshots to keep
- `batchSize` (number): Batch size for large operations
#### Save Operations
**save(options?): Promise<string>**
Save the entire database to disk.
```typescript
await persistence.save({
path: './backup.json.gz',
format: 'json',
compress: true,
onProgress: (p) => console.log(p.message)
});
```
**saveIncremental(options?): Promise<string | null>**
Save only changed data (returns null if no changes).
```typescript
const path = await persistence.saveIncremental();
if (path) {
console.log('Changes saved to:', path);
}
```
**load(options): Promise<void>**
Load database from disk.
```typescript
await persistence.load({
path: './backup.json.gz',
verifyChecksum: true,
onProgress: (p) => console.log(p.message)
});
```
#### Snapshot Management
**createSnapshot(name, metadata?): Promise<SnapshotMetadata>**
Create a named snapshot of the current database state.
```typescript
const snapshot = await persistence.createSnapshot('pre-migration', {
version: '2.0',
user: 'admin'
});
console.log(`Created snapshot ${snapshot.id}`);
console.log(`Size: ${formatFileSize(snapshot.fileSize)}`);
```
**listSnapshots(): Promise<SnapshotMetadata[]>**
List all available snapshots (sorted newest first).
```typescript
const snapshots = await persistence.listSnapshots();
for (const snap of snapshots) {
console.log(`${snap.name}: ${snap.vectorCount} vectors`);
}
```
**restoreSnapshot(id, options?): Promise<void>**
Restore database from a snapshot.
```typescript
await persistence.restoreSnapshot(snapshot.id, {
verifyChecksum: true,
onProgress: (p) => console.log(p.message)
});
```
**deleteSnapshot(id): Promise<void>**
Delete a snapshot.
```typescript
await persistence.deleteSnapshot(oldSnapshotId);
```
#### Export/Import
**export(options): Promise<void>**
Export database to a file.
```typescript
await persistence.export({
path: './export/database.json',
format: 'json',
compress: true,
includeIndex: false,
onProgress: (p) => console.log(p.message)
});
```
**import(options): Promise<void>**
Import database from a file.
```typescript
await persistence.import({
path: './export/database.json',
clear: true, // Clear existing data first
verifyChecksum: true,
onProgress: (p) => console.log(p.message)
});
```
#### Auto-Save
**startAutoSave(): void**
Start automatic saves at configured interval.
```typescript
persistence.startAutoSave();
```
**stopAutoSave(): void**
Stop automatic saves.
```typescript
persistence.stopAutoSave();
```
**shutdown(): Promise<void>**
Cleanup and perform final save.
```typescript
await persistence.shutdown();
```
### Utility Functions
**formatFileSize(bytes): string**
Format bytes as human-readable size.
```typescript
console.log(formatFileSize(1536000)); // "1.46 MB"
```
**formatTimestamp(timestamp): string**
Format Unix timestamp as ISO string.
```typescript
console.log(formatTimestamp(Date.now())); // "2024-01-15T10:30:00.000Z"
```
**estimateMemoryUsage(state): number**
Estimate memory usage of a database state.
```typescript
const usage = estimateMemoryUsage(state);
console.log(`Estimated: ${formatFileSize(usage)}`);
```
## Examples
### Example 1: Basic Persistence
```typescript
import { VectorDB } from 'ruvector';
import { DatabasePersistence } from 'ruvector-extensions';
const db = new VectorDB({ dimension: 128 });
// Add data
for (let i = 0; i < 1000; i++) {
db.insert({
id: `doc-${i}`,
vector: Array(128).fill(0).map(() => Math.random())
});
}
// Save
const persistence = new DatabasePersistence(db, {
baseDir: './data'
});
await persistence.save();
console.log('Database saved!');
```
### Example 2: Snapshot Workflow
```typescript
// Create initial snapshot
const v1 = await persistence.createSnapshot('version-1.0');
// Make changes
db.insert({ id: 'new-doc', vector: [...] });
// Create new snapshot
const v2 = await persistence.createSnapshot('version-1.1');
// Rollback to v1 if needed
await persistence.restoreSnapshot(v1.id);
```
### Example 3: Export/Import
```typescript
// Export to JSON
await persistence.export({
path: './backup.json',
format: 'json',
compress: false
});
// Import into new database
const db2 = new VectorDB({ dimension: 128 });
const p2 = new DatabasePersistence(db2, { baseDir: './data2' });
await p2.import({
path: './backup.json',
verifyChecksum: true
});
```
### Example 4: Progress Tracking
```typescript
await persistence.save({
onProgress: (progress) => {
console.log(`[${progress.percentage}%] ${progress.message}`);
console.log(`${progress.current}/${progress.total} items`);
}
});
```
### Example 5: Auto-Save
```typescript
const persistence = new DatabasePersistence(db, {
baseDir: './data',
autoSaveInterval: 300000, // Save every 5 minutes
incremental: true
});
// Auto-save runs automatically
// Stop when done
await persistence.shutdown();
```
## TypeScript Support
Full TypeScript definitions are included:
```typescript
import type {
PersistenceOptions,
SnapshotMetadata,
DatabaseState,
ProgressCallback,
ExportOptions,
ImportOptions
} from 'ruvector-extensions';
```
## Performance Tips
1. **Use Binary Format**: Faster than JSON for large databases
2. **Enable Compression**: Reduces storage size by 70-90%
3. **Incremental Saves**: Much faster for small changes
4. **Batch Size**: Adjust `batchSize` for optimal performance
5. **Auto-Save**: Use reasonable intervals (5-10 minutes)
## Error Handling
All async methods may throw errors:
```typescript
try {
await persistence.save();
} catch (error) {
if (error.code === 'ENOSPC') {
console.error('Not enough disk space');
} else if (error.message.includes('checksum')) {
console.error('Data corruption detected');
} else {
console.error('Save failed:', error.message);
}
}
```
## License
MIT - See [LICENSE](LICENSE) for details
## Contributing
Contributions welcome! Please see the main [ruvector repository](https://github.com/ruvnet/ruvector) for contribution guidelines.
## Support
- Documentation: https://github.com/ruvnet/ruvector
- Issues: https://github.com/ruvnet/ruvector/issues
- Discord: [Join our community](https://discord.gg/ruvector)