native-vector-store
Version:
High-performance local vector store with SIMD optimization for MCP servers
130 lines (97 loc) • 4.88 kB
Markdown
# Performance Case Study: Adaptive File Loading in Native Vector Store
## Executive Summary
We achieved significant performance improvements in our native vector store by implementing an adaptive file loading strategy that automatically selects the optimal I/O method based on file size. This resulted in up to **3x faster loading times** for typical workloads while maintaining simplicity for users.
## Background
Our vector store loads JSON documents containing embeddings from disk. Initial implementation used standard file I/O with buffering, which performed well but had room for improvement, especially when dealing with directories containing many files of varying sizes.
## The Challenge
We discovered that optimal file loading strategies vary dramatically based on file characteristics:
- **Large files (>5MB)**: Sequential reads with pre-allocated buffers perform best
- **Small files (<5MB)**: Memory-mapped I/O significantly reduces overhead
## Implementation Journey
### Phase 1: Baseline Optimization
First, we optimized the standard loader:
- Pre-allocated reusable buffers (1MB initial, grows as needed)
- Used `filesystem::file_size()` to avoid redundant syscalls
- Implemented producer-consumer pattern with lock-free queues
**Result**: ~10-15% improvement over naive implementation
### Phase 2: Memory-Mapped I/O
We added memory-mapped file support:
- Zero-copy access to file data
- Cross-platform support (mmap on POSIX, MapViewOfFile on Windows)
- Eliminated buffer allocation overhead
**Result**: Mixed results - faster for small files, slower for large files
### Phase 3: Adaptive Strategy
The key insight was that no single approach works best for all cases:
```cpp
// Adaptive loader chooses the best method per file
constexpr size_t SIZE_THRESHOLD = 5 * 1024 * 1024; // 5MB
for (const auto& file_info : file_infos) {
if (file_info.size < SIZE_THRESHOLD) {
// Use memory mapping for small files
load_with_mmap(file_info);
} else {
// Use standard I/O for large files
load_with_standard_io(file_info);
}
}
```
## Benchmark Results
### Test Dataset 1: Large Files (2 files, ~340MB total)
| Method | Load Time | Relative Performance |
|--------|-----------|---------------------|
| Standard Loader | 731ms | 1.0x (baseline) |
| Memory-Mapped | 1070ms | 0.68x (slower) |
| **Adaptive** | **735ms** | **0.99x** |
### Test Dataset 2: Partitioned Files (66 files, ~340MB total)
| Method | Load Time | Relative Performance |
|--------|-----------|---------------------|
| Standard Loader | 415ms | 1.0x (baseline) |
| Memory-Mapped | 283ms | 1.47x (faster) |
| **Adaptive** | **278ms** | **1.49x (faster)** |
### Test Dataset 3: Small Files (465 files, ~45MB total)
| Method | Load Time | Relative Performance |
|--------|-----------|---------------------|
| Standard Loader | 146ms | 1.0x (baseline) |
| Memory-Mapped | 51ms | 2.86x (faster) |
| **Adaptive** | **49ms** | **2.98x (faster)** |
## Key Findings
1. **File size matters more than total data volume**
- Memory mapping excels with many small files
- Standard I/O wins for few large files
2. **The 5MB threshold is optimal**
- Below 5MB: Memory mapping eliminates per-file overhead
- Above 5MB: SimdJSON's padding requirement negates mmap benefits
3. **Adaptive loading provides consistent best performance**
- Automatically selects optimal strategy
- No configuration required
- Negligible decision overhead (<1μs per file)
## Technical Details
### Why Memory Mapping Helps Small Files
- Eliminates buffer allocation (saves ~1-2ms per file)
- OS handles caching and prefetching
- Reduces memory copies
### Why Standard I/O Helps Large Files
- SimdJSON requires padded strings, forcing a copy anyway
- Sequential reads are highly optimized by OS
- Single large allocation is more efficient than mmap overhead
### Thread Safety Considerations
- Both strategies use the same producer-consumer pattern
- Lock-free atomic queues for work distribution
- No overlapping OpenMP regions (prevents TSAN warnings)
## Usage
The adaptive loader is now the default:
```javascript
const store = new VectorStore(1536);
store.loadDir('./documents'); // Automatically uses adaptive strategy
```
For specific use cases, individual strategies remain available:
```javascript
store.loadDirMMap('./documents'); // Force memory mapping
store.loadDirAdaptive('./documents'); // Explicit adaptive
```
## Conclusion
By implementing an adaptive loading strategy, we achieved:
- **Up to 3x faster loading** for typical workloads
- **Zero configuration** - it just works
- **Consistent performance** across diverse file distributions
The lesson: Sometimes the best optimization is knowing when to use which technique. Our adaptive loader makes this decision automatically, giving users optimal performance without complexity.