UNPKG

native-vector-store

Version:

High-performance local vector store with SIMD optimization for MCP servers

130 lines (97 loc) 4.88 kB
# Performance Case Study: Adaptive File Loading in Native Vector Store ## Executive Summary We achieved significant performance improvements in our native vector store by implementing an adaptive file loading strategy that automatically selects the optimal I/O method based on file size. This resulted in up to **3x faster loading times** for typical workloads while maintaining simplicity for users. ## Background Our vector store loads JSON documents containing embeddings from disk. Initial implementation used standard file I/O with buffering, which performed well but had room for improvement, especially when dealing with directories containing many files of varying sizes. ## The Challenge We discovered that optimal file loading strategies vary dramatically based on file characteristics: - **Large files (>5MB)**: Sequential reads with pre-allocated buffers perform best - **Small files (<5MB)**: Memory-mapped I/O significantly reduces overhead ## Implementation Journey ### Phase 1: Baseline Optimization First, we optimized the standard loader: - Pre-allocated reusable buffers (1MB initial, grows as needed) - Used `filesystem::file_size()` to avoid redundant syscalls - Implemented producer-consumer pattern with lock-free queues **Result**: ~10-15% improvement over naive implementation ### Phase 2: Memory-Mapped I/O We added memory-mapped file support: - Zero-copy access to file data - Cross-platform support (mmap on POSIX, MapViewOfFile on Windows) - Eliminated buffer allocation overhead **Result**: Mixed results - faster for small files, slower for large files ### Phase 3: Adaptive Strategy The key insight was that no single approach works best for all cases: ```cpp // Adaptive loader chooses the best method per file constexpr size_t SIZE_THRESHOLD = 5 * 1024 * 1024; // 5MB for (const auto& file_info : file_infos) { if (file_info.size < SIZE_THRESHOLD) { // Use memory mapping for small files load_with_mmap(file_info); } else { // Use standard I/O for large files load_with_standard_io(file_info); } } ``` ## Benchmark Results ### Test Dataset 1: Large Files (2 files, ~340MB total) | Method | Load Time | Relative Performance | |--------|-----------|---------------------| | Standard Loader | 731ms | 1.0x (baseline) | | Memory-Mapped | 1070ms | 0.68x (slower) | | **Adaptive** | **735ms** | **0.99x** | ### Test Dataset 2: Partitioned Files (66 files, ~340MB total) | Method | Load Time | Relative Performance | |--------|-----------|---------------------| | Standard Loader | 415ms | 1.0x (baseline) | | Memory-Mapped | 283ms | 1.47x (faster) | | **Adaptive** | **278ms** | **1.49x (faster)** | ### Test Dataset 3: Small Files (465 files, ~45MB total) | Method | Load Time | Relative Performance | |--------|-----------|---------------------| | Standard Loader | 146ms | 1.0x (baseline) | | Memory-Mapped | 51ms | 2.86x (faster) | | **Adaptive** | **49ms** | **2.98x (faster)** | ## Key Findings 1. **File size matters more than total data volume** - Memory mapping excels with many small files - Standard I/O wins for few large files 2. **The 5MB threshold is optimal** - Below 5MB: Memory mapping eliminates per-file overhead - Above 5MB: SimdJSON's padding requirement negates mmap benefits 3. **Adaptive loading provides consistent best performance** - Automatically selects optimal strategy - No configuration required - Negligible decision overhead (<1μs per file) ## Technical Details ### Why Memory Mapping Helps Small Files - Eliminates buffer allocation (saves ~1-2ms per file) - OS handles caching and prefetching - Reduces memory copies ### Why Standard I/O Helps Large Files - SimdJSON requires padded strings, forcing a copy anyway - Sequential reads are highly optimized by OS - Single large allocation is more efficient than mmap overhead ### Thread Safety Considerations - Both strategies use the same producer-consumer pattern - Lock-free atomic queues for work distribution - No overlapping OpenMP regions (prevents TSAN warnings) ## Usage The adaptive loader is now the default: ```javascript const store = new VectorStore(1536); store.loadDir('./documents'); // Automatically uses adaptive strategy ``` For specific use cases, individual strategies remain available: ```javascript store.loadDirMMap('./documents'); // Force memory mapping store.loadDirAdaptive('./documents'); // Explicit adaptive ``` ## Conclusion By implementing an adaptive loading strategy, we achieved: - **Up to 3x faster loading** for typical workloads - **Zero configuration** - it just works - **Consistent performance** across diverse file distributions The lesson: Sometimes the best optimization is knowing when to use which technique. Our adaptive loader makes this decision automatically, giving users optimal performance without complexity.