hnswsqlite
Version:
Vector search with HNSWlib and SQLite in TypeScript.
210 lines (157 loc) • 5.83 kB
Markdown
[](https://www.npmjs.com/package/hnswsqlite)
[](LICENSE)
[](https://github.com/praveencs87/hnswsqlite/actions)
A TypeScript library that combines approximate nearest neighbor vector search (via HNSWlib) with SQLite for persistent, lightweight, and efficient semantic search. Perfect for building semantic search applications, recommendation systems, and more.
## Features
- 🚀 **Fast Vector Search**: Approximate nearest neighbor search using HNSW algorithm
- 💾 **Persistence**: All data stored in SQLite for durability and easy backup
- 🔌 **Plugin System**: Support for multiple embedding providers:
- OpenAI
- HuggingFace
- Dummy (for testing)
- **WebLLM** (browser-based LLMs)
- **MediaPipe** (image/video feature extraction)
- **TensorFlow.js** (text/image/audio feature extraction)
- 🛠️ **CLI Tool**: Full-featured command-line interface for easy interaction
- 📦 **Lightweight**: No external dependencies other than SQLite and HNSWlib
- 🧩 **Extensible**: Easy to integrate with existing applications
- 🔄 **Batch Operations**: Support for adding and deleting multiple documents at once
## Installation
### As a Library
```bash
npm install hnswsqlite
```
### As a CLI Tool
```bash
# Install globally
npm install -g hnswsqlite
# Or use with npx
npx hnswsqlite --help
```
View on npm: [https://www.npmjs.com/package/hnswsqlite](https://www.npmjs.com/package/hnswsqlite)
View on GitHub: [https://github.com/praveencs87/hnswsqlite](https://github.com/praveencs87/hnswsqlite)
## Usage
### JavaScript/TypeScript API
```typescript
import { VectorStore } from 'hnswsqlite';
// Initialize with SQLite database path and embedding dimension
const store = new VectorStore('my_vectors.db', 1536);
try {
// Add documents with embeddings
const docId = store.addDocument('hello world', [0.1, 0.2, 0.3, ...]);
// Search for similar documents
const results = store.search([0.1, 0.2, 0.3, ...], 5);
// Delete a document
const deleted = store.deleteDocument(docId);
// Batch operations
const docIds = store.addDocuments([
{ text: 'first document', embedding: [0.1, 0.2, ...] },
{ text: 'second document', embedding: [0.3, 0.4, ...] }
]);
} finally {
// Always close the store when done
store.close();
}
```
```bash
hnswsqlite init
```
```bash
hnswsqlite add "Your document text here"
hnswsqlite add "Another document" 0.1 0.2 0.3 ...
hnswsqlite add "Text or image path" --provider webllm
```
```bash
hnswsqlite search "search query"
```
```bash
hnswsqlite list
```
```bash
hnswsqlite delete 1
```
```
-d, --database <path> Path to the SQLite database (default: vectors.db)
--dim <dimension> Dimension of the vectors (default: 1536)
--provider <name> Embedding provider to use (openai, huggingface, webllm, mediapipe, tensorflowjs, dummy)
--verbose Enable verbose output
```
## Advanced Usage
### Using Different Embedding Providers
All embedding providers implement a common interface:
```typescript
type EmbeddingPlugin = {
name: string;
generateEmbedding(input: string | Buffer): Promise<number[]>;
};
```
```typescript
import { VectorStore } from 'hnswsqlite';
import { OpenAIEmbedder } from 'hnswsqlite/plugins/openai';
const store = new VectorStore('my_vectors.db', 1536);
const embedder = new OpenAIEmbedder('your-api-key');
const embedding = await embedder.generateEmbedding('Your text here');
store.addDocument('Your text here', embedding);
```
```typescript
import { WebLLMPlugin } from 'hnswsqlite/plugins/webllm';
const plugin = new WebLLMPlugin();
const embedding = await plugin.generateEmbedding('Your text here');
```
```typescript
import { MediaPipePlugin } from 'hnswsqlite/plugins/mediapipe';
const plugin = new MediaPipePlugin();
const embedding = await plugin.generateEmbedding(imageBuffer);
```
```typescript
import { TensorFlowPlugin } from 'hnswsqlite/plugins/tensorflow';
const plugin = new TensorFlowPlugin();
const embedding = await plugin.generateEmbedding('Your text or image buffer');
```
> **Note:** Each plugin may require additional dependencies or setup. See the plugin source for details.
```typescript
const store = new VectorStore('my_vectors.db', 1536, {
maxElements: 100000, // Maximum number of elements in the index
M: 16, // Maximum number of outgoing connections in the graph
efConstruction: 200, // Controls index search speed/build speed tradeoff
randomSeed: 100, // Random seed for reproducibility
});
```
```bash
git clone https://github.com/praveencs87/hnswsqlite.git
cd hnswsqlite
npm install
npm run build
npm test
npm run cli -- --help
```
Contributions are welcome! Please feel free to submit a Pull Request.
MIT © [Praveen CS](https://www.linkedin.com/in/praveen-cs/)
---
Maintained by [Praveen CS](https://www.linkedin.com/in/praveen-cs/)
- GitHub: [praveencs87](https://github.com/praveencs87)