sqlite-wasm-vec
Version:
SQLite Wasm with sqlite-vec extension for vector similarity search, conveniently wrapped as an ES Module.
575 lines (459 loc) • 15.2 kB
Markdown
# SQLite Wasm with sqlite-vec
SQLite Wasm with [sqlite-vec](https://github.com/asg017/sqlite-vec) extension
for vector similarity search, conveniently wrapped as an ES Module.
This is a fork of
[@sqlite.org/sqlite-wasm](https://github.com/sqlite/sqlite-wasm) that includes
the sqlite-vec extension compiled into the WASM build, enabling vector
operations directly in the browser.
## Features
- 🚀 **Full SQLite** with WASM support (SQLite 3.51.0)
- 🔍 **Vector similarity search** using sqlite-vec extension (v0.1.7-alpha.2)
- 📦 **~1.5MB WASM file** (optimized with -O2)
- 🌐 **Browser & Node.js** compatible
- 💾 **OPFS support** for persistent storage
- 🎯 **TypeScript definitions** included
- ✅ **100% API compatible** with @sqlite.org/sqlite-wasm
## Node.js support
> [!Warning]
>
> Node.js is currently only supported for in-memory databases without
> persistence.
## Installation
```bash
npm install sqlite-wasm-vec
```
## sqlite-vec Features
All sqlite-vec functions are available out of the box. Here are comprehensive
examples:
### Basic Setup and Version Check
```javascript
import sqlite3InitModule from 'sqlite-wasm-vec';
// Initialize SQLite with vec extension
const sqlite3 = await sqlite3InitModule();
const db = new sqlite3.oo1.DB(':memory:'); // or use a file path
// Check sqlite-vec version
const vecVersion = db.selectValue('SELECT vec_version()');
console.log('sqlite-vec version:', vecVersion); // e.g., "v0.1.7-alpha.2"
```
### Creating Vector Tables
```javascript
// Create a virtual table for storing vectors
// Specify the vector dimension (e.g., 384 for all-MiniLM-L6-v2 embeddings)
db.exec(`
CREATE VIRTUAL TABLE vec_items USING vec0(
embedding float[384]
)
`);
// You can also create tables with different vector types
db.exec(`
CREATE VIRTUAL TABLE vec_int8_items USING vec0(
embedding int8[128] -- 8-bit integer vectors
)
`);
db.exec(`
CREATE VIRTUAL TABLE vec_bit_items USING vec0(
embedding bit[1024] -- Binary vectors for hamming distance
)
`);
```
### Inserting Vectors
```javascript
// Method 1: Using prepared statements (recommended for multiple inserts)
const vectors = [
{ id: 1, data: [0.1, 0.2, 0.3, 0.4] },
{ id: 2, data: [0.5, 0.6, 0.7, 0.8] },
{ id: 3, data: [0.9, 1.0, 1.1, 1.2] },
];
const stmt = db.prepare(
'INSERT INTO vec_items(rowid, embedding) VALUES (?, ?)',
);
for (const vector of vectors) {
const float32Array = new Float32Array(vector.data);
stmt.bind(vector.id, float32Array.buffer).stepReset();
}
stmt.finalize();
// Method 2: Direct insert with typed arrays
const embedding = new Float32Array(384);
// ... fill embedding with your data ...
db.exec({
sql: 'INSERT INTO vec_items(rowid, embedding) VALUES (?, ?)',
bind: [100, embedding.buffer],
});
```
### Vector Similarity Search
```javascript
// Perform k-NN search using MATCH operator
const queryVector = new Float32Array([0.2, 0.3, 0.4, 0.5]);
// Find 5 most similar vectors
const results = db.selectArrays(
'SELECT rowid, distance FROM vec_items WHERE embedding MATCH ? ORDER BY distance LIMIT 5',
queryVector.buffer,
);
// Results format: [[rowid1, distance1], [rowid2, distance2], ...]
results.forEach(([id, distance]) => {
console.log(`Item ${id}: distance = ${distance}`);
});
// With additional filtering
const filtered = db.selectObjects(
`SELECT
vi.rowid,
vi.distance,
items.title,
items.category
FROM vec_items vi
JOIN items ON items.id = vi.rowid
WHERE
embedding MATCH ? AND
items.category = 'electronics'
ORDER BY distance
LIMIT 10`,
queryVector.buffer,
);
```
### Using Distance Functions
```javascript
// Store vectors for comparison
db.exec('CREATE TEMP TABLE test_vectors (id INTEGER PRIMARY KEY, vec BLOB)');
const v1 = new Float32Array([1.0, 2.0, 3.0]);
const v2 = new Float32Array([4.0, 5.0, 6.0]);
const stmt = db.prepare('INSERT INTO test_vectors(id, vec) VALUES (?, ?)');
stmt.bind(1, v1.buffer).stepReset();
stmt.bind(2, v2.buffer).stepReset();
stmt.finalize();
// Calculate different distance metrics
const l2Distance = db.selectValue(`
SELECT vec_distance_l2(
(SELECT vec FROM test_vectors WHERE id = 1),
(SELECT vec FROM test_vectors WHERE id = 2)
)
`);
const cosineDistance = db.selectValue(`
SELECT vec_distance_cosine(
(SELECT vec FROM test_vectors WHERE id = 1),
(SELECT vec FROM test_vectors WHERE id = 2)
)
`);
const l1Distance = db.selectValue(`
SELECT vec_distance_l1(
(SELECT vec FROM test_vectors WHERE id = 1),
(SELECT vec FROM test_vectors WHERE id = 2)
)
`);
console.log('L2 distance:', l2Distance);
console.log('Cosine distance:', cosineDistance);
console.log('L1 distance:', l1Distance);
```
### Vector Operations
```javascript
// Vector normalization
const normalized = db.selectValue(
'SELECT vec_normalize(?)',
new Float32Array([3.0, 4.0]).buffer,
);
// Vector addition and subtraction
const sum = db.selectValue(
'SELECT vec_add(?, ?)',
new Float32Array([1, 2, 3]).buffer,
new Float32Array([4, 5, 6]).buffer,
);
// Get vector metadata
const length = db.selectValue(
'SELECT vec_length(?)',
new Float32Array([1, 2, 3, 4, 5]).buffer,
); // Returns 5
// Convert vector to JSON for inspection
const jsonArray = db.selectValue(
'SELECT vec_to_json(?)',
new Float32Array([1.5, 2.5, 3.5]).buffer,
); // Returns "[1.5,2.5,3.5]"
```
### Quantization for Memory Efficiency
```javascript
// Quantize float32 vectors to int8 (reduces memory by 4x)
const float32Vec = new Float32Array([0.1, 0.5, 0.9, -0.3]);
const int8Vec = db.selectValue(
'SELECT vec_quantize_int8(?)',
float32Vec.buffer,
);
// Quantize to binary vectors (reduces memory by 32x)
const binaryVec = db.selectValue(
'SELECT vec_quantize_binary(?)',
float32Vec.buffer,
);
// Use quantized vectors for approximate search
db.exec(`
CREATE VIRTUAL TABLE vec_binary USING vec0(
embedding bit[1024]
)
`);
```
### Practical Example: Building a Semantic Search
```javascript
import sqlite3InitModule from 'sqlite-wasm-vec';
async function buildSemanticSearch() {
const sqlite3 = await sqlite3InitModule();
const db = new sqlite3.oo1.DB(':memory:');
// Create tables for documents and their embeddings
db.exec(`
CREATE TABLE documents (
id INTEGER PRIMARY KEY,
title TEXT,
content TEXT,
metadata JSON
);
CREATE VIRTUAL TABLE doc_embeddings USING vec0(
embedding float[384] -- Using 384-dim embeddings
);
`);
// Insert sample documents with their embeddings
const documents = [
{
id: 1,
title: 'Introduction to AI',
content: 'AI is transforming...',
embedding: [
/* ... */
],
},
{
id: 2,
title: 'Machine Learning Basics',
content: 'ML algorithms learn...',
embedding: [
/* ... */
],
},
// ... more documents
];
const insertDoc = db.prepare('INSERT INTO documents VALUES (?, ?, ?, ?)');
const insertEmb = db.prepare(
'INSERT INTO doc_embeddings(rowid, embedding) VALUES (?, ?)',
);
for (const doc of documents) {
insertDoc
.bind(doc.id, doc.title, doc.content, JSON.stringify({}))
.stepReset();
insertEmb.bind(doc.id, new Float32Array(doc.embedding).buffer).stepReset();
}
insertDoc.finalize();
insertEmb.finalize();
// Search function
function search(queryEmbedding, limit = 5) {
return db.selectObjects(
`
SELECT
d.id,
d.title,
d.content,
e.distance
FROM doc_embeddings e
JOIN documents d ON d.id = e.rowid
WHERE e.embedding MATCH ?
ORDER BY e.distance
LIMIT ?
`,
new Float32Array(queryEmbedding).buffer,
limit,
);
}
// Perform a search
const queryVector = [
/* your query embedding */
];
const results = search(queryVector, 10);
return { db, search };
}
```
### Available Functions
All sqlite-vec functions are compiled and available:
**Core Functions:**
- `vec_version()` - Returns the version of the sqlite-vec extension
- `vec_debug()` - Debugging information about the extension
**Vector Creation and Manipulation:**
- `vec_f32(values...)` - Create a float32 vector from values
- `vec_int8(values...)` - Create an int8 vector from values
- `vec_bit(values...)` - Create a bit vector from values
- `vec_slice(vector, start, end)` - Extract a slice of a vector
- `vec_length(vector)` - Get the number of dimensions in a vector
- `vec_type(vector)` - Get the type of a vector (f32, int8, bit)
- `vec_to_json(vector)` - Convert vector to JSON array representation
**Vector Operations:**
- `vec_add(a, b)` - Element-wise addition of two vectors
- `vec_sub(a, b)` - Element-wise subtraction of two vectors
- `vec_normalize(vector)` - Normalize a vector to unit length
**Distance Functions:**
- `vec_distance_l2(a, b)` - Euclidean/L2 distance between vectors
- `vec_distance_cosine(a, b)` - Cosine distance (1 - cosine similarity)
- `vec_distance_l1(a, b)` - Manhattan/L1 distance between vectors
- `vec_distance_hamming(a, b)` - Hamming distance for bit vectors
**Quantization Functions:**
- `vec_quantize_binary(vector)` - Quantize to binary (bit) representation
- `vec_quantize_int8(vector)` - Quantize float32 to int8 representation
## Usage
There are three ways to use SQLite Wasm:
- [in the main thread with a wrapped worker](#in-a-wrapped-worker-with-opfs-if-available)
(🏆 preferred option)
- [in a worker](#in-a-worker-with-opfs-if-available)
- [in the main thread](#in-the-main-thread-without-opfs)
Only the worker versions allow you to use the origin private file system (OPFS)
storage back-end.
### In a wrapped worker (with OPFS if available):
> [!Warning]
>
> For this to work, you need to set the following headers on your server:
>
> `Cross-Origin-Opener-Policy: same-origin`
>
> `Cross-Origin-Embedder-Policy: require-corp`
```js
import { sqlite3Worker1Promiser } from 'sqlite-wasm-vec';
const log = console.log;
const error = console.error;
const initializeSQLite = async () => {
try {
log('Loading and initializing SQLite3 module...');
const promiser = await new Promise((resolve) => {
const _promiser = sqlite3Worker1Promiser({
onready: () => resolve(_promiser),
});
});
log('Done initializing. Running demo...');
const configResponse = await promiser('config-get', {});
log('Running SQLite3 version', configResponse.result.version.libVersion);
const openResponse = await promiser('open', {
filename: 'file:mydb.sqlite3?vfs=opfs',
});
const { dbId } = openResponse;
log(
'OPFS is available, created persisted database at',
openResponse.result.filename.replace(/^file:(.*?)\?vfs=opfs$/, '$1'),
);
// Your SQLite code here.
} catch (err) {
if (!(err instanceof Error)) {
err = new Error(err.result.message);
}
error(err.name, err.message);
}
};
initializeSQLite();
```
The `promiser` object above implements the
[Worker1 API](https://sqlite.org/wasm/doc/trunk/api-worker1.md#worker1-methods).
### In a worker (with OPFS if available):
> [!Warning]
>
> For this to work, you need to set the following headers on your server:
>
> `Cross-Origin-Opener-Policy: same-origin`
>
> `Cross-Origin-Embedder-Policy: require-corp`
```js
// In `main.js`.
const worker = new Worker('worker.js', { type: 'module' });
```
```js
// In `worker.js`.
import sqlite3InitModule from 'sqlite-wasm-vec';
const log = console.log;
const error = console.error;
const start = (sqlite3) => {
log('Running SQLite3 version', sqlite3.version.libVersion);
const db =
'opfs' in sqlite3
? new sqlite3.oo1.OpfsDb('/mydb.sqlite3')
: new sqlite3.oo1.DB('/mydb.sqlite3', 'ct');
log(
'opfs' in sqlite3
? `OPFS is available, created persisted database at ${db.filename}`
: `OPFS is not available, created transient database ${db.filename}`,
);
// Your SQLite code here.
};
const initializeSQLite = async () => {
try {
log('Loading and initializing SQLite3 module...');
const sqlite3 = await sqlite3InitModule({ print: log, printErr: error });
log('Done initializing. Running demo...');
start(sqlite3);
} catch (err) {
error('Initialization error:', err.name, err.message);
}
};
initializeSQLite();
```
The `db` object above implements the
[Object Oriented API #1](https://sqlite.org/wasm/doc/trunk/api-oo1.md).
### In the main thread (without OPFS):
```js
import sqlite3InitModule from 'sqlite-wasm-vec';
const log = console.log;
const error = console.error;
const start = (sqlite3) => {
log('Running SQLite3 version', sqlite3.version.libVersion);
const db = new sqlite3.oo1.DB('/mydb.sqlite3', 'ct');
// Your SQLite code here.
};
const initializeSQLite = async () => {
try {
log('Loading and initializing SQLite3 module...');
const sqlite3 = await sqlite3InitModule({
print: log,
printErr: error,
});
log('Done initializing. Running demo...');
start(sqlite3);
} catch (err) {
error('Initialization error:', err.name, err.message);
}
};
initializeSQLite();
```
The `db` object above implements the
[Object Oriented API #1](https://sqlite.org/wasm/doc/trunk/api-oo1.md).
## Usage with vite
If you are using [vite](https://vitejs.dev/), you need to add the following
config option in `vite.config.js`:
```js
import { defineConfig } from 'vite';
export default defineConfig({
server: {
headers: {
'Cross-Origin-Opener-Policy': 'same-origin',
'Cross-Origin-Embedder-Policy': 'require-corp',
},
},
optimizeDeps: {
exclude: ['sqlite-wasm-vec'],
},
});
```
Check out a
[sample project](https://stackblitz.com/edit/vitejs-vite-ttrbwh?file=main.js)
that shows this in action.
## Demo
See the [demo](https://github.com/sqlite/sqlite-wasm/tree/main/demo) folder for
examples of how to use this in the main thread and in a worker. (Note that the
worker variant requires special HTTP headers, so it can't be hosted on GitHub
Pages.) An example that shows how to use this with vite is available on
[StackBlitz](https://stackblitz.com/edit/vitejs-vite-ttrbwh?file=main.js).
## Projects using this package
See the list of
[npm dependents](https://www.npmjs.com/browse/depended/@sqlite.org/sqlite-wasm)
for this package.
## Deploying a new version
(These steps can only be executed by maintainers.)
1. Update the version number in `package.json` reflecting the current
[SQLite version number](https://sqlite.org/download.html) and add a build
identifier suffix like `-build1`. The complete version number should read
something like `3.41.2-build1`.
1. Run `npm run build` to build the ES Module. This downloads the latest SQLite
Wasm binary and builds the ES Module.
1. Run `npm run deploy` to commit the changes, push to GitHub, and publish the
new version to npm.
## License
Apache 2.0.
## Acknowledgements
This project is based on [SQLite Wasm](https://sqlite.org/wasm), which it
conveniently wraps as an ES Module and publishes to npm as
[`@sqlite.org/sqlite-wasm`](https://www.npmjs.com/package/@sqlite.org/sqlite-wasm).