ai-embed-search
Version:
Smart. Simple. Local. AI-powered semantic search in TypeScript using transformer embeddings. No cloud, no API keys โ 100% offline.
184 lines (140 loc) โข 6.05 kB
Markdown
# ๐ AI-embed-search โ Lightweight AI Semantic Search Engine
> **Smart. Simple. Local.**
> AI-powered semantic search in TypeScript using transformer embeddings. No cloud, no API keys โ 100% offline.
[](https://www.npmjs.com/package/ai-embed-search)
[](https://www.npmjs.com/package/ai-embed-search)
[](https://bundlephobia.com/package/ai-embed-search)
[](https://www.npmjs.com/package/ai-embed-search)
[](./LICENSE)
## ๐ Features
- ๐ง AI-powered semantic understanding
- โจ Super simple API: `init`, `embed`, `search`, `clear`
- โก๏ธ Fast cosine similarity-based retrieval
- ๐ฆ In-memory vector store (no DB required)
- ๐งฉ Save/load vectors to JSON file
- ๐ Search filters, caching & batch embed support
- ๐งฐ CLI-ready architecture
- ๐ Fully offline via `@xenova/transformers` (WASM/Node)
## ๐ฆ Installation
```bash
npm install ai-embed-search
```
or
```bash
yarn add ai-embed-search
```
Requires Node.js โฅ 18 or a modern browser for WASM.
## โก Quick Start
```typescript
import {embed, search, createEmbedder, initEmbedder} from 'ai-embed-search';
const embedder = await createEmbedder();
await initEmbedder({ embedder });
await embed([
{ id: '1', text: 'iPhone 15 Pro Max', meta: { brand: 'Apple', type: 'phone' } },
{ id: '2', text: 'Samsung Galaxy S24 Ultra', meta: { brand: 'Samsung', type: 'phone' } },
{ id: '3', text: 'Apple MacBook Pro', meta: { brand: 'Apple', type: 'laptop' } }
]);
const results = await search('apple phone', 2).exec();
console.log(results);
```
Result:
```typescript
[
{ id: '1', text: 'iPhone 15 Pro Max', score: 0.95, meta: { brand: 'Apple', type: 'phone' } },
{ id: '3', text: 'Apple MacBook Pro', score: 0.85, meta: { brand: 'Apple', type: 'laptop' } }
]
```
### ๐ง 1. Initialize the Embedding Model
```typescript
import { createEmbedder, initEmbedder } from 'ai-embed-search';
const embedder = await createEmbedder();
await initEmbedder({ embedder });
```
Loads the MiniLM model via @xenova/transformers. Required once at startup.
### ๐ฅ 2. Add Items to the Vector Store
```typescript
import { embed } from 'ai-embed-search';
await embed([
{ id: 'a1', text: 'Tesla Model S' },
{ id: 'a2', text: 'Electric Vehicle by Tesla' }
]);
```
Embeds and stores vector representations of the given items.
### ๐ 3. Perform Semantic Search
```typescript
import { search } from 'ai-embed-search';
const results = await search('fast electric car', 3).exec();
```
Returns:
```typescript
[
{ id: 'a1', text: 'Tesla Model S', score: 0.95 },
{ id: 'a2', text: 'Electric Vehicle by Tesla', score: 0.85 }
]
```
### ๐ฆ 4. Search with Metadata
You can add metadata to each item:
```typescript
const laptops = await search('computer', 5)
.filter(r => r.meta?.type === 'laptop')
.exec();
```
### ๐พ 5. Search with Cached Embeddings (Advanced)
You can store precomputed embeddings in your own DB or file:
```typescript
const precomputed = {
id: 'x1',
text: 'Apple Watch Series 9',
vector: [0.11, 0.32, ...] // 384-dim array
};
```
Then use cosine similarity to search across them, or build your own vector store using ai-embed-search functions.
### ๐งน 6. Clear the Vector Store
```typescript
import { removeVector, clearVectors } from 'ai-embed-search';
removeVector('a1'); // Remove by ID
clearVectors(); // Clear all vectors
```
### ๐ค 7. Find Similar Items
You can retrieve the most semantically similar items to an existing one in the vector store:
```typescript
import { getSimilarItems } from 'ai-embed-search';
const similar = await getSimilarItems('1', 3);
console.log(similar);
```
Result:
```typescript
[
{ id: '2', text: 'Samsung Galaxy S24 Ultra smartphone', score: 0.93 },
{ id: '3', text: 'Apple MacBook Air M3 laptop', score: 0.87 },
{ id: '5', text: 'Dell XPS 13 ultrabook', score: 0.85 }
]
```
This is useful for recommendation systems, "related items" features, or clustering.
## ๐ API Reference
### `initEmbedder()`
Initializes the embedding model. Must be called once before using `embed` or `search`.
### `embed(items: { id: string, text: string }[])`
Embeds and stores the provided items in the vector store. Each item must have a unique `id` and `text`.
### `search(query: string, limit: number)`
Performs a semantic search for the given query. Returns up to `limit` results sorted by similarity score (default is 5).
### `getSimilarItems(id: string,, limit: number)`
Finds the most similar items to the one with the given `id`. Returns up to `limit` results sorted by similarity score.
### `cacheFor(limit: number)`
Caches the embeddings for the next `limit` search queries. This is useful for optimizing performance when you know you'll be searching multiple times.
### `clearStore()`
Clears all embedded data from the vector store, freeing up memory.
## ๐ง Development
- Model: [MiniLM](https://huggingface.co/xenova/bert-base-uncased) via `@xenova/transformers`
- Vector type: 384-dim float32 array
- Similarity: Cosine similarity
- Storage: In-memory vector store (no database required)
- On-premises: Fully offline, no cloud dependencies
## ๐ SEO Keywords
ai search, semantic search, local ai search, vector search, transformer embeddings, cosine similarity, open source search engine, text embeddings, in-memory search, local search engine, typescript search engine, fast npm search, embeddings in JS, ai search npm package
## License
MIT ยฉ 2025 Peter Sibirtsev
## Contributing
Contributions are welcome! Please open an issue or submit a pull request.