easy-embeddings
Version:
Easy, fast and WASM/WebGPU accelerated vector embedding for the web platform. Locally via ONNX/Transformers.js and via API. Compatible with Browsers, Workers, Web Extensions, Node.js & co.
134 lines (108 loc) • 4.04 kB
Markdown
<span align="center">
# easy-embeddings
### Easy vector embeddings for the web platform. Use open source embedding models locally or an API (OpenAI, Voyage, Mixedbread).
> **🔥 Please note:** This project relies on the currently unreleased V3 branch of `@xenova/transformers.js` combined with a patched, development
> version of the `onnxruntime-web` to enable the latest, bleeding edge features (WebGPU and WASM acceleration) alongside unparalleled
> compatibility (even works in Web Extensions Service Workers).
</span>
## 📚 Install
`npm/yarn/bun install easy-embeddings`
## ⚡ Use
### Remote inference (call an API)
#### Single text vector embedding
```ts
import { embed } from "easy-embeddings";
// single embedding, german embedding model
const embedding: EmbeddingResponse = await embed("Hallo, Welt!", "mixedbread-ai", {
model: "mixedbread-ai/deepset-mxbai-embed-de-large-v1",
normalized: true,
dimensions: 512,
}, { apiKey: import.meta.env[`mixedbread-ai_api_key`] })
```
#### Multi-text vector embeddings
```ts
import { embed } from "easy-embeddings";
// single embedding, german embedding model
const embedding: EmbeddingResponse = await embed(["Hello", "World"], "openai", {
model: "text-embedding-3-small"
}, { apiKey: import.meta.env[`openai_api_key`] })
```
### Local inference
```ts
import { embed } from "easy-embeddings";
// single embedding, german embedding model
const embedResult = await embed(
["query: Foo", "passage: Bar"],
"local",
{
// https://huggingface.co/intfloat/multilingual-e5-small
model: "Xenova/multilingual-e5-small",
modelParams: {
pooling: "mean",
normalize: true, // so a single dot product of two vectors is enough to calculate a similarity score
quantize: true, // use a quantized variant (more efficient, little less accurate)
},
},
{
modelOptions: {
hideOnnxWarnings: false, // show warnings as errors in case ONNX runtime has a bad time
allowRemoteModels: false, // do not download remote models from huggingface.co
allowLocalModels: true,
localModelPath: "/models", // loads the model from public dir subfolder "models"
onnxProxy: false,
},
},
);
```
#### Advanced: Using a custom WASM runtime loader
```ts
import { embed } from "easy-embeddings";
// @ts-ignore
import getModule from "./public/ort-wasm-simd-threaded.jsep";
// single embedding, german embedding model
const embedResult = await embed(
["query: Foo", "passage: Bar"],
"local",
{
// https://huggingface.co/intfloat/multilingual-e5-small
model: "Xenova/multilingual-e5-small",
modelParams: {
pooling: "mean",
normalize: true, // so a single dot product of two vectors is enough to calculate a similarity score
quantize: true, // use a quantized variant (more efficient, little less accurate)
},
},
{
importWasmModule: async (
_mjsPathOverride: string,
_wasmPrefixOverride: string,
_threading: boolean,
) => {
return [
undefined,
async (moduleArgs = {}) => {
return await getModule(moduleArgs);
},
];
},
modelOptions: {
hideOnnxWarnings: false, // show warnings as errors in case ONNX runtime has a bad time
allowRemoteModels: false, // do not download remote models from huggingface.co
allowLocalModels: true,
localModelPath: "/models", // loads the model from public dir subfolder "models"
onnxProxy: false,
},
},
);
```
### Download models locally
You might want to write and execute a script to manually download a model locally:
```ts
import { downloadModel } from "easy-embeddings/tools";
// downloads the model into the models folder
await downloadModel('Xenova/multilingual-e5-small', 'public/models')
```
## Help improve this project!
### Setup
Clone this repo, install the dependencies (`bun` is recommended for speed),
and run `npm run test` to verify the installation was successful. You may want to play with the experiments.