gliner
Version:
This is a GLiNER inference engine
145 lines (109 loc) • 4.63 kB
Markdown
is a TypeScript-based inference engine for running GLiNER (Generalist and Lightweight Named Entity Recognition) models. GLiNER can identify any entity type using a bidirectional transformer encoder, offering a practical alternative to traditional NER models and large language models.
<p align="center">
<a href="https://arxiv.org/abs/2311.08526">📄 Paper</a>
<span> • </span>
<a href="https://discord.gg/Y2yVxpSQnG">📢 Discord</a>
<span> • </span>
<a href="https://huggingface.co/spaces/urchade/gliner_mediumv2.1">🤗 Demo</a>
<span> • </span>
<a href="https://huggingface.co/models?library=gliner&sort=trending">🤗 Available models</a>
<span> • </span>
<a href="https://github.com/urchade/GLiNER">🧬 Official Repo</a>
</p>
- Flexible entity recognition without predefined categories
- Lightweight and fast inference
- Easy integration with web applications
- TypeScript support for better developer experience
```bash
npm install gliner
```
```javascript
const gliner = new Gliner({
tokenizerPath: "onnx-community/gliner_small-v2",
onnxSettings: {
modelPath: "public/model.onnx", // Can be a string path or Uint8Array/ArrayBufferLike
executionProvider: "webgpu", // Optional: "cpu", "wasm", "webgpu", or "webgl"
wasmPaths: "path/to/wasm", // Optional: path to WASM binaries
multiThread: true, // Optional: enable multi-threading (for wasm/cpu providers)
maxThreads: 4, // Optional: specify number of threads (for wasm/cpu providers)
fetchBinary: true, // Optional: prefetch binary from wasmPaths
},
transformersSettings: {
// Optional
allowLocalModels: true,
useBrowserCache: true,
},
maxWidth: 12, // Optional
modelType: "gliner", // Optional
});
await gliner.initialize();
const texts = ["Your input text here"];
const entities = ["city", "country", "person"];
const options = {
flatNer: false, // Optional
threshold: 0.1, // Optional
multiLabel: false, // Optional
};
const results = await gliner.inference({
texts,
entities,
...options,
});
console.log(results);
```
The inference results will be returned in the following format:
```javascript
// For a single text input:
[
{
spanText: "New York", // The extracted entity text
start: 10, // Start character position
end: 18, // End character position
label: "city", // Entity type
score: 0.95, // Confidence score
},
// ... more entities
];
// For multiple text inputs, you'll get an array of arrays
```
To use GLiNER models in a web environment, you need an ONNX format model. You can:
1. Search for pre-converted models on [HuggingFace](https://huggingface.co/onnx-community?search_models=gliner)
2. Convert a model yourself using the [official Python script](https://github.com/urchade/GLiNER/blob/main/convert_to_onnx.py)
Use the `convert_to_onnx.py` script with the following arguments:
- `model_path`: Location of the GLiNER model
- `save_path`: Where to save the ONNX file
- `quantize`: Set to True for IntU8 quantization (optional)
Example:
```bash
python convert_to_onnx.py --model_path /path/to/your/model --save_path /path/to/save/onnx --quantize True
```
## 🌟 Use Cases
GLiNER.js offers versatile entity recognition capabilities across various domains:
1. **Enhanced Search Query Understanding**
2. **Real-time PII Detection**
3. **Intelligent Document Parsing**
4. **Content Summarization and Insight Extraction**
5. **Automated Content Tagging and Categorization**
...
## 🔧 Areas for Improvement
- [ ] Further optimize inference speed
- [ ] Add support for token-based GLiNER architecture
- [ ] Implement bi-encoder GLiNER architecture for better scalability
- [ ] Enable model training capabilities
- [ ] Provide more usage examples
## Creating a PR
- for any changes, remember to run `pnpm changeset`, otherwise there will not be a version bump and the PR Github Action will fail.
## 🙏 Acknowledgements
- [GLiNER original authors](https://github.com/urchade/GLiNER)
- [ONNX Runtime Web](https://github.com/microsoft/onnxruntime)
- [Transformers.js](https://github.com/xenova/transformers.js)
## 📞 Support
For questions and support, please join our [Discord community](https://discord.gg/ApZvyNZU) or open an issue on GitHub.
GLiNER.js