@lenml/tokenizer-gpt2
Version:
gpt2 tokenizer for NodeJS/Browser
29 lines (19 loc) • 606 B
Markdown
a tokenizer.
> based on `@lenml/tokenizers`
```ts
import { fromPreTrained } from "@lenml/tokenizer-gpt2";
const tokenizer = fromPreTrained();
console.log(
"encode()",
tokenizer.encode("Hello, my dog is cute", null, {
add_special_tokens: true,
})
);
console.log("_encode_text", tokenizer._encode_text("Hello, my dog is cute"));
```
Complete api parameters and usage can be found in [transformer.js tokenizers document](https://huggingface.co/docs/transformers.js/v3.0.0/api/tokenizers)
Apache-2.0