clustex
Version:
Clustex is a lightweight text classification package designed to efficiently categorize text based on similarity metrics and learned token weights.
108 lines (74 loc) • 2.96 kB
Markdown
Clustex is a lightweight text classification package designed to efficiently categorize text based on similarity metrics and learned token weights.
`new Classifier(classifications: string[], learningRate?: number, threshold?: number)`
Creates a new classifier instance.
- Parameters:
- `classifications` (string[]): An array of classification labels.
- `learningRate` (number, optional): Influences how quickly token weights adjust. Default is `1`.
- `threshold` (number, optional): The minimum similarity score for tokens to influence classification. Default is `0.8`.
```js
const classifier = new Classifier(["positive", "negative"], 0.5, 0.9);
```
`classify(text: string) → string`
Determines the most probable classification for the provided `text`.
- Parameters:
- `text` (string): The input text to be classified.
- Returns:
- The classification label with the highest probability.
```js
const classifier = new Classifier(["positive", "negative"]);
classifier.example("This is amazing!", "positive");
console.log(classifier.classify("Such a wonderful day!"));
// Output: "positive"
```
`chance(text: string) → Object`
Returns an object with classification probabilities for the given `text`.
- Parameters:
- `text` (string): The input text to analyze.
- Returns:
- An object mapping classification labels to their probabilities.
```js
const classifier = new Classifier(["spam", "normal"]);
classifier.example("Buy now!", "spam");
console.log(classifier.chance("Limited-time offer!"));
// Output: { spam: 0.85, normal: 0.15 }
```
`example(text: string, classification: string)`
Trains the classifier with an example sentence and its corresponding classification.
- Parameters:
- `text` (string): The example sentence.
- `classification` (string): The label associated with the sentence.
- Returns:
- No explicit return, but updates the internal token weights.
```js
const classifier = new Classifier(["positive", "negative"]);
classifier.example("This is fantastic!", "positive");
classifier.example("I hate this.", "negative");
```
`dataset(name: string, iterations: number = 1)`
Loads a predefined dataset and trains the classifier using its entries.
- Parameters:
- `name` (string): The dataset name.
- `iterations` (number, optional): Number of training iterations. Defaults to 1.
- Returns:
- No explicit return, but updates internal token storage with learned weights.
```js
const classifier = new Classifier();
classifier.dataset("news", 3);
```
`datasets`
A static array listing available dataset names.
```js
console.log(Classifier.datasets);
// Output: ["news", "spam", "tone", "importance"]
```
MIT License