@mastra/core
Version:
Mastra is a framework for building AI-powered applications and agents with a modern TypeScript stack.
128 lines (92 loc) • 4.03 kB
Markdown
# Keyword coverage scorer
The `createKeywordCoverageScorer()` function evaluates how well an LLM's output covers the important keywords from the input. It analyzes keyword presence and matches while ignoring common words and stop words.
## Parameters
The `createKeywordCoverageScorer()` function doesn't take any options.
This function returns an instance of the MastraScorer class. See the [MastraScorer reference](https://mastra.ai/reference/evals/mastra-scorer) for details on the `.run()` method and its input/output.
## `.run()` returns
**runId** (`string`): The id of the run (optional).
**preprocessStepResult** (`object`): Object with extracted keywords: { referenceKeywords: Set\<string>, responseKeywords: Set\<string> }
**analyzeStepResult** (`object`): Object with keyword coverage: { totalKeywords: number, matchedKeywords: number }
**score** (`number`): Coverage score (0-1) representing the proportion of matched keywords.
`.run()` returns a result in the following shape:
```typescript
{
runId: string,
extractStepResult: {
referenceKeywords: Set<string>,
responseKeywords: Set<string>
},
analyzeStepResult: {
totalKeywords: number,
matchedKeywords: number
},
score: number
}
```
## Scoring details
The scorer evaluates keyword coverage by matching keywords with the following features:
- Common word and stop word filtering (e.g., "the", "a", "and")
- Case-insensitive matching
- Word form variation handling
- Special handling of technical terms and compound words
### Scoring Process
1. Processes keywords from input and output:
- Filters out common words and stop words
- Normalizes case and word forms
- Handles special terms and compounds
2. Calculates keyword coverage:
- Matches keywords between texts
- Counts successful matches
- Computes coverage ratio
Final score: `(matched_keywords / total_keywords) * scale`
### Score interpretation
A coverage score between 0 and 1:
- **1.0**: Complete coverage – all keywords present.
- **0.7–0.9**: High coverage – most keywords included.
- **0.4–0.6**: Partial coverage – some keywords present.
- **0.1–0.3**: Low coverage – few keywords matched.
- **0.0**: No coverage – no keywords found.
### Special Cases
The scorer handles several special cases:
- Empty input/output: Returns score of 1.0 if both empty, 0.0 if only one is empty
- Single word: Treated as a single keyword
- Technical terms: Preserves compound technical terms (e.g., "React.js", "machine learning")
- Case differences: "JavaScript" matches "javascript"
- Common words: Ignored in scoring to focus on meaningful keywords
## Example
Evaluate keyword coverage between input queries and agent responses:
```typescript
import { runEvals } from '@mastra/core/evals'
import { createKeywordCoverageScorer } from '@mastra/evals/scorers/prebuilt'
import { myAgent } from './agent'
const scorer = createKeywordCoverageScorer()
const result = await runEvals({
data: [
{
input: 'JavaScript frameworks like React and Vue',
},
{
input: 'TypeScript offers interfaces, generics, and type inference',
},
{
input:
'Machine learning models require data preprocessing, feature engineering, and hyperparameter tuning',
},
],
scorers: [scorer],
target: myAgent,
onItemComplete: ({ scorerResults }) => {
console.log({
score: scorerResults[scorer.id].score,
})
},
})
console.log(result.scores)
```
For more details on `runEvals`, see the [runEvals reference](https://mastra.ai/reference/evals/run-evals).
To add this scorer to an agent, see the [Scorers overview](https://mastra.ai/docs/evals/overview) guide.
## Related
- [Completeness Scorer](https://mastra.ai/reference/evals/completeness)
- [Content Similarity Scorer](https://mastra.ai/reference/evals/content-similarity)
- [Answer Relevancy Scorer](https://mastra.ai/reference/evals/answer-relevancy)
- [Textual Difference Scorer](https://mastra.ai/reference/evals/textual-difference)