@mastra/core
Version:
Mastra is a framework for building AI-powered applications and agents with a modern TypeScript stack.
101 lines (69 loc) • 3.65 kB
Markdown
# Content similarity scorer
The `createContentSimilarityScorer()` function measures the textual similarity between two strings, providing a score that indicates how closely they match. It supports configurable options for case sensitivity and whitespace handling.
## Parameters
The `createContentSimilarityScorer()` function accepts a single options object with the following properties:
**ignoreCase** (`boolean`): Whether to ignore case differences when comparing strings. (Default: `true`)
**ignoreWhitespace** (`boolean`): Whether to normalize whitespace when comparing strings. (Default: `true`)
This function returns an instance of the MastraScorer class. See the [MastraScorer reference](https://mastra.ai/reference/evals/mastra-scorer) for details on the `.run()` method and its input/output.
## `.run()` returns
**runId** (`string`): The id of the run (optional).
**preprocessStepResult** (`object`): Object with processed input and output: { processedInput: string, processedOutput: string }
**analyzeStepResult** (`object`): Object with similarity: { similarity: number }
**score** (`number`): Similarity score (0-1) where 1 indicates perfect similarity.
## Scoring details
The scorer evaluates textual similarity through character-level matching and configurable text normalization.
### Scoring Process
1. Normalizes text:
- Case normalization (if ignoreCase: true)
- Whitespace normalization (if ignoreWhitespace: true)
2. Compares processed strings using string-similarity algorithm:
- Analyzes character sequences
- Aligns word boundaries
- Considers relative positions
- Accounts for length differences
Final score: `similarity_value * scale`
## Example
Evaluate textual similarity between expected and actual agent outputs:
```typescript
import { runEvals } from '@mastra/core/evals'
import { createContentSimilarityScorer } from '@mastra/evals/scorers/prebuilt'
import { myAgent } from './agent'
const scorer = createContentSimilarityScorer()
const result = await runEvals({
data: [
{
input: 'Summarize the benefits of TypeScript',
groundTruth:
'TypeScript provides static typing, better tooling support, and improved code maintainability.',
},
{
input: 'What is machine learning?',
groundTruth:
'Machine learning is a subset of AI that enables systems to learn from data without explicit programming.',
},
],
scorers: [scorer],
target: myAgent,
onItemComplete: ({ scorerResults }) => {
console.log({
score: scorerResults[scorer.id].score,
groundTruth: scorerResults[scorer.id].groundTruth,
})
},
})
console.log(result.scores)
```
For more details on `runEvals`, see the [runEvals reference](https://mastra.ai/reference/evals/run-evals).
To add this scorer to an agent, see the [Scorers overview](https://mastra.ai/docs/evals/overview) guide.
### Score interpretation
A similarity score between 0 and 1:
- **1.0**: Perfect match – content is nearly identical.
- **0.7–0.9**: High similarity – minor differences in word choice or structure.
- **0.4–0.6**: Moderate similarity – general overlap with noticeable variation.
- **0.1–0.3**: Low similarity – few common elements or shared meaning.
- **0.0**: No similarity – completely different content.
## Related
- [Completeness Scorer](https://mastra.ai/reference/evals/completeness)
- [Textual Difference Scorer](https://mastra.ai/reference/evals/textual-difference)
- [Answer Relevancy Scorer](https://mastra.ai/reference/evals/answer-relevancy)
- [Keyword Coverage Scorer](https://mastra.ai/reference/evals/keyword-coverage)