@mastra/core
Version:
Mastra is a framework for building AI-powered applications and agents with a modern TypeScript stack.
119 lines (85 loc) • 3.79 kB
Markdown
# Tone consistency scorer
The `createToneScorer()` function evaluates the text's emotional tone and sentiment consistency. It can operate in two modes: comparing tone between input/output pairs or analyzing tone stability within a single text.
## Parameters
The `createToneScorer()` function doesn't take any options.
This function returns an instance of the MastraScorer class. See the [MastraScorer reference](https://mastra.ai/reference/evals/mastra-scorer) for details on the `.run()` method and its input/output.
## `.run()` returns
**runId** (`string`): The id of the run (optional).
**analyzeStepResult** (`object`): Object with tone metrics: { responseSentiment: number, referenceSentiment: number, difference: number } (for comparison mode) OR { avgSentiment: number, sentimentVariance: number } (for stability mode)
**score** (`number`): Tone consistency/stability score (0-1).
`.run()` returns a result in the following shape:
```typescript
{
runId: string,
analyzeStepResult: {
responseSentiment?: number,
referenceSentiment?: number,
difference?: number,
avgSentiment?: number,
sentimentVariance?: number,
},
score: number
}
```
## Scoring details
The scorer evaluates sentiment consistency through tone pattern analysis and mode-specific scoring.
### Scoring Process
1. Analyzes tone patterns:
- Extracts sentiment features
- Computes sentiment scores
- Measures tone variations
2. Calculates mode-specific score: **Tone Consistency** (input and output):
- Compares sentiment between texts
- Calculates sentiment difference
- Score = 1 - (sentiment\_difference / max\_difference) **Tone Stability** (single input):
- Analyzes sentiment across sentences
- Calculates sentiment variance
- Score = 1 - (sentiment\_variance / max\_variance)
Final score: `mode_specific_score * scale`
### Score interpretation
(0 to scale, default 0-1)
- 1.0: Perfect tone consistency/stability
- 0.7-0.9: Strong consistency with minor variations
- 0.4-0.6: Moderate consistency with noticeable shifts
- 0.1-0.3: Poor consistency with major tone changes
- 0.0: No consistency - completely different tones
### `analyzeStepResult`
Object with tone metrics:
- **responseSentiment**: Sentiment score for the response (comparison mode).
- **referenceSentiment**: Sentiment score for the input/reference (comparison mode).
- **difference**: Absolute difference between sentiment scores (comparison mode).
- **avgSentiment**: Average sentiment across sentences (stability mode).
- **sentimentVariance**: Variance of sentiment across sentences (stability mode).
## Example
Evaluate tone consistency between related agent responses:
```typescript
import { runEvals } from '@mastra/core/evals'
import { createToneScorer } from '@mastra/evals/scorers/prebuilt'
import { myAgent } from './agent'
const scorer = createToneScorer()
const result = await runEvals({
data: [
{
input: 'How was your experience with our service?',
groundTruth: 'The service was excellent and exceeded expectations!',
},
{
input: 'Tell me about the customer support',
groundTruth: 'The support team was friendly and very helpful.',
},
],
scorers: [scorer],
target: myAgent,
onItemComplete: ({ scorerResults }) => {
console.log({
score: scorerResults[scorer.id].score,
})
},
})
console.log(result.scores)
```
For more details on `runEvals`, see the [runEvals reference](https://mastra.ai/reference/evals/run-evals).
To add this scorer to an agent, see the [Scorers overview](https://mastra.ai/docs/evals/overview) guide.
## Related
- [Content Similarity Scorer](https://mastra.ai/reference/evals/content-similarity)
- [Toxicity Scorer](https://mastra.ai/reference/evals/toxicity)