@prism-lang/confidence
Version:
Confidence extraction library for Prism - standardized patterns for extracting confidence values from LLMs and other sources
246 lines (190 loc) ⢠6.76 kB
Markdown
Confidence extraction library for Prism - standardized patterns for extracting confidence values from LLMs and other sources.
š **[Full Documentation](https://docs.prismlang.dev/docs/confidence/)** | š§ **[API Reference](https://docs.prismlang.dev/docs/api/confidence/)** | š” **[Examples](https://docs.prismlang.dev/docs/examples/confidence-extraction)**
Since most LLM providers (except OpenAI) don't provide log probabilities, this library focuses on practical confidence extraction methods:
- **Consistency-based extraction** - Sample multiple times and measure agreement
- **Response analysis** - Detect hedging language and certainty markers
- **Structured parsing** - Extract explicit confidence values from responses
- **Domain calibration** - Adjust confidence based on domain-specific knowledge
- **Ensemble methods** - Combine multiple confidence signals
## Installation
```bash
npm install @prism-lang/confidence
```
## Quick Start
### Level 1: Simple API
```typescript
import { confidence } from '@prism-lang/confidence';
// Extract confidence from any response
const response = "I'm fairly certain this is a SQL injection vulnerability.";
const result = await confidence.extract(response);
console.log(result.value); // 0.72
console.log(result.explanation); // "Response analysis confidence: 72.0% based on..."
```
```typescript
import { confidence } from '@prism-lang/confidence';
// Specify extraction method
const result = await confidence.extractWithOptions(response, {
method: 'response_analysis',
checkHedging: true,
checkCertainty: true
});
```
```typescript
import { ConfidenceExtractor, ConfidenceEnsemble, SecurityCalibrator } from '@prism-lang/confidence';
const extractor = new ConfidenceExtractor();
const ensemble = new ConfidenceEnsemble({
weights: { consistency: 0.4, analysis: 0.3, structured: 0.3 }
});
// Multiple extraction methods
const signals = {
consistency: await extractor.fromConsistency(async () => llm(prompt), { samples: 5 }),
analysis: await extractor.fromResponseAnalysis(response),
structured: await extractor.fromStructuredResponse(response)
};
// Combine signals
const combined = ensemble.combine(signals);
// Apply domain calibration
const calibrator = new SecurityCalibrator();
const final = await calibrator.calibrate(combined.value, {
type: 'sql_injection',
codeFeatures: ['parameterized_queries']
});
```
```typescript
const sampler = async () => llm("Is this code secure?");
const result = await extractor.fromConsistency(sampler, {
samples: 5,
temperature: [0.1, 0.3, 0.5, 0.7, 0.9]
});
```
```typescript
const result = await extractor.fromResponseAnalysis(llmResponse, {
checkHedging: true, // "might be", "possibly"
checkCertainty: true, // "definitely", "certainly"
checkSpecificity: true, // Specific details vs vague
checkCompleteness: true // Response length and structure
});
```
```typescript
// Automatically detects patterns like:
// - "confidence: 85%"
// - "certainty: high"
// - "7/10 confident"
const result = await extractor.fromStructuredResponse(response);
```
```typescript
// Pre-built calibrators
import { calibrators } from '@prism-lang/confidence';
const calibrated = await calibrators.security.calibrate(0.8, {
type: 'sql_injection',
codeComplexity: 'high'
});
// Custom calibrator
class MyCalibrator extends DomainCalibrator {
// ... implement domain-specific logic
}
```
```typescript
const budget = new ConfidenceBudgetManager(2.5); // Require total confidence of 2.5
budget.add(result1, 0.8);
budget.add(result2, 0.9);
if (!budget.met()) {
const result3 = await getThirdOpinion();
budget.add(result3, 0.9);
}
```
```typescript
const contract = new ConfidenceContractManager({
security_check: 0.9,
performance_check: 0.7,
style_check: 0.5
});
const verification = contract.verify(results);
if (!verification.passed) {
console.log('Failed checks:', verification.failures);
}
```
```typescript
const differential = new DifferentialConfidenceManager();
differential.setAspect('disease_identification', 0.9);
differential.setAspect('severity_assessment', 0.6);
differential.setAspect('treatment_selection', 0.4);
const highest = differential.getHighest();
// Use appropriate confidence for each decision
```
```typescript
const sensorExtractor = new SensorConfidenceExtractor();
const confidence = sensorExtractor.fromSensor(reading, {
age: sensor.daysSinceCalibration(),
environment: { temperature: 25, humidity: 60 },
history: 0.95 // Historical accuracy
});
```
```typescript
const apiExtractor = new APIConfidenceExtractor();
const confidence = apiExtractor.fromAPIReliability({
provider: 'weather.com',
historicalAccuracy: 0.85,
latency: 250,
lastFailure: new Date('2024-01-01')
});
```
```prism
import confidence from "@prism-lang/confidence"
// Simple extraction
response = llm("Analyze this code")
conf = confidence.extract(response)
result = response ~> conf
// Consistency-based
samples = confidence.sample(
prompt => llm(prompt),
"Is this secure?",
n=5
)
conf = confidence.from_consistency(samples)
result = samples[0] ~> conf
// With calibration
calibrator = confidence.calibrators.security
raw_conf = confidence.extract(response)
calibrated = calibrator.calibrate(raw_conf, {type: "sql_injection"})
result = response ~> calibrated
// Confidence-aware control flow
uncertain if (result) {
high { deploy() }
medium { require_review() }
low { escalate() }
}
```
See the `/examples` directory for complete examples including:
- Security vulnerability analysis with confidence
- Medical diagnosis with differential confidence
- Sensor fusion with temporal decay
- API response validation
- [`@prism-lang/core`](https://www.npmjs.com/package/@prism-lang/core) - Core language implementation
- [`@prism-lang/llm`](https://www.npmjs.com/package/@prism-lang/llm) - LLM provider integrations
- [`@prism-lang/validator`](https://www.npmjs.com/package/@prism-lang/validator) - Validation toolkit
- [`@prism-lang/cli`](https://www.npmjs.com/package/@prism-lang/cli) - Command-line interface
- [`@prism-lang/repl`](https://www.npmjs.com/package/@prism-lang/repl) - Interactive REPL
See [CONTRIBUTING.md](../../CONTRIBUTING.md) for guidelines.
MIT