@flatfile/improv
Version:
A powerful TypeScript library for building AI agents with multi-threaded conversations, tool execution, and event handling capabilities
226 lines (183 loc) • 5.28 kB
Markdown
# Creating Pieces with Improv
This guide shows how to create reusable pieces for your AI workflows with separate evaluation datasets.
## What are Pieces?
Pieces are reusable AI workflow components that encapsulate:
- A specific task or operation
- Configuration (models, temperature, tools, etc.)
- Input/output schemas
- Metadata
## Basic Structure
```typescript
import { PieceDefinition } from '@flatfile/improv';
import { z } from 'zod';
// Define your output schema
const outputSchema = z.object({
category: z.enum(['bug', 'feature', 'question']),
priority: z.enum(['low', 'medium', 'high'])
});
// Define your piece
export const classifyIssue: PieceDefinition<z.infer<typeof outputSchema>> = {
name: "classify_issue",
play: (groove) => {
const issue = groove.feelVibe("issue");
return `Classify this GitHub issue: ${issue}`;
},
config: {
outputSchema,
temperature: 0.1,
systemPrompt: "You are an expert at triaging GitHub issues."
},
meta: {
version: "1.0.0",
description: "Classifies GitHub issues by type and priority",
author: "Your Team"
}
};
```
## Separating Evaluation Data
To prevent evaluation datasets from being bundled in production, keep them in separate files:
### src/pieces/classify-issue/index.ts
```typescript
// Production piece definition
export const classifyIssue: PieceDefinition<ClassifyOutput> = {
// ... piece definition
};
```
### src/pieces/classify-issue/eval.ts
```typescript
// Evaluation data - NOT imported in production
export const classifyIssueEvalDataset = [
{
input: "App crashes when clicking submit",
expected: { category: "bug", priority: "high" }
},
{
input: "Add dark mode support",
expected: { category: "feature", priority: "medium" }
}
];
// Evaluation function
export async function evaluateClassifyIssue(driver: ThreadDriver) {
// ... evaluation logic
}
```
## Using Pieces in Production
```typescript
import { Gig } from '@flatfile/improv';
import { classifyIssue } from './pieces/classify-issue';
// Note: Do NOT import from './pieces/classify-issue/eval'
const workflow = new Gig({
label: "Issue Triage",
driver: yourDriver
});
workflow.add(classifyIssue);
```
## Running Evaluations (Development Only)
```typescript
// In test/evaluation files only
import { classifyIssue } from '../src/pieces/classify-issue';
import { evaluateClassifyIssue } from '../src/pieces/classify-issue/eval';
async function runEvals() {
const results = await evaluateClassifyIssue(testDriver);
console.log(`Accuracy: ${results.accuracy}`);
}
```
## Best Practices
1. **File Organization**
```
src/
├── pieces/
│ ├── classify-issue/
│ │ ├── index.ts # Piece definition
│ │ └── eval.ts # Evaluation data (dev only)
│ └── extract-entities/
│ ├── index.ts
│ └── eval.ts
└── workflows/
└── triage.ts # Uses pieces
```
2. **Bundle Configuration**
- Ensure your bundler excludes `eval.ts` files
- Use dynamic imports for evaluation code if needed
- Consider using environment variables to conditionally load eval data
3. **TypeScript Configuration**
```json
{
"compilerOptions": {
// ... your options
},
"exclude": [
"**/*.eval.ts",
"**/eval.ts"
]
}
```
## Example: Agent Piece with Tools
```typescript
import { PieceDefinition } from '@flatfile/improv';
import { Tool } from '@flatfile/improv';
const searchTool = new Tool({
name: "search_docs",
description: "Search documentation",
parameters: z.object({
query: z.string()
}),
executeFn: async ({ query }) => {
// Implementation
return { results: [] };
}
});
export const researchPiece: PieceDefinition = {
name: "research",
play: (groove) => {
const question = groove.feelVibe("question");
return `Research and answer: ${question}`;
},
config: {
tools: [searchTool],
instructions: [
{ instruction: "Always cite sources", priority: 1 }
]
}
};
```
## Integration with Evaluation Frameworks
### Braintrust Example
```typescript
// src/pieces/classify/eval.ts
import { Eval } from 'braintrust';
import { classifyIssue } from './index';
export async function evaluateWithBraintrust(driver: ThreadDriver) {
return Eval("Issue Classification", {
data: () => evalDataset,
task: async ({ input, expected }) => {
// Run piece and compare output
}
});
}
```
### Custom Evaluation
```typescript
export async function evaluate(piece: PieceDefinition, dataset: EvalCase[]) {
const results = [];
for (const { input, expected } of dataset) {
const output = await runPiece(piece, input);
results.push({
input,
expected,
actual: output,
correct: deepEqual(output, expected)
});
}
return {
accuracy: results.filter(r => r.correct).length / results.length,
results
};
}
```
## See Examples
Check out the `example/pieces/` directory for complete examples of:
- Sentiment analysis piece
- Request classification piece
- Research agent piece
These examples show best practices for structuring pieces and their evaluations.