@flatfile/improv
Version:
A powerful TypeScript library for building AI agents with multi-threaded conversations, tool execution, and event handling capabilities
770 lines (605 loc) • 22.7 kB
Markdown
# @flatfile/improv
A powerful TypeScript library for building AI-powered applications with three complementary APIs: Solo for simple structured outputs, Agent for complex tool-enabled workflows, and Gig for orchestrating multi-step AI operations. Features type-safe outputs, built-in retry/fallback mechanisms, and support for multiple LLM providers.
## Core Features
- 🎯 **Solo**: Simple, type-safe API for one-off LLM calls with structured output
- 🤖 **Agent**: Advanced AI agents with tool usage, knowledge bases, and multi-step reasoning
- 🎭 **Gig**: Orchestrate complex workflows with sequential/parallel execution and dependencies
- 🧩 **Pieces**: Create reusable workflow components with built-in evaluation support
- 🛠️ **Tool Integration**: Define and execute custom tools with type-safe parameters using Zod schemas
- 📎 **Attachment Support**: Handle various types of attachments (documents, images, videos)
- 🔄 **Event Streaming**: Built-in event system for real-time monitoring and response handling
- 💾 **Memory Management**: Track and persist conversation history and agent state
- 🔧 **Multi-Provider Support**: AWS Bedrock, OpenAI, Cohere, Gemini, Cerebras, and more
## Table of Contents
- [Quick Start](#quick-start)
- [📚 Documentation](#-documentation)
- [New: Solo, Agent, and Gig](#new-solo-agent-and-gig)
- [Solo - Structured LLM Calls](#solo---structured-llm-calls)
- [Agent - Tool-Enabled Workflows](#agent---tool-enabled-workflows)
- [Gig - Workflow Orchestration](#gig---workflow-orchestration)
- [Pieces - Reusable Components](#pieces---reusable-components)
- [Core Components](#core-components)
- [Agent](#agent)
- [Thread](#thread)
- [Tool](#tool)
- [Message](#message)
- [Event System](#event-system)
- [Best Practices](#best-practices)
- [License](#license)
- [Contributing](#contributing)
- [Follow-up Messages](#follow-up-messages)
- [State Management](#state-management)
- [Agent State](#agent-state)
- [Thread State](#thread-state)
- [Tool State](#tool-state)
- [Evaluator System](#evaluator-system)
- [State Management Patterns](#state-management-patterns)
- [Three-Keyed Lock Pattern](#three-keyed-lock-pattern)
- [AWS Bedrock Integration](#aws-bedrock-integration)
- [Model Drivers](#model-drivers)
- [Tool Decorators](#tool-decorators)
- [Streaming Support](#streaming-support)
- [Reasoning Capabilities](#reasoning-capabilities)
## Quick Start
```typescript
import { Agent, Tool, BedrockThreadDriver } from '@flatfile/improv';
import { z } from 'zod';
// Create a custom tool
const calculatorTool = new Tool({
name: 'calculator',
description: 'Performs basic arithmetic operations',
parameters: z.object({
operation: z.enum(['add', 'subtract', 'multiply', 'divide']),
a: z.number(),
b: z.number(),
}),
executeFn: async (args) => {
const { operation, a, b } = args;
switch (operation) {
case 'add': return a + b;
case 'subtract': return a - b;
case 'multiply': return a * b;
case 'divide': return a / b;
}
}
});
// Initialize the Bedrock driver
const driver = new BedrockThreadDriver({
model: 'anthropic.claude-3-haiku-20240307-v1:0',
temperature: 0.7,
});
// Create an agent
const agent = new Agent({
knowledge: [
{ fact: 'The agent can perform basic arithmetic operations.' }
],
instructions: [
{ instruction: 'Use the calculator tool for arithmetic operations.', priority: 1 }
],
tools: [calculatorTool],
driver,
});
// Create and use a thread
const thread = agent.createThread({
prompt: 'What is 25 multiplied by 4?',
onResponse: async (message) => {
console.log('Agent response:', message.content);
}
});
// Send the thread
await thread.send();
// Stream the response
const stream = await thread.stream();
for await (const text of stream) {
process.stdout.write(text); // Print each chunk as it arrives
}
```
## 📚 Documentation
**Comprehensive guides and API references:**
- **[📖 Complete Documentation](./docs/)** - All guides and references
- **[🚀 API Overview](./docs/api-overview.md)** - Complete overview of all APIs
- **[🎯 Event System Guide](./docs/events-guide.md)** - Using events and monitoring
- **[📋 Events Reference](./docs/events.md)** - Complete event registry
- **[🔄 Structured Output](./docs/structured-output.md)** - Type-safe LLM responses
- **[🤖 Agent & Solo Patterns](./docs/agent-solo-patterns.md)** - When to use each API
- **[🧩 Creating Pieces](./docs/creating-pieces.md)** - Building reusable components
## New: Solo, Agent, and Gig
Improv now provides simplified APIs for different AI use cases:
### Solo - Structured LLM Calls
For simple, one-off LLM calls with structured output:
```typescript
import { Solo } from '@flatfile/improv';
import { z } from 'zod';
const solo = new Solo({
driver,
outputSchema: z.object({
sentiment: z.enum(["positive", "negative", "neutral"]),
confidence: z.number().min(0).max(1)
})
});
const result = await solo.ask("Analyze the sentiment: 'This product is amazing!'");
// result.output is fully typed: { sentiment: "positive", confidence: 0.95 }
```
**Key Features:**
- Type-safe structured output with Zod schemas
- Built-in retry logic with exponential backoff
- Fallback driver support
- Streaming support
- Simple API for classification and extraction tasks
### Agent - Tool-Enabled Workflows
For complex, multi-step workflows that require tool usage:
```typescript
import { Agent, Tool } from '@flatfile/improv';
const searchTool = new Tool({
name: "search",
description: "Search the knowledge base",
parameters: z.object({ query: z.string() }),
executeFn: async ({ query }) => {
// Your search implementation
return { results: ["Result 1", "Result 2"] };
}
});
const agent = new Agent({
driver,
tools: [searchTool],
instructions: [
{ instruction: "Always search before answering", priority: 1 }
]
});
const thread = agent.createThread({
prompt: "What are the best practices for error handling?"
});
await thread.send();
```
### Gig - Workflow Orchestration
Orchestrate multiple AI operations with dependencies and control flow:
```typescript
import { Gig, PieceDefinition } from '@flatfile/improv';
const gig = new Gig({
label: "Customer Support Workflow",
driver
});
// Add pieces sequentially (default behavior)
gig
.add("classify", groove =>
`Classify this support request: "${groove.feelVibe("request")}"`, {
outputSchema: z.enum(["technical", "billing", "general"])
})
.add("sentiment", groove =>
`Analyze sentiment: "${groove.feelVibe("request")}"`, {
outputSchema: z.enum(["positive", "negative", "neutral"])
})
// Pieces can access previous results
.add("research", groove => {
const category = groove.recall("classify");
return `Research solutions for ${category} issue`;
}, {
tools: [searchTool],
driver: cheaperModel // Override driver for cost optimization
})
// Explicit parallel execution when needed
.parallel([
["check_status", "Check system status"],
["find_similar", "Find similar resolved issues"]
])
.add("respond", groove => {
const sentiment = groove.recall("sentiment");
const research = groove.recall("research");
return `Generate ${sentiment} response using: ${research}`;
});
// Execute the workflow
const results = await gig.perform({
request: "My invoice is wrong and I'm very frustrated!"
});
console.log(results.recordings.get("classify")); // "billing"
console.log(results.recordings.get("sentiment")); // "negative"
```
**Key Features:**
- Sequential execution by default (predictable flow)
- Explicit `parallel()` for concurrent operations
- Auto-detection: uses Solo for structured output, Agent for tools
- Driver overrides per piece for optimization
- Access previous results with `groove.recall()`
- Type-safe with full TypeScript support
### Pieces - Reusable Components
Create reusable workflow components that can be shared across projects:
```typescript
import { PieceDefinition } from '@flatfile/improv';
import { z } from 'zod';
// Define a reusable piece
export const sentimentAnalysis: PieceDefinition<"positive" | "negative" | "neutral"> = {
name: "sentiment",
play: (groove) => {
const text = groove.feelVibe("text");
return `Analyze sentiment of: "${text}"`;
},
config: {
outputSchema: z.enum(["positive", "negative", "neutral"]),
temperature: 0.1
},
meta: {
version: "1.0.0",
description: "Analyzes emotional tone of text"
}
};
// Use in any Gig
gig.add(sentimentAnalysis);
```
**Organizing Pieces with Evaluations:**
```
src/
pieces/
sentiment/
index.ts # Piece definition (production)
eval.ts # Evaluation data (dev only)
```
This separation ensures evaluation datasets don't get bundled in production builds.
## Reasoning Capabilities
Improv supports advanced reasoning capabilities through the `reasoning_config` option in the thread driver. This allows the AI to perform step-by-step reasoning before providing a final answer.
```typescript
import { Agent, BedrockThreadDriver } from '@flatfile/improv';
const driver = new BedrockThreadDriver({
model: 'anthropic.claude-3-7-sonnet-20250219-v1:0',
temperature: 1,
reasoning_config: {
budget_tokens: 1024,
type: 'enabled',
},
});
const agent = new Agent({
driver,
});
const thread = agent.createThread({
systemPrompt: 'You are a helpful assistant that can answer questions about the world.',
prompt: 'How many people will live in the world in 2040?',
});
const result = await thread.send();
console.log(result.last());
```
This example enables the AI to work through its reasoning process with a token budget of 1024 tokens before providing a final answer about population projections.
## Core Components
### Agent
The main agent class that manages knowledge, instructions, tools, and conversation threads.
```typescript
const agent = new Agent({
knowledge?: AgentKnowledge[], // Array of facts with optional source and timestamp
instructions?: AgentInstruction[], // Array of prioritized instructions
memory?: AgentMemory[], // Array of stored thread histories
systemPrompt?: string, // Base system prompt
tools?: Tool[], // Array of available tools
driver: ThreadDriver, // Thread driver implementation
evaluators?: Evaluator[] // Array of evaluators for response processing
});
```
### Thread
Manages a single conversation thread with message history and tool execution.
```typescript
const thread = new Thread({
messages?: Message[], // Array of conversation messages
tools?: Tool[], // Array of available tools
driver: ThreadDriver, // Thread driver implementation
toolChoice?: 'auto' | 'any', // Tool selection mode
maxSteps?: number // Maximum number of tool execution steps
});
```
### Tool
Define custom tools that the agent can use during conversations.
```typescript
const tool = new Tool({
name: string, // Tool name
description: string, // Tool description
parameters: z.ZodTypeAny, // Zod schema for parameter validation
followUpMessage?: string, // Optional message to guide response evaluation
executeFn: (args: Record<string, any>, toolCall: ToolCall) => Promise<any> // Tool execution function
});
```
### Message
Represents a single message in a conversation thread.
```typescript
const message = new Message({
content?: string, // Message content
role: 'system' | 'user' | 'assistant' | 'tool', // Message role
toolCalls?: ToolCall[], // Array of tool calls
toolResults?: ToolResult[], // Array of tool results
attachments?: Attachment[], // Array of attachments
cache?: boolean // Whether to cache the message
});
```
## Event System
The library uses an event-driven architecture. All major components extend `EventSource`, allowing you to listen for various events:
```typescript
// Agent events
agent.on('agent.thread-added', ({ agent, thread }) => {});
agent.on('agent.thread-removed', ({ agent, thread }) => {});
agent.on('agent.knowledge-added', ({ agent, knowledge }) => {});
agent.on('agent.instruction-added', ({ agent, instruction }) => {});
// Thread events
thread.on('thread.response', ({ thread, message }) => {});
thread.on('thread.max_steps_reached', ({ thread, steps }) => {});
// Tool events
tool.on('tool.execution.started', ({ tool, name, args }) => {});
tool.on('tool.execution.completed', ({ tool, name, args, result }) => {});
tool.on('tool.execution.failed', ({ tool, name, args, error }) => {});
```
## Best Practices
1. **Choosing the Right API**
- Use **Solo** for: Classification, extraction, structured output, simple Q&A
- Use **Agent** for: Tool usage, multi-step reasoning, complex decision making
- Use **Gig** for: Orchestrating multiple operations, workflows with dependencies
2. **Piece Design**
- Keep pieces focused on a single task
- Use descriptive names that indicate the piece's purpose
- Leverage `groove.recall()` to access previous results
- Override drivers for cost optimization (e.g., cheaper models for simple tasks)
3. **Tool Design**
- Keep tools atomic and focused on a single responsibility
- Use Zod schemas for robust parameter validation
- Implement proper error handling in tool execution
- Use follow-up messages to guide response evaluation
4. **Workflow Organization**
- Default to sequential execution for predictable flow
- Use `parallel()` only when operations are truly independent
- Separate evaluation data into `eval.ts` files
- Create reusable pieces for common operations
5. **Error Handling & Resilience**
- Configure retry logic with appropriate backoff
- Set up fallback drivers for critical operations
- Use `onError` handlers in Gig pieces
- Monitor events for debugging and observability
6. **Type Safety**
- Define output schemas with Zod for Solo operations
- Use `PieceDefinition<T>` for type-safe pieces
- Leverage TypeScript's type inference with `groove.recall()`
## License
MIT
## Contributing
Contributions are welcome! Please read our contributing guidelines for details.
## Follow-up Messages
Tools can include follow-up messages that guide the AI's evaluation of tool responses. This is particularly useful for:
- Providing context for tool results
- Guiding the AI's interpretation of data
- Maintaining consistent response patterns
- Suggesting next steps or actions
```typescript
const tool = new Tool({
name: 'dataAnalyzer',
description: 'Analyzes data and returns insights',
parameters: z.object({
data: z.array(z.any()),
metrics: z.array(z.string())
}),
followUpMessage: `Review the analysis results:
1. What are the key insights from the data?
2. Are there any concerning patterns?
3. What actions should be taken based on these results?`,
executeFn: async (args) => {
// Tool implementation
}
});
```
## State Management
The library provides several mechanisms for managing state:
### Agent State
- Knowledge base for storing facts
- Prioritized instructions for behavior guidance
- Memory system for storing thread histories
- System prompt for base context
### Thread State
- Message history tracking
- Tool execution state
- Maximum step limits
- Response handlers
### Tool State
- Parameter validation
- Execution tracking
- Result processing
- Event emission
### Evaluator System
Evaluators provide a way to process and validate agent responses:
```typescript
const evaluator: Evaluator = async ({ thread, agent }, complete) => {
// Process the thread response
const lastMessage = thread.last();
if (lastMessage?.content.includes('done')) {
complete(); // Signal completion
} else {
// Continue processing
thread.send(new Message({
content: 'Please continue with the task...'
}));
}
};
const agent = new Agent({
// ... other options ...
evaluators: [evaluator]
});
```
Evaluators can:
- Process agent responses
- Trigger additional actions
- Control conversation flow
- Validate results
## State Management Patterns
### Three-Keyed Lock Pattern
The three-keyed lock pattern is a state management pattern that ensures controlled flow through tool execution, evaluation, and completion phases. It's implemented as a reusable evaluator:
```typescript
import { threeKeyedLockEvaluator } from '@flatfile/improv';
const agent = new Agent({
// ... other options ...
evaluators: [
threeKeyedLockEvaluator({
evalPrompt: "Are there other items to process? If not, say 'done'",
exitPrompt: "Please provide a final summary of all actions taken."
})
]
});
```
The pattern works through three distinct states:
```mermaid
stateDiagram-v2
[*] --> ToolExecution
state "Tool Execution" as ToolExecution {
[*] --> Running
Running --> Complete
Complete --> [*]
}
state "Evaluation" as Evaluation {
[*] --> CheckMore
CheckMore --> [*]
}
state "Summary" as Summary {
[*] --> Summarize
Summarize --> [*]
}
ToolExecution --> Evaluation: Non-tool response
Evaluation --> ToolExecution: Tool called
Evaluation --> Summary: No more items
Summary --> [*]: Complete
note right of ToolExecution
isEvaluatingTools = true
Handles tool execution
end note
note right of Evaluation
isEvaluatingTools = false
nextMessageIsSummary = false
Checks for more work
end note
note right of Summary
nextMessageIsSummary = true
Gets final summary
end note
```
The evaluator manages these states through:
1. **Tool Execution State**
- Tracks when tools are being executed
- Resets state when new tools are called
- Handles multiple tool executions
2. **Evaluation State**
- Triggered after tool completion
- Prompts for more items to process
- Can return to tool execution if needed
3. **Summary State**
- Final state before completion
- Gathers summary of actions
- Signals completion when done
Key features:
- Automatic state transitions
- Event-based flow control
- Clean event listener management
- Configurable prompts
- Support for multiple tool executions
Example usage with custom prompts:
```typescript
const workflowAgent = new Agent({
// ... agent configuration ...
evaluators: [
threeKeyedLockEvaluator({
evalPrompt: "Review the results. Should we process more items?",
exitPrompt: "Provide a detailed summary of all processed items."
})
]
});
// The evaluator will automatically:
// 1. Let tools execute freely
// 2. After each tool completion, check if more processing is needed
// 3. When no more items need processing, request a final summary
// 4. Complete the evaluation after receiving the summary
```
This pattern is particularly useful for:
- Processing multiple items sequentially
- Workflows requiring validation between steps
- Tasks with dynamic tool usage
- Operations requiring final summaries
## AWS Bedrock Integration
The library uses AWS Bedrock (Claude) as its default LLM provider. Configure your AWS credentials:
```typescript
// Required environment variables
process.env.AWS_ACCESS_KEY_ID = 'your-access-key';
process.env.AWS_SECRET_ACCESS_KEY = 'your-secret-key';
process.env.AWS_REGION = 'your-region';
// Initialize the driver
const driver = new BedrockThreadDriver({
model: 'anthropic.claude-3-haiku-20240307-v1:0', // Default model
temperature?: number, // Default: 0.7
maxTokens?: number, // Default: 4096
cache?: boolean // Default: false
});
```
## Model Drivers
Improv supports multiple LLM providers through dedicated thread drivers:
### Available Drivers
| Driver | Provider | Documentation |
|--------|----------|---------------|
| `BedrockThreadDriver` | AWS Bedrock (Claude) | [Bedrock Driver Documentation](src/model.drivers/bedrock.driver.md) |
| `OpenAIThreadDriver` | OpenAI | [OpenAI Driver Documentation](src/model.drivers/openai.driver.md) |
| `CohereThreadDriver` | Cohere | [Cohere Driver Documentation](src/model.drivers/cohere.driver.md) |
| `GeminiThreadDriver` | Google Gemini | [Gemini Driver Documentation](src/model.drivers/gemini.driver.md) |
| `CerebrasThreadDriver` | Cerebras | [Cerebras Driver Documentation](src/model.drivers/cerebras.driver.md) |
Each driver provides a consistent interface while supporting model-specific features:
```typescript
// OpenAI example
import { OpenAIThreadDriver } from '@flatfile/improv';
const driver = new OpenAIThreadDriver({
model: 'gpt-4o',
apiKey: process.env.OPENAI_API_KEY,
temperature: 0.7
});
// Cohere example
import { CohereThreadDriver } from '@flatfile/improv';
const driver = new CohereThreadDriver({
model: 'command-r-plus',
apiKey: process.env.COHERE_API_KEY
});
// Gemini example
import { GeminiThreadDriver } from '@flatfile/improv';
const driver = new GeminiThreadDriver({
model: 'gemini-1.5-pro',
apiKey: process.env.GOOGLE_API_KEY
});
// Cerebras example
import { CerebrasThreadDriver } from '@flatfile/improv';
const driver = new CerebrasThreadDriver({
model: 'llama-4-scout-17b-16e-instruct',
apiKey: process.env.CEREBRAS_API_KEY
});
```
Refer to each driver's documentation for available models and specific configuration options.
## Tool Decorators
The library provides decorators for creating tools directly on agent classes:
```typescript
class CustomAgent extends Agent {
@ToolName("sampleData")
@ToolDescription("Sample the original data with the mapping program")
private async sampleData(
@ToolParam("count", "Number of records to sample", z.number())
count: number,
@ToolParam("seed", "Random seed", z.number().optional())
seed?: number
): Promise<any> {
return { count, seed };
}
}
```
This provides:
- Type-safe tool definitions
- Automatic parameter validation
- Clean method-based tools
- Integrated error handling
## Streaming Support
The library provides built-in support for streaming responses from the AI model. Keys features:
- Real-time text chunks as they're generated
- Automatic message management in the thread
- Event emission for stream progress
- Error handling and recovery
- Compatible with all thread features including tool calls
```typescript
const thread = agent.createThread({
prompt: 'What is 25 multiplied by 4?',
});
const stream = await thread.stream();
for await (const text of stream) {
process.stdout.write(text);
}
// The final response is also available in the thread
console.log(thread.last()?.content);
```