UNPKG

@flatfile/improv

Version:

A powerful TypeScript library for building AI agents with multi-threaded conversations, tool execution, and event handling capabilities

770 lines (605 loc) 22.7 kB
# @flatfile/improv A powerful TypeScript library for building AI-powered applications with three complementary APIs: Solo for simple structured outputs, Agent for complex tool-enabled workflows, and Gig for orchestrating multi-step AI operations. Features type-safe outputs, built-in retry/fallback mechanisms, and support for multiple LLM providers. ## Core Features - 🎯 **Solo**: Simple, type-safe API for one-off LLM calls with structured output - 🤖 **Agent**: Advanced AI agents with tool usage, knowledge bases, and multi-step reasoning - 🎭 **Gig**: Orchestrate complex workflows with sequential/parallel execution and dependencies - 🧩 **Pieces**: Create reusable workflow components with built-in evaluation support - 🛠️ **Tool Integration**: Define and execute custom tools with type-safe parameters using Zod schemas - 📎 **Attachment Support**: Handle various types of attachments (documents, images, videos) - 🔄 **Event Streaming**: Built-in event system for real-time monitoring and response handling - 💾 **Memory Management**: Track and persist conversation history and agent state - 🔧 **Multi-Provider Support**: AWS Bedrock, OpenAI, Cohere, Gemini, Cerebras, and more ## Table of Contents - [Quick Start](#quick-start) - [📚 Documentation](#-documentation) - [New: Solo, Agent, and Gig](#new-solo-agent-and-gig) - [Solo - Structured LLM Calls](#solo---structured-llm-calls) - [Agent - Tool-Enabled Workflows](#agent---tool-enabled-workflows) - [Gig - Workflow Orchestration](#gig---workflow-orchestration) - [Pieces - Reusable Components](#pieces---reusable-components) - [Core Components](#core-components) - [Agent](#agent) - [Thread](#thread) - [Tool](#tool) - [Message](#message) - [Event System](#event-system) - [Best Practices](#best-practices) - [License](#license) - [Contributing](#contributing) - [Follow-up Messages](#follow-up-messages) - [State Management](#state-management) - [Agent State](#agent-state) - [Thread State](#thread-state) - [Tool State](#tool-state) - [Evaluator System](#evaluator-system) - [State Management Patterns](#state-management-patterns) - [Three-Keyed Lock Pattern](#three-keyed-lock-pattern) - [AWS Bedrock Integration](#aws-bedrock-integration) - [Model Drivers](#model-drivers) - [Tool Decorators](#tool-decorators) - [Streaming Support](#streaming-support) - [Reasoning Capabilities](#reasoning-capabilities) ## Quick Start ```typescript import { Agent, Tool, BedrockThreadDriver } from '@flatfile/improv'; import { z } from 'zod'; // Create a custom tool const calculatorTool = new Tool({ name: 'calculator', description: 'Performs basic arithmetic operations', parameters: z.object({ operation: z.enum(['add', 'subtract', 'multiply', 'divide']), a: z.number(), b: z.number(), }), executeFn: async (args) => { const { operation, a, b } = args; switch (operation) { case 'add': return a + b; case 'subtract': return a - b; case 'multiply': return a * b; case 'divide': return a / b; } } }); // Initialize the Bedrock driver const driver = new BedrockThreadDriver({ model: 'anthropic.claude-3-haiku-20240307-v1:0', temperature: 0.7, }); // Create an agent const agent = new Agent({ knowledge: [ { fact: 'The agent can perform basic arithmetic operations.' } ], instructions: [ { instruction: 'Use the calculator tool for arithmetic operations.', priority: 1 } ], tools: [calculatorTool], driver, }); // Create and use a thread const thread = agent.createThread({ prompt: 'What is 25 multiplied by 4?', onResponse: async (message) => { console.log('Agent response:', message.content); } }); // Send the thread await thread.send(); // Stream the response const stream = await thread.stream(); for await (const text of stream) { process.stdout.write(text); // Print each chunk as it arrives } ``` ## 📚 Documentation **Comprehensive guides and API references:** - **[📖 Complete Documentation](./docs/)** - All guides and references - **[🚀 API Overview](./docs/api-overview.md)** - Complete overview of all APIs - **[🎯 Event System Guide](./docs/events-guide.md)** - Using events and monitoring - **[📋 Events Reference](./docs/events.md)** - Complete event registry - **[🔄 Structured Output](./docs/structured-output.md)** - Type-safe LLM responses - **[🤖 Agent & Solo Patterns](./docs/agent-solo-patterns.md)** - When to use each API - **[🧩 Creating Pieces](./docs/creating-pieces.md)** - Building reusable components ## New: Solo, Agent, and Gig Improv now provides simplified APIs for different AI use cases: ### Solo - Structured LLM Calls For simple, one-off LLM calls with structured output: ```typescript import { Solo } from '@flatfile/improv'; import { z } from 'zod'; const solo = new Solo({ driver, outputSchema: z.object({ sentiment: z.enum(["positive", "negative", "neutral"]), confidence: z.number().min(0).max(1) }) }); const result = await solo.ask("Analyze the sentiment: 'This product is amazing!'"); // result.output is fully typed: { sentiment: "positive", confidence: 0.95 } ``` **Key Features:** - Type-safe structured output with Zod schemas - Built-in retry logic with exponential backoff - Fallback driver support - Streaming support - Simple API for classification and extraction tasks ### Agent - Tool-Enabled Workflows For complex, multi-step workflows that require tool usage: ```typescript import { Agent, Tool } from '@flatfile/improv'; const searchTool = new Tool({ name: "search", description: "Search the knowledge base", parameters: z.object({ query: z.string() }), executeFn: async ({ query }) => { // Your search implementation return { results: ["Result 1", "Result 2"] }; } }); const agent = new Agent({ driver, tools: [searchTool], instructions: [ { instruction: "Always search before answering", priority: 1 } ] }); const thread = agent.createThread({ prompt: "What are the best practices for error handling?" }); await thread.send(); ``` ### Gig - Workflow Orchestration Orchestrate multiple AI operations with dependencies and control flow: ```typescript import { Gig, PieceDefinition } from '@flatfile/improv'; const gig = new Gig({ label: "Customer Support Workflow", driver }); // Add pieces sequentially (default behavior) gig .add("classify", groove => `Classify this support request: "${groove.feelVibe("request")}"`, { outputSchema: z.enum(["technical", "billing", "general"]) }) .add("sentiment", groove => `Analyze sentiment: "${groove.feelVibe("request")}"`, { outputSchema: z.enum(["positive", "negative", "neutral"]) }) // Pieces can access previous results .add("research", groove => { const category = groove.recall("classify"); return `Research solutions for ${category} issue`; }, { tools: [searchTool], driver: cheaperModel // Override driver for cost optimization }) // Explicit parallel execution when needed .parallel([ ["check_status", "Check system status"], ["find_similar", "Find similar resolved issues"] ]) .add("respond", groove => { const sentiment = groove.recall("sentiment"); const research = groove.recall("research"); return `Generate ${sentiment} response using: ${research}`; }); // Execute the workflow const results = await gig.perform({ request: "My invoice is wrong and I'm very frustrated!" }); console.log(results.recordings.get("classify")); // "billing" console.log(results.recordings.get("sentiment")); // "negative" ``` **Key Features:** - Sequential execution by default (predictable flow) - Explicit `parallel()` for concurrent operations - Auto-detection: uses Solo for structured output, Agent for tools - Driver overrides per piece for optimization - Access previous results with `groove.recall()` - Type-safe with full TypeScript support ### Pieces - Reusable Components Create reusable workflow components that can be shared across projects: ```typescript import { PieceDefinition } from '@flatfile/improv'; import { z } from 'zod'; // Define a reusable piece export const sentimentAnalysis: PieceDefinition<"positive" | "negative" | "neutral"> = { name: "sentiment", play: (groove) => { const text = groove.feelVibe("text"); return `Analyze sentiment of: "${text}"`; }, config: { outputSchema: z.enum(["positive", "negative", "neutral"]), temperature: 0.1 }, meta: { version: "1.0.0", description: "Analyzes emotional tone of text" } }; // Use in any Gig gig.add(sentimentAnalysis); ``` **Organizing Pieces with Evaluations:** ``` src/ pieces/ sentiment/ index.ts # Piece definition (production) eval.ts # Evaluation data (dev only) ``` This separation ensures evaluation datasets don't get bundled in production builds. ## Reasoning Capabilities Improv supports advanced reasoning capabilities through the `reasoning_config` option in the thread driver. This allows the AI to perform step-by-step reasoning before providing a final answer. ```typescript import { Agent, BedrockThreadDriver } from '@flatfile/improv'; const driver = new BedrockThreadDriver({ model: 'anthropic.claude-3-7-sonnet-20250219-v1:0', temperature: 1, reasoning_config: { budget_tokens: 1024, type: 'enabled', }, }); const agent = new Agent({ driver, }); const thread = agent.createThread({ systemPrompt: 'You are a helpful assistant that can answer questions about the world.', prompt: 'How many people will live in the world in 2040?', }); const result = await thread.send(); console.log(result.last()); ``` This example enables the AI to work through its reasoning process with a token budget of 1024 tokens before providing a final answer about population projections. ## Core Components ### Agent The main agent class that manages knowledge, instructions, tools, and conversation threads. ```typescript const agent = new Agent({ knowledge?: AgentKnowledge[], // Array of facts with optional source and timestamp instructions?: AgentInstruction[], // Array of prioritized instructions memory?: AgentMemory[], // Array of stored thread histories systemPrompt?: string, // Base system prompt tools?: Tool[], // Array of available tools driver: ThreadDriver, // Thread driver implementation evaluators?: Evaluator[] // Array of evaluators for response processing }); ``` ### Thread Manages a single conversation thread with message history and tool execution. ```typescript const thread = new Thread({ messages?: Message[], // Array of conversation messages tools?: Tool[], // Array of available tools driver: ThreadDriver, // Thread driver implementation toolChoice?: 'auto' | 'any', // Tool selection mode maxSteps?: number // Maximum number of tool execution steps }); ``` ### Tool Define custom tools that the agent can use during conversations. ```typescript const tool = new Tool({ name: string, // Tool name description: string, // Tool description parameters: z.ZodTypeAny, // Zod schema for parameter validation followUpMessage?: string, // Optional message to guide response evaluation executeFn: (args: Record<string, any>, toolCall: ToolCall) => Promise<any> // Tool execution function }); ``` ### Message Represents a single message in a conversation thread. ```typescript const message = new Message({ content?: string, // Message content role: 'system' | 'user' | 'assistant' | 'tool', // Message role toolCalls?: ToolCall[], // Array of tool calls toolResults?: ToolResult[], // Array of tool results attachments?: Attachment[], // Array of attachments cache?: boolean // Whether to cache the message }); ``` ## Event System The library uses an event-driven architecture. All major components extend `EventSource`, allowing you to listen for various events: ```typescript // Agent events agent.on('agent.thread-added', ({ agent, thread }) => {}); agent.on('agent.thread-removed', ({ agent, thread }) => {}); agent.on('agent.knowledge-added', ({ agent, knowledge }) => {}); agent.on('agent.instruction-added', ({ agent, instruction }) => {}); // Thread events thread.on('thread.response', ({ thread, message }) => {}); thread.on('thread.max_steps_reached', ({ thread, steps }) => {}); // Tool events tool.on('tool.execution.started', ({ tool, name, args }) => {}); tool.on('tool.execution.completed', ({ tool, name, args, result }) => {}); tool.on('tool.execution.failed', ({ tool, name, args, error }) => {}); ``` ## Best Practices 1. **Choosing the Right API** - Use **Solo** for: Classification, extraction, structured output, simple Q&A - Use **Agent** for: Tool usage, multi-step reasoning, complex decision making - Use **Gig** for: Orchestrating multiple operations, workflows with dependencies 2. **Piece Design** - Keep pieces focused on a single task - Use descriptive names that indicate the piece's purpose - Leverage `groove.recall()` to access previous results - Override drivers for cost optimization (e.g., cheaper models for simple tasks) 3. **Tool Design** - Keep tools atomic and focused on a single responsibility - Use Zod schemas for robust parameter validation - Implement proper error handling in tool execution - Use follow-up messages to guide response evaluation 4. **Workflow Organization** - Default to sequential execution for predictable flow - Use `parallel()` only when operations are truly independent - Separate evaluation data into `eval.ts` files - Create reusable pieces for common operations 5. **Error Handling & Resilience** - Configure retry logic with appropriate backoff - Set up fallback drivers for critical operations - Use `onError` handlers in Gig pieces - Monitor events for debugging and observability 6. **Type Safety** - Define output schemas with Zod for Solo operations - Use `PieceDefinition<T>` for type-safe pieces - Leverage TypeScript's type inference with `groove.recall()` ## License MIT ## Contributing Contributions are welcome! Please read our contributing guidelines for details. ## Follow-up Messages Tools can include follow-up messages that guide the AI's evaluation of tool responses. This is particularly useful for: - Providing context for tool results - Guiding the AI's interpretation of data - Maintaining consistent response patterns - Suggesting next steps or actions ```typescript const tool = new Tool({ name: 'dataAnalyzer', description: 'Analyzes data and returns insights', parameters: z.object({ data: z.array(z.any()), metrics: z.array(z.string()) }), followUpMessage: `Review the analysis results: 1. What are the key insights from the data? 2. Are there any concerning patterns? 3. What actions should be taken based on these results?`, executeFn: async (args) => { // Tool implementation } }); ``` ## State Management The library provides several mechanisms for managing state: ### Agent State - Knowledge base for storing facts - Prioritized instructions for behavior guidance - Memory system for storing thread histories - System prompt for base context ### Thread State - Message history tracking - Tool execution state - Maximum step limits - Response handlers ### Tool State - Parameter validation - Execution tracking - Result processing - Event emission ### Evaluator System Evaluators provide a way to process and validate agent responses: ```typescript const evaluator: Evaluator = async ({ thread, agent }, complete) => { // Process the thread response const lastMessage = thread.last(); if (lastMessage?.content.includes('done')) { complete(); // Signal completion } else { // Continue processing thread.send(new Message({ content: 'Please continue with the task...' })); } }; const agent = new Agent({ // ... other options ... evaluators: [evaluator] }); ``` Evaluators can: - Process agent responses - Trigger additional actions - Control conversation flow - Validate results ## State Management Patterns ### Three-Keyed Lock Pattern The three-keyed lock pattern is a state management pattern that ensures controlled flow through tool execution, evaluation, and completion phases. It's implemented as a reusable evaluator: ```typescript import { threeKeyedLockEvaluator } from '@flatfile/improv'; const agent = new Agent({ // ... other options ... evaluators: [ threeKeyedLockEvaluator({ evalPrompt: "Are there other items to process? If not, say 'done'", exitPrompt: "Please provide a final summary of all actions taken." }) ] }); ``` The pattern works through three distinct states: ```mermaid stateDiagram-v2 [*] --> ToolExecution state "Tool Execution" as ToolExecution { [*] --> Running Running --> Complete Complete --> [*] } state "Evaluation" as Evaluation { [*] --> CheckMore CheckMore --> [*] } state "Summary" as Summary { [*] --> Summarize Summarize --> [*] } ToolExecution --> Evaluation: Non-tool response Evaluation --> ToolExecution: Tool called Evaluation --> Summary: No more items Summary --> [*]: Complete note right of ToolExecution isEvaluatingTools = true Handles tool execution end note note right of Evaluation isEvaluatingTools = false nextMessageIsSummary = false Checks for more work end note note right of Summary nextMessageIsSummary = true Gets final summary end note ``` The evaluator manages these states through: 1. **Tool Execution State** - Tracks when tools are being executed - Resets state when new tools are called - Handles multiple tool executions 2. **Evaluation State** - Triggered after tool completion - Prompts for more items to process - Can return to tool execution if needed 3. **Summary State** - Final state before completion - Gathers summary of actions - Signals completion when done Key features: - Automatic state transitions - Event-based flow control - Clean event listener management - Configurable prompts - Support for multiple tool executions Example usage with custom prompts: ```typescript const workflowAgent = new Agent({ // ... agent configuration ... evaluators: [ threeKeyedLockEvaluator({ evalPrompt: "Review the results. Should we process more items?", exitPrompt: "Provide a detailed summary of all processed items." }) ] }); // The evaluator will automatically: // 1. Let tools execute freely // 2. After each tool completion, check if more processing is needed // 3. When no more items need processing, request a final summary // 4. Complete the evaluation after receiving the summary ``` This pattern is particularly useful for: - Processing multiple items sequentially - Workflows requiring validation between steps - Tasks with dynamic tool usage - Operations requiring final summaries ## AWS Bedrock Integration The library uses AWS Bedrock (Claude) as its default LLM provider. Configure your AWS credentials: ```typescript // Required environment variables process.env.AWS_ACCESS_KEY_ID = 'your-access-key'; process.env.AWS_SECRET_ACCESS_KEY = 'your-secret-key'; process.env.AWS_REGION = 'your-region'; // Initialize the driver const driver = new BedrockThreadDriver({ model: 'anthropic.claude-3-haiku-20240307-v1:0', // Default model temperature?: number, // Default: 0.7 maxTokens?: number, // Default: 4096 cache?: boolean // Default: false }); ``` ## Model Drivers Improv supports multiple LLM providers through dedicated thread drivers: ### Available Drivers | Driver | Provider | Documentation | |--------|----------|---------------| | `BedrockThreadDriver` | AWS Bedrock (Claude) | [Bedrock Driver Documentation](src/model.drivers/bedrock.driver.md) | | `OpenAIThreadDriver` | OpenAI | [OpenAI Driver Documentation](src/model.drivers/openai.driver.md) | | `CohereThreadDriver` | Cohere | [Cohere Driver Documentation](src/model.drivers/cohere.driver.md) | | `GeminiThreadDriver` | Google Gemini | [Gemini Driver Documentation](src/model.drivers/gemini.driver.md) | | `CerebrasThreadDriver` | Cerebras | [Cerebras Driver Documentation](src/model.drivers/cerebras.driver.md) | Each driver provides a consistent interface while supporting model-specific features: ```typescript // OpenAI example import { OpenAIThreadDriver } from '@flatfile/improv'; const driver = new OpenAIThreadDriver({ model: 'gpt-4o', apiKey: process.env.OPENAI_API_KEY, temperature: 0.7 }); // Cohere example import { CohereThreadDriver } from '@flatfile/improv'; const driver = new CohereThreadDriver({ model: 'command-r-plus', apiKey: process.env.COHERE_API_KEY }); // Gemini example import { GeminiThreadDriver } from '@flatfile/improv'; const driver = new GeminiThreadDriver({ model: 'gemini-1.5-pro', apiKey: process.env.GOOGLE_API_KEY }); // Cerebras example import { CerebrasThreadDriver } from '@flatfile/improv'; const driver = new CerebrasThreadDriver({ model: 'llama-4-scout-17b-16e-instruct', apiKey: process.env.CEREBRAS_API_KEY }); ``` Refer to each driver's documentation for available models and specific configuration options. ## Tool Decorators The library provides decorators for creating tools directly on agent classes: ```typescript class CustomAgent extends Agent { @ToolName("sampleData") @ToolDescription("Sample the original data with the mapping program") private async sampleData( @ToolParam("count", "Number of records to sample", z.number()) count: number, @ToolParam("seed", "Random seed", z.number().optional()) seed?: number ): Promise<any> { return { count, seed }; } } ``` This provides: - Type-safe tool definitions - Automatic parameter validation - Clean method-based tools - Integrated error handling ## Streaming Support The library provides built-in support for streaming responses from the AI model. Keys features: - Real-time text chunks as they're generated - Automatic message management in the thread - Event emission for stream progress - Error handling and recovery - Compatible with all thread features including tool calls ```typescript const thread = agent.createThread({ prompt: 'What is 25 multiplied by 4?', }); const stream = await thread.stream(); for await (const text of stream) { process.stdout.write(text); } // The final response is also available in the thread console.log(thread.last()?.content); ```