@flatfile/improv

# Agent vs Solo: Choosing the Right Pattern ## Overview Improv provides two primary interfaces for interacting with LLMs: 1. **Solo**: For simple, one-off model calls with structured outputs 2. **Agent**: For complex, multi-step workflows with tool calling ## When to Use Solo Solo is designed for **deterministic, single-response tasks** where you need: - ✅ Classification into predefined categories - ✅ Structured data extraction from text - ✅ Simple transformations with type-safe outputs - ✅ Retry and fallback mechanisms for reliability - ✅ Consistent, predictable responses ### Solo Examples ```typescript // Sentiment classification const sentiment = await classify( "I love this product!", ["positive", "negative", "neutral"], driver ); // Data extraction with schema const PersonSchema = z.object({ name: z.string(), email: z.string().email(), age: z.number() }); const person = await extract( "John Doe (john@example.com) is 30 years old", PersonSchema, driver ); // Intent classification with detailed output const solo = new Solo({ driver, outputSchema: z.object({ intent: z.enum(["order_status", "complaint", "inquiry"]), confidence: z.number(), entities: z.record(z.string()) }) }); const result = await solo.ask("Where is my order #12345?"); ``` ### Solo Features - **Structured Output**: Use Zod schemas to ensure type-safe responses - **Retry Logic**: Automatic retries with exponential backoff - **Fallback Drivers**: Switch to backup models if primary fails - **Low Temperature**: Optimized for consistent, deterministic outputs - **Schema Validation**: Runtime validation of LLM responses ## When to Use Agent Agent is designed for **complex, iterative workflows** where you need: - ✅ Tool calling and execution - ✅ Multi-step reasoning processes - ✅ Dynamic decision making - ✅ Conversation state management - ✅ Completion detection through evaluators ### Agent Examples ```typescript // Agent with tools and evaluators const agent = new Agent({ tools: [databaseTool, calculatorTool, emailTool], evaluators: [threeKeyedLockEvaluator()], driver }); const thread = agent.createThread({ prompt: "Process all pending orders and send confirmation emails" }); await thread.send(); // Will loop through tools until complete ``` ### Agent Features - **Tool Execution**: Define and execute custom tools - **Evaluators**: Control flow and completion detection - **Thread Management**: Maintain conversation state - **Knowledge & Instructions**: Provide context and behavior rules - **Memory System**: Persist and recall past conversations ## Decision Matrix | Use Case | Solo | Agent | |----------|------|-------| | Classify customer sentiment | ✅ | ❌ | | Extract entities from text | ✅ | ❌ | | Generate structured JSON | ✅ | ❌ | | Process multiple items with tools | ❌ | ✅ | | Research task requiring web search | ❌ | ✅ | | Multi-step workflow automation | ❌ | ✅ | | Simple Q&A with consistent format | ✅ | ❌ | | Complex reasoning with iterations | ❌ | ✅ | ## Complementary Patterns ### 1. **Agent as Orchestra, Solo as Instruments** Use Agent for complex workflow orchestration while Solo handles specific, isolated tasks. ```typescript import { Agent, Solo, OpenAIThreadDriver } from 'improv'; class ContentModerationSystem { private agent: Agent; private classifierSolo: Solo; private summarizerSolo: Solo; constructor(driver: ThreadDriver) { // Agent handles the overall moderation workflow this.agent = new Agent({ driver, systemPrompt: "You are a content moderation coordinator. Use tools to analyze and make decisions about content.", tools: [ this.createClassificationTool(), this.createSummaryTool(), this.createDecisionTool() ] }); // Solo instances for specific classification tasks this.classifierSolo = new Solo({ driver, systemPrompt: "Classify content as: safe, questionable, harmful, spam. Respond with only the classification." }); this.summarizerSolo = new Solo({ driver, systemPrompt: "Summarize content in exactly one sentence focusing on key concerns for moderation." }); } private createClassificationTool() { return { name: "classify_content", description: "Classify content using specialized model", parameters: z.object({ content: z.string() }), executeTool: async ({ content }) => { // Solo handles the specialized classification const result = await this.classifierSolo.ask(content); return { classification: result.content.trim() }; } }; } } ``` ### 2. **Agent for Conversations, Solo for Analysis** Agent maintains conversation state while Solo performs stateless analysis tasks. ```typescript class CustomerSupportSystem { private conversationAgent: Agent; private sentimentSolo: Solo; private urgencySolo: Solo; async handleCustomerMessage(conversationId: string, message: string) { // Solo quickly analyzes the incoming message const [sentiment, urgency] = await Promise.all([ this.sentimentSolo.ask(`Analyze sentiment of: "${message}"`), this.urgencySolo.ask(`Rate urgency 1-10 for: "${message}"`) ]); // Agent handles the conversation with enriched context const agent = this.getOrCreateAgent(conversationId); agent.addKnowledge({ fact: `Current message sentiment: ${sentiment.content}`, source: "sentiment_analysis" }); const thread = agent.createThread({ prompt: message }); return thread.send(); } } ``` ### 3. **Agent with Solo-Powered Tools** Tools within an Agent can use Solo for their own LLM operations. ```typescript class ResearchAgent { constructor(driver: ThreadDriver) { const webSearchSolo = new Solo({ driver, systemPrompt: "Extract key information from search results. Be concise and factual." }); const agent = new Agent({ driver, tools: [{ name: "web_search", description: "Search the web and summarize results", parameters: z.object({ query: z.string() }), executeTool: async ({ query }) => { // Fetch search results (pseudo-code) const searchResults = await fetchSearchResults(query); // Use Solo to analyze and summarize const analysis = await webSearchSolo.ask( `Summarize these search results for query "${query}":\n${searchResults}` ); return { summary: analysis.content, source: "web_search" }; } }] }); } } ``` ### 4. **Microservice Architecture** Different services use Solo for their specific LLM needs while a main service uses Agent for orchestration. ```typescript // Email Service class EmailService { private subjectSolo: Solo; private bodySolo: Solo; async generateEmail(context: EmailContext) { const [subject, body] = await Promise.all([ this.subjectSolo.ask(`Generate email subject for: ${context.purpose}`), this.bodySolo.ask(`Generate email body for: ${JSON.stringify(context)}`) ]); return { subject: subject.content, body: body.content }; } } // Main Orchestration Service class WorkflowService { private orchestrationAgent: Agent; private emailService: EmailService; async processWorkflow(workflowData: any) { // Agent coordinates the overall workflow const thread = this.orchestrationAgent.createThread({ prompt: `Process this workflow: ${JSON.stringify(workflowData)}` }); // Agent can call tools that internally use other services' Solo instances return thread.send(); } } ``` ## Implementation Patterns ### Solo Pattern: Classification Pipeline ```typescript // Create a reusable classifier class CustomerServiceClassifier { private solo: Solo<{ category: string; urgency: "low" | "medium" | "high"; requiresHuman: boolean; }>; constructor(driver: ThreadDriver) { this.solo = new Solo({ driver, outputSchema: z.object({ category: z.string(), urgency: z.enum(["low", "medium", "high"]), requiresHuman: z.boolean() }), retry: { maxAttempts: 3, exponentialBackoff: true }, temperature: 0.1 // Low for consistency }); } async classify(message: string) { return await this.solo.ask(message); } } ``` ### Agent Pattern: Workflow Automation ```typescript // Create a workflow agent class OrderProcessingAgent extends Agent { constructor(driver: ThreadDriver) { super({ driver, tools: [ databaseQueryTool, inventoryCheckTool, paymentProcessTool, emailSenderTool ], instructions: [ { instruction: "Process orders in chronological order", priority: 1 }, { instruction: "Check inventory before confirming payment", priority: 2 } ], evaluators: [ threeKeyedLockEvaluator({ evalPrompt: "Are there more orders to process?", exitPrompt: "Summarize all processed orders" }) ] }); } } ## Best Practices ### Solo Best Practices 1. **Use Specific Schemas**: Define precise Zod schemas for expected outputs 2. **Set Low Temperature**: Use 0.1-0.3 for classification tasks 3. **Implement Fallbacks**: Have backup drivers for critical operations 4. **Cache Results**: Solo outputs are deterministic and cacheable 5. **Batch Similar Requests**: Process multiple classifications together ### Agent Best Practices 1. **Define Clear Tools**: Each tool should have a single responsibility 2. **Use Evaluators**: Implement proper completion detection 3. **Manage State**: Track progress through agent memory 4. **Set Appropriate Limits**: Configure maxSteps to prevent infinite loops 5. **Monitor Events**: Subscribe to agent events for debugging ## Migration Guide ### From Basic LLM Calls to Solo Before: ```typescript // Direct API call with manual parsing const response = await openai.chat.completions.create({ messages: [{ role: "user", content: "Classify: " + text }], model: "gpt-4" }); const result = JSON.parse(response.choices[0].message.content); ``` After: ```typescript // Type-safe with automatic retry const result = await classify( text, ["category1", "category2", "category3"], driver ); ``` ### From Complex Scripts to Agent Before: ```typescript // Manual orchestration async function processOrders() { const orders = await getOrders(); for (const order of orders) { const inventory = await checkInventory(order); if (inventory.available) { await processPayment(order); await sendEmail(order); } } } ``` After: ```typescript // Agent with tools and evaluators const agent = new OrderAgent({ driver, tools: [...] }); const thread = agent.createThread({ prompt: "Process all pending orders" }); await thread.send(); // Handles all logic internally ``` ## Performance Considerations ### Solo Performance - **Latency**: Single API call (plus retries if needed) - **Token Usage**: Minimal - only classification/extraction prompts - **Caching**: Responses are deterministic and cacheable - **Concurrency**: Can process many classifications in parallel ### Agent Performance - **Latency**: Multiple API calls for tool execution - **Token Usage**: Higher due to conversation context - **State Management**: Maintains thread history - **Concurrency**: Limited by stateful nature of conversations ## Solo Optimizations ### 1. **Driver Reuse and Connection Pooling** ```typescript class OptimizedSoloPool { private drivers: Map<string, ThreadDriver> = new Map(); private soloInstances: Map<string, Solo> = new Map(); getOrCreateSolo(purpose: string, systemPrompt: string): Solo { const key = `${purpose}-${hashString(systemPrompt)}`; if (!this.soloInstances.has(key)) { const driver = this.getOrCreateDriver(purpose); this.soloInstances.set(key, new Solo({ driver, systemPrompt })); } return this.soloInstances.get(key)!; } private getOrCreateDriver(purpose: string): ThreadDriver { if (!this.drivers.has(purpose)) { // Optimize driver settings based on purpose const config = this.getOptimalConfig(purpose); this.drivers.set(purpose, new OpenAIThreadDriver(config)); } return this.drivers.get(purpose)!; } private getOptimalConfig(purpose: string): OpenAIConfig { switch (purpose) { case 'classification': return { model: 'gpt-4o-mini', temperature: 0.1, maxTokens: 50 }; case 'creative': return { model: 'gpt-4o', temperature: 0.9, maxTokens: 1000 }; case 'analysis': return { model: 'gpt-4o', temperature: 0.3, maxTokens: 500 }; default: return { model: 'gpt-4o-mini', temperature: 0.7 }; } } } ``` ### 2. **Caching for Similar Requests** ```typescript class CachedSolo { private cache = new Map<string, { response: string; timestamp: number }>(); private solo: Solo; private cacheTTL = 5 * 60 * 1000; // 5 minutes constructor(options: SoloOptions) { this.solo = new Solo(options); } async ask(prompt: string): Promise<string> { const cacheKey = this.hashPrompt(prompt); const cached = this.cache.get(cacheKey); if (cached && Date.now() - cached.timestamp < this.cacheTTL) { return cached.response; } const response = await this.solo.ask(prompt); this.cache.set(cacheKey, { response: response.content, timestamp: Date.now() }); return response.content; } private hashPrompt(prompt: string): string { // Simple hash function - use crypto in production return btoa(prompt).slice(0, 32); } } ``` ### 3. **Batch Processing Optimization** ```typescript class BatchSolo { private solo: Solo; private batchSize = 10; private processingQueue: Array<{ prompt: string; resolve: (value: string) => void; reject: (error: Error) => void; }> = []; async ask(prompt: string): Promise<string> { return new Promise((resolve, reject) => { this.processingQueue.push({ prompt, resolve, reject }); this.processBatch(); }); } private async processBatch() { if (this.processingQueue.length < this.batchSize) { return; // Wait for more items } const batch = this.processingQueue.splice(0, this.batchSize); // Process batch items in parallel const promises = batch.map(async ({ prompt, resolve, reject }) => { try { const response = await this.solo.ask(prompt); resolve(response.content); } catch (error) { reject(error as Error); } }); await Promise.all(promises); } } ``` ### 4. **Specialized Solo Variants** ```typescript // Ultra-fast classification class QuickClassifierSolo extends Solo { constructor(driver: ThreadDriver, categories: string[]) { super({ driver, systemPrompt: `Classify input as one of: ${categories.join(', ')}. Respond with ONLY the category name, nothing else.`, maxSteps: 1 // No tool calling needed }); } async classify(text: string): Promise<string> { const response = await this.ask(text); return response.content.trim().toLowerCase(); } } // Streaming content generator class StreamingContentSolo extends Solo { constructor(driver: ThreadDriver, contentType: string) { super({ driver, systemPrompt: `Generate ${contentType} content. Be creative and engaging.` }); } async *generateContent(prompt: string): AsyncGenerator<string, void> { yield* this.stream(prompt); } } // JSON-only response Solo class JSONSolo extends Solo { constructor(driver: ThreadDriver, schema: string) { super({ driver, systemPrompt: `Always respond with valid JSON matching this schema: ${schema}. Never include explanations, only JSON.` }); } async askJSON<T>(prompt: string): Promise<T> { const response = await this.ask(prompt); try { return JSON.parse(response.content); } catch (error) { throw new Error(`Invalid JSON response: ${response.content}`); } } } ``` ### 5. **Performance Monitoring** ```typescript class MonitoredSolo extends Solo { private metrics = { totalCalls: 0, averageLatency: 0, errorRate: 0, cacheHitRate: 0 }; async ask(prompt: string): Promise<SoloResponse> { const startTime = Date.now(); this.metrics.totalCalls++; try { const response = await super.ask(prompt); // Update metrics const latency = Date.now() - startTime; this.updateLatencyMetric(latency); return response; } catch (error) { this.metrics.errorRate = this.calculateErrorRate(); throw error; } } getMetrics() { return { ...this.metrics }; } private updateLatencyMetric(latency: number) { this.metrics.averageLatency = (this.metrics.averageLatency * (this.metrics.totalCalls - 1) + latency) / this.metrics.totalCalls; } } ``` ## Summary Choose **Solo** when you need: - Simple, deterministic outputs - Type-safe structured data - High reliability with retries - Consistent classification results Choose **Agent** when you need: - Complex multi-step workflows - Dynamic tool execution - Iterative problem solving - Stateful conversations Both patterns can be combined - use Solo for classification steps within Agent workflows for the best of both worlds.