UNPKG

@unified-llm/core

Version:

Unified LLM interface (in-memory).

829 lines (680 loc) β€’ 24.2 kB
# @unified-llm/core [![MIT License](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT) [![npm version](https://badge.fury.io/js/%40unified-llm%2Fcore.svg)](https://badge.fury.io/js/%40unified-llm%2Fcore) A simple way to manipulate multiple LLMs (OpenAI, Anthropic, Google Gemini, DeepSeek, Azure OpenAI, Ollama) with unified interface. Why this matters: - One interface for many LLMs: swap providers without changing app code. - Event-based streaming API: start β†’ text_delta* β†’ stop β†’ error. - Same field for display: use `response.text` for both chat and stream. - Clean streaming with tools: providers execute tool calls mid-stream; you only receive text. - Power when you need it: access provider-native payloads via `rawResponse` on the final chunk. ## Features - πŸ€– **Multi-Provider Support** - OpenAI, Anthropic Claude, Google Gemini, DeepSeek, Azure OpenAI, Ollama - ⚑ **Event‑Based Streaming API** - Unified `start/text_delta/stop/error` events across providers - πŸ”§ **Function Calling** - Execute local functions and integrate external tools - πŸ“Š **Structured Output** - Guaranteed JSON schema compliance across all providers - πŸ’¬ **Conversation Persistence** - SQLite-based chat history and thread management - 🏠 **Local LLM Support** - Run models locally with Ollama's OpenAI-compatible API ## Installation ```bash npm install @unified-llm/core ``` ## Simplest way to chat with multiple LLMs One tiny interface, many providers. Change only the `provider` and `model`. ```typescript import { LLMClient } from '@unified-llm/core'; const providers = [ { provider: 'openai', model: 'gpt-4o-mini', apiKey: process.env.OPENAI_API_KEY }, { provider: 'anthropic',model: 'claude-3-haiku-20240307', apiKey: process.env.ANTHROPIC_API_KEY }, { provider: 'google', model: 'gemini-2.5-flash', apiKey: process.env.GOOGLE_API_KEY }, { provider: 'deepseek', model: 'deepseek-chat', apiKey: process.env.DEEPSEEK_API_KEY }, // Azure and Ollama have dedicated sections below ]; for (const cfg of providers.filter(p => p.apiKey)) { const client = new LLMClient(cfg as any); const res = await client.chat({ messages: [{ role: 'user', content: 'Give me one fun fact.' }] }); console.log(cfg.provider, 'β†’', res.text); } ``` ## Streaming Responses Streaming now uses an event-based streaming API across providers. Instead of provider-specific chunk shapes, you receive structured events with an `eventType` and optional `delta`. This replaces the legacy streaming example and is not backward compatible. See docs/streaming-unification.md for the full spec. ### Basic Streaming Example ```typescript import { LLMClient } from '@unified-llm/core'; const client = new LLMClient({ provider: 'openai', model: 'gpt-4o-mini', apiKey: process.env.OPENAI_API_KEY, systemPrompt: 'You are a helpful assistant that answers questions in Japanese.', }); const stream = await client.stream({ messages: [ { id: '1', role: 'user', content: 'What are some recommended tourist spots in Osaka?', createdAt: new Date() }, ], }); let acc = ''; for await (const ev of stream) { switch (ev.eventType) { case 'start': // initialize UI state if needed break; case 'text_delta': // ev.delta?.text is the incremental piece; ev.text is the accumulator process.stdout.write(ev.delta?.text ?? ''); acc = ev.text; break; case 'stop': console.log('\nComplete response:', ev.text); // ev.rawResponse contains provider-native final response (or stream data) break; case 'error': console.error('Stream error:', ev.delta); break; } } ``` ## Function Calling The `defineTool` helper provides type safety for tool definitions, automatically inferring argument and return types from the handler function: ```typescript import { LLMClient } from '@unified-llm/core'; import { defineTool } from '@unified-llm/core/tools'; import fs from 'fs/promises'; // Let AI read and analyze any file const readFile = defineTool({ type: 'function', function: { name: 'readFile', description: 'Read any text file', parameters: { type: 'object', properties: { filename: { type: 'string', description: 'Name of file to read' } }, required: ['filename'] } }, handler: async (args: { filename: string }) => { const content = await fs.readFile(args.filename, 'utf8'); return content; } }); const client = new LLMClient({ provider: 'openai', model: 'gpt-4o-mini', apiKey: process.env.OPENAI_API_KEY, tools: [readFile] }); // Create a sample log file for demo await fs.writeFile('app.log', ` [ERROR] 2024-01-15 Database connection timeout [WARN] 2024-01-15 Memory usage at 89% [ERROR] 2024-01-15 Failed to authenticate user rhyizm [ERROR] 2024-01-15 Database connection timeout [INFO] 2024-01-15 Server restarted `); // Ask AI to analyze the log file const response = await client.chat({ messages: [{ role: 'user', content: "Read app.log and tell me what's wrong with my application", createdAt: new Date() }] }); console.log(response.message.content); // AI will read the actual file and give you insights about the errors! ``` ### Streaming with Function Calls During streaming, tool calls are handled provider-side for you. When a model requests tool input mid-stream, the provider accumulates the tool call, executes your registered tool handlers, and continues streaming the final assistant text. You only observe text events: `start β†’ text_delta* β†’ stop`. ```typescript import { LLMClient } from '@unified-llm/core'; import { defineTool } from '@unified-llm/core/tools'; const getWeather = defineTool({ type: 'function', function: { name: 'getWeather', description: 'Get current weather for a location', parameters: { type: 'object', properties: { location: { type: 'string', description: 'City name' } }, required: ['location'] } }, handler: async (args: { location: string }) => { return `Weather in ${args.location}: Sunny, 27Β°C`; } }); const client = new LLMClient({ provider: 'openai', model: 'gpt-4o-mini', apiKey: process.env.OPENAI_API_KEY, tools: [getWeather] }); const stream = await client.stream({ messages: [{ id: '1', role: 'user', content: "What's the weather like in Tokyo?", createdAt: new Date() }] }); for await (const ev of stream) { if (ev.eventType === 'text_delta') { process.stdout.write(ev.delta?.text ?? ''); } if (ev.eventType === 'stop') { console.log('\nFinal text:', ev.text); } } ``` ## Structured Output Structured Output ensures that AI responses follow a specific JSON schema format across all supported providers. This is particularly useful for applications that need to parse and process AI responses programmatically. ### Basic Structured Output ```typescript import { LLMClient, ResponseFormat } from '@unified-llm/core'; // Define the expected response structure const weatherFormat = new ResponseFormat({ name: 'weather_info', description: 'Weather information for a location', schema: { type: 'object', properties: { location: { type: 'string' }, temperature: { type: 'number' }, condition: { type: 'string' }, humidity: { type: 'number' } }, required: ['location', 'temperature', 'condition'] } }); const client = new LLMClient({ provider: 'openai', model: 'gpt-4o-2024-08-06', // Structured output requires specific models apiKey: process.env.OPENAI_API_KEY }); const response = await client.chat({ messages: [{ role: 'user', content: 'What is the weather like in Tokyo today?' }], generationConfig: { responseFormat: weatherFormat } }); // Response will be guaranteed to follow the schema console.log(JSON.parse(response.message.content[0].text)); // Output: { "location": "Tokyo", "temperature": 25, "condition": "Sunny", "humidity": 60 } ``` ### Multi-Provider Structured Output The same `ResponseFormat` works across all providers with automatic conversion: ```typescript // Works with OpenAI (uses json_schema format internally) const openaiClient = new LLMClient({ provider: 'openai', model: 'gpt-4o-2024-08-06', apiKey: process.env.OPENAI_API_KEY }); // Works with Google Gemini (uses responseSchema format internally) const geminiClient = new LLMClient({ provider: 'google', model: 'gemini-1.5-pro', apiKey: process.env.GOOGLE_API_KEY }); // Works with Anthropic (uses prompt engineering internally) const claudeClient = new LLMClient({ provider: 'anthropic', model: 'claude-3-5-sonnet-latest', apiKey: process.env.ANTHROPIC_API_KEY }); const userInfoFormat = new ResponseFormat({ name: 'user_profile', schema: { type: 'object', properties: { name: { type: 'string' }, age: { type: 'number' }, email: { type: 'string' }, interests: { type: 'array', items: { type: 'string' } } }, required: ['name', 'age', 'email'] } }); const request = { messages: [{ role: 'user', content: 'Create a sample user profile' }], generationConfig: { responseFormat: userInfoFormat } }; // All three will return structured JSON in the same format const openaiResponse = await openaiClient.chat(request); const geminiResponse = await geminiClient.chat(request); const claudeResponse = await claudeClient.chat(request); ``` ### Pre-built Response Format Templates The library provides convenient templates for common structured output patterns: ```typescript import { ResponseFormats } from '@unified-llm/core'; // Key-value extraction const contactFormat = ResponseFormats.keyValue(['name', 'email', 'phone']); const contactResponse = await client.chat({ messages: [{ role: 'user', content: 'Extract contact info: John Doe, john@example.com, 555-1234' }], generationConfig: { responseFormat: contactFormat } }); // Classification with confidence scores const sentimentFormat = ResponseFormats.classification(['positive', 'negative', 'neutral']); const sentimentResponse = await client.chat({ messages: [{ role: 'user', content: 'Analyze sentiment: "I absolutely love this new feature!"' }], generationConfig: { responseFormat: sentimentFormat } }); // Returns: { "category": "positive", "confidence": 0.95 } // List responses const taskFormat = ResponseFormats.list({ type: 'object', properties: { task: { type: 'string' }, priority: { type: 'string', enum: ['high', 'medium', 'low'] }, deadline: { type: 'string' } } }); const taskResponse = await client.chat({ messages: [{ role: 'user', content: 'Create a task list for launching a mobile app' }], generationConfig: { responseFormat: taskFormat } }); // Returns: { "items": [{ "task": "Design UI", "priority": "high", "deadline": "2024-02-01" }, ...] } ``` ### Complex Nested Schemas ```typescript const productReviewFormat = new ResponseFormat({ name: 'product_review', schema: { type: 'object', properties: { rating: { type: 'number', minimum: 1, maximum: 5 }, summary: { type: 'string' }, pros: { type: 'array', items: { type: 'string' } }, cons: { type: 'array', items: { type: 'string' } }, recommendation: { type: 'object', properties: { wouldRecommend: { type: 'boolean' }, targetAudience: { type: 'string' }, alternatives: { type: 'array', items: { type: 'string' } } } } }, required: ['rating', 'summary', 'pros', 'cons', 'recommendation'] } }); const reviewResponse = await client.chat({ messages: [{ role: 'user', content: 'Review this smartphone: iPhone 15 Pro - great camera, expensive, good battery life' }], generationConfig: { responseFormat: productReviewFormat } }); ``` ### Provider-Specific Notes - **OpenAI**: Supports native structured outputs with `gpt-4o-2024-08-06` and newer models - **Google Gemini**: Uses `responseMimeType: 'application/json'` with `responseSchema` - **Anthropic**: Uses prompt engineering to request JSON format responses - **DeepSeek**: Similar to OpenAI, supports JSON mode The `ResponseFormat` class automatically handles the conversion to each provider's specific format, ensuring consistent behavior across all supported LLMs. ## Multi-Provider Example ```typescript import { LLMClient } from '@unified-llm/core'; // Create LLM clients for different providers const gpt = new LLMClient({ provider: 'openai', model: 'gpt-4o-mini', apiKey: process.env.OPENAI_API_KEY, systemPrompt: 'You are a helpful assistant that answers concisely.' }); const claude = new LLMClient({ provider: 'anthropic', model: 'claude-3-haiku-20240307', apiKey: process.env.ANTHROPIC_API_KEY, systemPrompt: 'You are a thoughtful assistant that provides detailed explanations.' }); const gemini = new LLMClient({ provider: 'google', model: 'gemini-2.0-flash', apiKey: process.env.GOOGLE_API_KEY, systemPrompt: 'You are a creative assistant that thinks outside the box.' }); const deepseek = new LLMClient({ provider: 'deepseek', model: 'deepseek-chat', apiKey: process.env.DEEPSEEK_API_KEY, systemPrompt: 'You are a technical assistant specialized in coding.' }); // Use the unified chat interface const request = { messages: [{ id: '1', role: 'user', content: 'What are your thoughts on AI?', createdAt: new Date() }] }; // Each provider will respond according to their system prompt const gptResponse = await gpt.chat(request); const claudeResponse = await claude.chat(request); const geminiResponse = await gemini.chat(request); const deepseekResponse = await deepseek.chat(request); ``` ### Multi-Provider Streaming Streaming works consistently across all supported providers using the unified event model: ```typescript const providers = [ { name: 'OpenAI', provider: 'openai', model: 'gpt-4o-mini' }, { name: 'Claude', provider: 'anthropic', model: 'claude-3-haiku-20240307' }, { name: 'Gemini', provider: 'google', model: 'gemini-2.0-flash' }, { name: 'DeepSeek', provider: 'deepseek', model: 'deepseek-chat' } ]; for (const config of providers) { const client = new LLMClient({ provider: config.provider as any, model: config.model, apiKey: process.env[`${config.provider.toUpperCase()}_API_KEY`] }); console.log(`\n--- ${config.name} Response ---`); const stream = await client.stream({ messages: [{ id: '1', role: 'user', content: 'Tell me a short story about AI.', createdAt: new Date() }] }); for await (const ev of stream) { if (ev.eventType === 'text_delta') { process.stdout.write(ev.delta?.text ?? ''); } } } ``` ## Unified Response Format All providers return responses in a consistent format, making it easy to switch between different LLMs: ### Chat Response Format ```typescript { id: "chatcmpl-Blub8EgOvVaP7c3lxzmVF4TJpVCun", model: "gpt-4o-mini", provider: "openai", message: { id: "msg_1750758679093_r9hqdhfzh", role: "assistant", content: [ { type: "text", text: "The author of this project is rhyizm." } ], createdAt: "2025-06-24T09:51:19.093Z" }, usage: { inputTokens: 72, outputTokens: 10, totalTokens: 82 }, finish_reason: "stop", createdAt: "2025-06-24T09:51:18.000Z", rawResponse: { /* Original response from the provider (as returned by OpenAI, Anthropic, Google, DeepSeek, etc.) */ } } ``` ### Stream Response Format Each streaming event mirrors `UnifiedChatResponse` with a few additions: ```typescript { id: "chatcmpl-example", model: "gpt-4o-mini", provider: "openai", message: { id: "msg_example", role: "assistant", content: [ { type: "text", text: "Chunk of text..." } ], createdAt: "2025-01-01T00:00:00.000Z" }, // createdAt is optional in streaming events eventType: "text_delta", // one of: start | text_delta | stop | error outputIndex: 0, delta: { type: "text", text: "Chunk of text..." } } On the final `stop` event: - `finish_reason` may be present (e.g., "stop", "length"). - `usage` may be present when available. - `rawResponse` contains the provider-native final result. For streaming providers that don’t return a single object, it contains native stream data (e.g., an array of SSE chunks). For Gemini, it includes both `{ stream, response }`. ``` Key benefits: - **Consistent structure** across all providers (OpenAI, Anthropic, Google, DeepSeek, Azure) - **Event-based streaming** with `eventType` and `delta` for incremental text - **Unified usage tracking** when the provider reports it - **Provider identification** to know which service generated the response - **Raw response access** on the final chunk for provider-specific features ## Persistent LLM Client Configuration ```typescript import { LLMClient } from '@unified-llm/core'; // Save LLM client configuration const savedClientId = await LLMClient.save({ name: 'My AI Assistant', provider: 'openai', model: 'gpt-4o-mini', systemPrompt: 'You are a helpful coding assistant.', tags: ['development', 'coding'], isActive: true }); // Load saved LLM client const client = await LLMClient.fromSaved( savedClientId, process.env.OPENAI_API_KEY ); // List all saved LLM clients const clients = await LLMClient.list({ provider: 'openai', includeInactive: false }); ``` ## Azure OpenAI Example Azure OpenAI requires a different initialization pattern compared to other providers: ```typescript import { AzureOpenAIProvider } from '@unified-llm/core/providers/azure'; // Azure OpenAI uses a different constructor pattern // First parameter: Azure-specific configuration // Second parameter: Base options (apiKey, tools, etc.) const azureOpenAI = new AzureOpenAIProvider( { endpoint: process.env.AZURE_OPENAI_ENDPOINT!, // Azure resource endpoint deployment: process.env.AZURE_OPENAI_DEPLOYMENT!, // Model deployment name apiVersion: '2024-10-21', // Optional, defaults to 'preview' useV1: true // Use /openai/v1 endpoint format }, { apiKey: process.env.AZURE_OPENAI_KEY!, tools: [] // Optional tools } ); // Compare with standard LLMClient initialization: // const client = new LLMClient({ // provider: 'openai', // model: 'gpt-4o-mini', // apiKey: process.env.OPENAI_API_KEY // }); // Use it like any other provider const response = await azureOpenAI.chat({ messages: [{ id: '1', role: 'user', content: 'Hello from Azure!', createdAt: new Date() }] }); ``` ## Ollama & OpenAI-Compatible APIs ### Ollama (Local LLM) Example Ollama provides an OpenAI-compatible API, allowing you to run large language models locally. You can use either `provider: 'ollama'`. ### Prerequisites 1. Install Ollama from [ollama.ai](https://ollama.ai) 2. Pull a model: `ollama pull llama3` (or any other model) 3. Start Ollama server (usually runs automatically at `http://localhost:11434`) ### Basic Usage ```typescript import { LLMClient } from '@unified-llm/core'; // Ollama configuration - no API key required const ollama = new LLMClient({ provider: 'ollama', model: 'llama3', // or 'mistral', 'codellama', etc. baseURL: 'http://localhost:11434/v1', // Optional - this is the default systemPrompt: 'You are a helpful assistant running locally.' }); // Use it just like any other provider const response = await ollama.chat({ messages: [{ id: '1', role: 'user', content: 'Explain quantum computing in simple terms.', createdAt: new Date() }] }); console.log(response.message.content); ``` ### Remote Ollama Server If you're running Ollama on a different machine or port: ```typescript const remoteOllama = new LLMClient({ provider: 'ollama', model: 'llama3', baseURL: 'http://your-server:11434/v1' // Replace with your server address }); ``` ### Available Models Popular models you can use with Ollama: - `llama3` - Meta's Llama 3 - `mistral` - Mistral AI's models - `codellama` - Code-focused Llama variant - `phi` - Microsoft's Phi models - `gemma` - Google's Gemma models - `mixtral` - Mixture of experts model Check available models with: `ollama list` ### Why Use Ollama Provider? While Ollama is OpenAI-compatible and could work with `provider: 'openai'`, using `provider: 'ollama'` offers: - **Clearer intent** - Makes it obvious you're using a local model - **No API key required** - Ollama doesn't need authentication - **Future compatibility** - If we add Ollama-specific features, your code won't need changes ### Running Examples You can run TypeScript examples directly with tsx: ```bash # Add tsx to your project npm install --save-dev tsx # Run the example npx tsx example.ts ``` Example file (`example.ts`): ```typescript import { LLMClient } from '@unified-llm/core'; async function runOllamaExample() { const ollamaClient = new LLMClient({ provider: 'ollama', model: 'llama3', baseURL: 'http://localhost:11434/v1' }); const response = await ollamaClient.chat({ messages: [{ id: '1', role: 'user', content: 'Hello, introduce yourself in Japanese.', createdAt: new Date() }] }); console.log('Ollama Response:', JSON.stringify(response, null, 2)); } runOllamaExample().catch(console.error); ``` ## Environment Variables ```env OPENAI_API_KEY=your-openai-key ANTHROPIC_API_KEY=your-anthropic-key GOOGLE_API_KEY=your-google-key DEEPSEEK_API_KEY=your-deepseek-key AZURE_OPENAI_KEY=your-azure-key # For Azure OpenAI AZURE_OPENAI_ENDPOINT=https://your-resource.openai.azure.com # For Azure OpenAI AZURE_OPENAI_DEPLOYMENT=your-deployment-name # For Azure OpenAI UNIFIED_LLM_DB_PATH=./chat-history.db # Optional custom DB path ``` ## Supported Providers | Provider | Models | Features | |----------|---------|----------| | **OpenAI** | GPT-4o, GPT-4o-mini, GPT-4, GPT-3.5 | Function calling, streaming, vision, structured output | | **Anthropic** | Claude 3.5 (Sonnet), Claude 3 (Opus, Sonnet, Haiku) | Tool use, streaming, long context, structured output | | **Google** | Gemini 2.0 Flash, Gemini 1.5 Pro/Flash | Function calling, multimodal, structured output | | **DeepSeek** | DeepSeek-Chat, DeepSeek-Coder | Function calling, streaming, code generation, structured output | | **Azure OpenAI** | GPT-4o, GPT-4, GPT-3.5 (via Azure deployments) | Function calling, streaming, structured output | | **Ollama** | Llama 3, Mistral, CodeLlama, Phi, Gemma, Mixtral, etc. | Local execution, OpenAI-compatible API, no API key required | ## API Methods ### Core Methods ```typescript // Main chat method - returns complete response await client.chat(request: UnifiedChatRequest) // Streaming responses (returns async generator) for await (const chunk of client.stream(request)) { console.log(chunk); } ``` **Note:** Persistence methods are experimental; there is a possibility that they may be removed or moved to a separate package in the future. ### Persistence Methods ```typescript // Save LLM client configuration await LLMClient.save(config: LLMClientConfig) // Load saved LLM client await LLMClient.fromSaved(id: string, apiKey?: string) // Get saved configuration await LLMClient.getConfig(id: string) // List saved LLM clients await LLMClient.list(options?: { provider?: string, includeInactive?: boolean }) // Update configuration await LLMClient.update(id: string, updates: Partial<LLMClientConfig>) // Soft delete LLM client await LLMClient.delete(id: string) ``` ## Requirements - Node.js 20 or higher - TypeScript 5.4.5 or higher (for development) ## License MIT - see [LICENSE](https://github.com/rhyizm/unified-llm/blob/main/LICENSE) for details. ## Links - 🏠 [Homepage](https://github.com/rhyizm/unified-llm) - πŸ› [Report Issues](https://github.com/rhyizm/unified-llm/issues) - πŸ’¬ [Discussions](https://github.com/rhyizm/unified-llm/discussions)