UNPKG

llm-polyglot

Version:

A universal LLM client - provides adapters for various LLM providers to adhere to a universal interface - the openai sdk - allows you to use providers like anthropic using the same openai interface and transforms the responses in the same way - this allow

251 lines (201 loc) 10.8 kB
<div align="center"> <h1>llm-polyglot</h1> </div> <br /> <p align="center"><i>> Universal client for LLM providers with OpenAI-compatible interface</i></p> <br /> <div align="center"> <a aria-label="NPM version" href="https://www.npmjs.com/package/llm-polyglot"> <img alt="llm-polyglot" src="https://img.shields.io/npm/v/llm-polyglot.svg?style=flat-square&logo=npm&labelColor=000000&label=llm-polyglot"> </a> <a aria-label="Island AI" href="https://github.com/hack-dance/island-ai"> <img alt="Island AI" src="https://img.shields.io/badge/Part of Island AI-000000.svg?style=flat-square&labelColor=000000&logo="> </a> <a aria-label="Made by hack.dance" href="https://hack.dance"> <img alt="docs" src="https://img.shields.io/badge/MADE%20BY%20HACK.DANCE-000000.svg?style=flat-square&labelColor=000000"> </a> <a aria-label="Twitter" href="https://twitter.com/dimitrikennedy"> <img alt="follow" src="https://img.shields.io/twitter/follow/dimitrikennedy?style=social&labelColor=000000"> </a> </div> `llm-polyglot` extends the OpenAI SDK to provide a consistent interface across different LLM providers. Use the same familiar OpenAI-style API with Anthropic, Google, and others. ## Provider Support **Native API Support Status:** | Provider API | Status | Chat | Basic Stream | Functions/Tool calling | Function streaming | Notes | |-------------|---------|------|--------------|---------------------|-----------------|--------| | OpenAI | | | | | | Direct SDK proxy | | Anthropic | | | | | | Claude models | | Google | | | | | | Gemini models + context caching | | Azure | 🚧 | | | | | OpenAI model hosting | | Cohere | | - | - | - | - | Not supported | | AI21 | | - | - | - | - | Not supported | Stream Types: - **Basic Stream**: Simple text streaming - **Partial JSON Stream**: Progressive JSON object construction during streaming - **Function Stream**: Streaming function/tool calls and their results <br /> **OpenAI-Compatible Hosting Providers:** These providers use the OpenAI SDK format, so they work directly with the OpenAI client configuration: | Provider | How to Use | Available Models | |----------|------------|------------------| | Together | Use OpenAI client with Together base URL | Mixtral, Llama, OpenChat, Yi, others | | Anyscale | Use OpenAI client with Anyscale base URL | Mistral, Llama, others | | Perplexity | Use OpenAI client with Perplexity base URL | pplx-* models | | Replicate | Use OpenAI client with Replicate base URL | Various open models | ## Installation ```bash # Base installation npm install llm-polyglot openai # Provider-specific SDKs (as needed) npm install @anthropic-ai/sdk # For Anthropic npm install @google/generative-ai # For Google/Gemini ``` ## Basic Usage ```typescript import { createLLMClient } from "llm-polyglot"; // Initialize provider-specific client const client = createLLMClient({ provider: "anthropic" // or "google", "openai", etc. }); // Use consistent OpenAI-style interface const completion = await client.chat.completions.create({ model: "claude-3-opus-20240229", messages: [{ role: "user", content: "Hello!" }], max_tokens: 1000 }); ``` ## Provider-Specific Features ### Anthropic The llm-polyglot library provides support for Anthropic's API, including standard chat completions, streaming chat completions, and function calling. Both input paramaters and responses match exactly those of the OpenAI SDK - for more detailed documentation please see the OpenAI docs: [https://platform.openai.com/docs/api-reference](https://platform.openai.com/docs/api-reference) The anthropic sdk is required when using the anthropic provider - we only use the types provided by the sdk. ```bash bun add @anthropic-ai/sdk ``` ```typescript const client = createLLMClient({ provider: "anthropic" }); // Standard completion const response = await client.chat.completions.create({ model: "claude-3-opus-20240229", messages: [{ role: "user", content: "Hello!" }] }); // Streaming const stream = await client.chat.completions.create({ model: "claude-3-opus-20240229", messages: [{ role: "user", content: "Hello!" }], stream: true }); for await (const chunk of stream) { process.stdout.write(chunk.choices[0]?.delta?.content ?? ""); } // Tool/Function calling const result = await client.chat.completions.create({ model: "claude-3-opus-20240229", messages: [{ role: "user", content: "Analyze this data" }], tools: [{ type: "function", function: { name: "analyze", parameters: { type: "object", properties: { sentiment: { type: "string" } } } } }] }); ``` ### Google (Gemini) The llm-polyglot library provides support for Google's Gemini API including: - Standard chat completions with OpenAI-compatible interface - Streaming chat completions with delta updates - Function/tool calling with automatic schema conversion - Context caching for token optimization (requires paid API key) - Grounding support with Google Search integration - Safety settings and model generation config - Session management for stateful conversations - Automatic response transformation with source attribution The Google generative-ai sdk is required when using the google provider: ```bash bun add @google/generative-ai ``` To use any of the above functionality, the schema matches OpenAI's format since we translate the OpenAI params spec into Gemini's model spec. #### Basic Usage ```typescript const client = createLLMClient({ provider: "google" }); // Standard completion const completion = await client.chat.completions.create({ model: "gemini-1.5-flash-latest", messages: [{ role: "user", content: "Hello!" }], max_tokens: 1000 }); // With grounding (Google Search) const groundedCompletion = await client.chat.completions.create({ model: "gemini-1.5-flash-latest", messages: [{ role: "user", content: "What are the latest AI developments?" }], groundingThreshold: 0.7, max_tokens: 1000 }); // With safety settings const safeCompletion = await client.chat.completions.create({ model: "gemini-1.5-flash-latest", messages: [{ role: "user", content: "Tell me a story" }], additionalProperties: { safetySettings: [{ category: "HARM_CATEGORY_HARASSMENT", threshold: "BLOCK_MEDIUM_AND_ABOVE" }] } }); // With session management const sessionCompletion = await client.chat.completions.create({ model: "gemini-1.5-flash-latest", messages: [{ role: "user", content: "Remember this: I'm Alice" }], additionalProperties: { sessionId: "user-123" } }); ``` #### Context Caching [Context Caching](https://ai.google.dev/gemini-api/docs/caching) is a feature specific to Gemini that helps cut down on duplicate token usage by allowing you to create a cache with a TTL: ```typescript // Create a cache const cache = await client.cacheManager.create({ model: "gemini-1.5-flash-8b", messages: [{ role: "user", content: "Context to cache" }], ttlSeconds: 3600 // Cache for 1 hour }); // Use the cached context const completion = await client.chat.completions.create({ model: "gemini-1.5-flash-8b", messages: [{ role: "user", content: "Follow-up question" }], additionalProperties: { cacheName: cache.name } }); ``` #### Function/Tool Calling ```typescript const completion = await client.chat.completions.create({ model: "gemini-1.5-flash-latest", messages: [{ role: "user", content: "Analyze this data" }], tools: [{ type: "function", function: { name: "analyze", parameters: { type: "object", properties: { sentiment: { type: "string" } } } } }], tool_choice: { type: "function", function: { name: "analyze" } } }); ``` ## Error Handling ```