modelmix
Version:
🧬 Reliable interface with automatic fallback for AI LLMs.
782 lines (604 loc) • 33.9 kB
Markdown
# 🧬 ModelMix: Reliable interface with automatic fallback for AI LLMs
**ModelMix** is a versatile module that enables seamless integration of various language models from different providers through a unified interface. With ModelMix, you can effortlessly manage and utilize multiple AI models while controlling request rates to avoid provider restrictions. The module also supports the Model Context Protocol (MCP), allowing you to enhance your models with powerful capabilities like web search, code execution, and custom functions.
Ever found yourself wanting to integrate AI models into your projects but worried about reliability? ModelMix helps you build resilient AI applications by chaining multiple models together. If one model fails, it automatically switches to the next one, ensuring your application keeps running smoothly.
## ✨ Features
- **Unified Interface**: Interact with multiple AI models through a single, coherent API.
- **Request Rate Control**: Manage the rate of requests to adhere to provider limitations using Bottleneck.
- **Flexible Integration**: Easily integrate popular models like OpenAI, Anthropic, Gemini, Perplexity, Groq, Together AI, Lambda, OpenRouter, Ollama, LM Studio or custom models.
- **History Tracking**: Automatically logs the conversation history with model responses, allowing you to limit the number of historical messages with `max_history`.
- **Model Fallbacks**: Automatically try different models if one fails or is unavailable.
- **Round Robin Load Balancing**: Rotate through multiple models on each request to distribute load and maximize free tier quotas.
- **Chain Multiple Models**: Create powerful chains of models that work together, with automatic fallback if one fails.
- **Model Context Protocol (MCP) Support**: Seamlessly integrate external tools and capabilities like web search, code execution, or custom functions through the Model Context Protocol standard.
## 🛠️ Usage
1. **Install the ModelMix package:**
```bash
npm install modelmix
```
> **AI Skill**: You can also add ModelMix as a skill for AI agentic development:
> ```bash
> npx skills add https://github.com/clasen/ModelMix --skill modelmix
> ```
2. **Setup your environment variables (.env file)**:
Only the API keys you plan to use are required.
```plaintext
ANTHROPIC_API_KEY="sk-ant-..."
OPENAI_API_KEY="sk-proj-..."
OPENROUTER_API_KEY="sk-or-..."
MINIMAX_API_KEY="your-minimax-key..."
NVIDIA_API_KEY="nvapi-..."
...
GEMINI_API_KEY="AIza..."
```
For environment variables, use `dotenv` or Node's built-in `process.loadEnvFile()`.
3. **Create and configure your models**:
```javascript
import { ModelMix } from 'modelmix';
try { process.loadEnvFile(); } catch {}
// Get structured JSON responses
const model = ModelMix.new()
.sonnet46() // Anthropic claude-sonnet-4-6
.addText("Name and capital of 3 South American countries.");
const outputExample = { countries: [{ name: "", capital: "" }] };
console.log(await model.json(outputExample));
```
**Chain multiple models with automatic fallback**
```javascript
const setup = {
config: {
system: "You are ALF, if they ask your name, respond with 'ALF'.",
debug: 2
}
};
const model = await ModelMix.new(setup)
.sonnet46() // (main model) Anthropic claude-sonnet-4-5-20250929
.gpt5mini() // (fallback 2) OpenAI gpt-5-mini
.gemini3flash({ config: { temperature: 0 } }) // (fallback 3) Google gemini-3-flash
.grok43() // (fallback 4) Grok grok-4.3
.addText("What's your name?");
console.log(await model.message());
```
**Use Perplexity to get the price of ETH**
```javascript
const ETH = ModelMix.new()
.sonar() // Perplexity sonar
.addText('How much is ETH trading in USD?')
.json({ price: 1000.1 });
console.log(ETH.price);
```
**This example uses providers with free quotas (OpenRouter, Groq, Cerebras) - just get the API key and you're ready to go. If one model runs out of quota, ModelMix automatically falls back to the next model in the chain.**
```javascript
ModelMix.new()
.gptOss()
.kimiK25think()
.deepseekR1()
.hermes3()
.addText('What is the capital of France?');
```
This pattern allows you to:
- Chain multiple models together
- Automatically fall back to the next model if one fails
- Get structured JSON responses when needed
- Track token usage across all providers
- Keep your code clean and maintainable
## 🔧 Model Context Protocol (MCP) Integration
ModelMix makes it incredibly easy to enhance your AI models with powerful capabilities through the Model Context Protocol. With just a few lines of code, you can add features like web search, code execution, or any custom functionality to your models.
### Example: Adding Web Search Capability
Include the API key for Brave Search in your .env file.
```
BRAVE_API_KEY="BSA0..._fm"
```
```javascript
const mmix = ModelMix.new({ config: { max_history: 10 } }).gpt5nano();
mmix.setSystem('You are an assistant and today is ' + new Date().toISOString());
// Add web search capability through MCP
await mmix.addMCP('@modelcontextprotocol/server-brave-search');
mmix.addText('Use Internet: When did the last Christian pope die?');
console.log(await mmix.message());
```
This simple integration allows your model to:
- Search the web in real-time
- Access up-to-date information
- Combine AI reasoning with external data
The Model Context Protocol makes it easy to add any capability to your models, from web search to code execution, database queries, or custom functions. All with just a few lines of code!
## ⚡️ Shorthand Methods
ModelMix provides convenient shorthand methods for quickly accessing different AI models.
Here's a comprehensive list of available methods:
| Method | Provider | Model | Price (I/O) per 1 M tokens |
| ------------------- | ---------- | ---------------------------- | -------------------------- |
| `gpt54()` | OpenAI | gpt-5.4 | [\$2.50 / \$15.00][1] |
| `gpt54mini()` | OpenAI | gpt-5.4-mini | [\$0.75 / \$4.50][1] |
| `gpt54nano()` | OpenAI | gpt-5.4-nano | [\$0.20 / \$1.25][1] |
| `gpt53codex()` | OpenAI | gpt-5.3-codex | [\$1.25 / \$14.00][1] |
| `gpt52()` | OpenAI | gpt-5.2 | [\$1.75 / \$14.00][1] |
| `gpt51()` | OpenAI | gpt-5.1 | [\$1.25 / \$10.00][1] |
| `gpt5mini()` | OpenAI | gpt-5-mini | [\$0.25 / \$2.00][1] |
| `gpt5nano()` | OpenAI | gpt-5-nano | [\$0.05 / \$0.40][1] |
| `gpt41()` | OpenAI | gpt-4.1 | [\$2.00 / \$8.00][1] |
| `gpt41mini()` | OpenAI | gpt-4.1-mini | [\$0.40 / \$1.60][1] |
| `gpt41nano()` | OpenAI | gpt-4.1-nano | [\$0.10 / \$0.40][1] |
| `gptOss()` | Together | gpt-oss-120B | [\$0.15 / \$0.60][7] |
| `opus47[think]()` | Anthropic | claude-opus-4-7 | [\$5.00 / \$25.00][2] |
| `opus46[think]()` | Anthropic | claude-opus-4-6 | [\$5.00 / \$25.00][2] |
| `sonnet46[think]()` | Anthropic | claude-sonnet-4-6 | [\$3.00 / \$15.00][2] |
| `haiku45[think]()` | Anthropic | claude-haiku-4-5-20251001 | [\$1.00 / \$5.00][2] |
| `gemini31pro()` | Google | gemini-3.1-pro-preview | [\$2.00 / \$12.00][3] |
| `gemini35flash()` | Google | gemini-3.5-flash | N/A |
| `gemini31flashLite()`| Google | gemini-3.1-flash-lite-preview | [\$0.25 / \$1.50][3] |
| `grok43()` | Grok | grok-4.3 | [\$1.25 / \$2.50][6] |
| `grok420multiAgent()`| Grok | grok-4.20-multi-agent-0309 | [\$1.25 / \$2.50][6] |
| `grok420[think]()` | Grok | grok-4.20-0309 | [\$1.25 / \$2.50][6] |
| `grok41[think]()` | Grok | grok-4-1-fast | [\$0.20 / \$0.50][6] |
| `qwen36plus()` | Fireworks/Together | qwen3p6-plus / Qwen3.6-Plus | [\$0.50 / \$3.00][10] |
| `deepseekV4Pro()` | Fireworks | models/deepseek-v4-pro | [\$1.74 / \$3.48][10] |
| `GLM51()` | Fireworks | models/glm-5p1 | [\$1.05 / \$3.50][10] |
| `minimaxM27()` | MiniMax | MiniMax-M2.7 | [\$0.30 / \$1.20][9] |
| `sonar()` | Perplexity | sonar | [\$1.00 / \$1.00][4] |
| `sonarPro()` | Perplexity | sonar-pro | [\$3.00 / \$15.00][4] |
| `hermes3()` | Lambda | Hermes-3-Llama-3.1-405B-FP8 | [\$0.80 / \$0.80][8] |
| `kimiK25think()` | Together | Kimi-K2.5 | [\$0.50 / \$2.80][7] |
| `kimiK26think()` | Fireworks | models/kimi-k2p6 | [\$0.95 / \$4.00][10] |
[1]: https://platform.openai.com/docs/pricing "Pricing | OpenAI"
[2]: https://docs.anthropic.com/en/docs/about-claude/pricing "Pricing - Anthropic"
[3]: https://ai.google.dev/gemini-api/docs/pricing "Google AI for Developers"
[4]: https://docs.perplexity.ai/guides/pricing "Pricing - Perplexity"
[5]: https://groq.com/pricing/ "Groq Pricing"
[6]: https://docs.x.ai/docs/models "xAI"
[7]: https://www.together.ai/pricing "Together AI"
[8]: https://lambda.ai/inference "Lambda Pricing"
[9]: https://platform.minimax.io/docs/api-reference/anthropic-api-compatible-cache#supported-models-and-pricing "MiniMax Pricing"
[10]: https://fireworks.ai/pricing#serverless-pricing "Fireworks Pricing"
Each method accepts optional `options`, `config`, and (for multi-provider methods) `mix` parameters to customize behavior.
For NVIDIA on DeepSeek V4 Pro, use `deepseekV4Pro({ mix: { nvidia: true } })`.
For Together on Qwen 3.6 Plus, use `qwen36plus({ mix: { fireworks: false, together: true } })`.
```javascript
const result = await ModelMix.new({
options: { temperature: 0.7 },
config: { system: "You are a helpful assistant" }
})
.sonnet46()
.addText("Tell me a story about a cat");
.message();
```
## 🔄 Templates
ModelMix includes a simple but powerful templating system. You can write your system prompts and user messages in external `.md` files with placeholders, then use `replace` to fill them in at runtime.
### Core methods
| Method | Description |
| --- | --- |
| `setSystemFromFile(path)` | Load the system prompt from a file |
| `addTextFromFile(path)` | Load a user message from a file |
| `replace({ key: value })` | Replace placeholders in all messages and the system prompt |
| `replaceKeyFromFile(key, path)` | Replace a placeholder with the contents of a file |
### Basic example with `replace`
```javascript
const gpt = ModelMix.new().gpt52();
gpt.addText('Write a short story about a {animal} that lives in {place}.');
gpt.replace({ '{animal}': 'cat', '{place}': 'a haunted castle' });
console.log(await gpt.message());
```
### Loading prompts from `.md` files
Instead of writing long prompts inline, keep them in separate Markdown files. This makes them easier to read, edit, and version control.
**`prompts/system.md`**
```markdown
You are {role}, an expert in {topic}.
Always respond in {language}.
```
**`prompts/task.md`**
```markdown
Analyze the following and provide 3 key insights:
{content}
```
**`app.js`**
```javascript
const gpt = ModelMix.new().gpt5mini();
gpt.setSystemFromFile('./prompts/system.md');
gpt.addTextFromFile('./prompts/task.md');
gpt.replace({
'{role}': 'a senior analyst',
'{topic}': 'market trends',
'{language}': 'Spanish',
'{content}': 'Bitcoin surpassed $100,000 in December 2024...'
});
console.log(await gpt.message());
```
### Injecting file contents into a placeholder
Use `replaceKeyFromFile` when the replacement value itself is a large text stored in a file.
**`prompts/summarize.md`**
```markdown
Summarize the following article in 3 bullet points:
{article}
```
**`app.js`**
```javascript
const gpt = ModelMix.new().gpt5mini();
gpt.addTextFromFile('./prompts/summarize.md');
gpt.replaceKeyFromFile('{article}', './data/article.md');
console.log(await gpt.message());
```
### Full template workflow
Combine all methods to build reusable, file-based prompt pipelines:
**`prompts/system.md`**
```markdown
You are {role}. Follow these rules:
- Be concise
- Use examples when possible
- Respond in {language}
```
**`prompts/review.md`**
```markdown
Review the following code and suggest improvements:
{code}
```
**`app.js`**
```javascript
const gpt = ModelMix.new().gpt5mini();
gpt.setSystemFromFile('./prompts/system.md');
gpt.addTextFromFile('./prompts/review.md');
gpt.replace({ '{role}': 'a senior code reviewer', '{language}': 'English' });
gpt.replaceKeyFromFile('{code}', './src/utils.js');
console.log(await gpt.message());
```
## 🧩 JSON Structured Output
The `json` method forces the model to return a structured JSON response. You define the shape with an example object and optionally describe each field.
```javascript
await model.json(schemaExample, schemaDescription, options)
```
### Basic usage
```javascript
const model = ModelMix.new()
.gpt5mini()
.addText('Name and capital of 3 South American countries.');
const result = await model.json({ countries: [{ name: "", capital: "" }] });
console.log(result);
// { countries: [{ name: "Argentina", capital: "Buenos Aires" }, ...] }
```
### Adding field descriptions
The second argument lets you describe each field so the model understands exactly what you expect. Descriptions can be **strings** (simple) or **descriptor objects** (with metadata):
```javascript
const result = await model.json(
{ countries: [{ name: "Argentina", capital: "BUENOS AIRES" }] },
{ countries: [{ name: "name of the country", capital: "capital of the country in uppercase" }] },
{ addNote: true }
);
// { countries: [
// { name: "Brazil", capital: "BRASILIA" },
// { name: "Colombia", capital: "BOGOTA" },
// { name: "Chile", capital: "SANTIAGO" }
// ]}
```
### Enhanced descriptors
Descriptions support **descriptor objects** with `description`, `required`, `enum`, `default`, and `nullable`:
```javascript
const result = await model.json(
{ name: 'Martin', age: 22, sex: 'male' },
{
name: { description: 'Name of the actor', required: false },
age: 'Age of the actor', // string still works
sex: { description: 'Gender', enum: ['male', 'female', null], default: null }
}
);
```
| Property | Type | Default | Description |
| --- | --- | --- | --- |
| `description` | `string` | — | Field description for the model |
| `required` | `boolean` | `true` | If `false`, field is removed from `required` and its type becomes nullable |
| `enum` | `array` | — | Restricts the field to specific values. Including `null` in the array auto-makes the type nullable |
| `default` | `any` | — | Default value hint for the model |
| `nullable` | `boolean` | `false` | If `true`, makes the type nullable without removing from `required` |
You can mix plain strings and descriptor objects freely in the same descriptions parameter:
```javascript
const result = await model.json(
{ name: 'Martin', age: 22, status: 'active' },
{
name: 'Full name', // plain string
age: { description: 'Age in years', required: false }, // optional field
status: { description: 'Account status', enum: ['active', 'inactive', 'banned'], default: 'active' }
}
);
```
### Nested object descriptions
Pass a nested object as the description value to describe fields inside a nested object:
```javascript
const result = await model.json(
{ user: { name: 'Alice', age: 30 } },
{
user: { name: 'Full name of the user', age: 'Age in years' }
}
);
```
To describe the object field itself (e.g. mark it optional) **and** its nested fields, use the `description` / `required` descriptor for the parent key, which applies only to the parent, while still passing nested descriptions as its own separate key:
```javascript
// Mark the parent optional but don't describe its children
const result = await model.json(
{ user: { name: 'Alice', age: 30 } },
{ user: { description: 'User details', required: false } }
);
```
### Array item descriptions
Pass descriptions for the items of an array by wrapping the descriptions in an array:
```javascript
const result = await model.json(
{ countries: [{ name: 'France', capital: 'Paris' }] },
{ countries: [{ name: 'Country name', capital: 'Capital city in uppercase' }] }
);
```
To mark the array field itself optional while keeping item descriptions, use a descriptor on the key:
```javascript
const result = await model.json(
{ tags: ['admin'] },
{ tags: { description: 'List of user roles', required: false } }
);
```
### Automatic type and format detection
`generateJsonSchema` infers types and formats automatically from the example values:
| Example value | Inferred schema |
| --- | --- |
| `42` | `{ type: 'integer' }` |
| `19.99` | `{ type: 'number' }` |
| `true` / `false` | `{ type: 'boolean' }` |
| `null` | `{ type: 'null' }` |
| `'hello'` | `{ type: 'string' }` |
| `'user@example.com'` | `{ type: 'string', format: 'email' }` |
| `'1990-01-01'` | `{ type: 'string', format: 'date', description: 'Date in format YYYY-MM-DD' }` |
| `'14:30'` | `{ type: 'string', format: 'time', description: 'Time in format HH:MM' }` |
| `'09:15:45'` | `{ type: 'string', format: 'time', description: 'Time in format HH:MM:SS' }` |
| `[{ … }]` | `{ type: 'array', items: { … } }` — schema inferred from the first element |
| `{ … }` | `{ type: 'object', properties: { … }, required: […] }` |
When a field carries an `enum` that includes `null`, or has `required: false` or `nullable: true`, its type is widened to `[type, 'null']`. For example:
```javascript
// enum with null → type becomes ['string', 'null']
{ description: 'Gender', enum: ['m', 'f', null] }
// required: false → removes from required[] and type becomes ['string', 'null']
{ description: 'Nickname', required: false }
// nullable: true → type becomes ['string', 'null'] but stays in required[]
{ description: 'Middle name', nullable: true }
```
### Array auto-wrap
When you pass a top-level array as the example, ModelMix automatically wraps it for better LLM compatibility and unwraps the result transparently:
```javascript
const result = await model.json([{ name: 'martin' }]);
// result is an array: [{ name: "Martin" }, { name: "Carlos" }, ...]
```
Internally, the array is wrapped as `{ out: [...] }` so the model receives a proper object schema, then `result.out` is returned automatically.
### Options
| Option | Default | Description |
| --- | --- | --- |
| `addSchema` | `true` | Include the generated JSON schema in the system prompt |
| `addExample` | `false` | Include the example object in the system prompt |
| `addNote` | `false` | Add a note about JSON escaping to prevent parsing errors |
```javascript
// Include the example and the escaping note
const result = await model.json(
{ name: "John", age: 30, skills: ["JavaScript"] },
{ name: "Full name", age: "Age in years", skills: "List of programming languages" },
{ addExample: true, addNote: true }
);
```
These options give you fine-grained control over how much guidance you provide to the model for generating properly formatted JSON responses.
## 📊 Token Usage Tracking
ModelMix automatically tracks token usage for all requests across different providers, providing a unified format regardless of the underlying API.
### How it works
Every response from `raw()` now includes a `tokens` object with the following structure:
```javascript
{
tokens: {
input: 150, // Number of tokens in the prompt/input
output: 75, // Number of tokens in the completion/output
total: 225, // Total tokens used (input + output)
cached: 100, // Cached input tokens reported by the provider (0 when absent)
cost: 0.0012, // Estimated cost in USD (null if model not in pricing table)
speed: 42 // Output tokens per second (int)
}
}
```
### `lastRaw` — Access full response after `message()` or `json()`
After calling `message()` or `json()`, use `lastRaw` to access the complete response (tokens, thinking, tool calls, etc.). It has the same structure as `raw()`.
```javascript
const text = await model.message();
console.log(model.lastRaw.tokens);
// { input: 122, output: 86, total: 208, cached: 41, cost: 0.000319, speed: 38 }
```
The `cached` field is a single aggregated count of cached input tokens reported by the provider. The `cost` field is the estimated cost in USD based on the model's pricing per 1M tokens (input/output). If the model is not found in the pricing table, `cost` will be `null`. The `speed` field is the generation speed measured in output tokens per second (integer).
## 🐛 Enabling Debug Mode
To activate debug mode in ModelMix and view detailed request information, follow these two steps:
1. In the ModelMix constructor, include a `debug` level in the configuration:
```javascript
const mix = ModelMix.new({
config: {
debug: 4 // 0=silent, 1=minimal, 2=summary, 3=full (no truncate), 4=verbose (raw details)
// ... other configuration options ...
}
});
```
2. When running your script from the command line, use the `DEBUG=ModelMix*` prefix:
```
DEBUG=ModelMix* node your_script.js
```
When you run your script this way, you'll see detailed information about the requests in the console, including the configuration and options used for each AI model request.
This information is valuable for debugging and understanding how ModelMix is processing your requests.
## 🚦 Bottleneck Integration
ModelMix now uses Bottleneck for efficient rate limiting of API requests. This integration helps prevent exceeding API rate limits and ensures smooth operation when working with multiple models or high request volumes.
### How it works:
1. **Configuration**: Bottleneck is configured in the ModelMix constructor. You can customize the settings or use the default configuration:
```javascript
const setup = {
config: {
bottleneck: {
maxConcurrent: 8, // Maximum number of concurrent requests
minTime: 500 // Minimum time between requests (in ms)
}
}
};
```
2. **Rate Limiting**: When you make a request using any of the attached models, Bottleneck automatically manages the request flow based on the configured settings.
3. **Automatic Queueing**: If the rate limit is reached, Bottleneck will automatically queue subsequent requests and process them as capacity becomes available.
This integration ensures that your application respects API rate limits while maximizing throughput, providing a robust solution for managing multiple AI model interactions.
## 🔁 Retry (Opt-In)
ModelMix supports optional intra-model retries for transient HTTP failures. When enabled, it retries the same provider before moving to fallback models.
```javascript
const mix = ModelMix.new({
config: {
retry: {
enabled: true, // Default: false (opt-in)
retries: 2, // Extra attempts after first try
baseDelayMs: 500, // Exponential backoff base delay
maxDelayMs: 5000, // Backoff cap
retryableStatusCodes: [408, 425, 429, 500, 502, 503, 504, 529]
}
}
});
```
Behavior summary:
- If retry is disabled (default), ModelMix keeps current behavior: immediate fallback to next model on failure.
- If retry is enabled, ModelMix retries the same model only for configured transient status codes.
- After retries are exhausted (or for non-retryable errors), ModelMix continues with normal fallback chain.
## 📚 ModelMix Class Overview
```javascript
new ModelMix(args = { options: {}, config: {} })
```
- **args**: Configuration object with `options` and `config` properties.
- **options**: This object contains default options that are applied to all models. These options can be overridden when creating a specific model instance. Examples of default options include:
- `max_tokens`: Sets the maximum number of tokens to generate, e.g., 2000.
- `temperature`: Controls the randomness of the model's output, e.g., 1.
- ...(Additional default options can be added as needed)
- **config**: This object contains configuration settings that control the behavior of the `ModelMix` instance. These settings can also be overridden for specific model instances. Examples of configuration settings include:
- `system`: Sets the default system message for the model, e.g., "You are an assistant."
- `max_history`: Limits the number of historical messages to retain, e.g., 1.
- `roundRobin`: When `true`, rotates through attached models on each request for load balancing. When `false` (default), uses fallback mode where models are tried sequentially only if previous ones fail.
- `bottleneck`: Configures the rate limiting behavior using Bottleneck. For example:
- `maxConcurrent`: Maximum number of concurrent requests
- `minTime`: Minimum time between requests (in ms)
- `reservoir`: Number of requests allowed in the reservoir period
- `reservoirRefreshAmount`: How many requests are added when the reservoir refreshes
- `reservoirRefreshInterval`: Reservoir refresh interval
- `retry`: Optional intra-model retry policy before fallback:
- `enabled`: Enables retry behavior (`false` by default)
- `retries`: Number of retries for retryable failures
- `baseDelayMs`: Initial backoff delay in milliseconds
- `maxDelayMs`: Maximum backoff delay in milliseconds
- `retryableStatusCodes`: HTTP status codes that should trigger retry
- ...(Additional configuration parameters can be added as needed)
**Methods**
- `attach(modelKey, modelInstance)`: Attaches a model instance to the `ModelMix`.
- `new()`: `static` Creates a new `ModelMix`.
- `new()`: Creates a new `ModelMix` using instance setup.
- `setSystem(text)`: Sets the system prompt.
- `setSystemFromFile(filePath)`: Sets the system prompt from a file.
- `addText(text, config = { role: "user" })`: Adds a text message.
- `addTextFromFile(filePath, config = { role: "user" })`: Adds a text message from a file.
- `addImage(filePath, config = { role: "user" })`: Adds an image message from a file path.
- `addImageFromUrl(url, config = { role: "user" })`: Adds an image message from URL.
- `replace(keyValues)`: Defines placeholder replacements for messages and system prompt.
- `replaceKeyFromFile(key, filePath)`: Defines a placeholder replacement with file contents as value.
- `message()`: Sends the message and returns the response.
- `raw()`: Sends the message and returns the complete response data including:
- `message`: The text response from the model
- `think`: Reasoning/thinking content (if available)
- `toolCalls`: Array of tool calls made by the model (if any)
- `tokens`: Object with `input`, `output`, `total`, and `cached` token counts, plus `cost` (USD) and `speed` (output tokens/sec)
- `response`: The raw API response
- `stream(callback)`: Sends the message and streams the response, invoking the callback with each streamed part.
- `json(schemaExample, descriptions = {}, options = {})`: Forces the model to return a response in a specific JSON format.
- `schemaExample`: Example of the JSON structure to be returned. Top-level arrays are auto-wrapped for better LLM compatibility.
- `descriptions`: Descriptions for each field — can be strings or descriptor objects with `{ description, required, enum, default }`.
- `options`: `{ addSchema: true, addExample: false, addNote: false }`
- Returns a Promise that resolves to the structured JSON response
- Example:
```javascript
const response = await handler.json(
{ time: '24:00:00', message: 'Hello' },
{ time: 'Time in format HH:MM:SS', message: { description: 'Greeting', required: false } }
);
```
- `block({ addText = true })`: Forces the model to return a response in a specific block format.
### MixCustom Class Overview
```javascript
new MixCustom(args = { config: {}, options: {}, headers: {} })
```
- **args**: Configuration object with `config`, `options`, and `headers` properties.
- **config**:
- `url`: The endpoint URL to which the model sends requests.
- `prefix`: An array of strings used as a prefix for requests.
- ...(Additional configuration parameters can be added as needed)
- **options**: This object contains default options that are applied to all models. These options can be overridden when creating a specific model instance. Examples of default options include:
- `max_tokens`: Sets the maximum number of tokens to generate, e.g., 2000.
- `temperature`: Controls the randomness of the model's output, e.g., 1.
- `top_p`: Controls the diversity of the output, e.g., 1.
- ...(Additional default options can be added as needed)
- **headers**:
- `authorization`: The authorization header, typically including a Bearer token for API access.
- `x-api-key`: A custom header for API key if needed.
- ...(Additional headers can be added as needed)
### MixOpenAI Class Overview
```javascript
new MixOpenAI(args = { config: {}, options: {} })
```
- **args**: Configuration object with `config` and `options` properties.
- **config**: Specific configuration settings for OpenAI, including the `apiKey`.
- **options**: Default options for OpenAI model instances.
### MixOpenRouter Class Overview
```javascript
new MixOpenRouter(args = { config: {}, options: {} })
```
- **args**: Configuration object with `config` and `options` properties.
- **config**: Specific configuration settings for OpenRouter, including the `apiKey`.
- **options**: Default options for OpenRouter model instances.
### MixAnthropic Class Overview
```javascript
new MixAnthropic(args = { config: {}, options: {} })
```
- **args**: Configuration object with `config` and `options` properties.
- **config**: Specific configuration settings for Anthropic, including the `apiKey`.
- **options**: Default options for Anthropic model instances.
### MixPerplexity Class Overview
```javascript
new MixPerplexity(args = { config: {}, options: {} })
```
- **args**: Configuration object with `config` and `options` properties.
- **config**: Specific configuration settings for Perplexity, including the `apiKey`.
- **options**: Default options for Perplexity model instances.
### MixPerplexity Class Overview
```javascript
new MixGroq(args = { config: {}, options: {} })
```
- **args**: Configuration object with `config` and `options` properties.
- **config**: Specific configuration settings for Perplexity, including the `apiKey`.
- **options**: Default options for Perplexity model instances.
### MixOllama Class Overview
```javascript
new MixOllama(args = { config: {}, options: {} })
```
- **args**: Configuration object with `config` and `options` properties.
- **config**: Specific configuration settings for Ollama.
- `url`: The endpoint URL to which the model sends requests.
- **options**: Default options for Ollama model instances.
### MixLMStudio Class Overview
```javascript
new MixLMStudio(args = { config: {}, options: {} })
```
- **args**: Configuration object with `config` and `options` properties.
- **config**: Specific configuration settings for Ollama.
- `url`: The endpoint URL to which the model sends requests.
- **options**: Default options for Ollama model instances.
### MixTogether Class Overview
```javascript
new MixTogether(args = { config: {}, options: {} })
```
- **args**: Configuration object with `config` and `options` properties.
- **config**: Specific configuration settings for Together AI, including the `apiKey`.
- **options**: Default options for Together AI model instances.
### MixGoogle Class Overview
```javascript
new MixGoogle(args = { config: {}, options: {} })
```
- **args**: Configuration object with `config` and `options` properties.
- **config**: Specific configuration settings for Google Gemini, including the `apiKey`.
- **options**: Default options for Google Gemini model instances.
## 🤝 Contributing
Contributions are welcome! If you find any issues or have suggestions for improvement, please open an issue or submit a pull request on the [GitHub repository](https://github.com/clasen/ModelMix).
## 📄 License
The MIT License (MIT)
Copyright (c) Martin Clasen
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.