UNPKG

clasp-ai

Version:

Claude Language Agent Super Proxy - Translate Claude/Anthropic API calls to OpenAI-compatible endpoints

528 lines (398 loc) 14 kB
# CLASP - Claude Language Agent Super Proxy A high-performance Go proxy that translates Claude/Anthropic API calls to OpenAI-compatible endpoints, enabling Claude Code to work with any LLM provider. ## Features - **Bundled Claude Code**: Automatically includes Claude Code as a dependency - single `npx clasp-ai` installs everything - **Multi-Provider Support**: OpenAI, Azure OpenAI, OpenRouter (200+ models), and custom endpoints (Ollama, vLLM, LM Studio) - **Full Protocol Translation**: Anthropic Messages API ↔ OpenAI Chat Completions API - **SSE Streaming**: Real-time token streaming with state machine processing - **Tool Calls**: Complete translation of tool_use/tool_result between formats - **Connection Pooling**: Optimized HTTP transport with persistent connections - **Retry Logic**: Exponential backoff for transient failures - **Metrics Endpoint**: Request statistics and performance monitoring - **API Key Authentication**: Secure the proxy with optional API key validation ## Installation ### Via npm (recommended) ```bash # Install globally npm install -g clasp-ai # Or run directly with npx npx clasp-ai ``` ### Via Go ```bash go install github.com/jedarden/clasp/cmd/clasp@latest ``` ### From Source ```bash git clone https://github.com/jedarden/CLASP.git cd CLASP make build ``` ### Via Docker ```bash # Run with Docker (from GitHub Container Registry) docker run -d -p 8080:8080 \ -e OPENAI_API_KEY=sk-... \ ghcr.io/jedarden/clasp:latest # With specific version docker run -d -p 8080:8080 \ -e OPENAI_API_KEY=sk-... \ ghcr.io/jedarden/clasp:0.24.8 # Or with docker-compose docker-compose up -d ``` **Available Docker tags:** - `latest` - Latest stable release - `0.24` - Latest 0.24.x release - `0.24.8` - Specific version ## Quick Start ### Using with OpenAI ```bash # Set your API key export OPENAI_API_KEY=sk-... # Start the proxy clasp -model gpt-4o # In another terminal, use Claude Code ANTHROPIC_BASE_URL=http://localhost:8080 claude ``` ### Using with Azure OpenAI ```bash export AZURE_API_KEY=your-key export AZURE_OPENAI_ENDPOINT=https://your-resource.openai.azure.com export AZURE_DEPLOYMENT_NAME=gpt-4 clasp -provider azure ``` ### Using with OpenRouter ```bash export OPENROUTER_API_KEY=sk-or-... clasp -provider openrouter -model anthropic/claude-3-sonnet ``` ### Using with Local Models (Ollama) ```bash export CUSTOM_BASE_URL=http://localhost:11434/v1 clasp -provider custom -model llama3.1 ``` ## Configuration ### Command Line Options ``` clasp [options] Options: -port <port> Port to listen on (default: 8080) -provider <name> LLM provider: openai, azure, openrouter, custom -model <model> Default model to use for all requests -debug Enable debug logging (full request/response) -rate-limit Enable rate limiting -cache Enable response caching -cache-max-size <n> Maximum cache entries (default: 1000) -cache-ttl <n> Cache TTL in seconds (default: 3600) -multi-provider Enable multi-provider tier routing -fallback Enable fallback routing for auto-failover -auth Enable API key authentication -auth-api-key <key> API key for authentication (required with -auth) -version Show version information -help Show help message ``` ### Environment Variables | Variable | Description | Default | |----------|-------------|---------| | `PROVIDER` | LLM provider type | `openai` | | `CLASP_PORT` | Proxy server port | `8080` | | `CLASP_MODEL` | Default model | - | | `CLASP_MODEL_OPUS` | Model for Opus tier | - | | `CLASP_MODEL_SONNET` | Model for Sonnet tier | - | | `CLASP_MODEL_HAIKU` | Model for Haiku tier | - | | `OPENAI_API_KEY` | OpenAI API key | - | | `OPENAI_BASE_URL` | Custom OpenAI base URL | `https://api.openai.com/v1` | | `AZURE_API_KEY` | Azure OpenAI API key | - | | `AZURE_OPENAI_ENDPOINT` | Azure endpoint URL | - | | `AZURE_DEPLOYMENT_NAME` | Azure deployment name | - | | `AZURE_API_VERSION` | Azure API version | `2024-02-15-preview` | | `OPENROUTER_API_KEY` | OpenRouter API key | - | | `CUSTOM_BASE_URL` | Custom endpoint base URL | - | | `CUSTOM_API_KEY` | Custom endpoint API key | - | | `CLASP_DEBUG` | Enable all debug logging | `false` | | `CLASP_DEBUG_REQUESTS` | Log requests only | `false` | | `CLASP_DEBUG_RESPONSES` | Log responses only | `false` | | `CLASP_RATE_LIMIT` | Enable rate limiting | `false` | | `CLASP_RATE_LIMIT_REQUESTS` | Requests per window | `60` | | `CLASP_RATE_LIMIT_WINDOW` | Window in seconds | `60` | | `CLASP_RATE_LIMIT_BURST` | Burst allowance | `10` | | `CLASP_CACHE` | Enable response caching | `false` | | `CLASP_CACHE_MAX_SIZE` | Maximum cache entries | `1000` | | `CLASP_CACHE_TTL` | Cache TTL in seconds | `3600` | | `CLASP_MULTI_PROVIDER` | Enable multi-provider routing | `false` | | `CLASP_FALLBACK` | Enable fallback routing | `false` | | `CLASP_AUTH` | Enable API key authentication | `false` | | `CLASP_AUTH_API_KEY` | Required API key for access | - | | `CLASP_AUTH_ALLOW_ANONYMOUS_HEALTH` | Allow /health without auth | `true` | | `CLASP_AUTH_ALLOW_ANONYMOUS_METRICS` | Allow /metrics without auth | `false` | ### Model Mapping CLASP can automatically map Claude model tiers to your provider's models: ```bash # Map Claude tiers to specific models export CLASP_MODEL_OPUS=gpt-4o export CLASP_MODEL_SONNET=gpt-4o-mini export CLASP_MODEL_HAIKU=gpt-3.5-turbo ``` ### Multi-Provider Routing Route different Claude model tiers to different LLM providers for cost optimization: ```bash # Enable multi-provider routing export CLASP_MULTI_PROVIDER=true # Route Opus tier to OpenAI (premium) export CLASP_OPUS_PROVIDER=openai export CLASP_OPUS_MODEL=gpt-4o export CLASP_OPUS_API_KEY=sk-... # Optional, inherits from OPENAI_API_KEY # Route Sonnet tier to OpenRouter (cost-effective) export CLASP_SONNET_PROVIDER=openrouter export CLASP_SONNET_MODEL=anthropic/claude-3-sonnet export CLASP_SONNET_API_KEY=sk-or-... # Optional, inherits from OPENROUTER_API_KEY # Route Haiku tier to local Ollama (free) export CLASP_HAIKU_PROVIDER=custom export CLASP_HAIKU_MODEL=llama3.1 export CLASP_HAIKU_BASE_URL=http://localhost:11434/v1 clasp -multi-provider ``` **Multi-Provider Environment Variables:** | Variable | Description | |----------|-------------| | `CLASP_MULTI_PROVIDER` | Enable multi-provider routing (`true`/`1`) | | `CLASP_{TIER}_PROVIDER` | Provider for tier: `openai`, `openrouter`, `custom` | | `CLASP_{TIER}_MODEL` | Model name for the tier | | `CLASP_{TIER}_API_KEY` | API key (optional, inherits from main config) | | `CLASP_{TIER}_BASE_URL` | Base URL (optional, uses provider default) | Where `{TIER}` is `OPUS`, `SONNET`, or `HAIKU`. **Benefits:** - **Cost Optimization**: Use expensive providers only for complex tasks - **Latency Reduction**: Route simple requests to faster local models - **Redundancy**: Mix cloud and local providers for reliability - **A/B Testing**: Compare different models across tiers ## API Endpoints | Endpoint | Description | |----------|-------------| | `POST /v1/messages` | Anthropic Messages API (translated) | | `GET /health` | Health check | | `GET /metrics` | Request statistics (JSON) | | `GET /metrics/prometheus` | Prometheus metrics | | `GET /` | Server info | ## Example Usage ### With curl ```bash curl http://localhost:8080/v1/messages \ -H "Content-Type: application/json" \ -H "x-api-key: any-key" \ -d '{ "model": "claude-3-5-sonnet-20241022", "max_tokens": 1024, "messages": [ {"role": "user", "content": "Hello!"} ] }' ``` ### Streaming ```bash curl http://localhost:8080/v1/messages \ -H "Content-Type: application/json" \ -H "x-api-key: any-key" \ -d '{ "model": "claude-3-5-sonnet-20241022", "max_tokens": 1024, "stream": true, "messages": [ {"role": "user", "content": "Count to 5"} ] }' ``` ## Response Caching CLASP can cache responses to reduce API costs and improve latency for repeated requests: ```bash # Enable caching with defaults (1000 entries, 1 hour TTL) clasp -cache # Custom cache settings clasp -cache -cache-max-size 500 -cache-ttl 1800 # Via environment CLASP_CACHE=true CLASP_CACHE_MAX_SIZE=500 clasp ``` **Caching behavior:** - Only non-streaming requests are cached - Requests with `temperature > 0` are not cached (non-deterministic) - Cache uses LRU (Least Recently Used) eviction when full - Cache entries expire after TTL (time-to-live) - Response headers include `X-CLASP-Cache: HIT` or `X-CLASP-Cache: MISS` ## Metrics Access `/metrics` for request statistics: ```json { "requests": { "total": 100, "successful": 98, "errors": 2, "streaming": 75, "tool_calls": 15, "success_rate": "98.00%" }, "performance": { "avg_latency_ms": "523.50", "requests_per_sec": "2.34" }, "cache": { "enabled": true, "size": 42, "max_size": 1000, "hits": 156, "misses": 44, "hit_rate": "78.00%" }, "uptime": "5m30s" } ``` ## Docker ### Build and Run ```bash # Build Docker image make docker # Run container make docker-run # Stop container make docker-stop ``` ### Docker Compose Create a `.env` file with your configuration: ```bash PROVIDER=openai OPENAI_API_KEY=sk-... CLASP_DEFAULT_MODEL=gpt-4o ``` Then start the service: ```bash docker-compose up -d ``` ### Docker Environment Variables All configuration is done through environment variables. See the Environment Variables section above. ## Development ```bash # Build make build # Run tests make test # Build for all platforms make build-all # Build Docker image make docker # Format code make fmt ``` ## Debugging Enable debug logging to troubleshoot issues: ```bash # Via CLI flag clasp -debug # Via environment variable CLASP_DEBUG=true clasp # Log only requests or responses CLASP_DEBUG_REQUESTS=true clasp CLASP_DEBUG_RESPONSES=true clasp ``` Debug output includes: - Incoming Anthropic requests - Outgoing OpenAI requests - Raw OpenAI responses - Transformed Anthropic responses ## Authentication Secure your CLASP proxy with API key authentication to control access: ```bash # Enable authentication with CLI flags clasp -auth -auth-api-key "my-secret-key" # Or via environment variables CLASP_AUTH=true CLASP_AUTH_API_KEY="my-secret-key" clasp ``` ### Providing the API Key Clients can provide the API key in two ways: ```bash # Via x-api-key header curl http://localhost:8080/v1/messages \ -H "x-api-key: my-secret-key" \ -H "Content-Type: application/json" \ -d '{"model": "claude-3-5-sonnet-20241022", ...}' # Via Authorization header (Bearer token) curl http://localhost:8080/v1/messages \ -H "Authorization: Bearer my-secret-key" \ -H "Content-Type: application/json" \ -d '{"model": "claude-3-5-sonnet-20241022", ...}' ``` ### Authentication Options | Variable | Description | Default | |----------|-------------|---------| | `CLASP_AUTH` | Enable authentication | `false` | | `CLASP_AUTH_API_KEY` | Required API key | - | | `CLASP_AUTH_ALLOW_ANONYMOUS_HEALTH` | Allow /health without auth | `true` | | `CLASP_AUTH_ALLOW_ANONYMOUS_METRICS` | Allow /metrics without auth | `false` | ### Endpoint Access with Authentication Enabled | Endpoint | Default Access | |----------|----------------| | `/` | Always accessible | | `/health` | Anonymous by default | | `/metrics` | Requires auth by default | | `/metrics/prometheus` | Requires auth by default | | `/v1/messages` | Requires auth | ### Using with Claude Code When authentication is enabled, set both the base URL and API key: ```bash # Start CLASP with auth OPENAI_API_KEY=sk-... clasp -auth -auth-api-key "proxy-key" # Use with Claude Code (the proxy key is passed as the Anthropic key) ANTHROPIC_BASE_URL=http://localhost:8080 ANTHROPIC_API_KEY=proxy-key claude ``` ## Request Queuing Queue requests during provider outages for automatic retry: ```bash # Enable request queuing clasp -queue # Custom queue settings clasp -queue -queue-max-size 200 -queue-max-wait 60 # Via environment CLASP_QUEUE=true CLASP_QUEUE_MAX_SIZE=200 clasp ``` ### Queue Options | Variable | Description | Default | |----------|-------------|---------| | `CLASP_QUEUE` | Enable request queuing | `false` | | `CLASP_QUEUE_MAX_SIZE` | Maximum queued requests | `100` | | `CLASP_QUEUE_MAX_WAIT` | Queue timeout in seconds | `30` | | `CLASP_QUEUE_RETRY_DELAY` | Retry delay in milliseconds | `1000` | | `CLASP_QUEUE_MAX_RETRIES` | Maximum retries per request | `3` | ## Circuit Breaker Prevent cascade failures with circuit breaker pattern: ```bash # Enable circuit breaker clasp -circuit-breaker # Custom circuit breaker settings clasp -circuit-breaker -cb-threshold 10 -cb-recovery 3 -cb-timeout 60 # Via environment CLASP_CIRCUIT_BREAKER=true clasp ``` ### Circuit Breaker Options | Variable | Description | Default | |----------|-------------|---------| | `CLASP_CIRCUIT_BREAKER` | Enable circuit breaker | `false` | | `CLASP_CIRCUIT_BREAKER_THRESHOLD` | Failures before opening circuit | `5` | | `CLASP_CIRCUIT_BREAKER_RECOVERY` | Successes to close circuit | `2` | | `CLASP_CIRCUIT_BREAKER_TIMEOUT` | Timeout in seconds before retry | `30` | ### Circuit Breaker States - **Closed**: Normal operation, requests pass through - **Open**: Circuit tripped, requests fail fast with 503 - **Half-Open**: Testing if service recovered, limited requests allowed ### Maximum Resilience Configuration For production deployments requiring maximum resilience: ```bash # Enable queue + circuit breaker + fallback OPENAI_API_KEY=sk-xxx OPENROUTER_API_KEY=sk-or-xxx \ clasp -queue -circuit-breaker -fallback \ -queue-max-size 200 \ -cb-threshold 5 \ -cb-timeout 30 ``` ## License MIT License - see [LICENSE](LICENSE) for details. ## Contributing Contributions are welcome! Please open an issue or submit a pull request on [GitHub](https://github.com/jedarden/CLASP).