UNPKG

spinal-obs-node

Version:

WithSpinal cost-aware OpenTelemetry SDK for Node.js

288 lines (238 loc) 7.7 kB
# What the Spinal SDK Tracks This document explains what data the Spinal Observability SDK captures and tracks for LLM providers, particularly OpenAI. ## 🔍 **OpenAI Tracking Overview** ### **Current Implementation (HTTP-based)** The SDK uses OpenTelemetry's HTTP instrumentation to capture OpenAI API interactions: #### **Request Data Captured:** - **Request URLs**: `https://api.openai.com/v1/chat/completions` - **HTTP Method**: `POST`, `GET`, etc. - **Request Headers**: Including authentication tokens (scrubbed) - **Request Body Size**: Data volume being sent - **Timing**: Request start time and duration #### **Response Data Captured:** - **Response Status**: Success/failure codes (200, 400, 429, etc.) - **Response Headers**: Metadata about the response - **Response Size**: Data volume received - **Latency**: Total request duration #### **Contextual Data:** - **Custom Tags**: User-defined aggregation IDs, tenant info - **Service Context**: Which service made the request - **User Context**: User-specific metadata ## 💰 **Cost Tracking & Estimation** ### **Current Pricing Models** The SDK includes built-in pricing for popular models: ```typescript const catalog: PricingModel[] = [ { model: 'openai:gpt-4o-mini', inputPer1K: 0.15, outputPer1K: 0.60 }, { model: 'openai:gpt-4o', inputPer1K: 2.50, outputPer1K: 10.00 }, ] ``` ### **Cost Calculation** ```typescript export function estimateCost(params: { model?: string inputTokens?: number outputTokens?: number }): number { const { model = 'openai:gpt-4o-mini', inputTokens = 0, outputTokens = 0 } = params const entry = catalog.find((c) => c.model === model) ?? catalog[0] const inputCost = (inputTokens / 1000) * entry.inputPer1K const outputCost = (outputTokens / 1000) * entry.outputPer1K return roundUSD(inputCost + outputCost) } ``` ## 🛡️ **Privacy & Security** ### **Data Scrubbing** The SDK automatically scrubs sensitive information: - **API Keys**: Authentication tokens are removed - **Sensitive Headers**: Authorization headers scrubbed - **PII Data**: Personal information filtered out - **Request Bodies**: Content may be sanitized ### **What Gets Exported** Only safe, non-sensitive metadata is sent to Spinal cloud: ```json { "span": { "name": "HTTP POST", "attributes": { "http.url": "https://api.openai.com/v1/chat/completions", "http.method": "POST", "http.status_code": 200, "http.request.header.content-length": "1234", "spinal.aggregationId": "user-chat-flow", "spinal.tenant": "acme-corp", "spinal.service": "chatbot" }, "duration": 1500, "estimated_cost": 0.0025 } } ``` ## 🎯 **Current vs Future Tracking** ### **Current Capabilities (HTTP-based)** - ✅ Request/response metadata - ✅ Timing and performance metrics - ✅ Cost estimates (based on URL patterns) - ✅ Basic usage statistics - ✅ Error tracking and status codes - ✅ Custom contextual tagging ### **Future Enhancements (Direct Integration)** - ✅ Exact token counts (input/output) - ✅ Model-specific pricing and usage - ✅ Detailed usage analytics - ✅ Performance optimization insights - ✅ Rate limiting analysis - ✅ Error pattern analysis ## 📊 **Example Tracking Data** ### **Successful API Call** ```json { "span": { "name": "OpenAI Chat Completion", "attributes": { "http.url": "https://api.openai.com/v1/chat/completions", "http.method": "POST", "http.status_code": 200, "http.request.header.content-length": "1234", "http.response.header.content-length": "5678", "spinal.aggregationId": "user-chat-flow", "spinal.tenant": "acme-corp", "spinal.service": "chatbot", "spinal.model": "gpt-4o-mini", "spinal.estimated_cost": 0.0025 }, "duration": 1500, "startTime": "2024-01-15T10:30:00.000Z", "endTime": "2024-01-15T10:30:01.500Z" } } ``` ### **Failed API Call** ```json { "span": { "name": "OpenAI Chat Completion", "attributes": { "http.url": "https://api.openai.com/v1/chat/completions", "http.method": "POST", "http.status_code": 429, "http.status_text": "Too Many Requests", "spinal.aggregationId": "user-chat-flow", "spinal.tenant": "acme-corp", "spinal.service": "chatbot", "error": true }, "duration": 500, "startTime": "2024-01-15T10:30:00.000Z", "endTime": "2024-01-15T10:30:00.500Z" } } ``` ## 🔧 **Configuration Options** ### **Environment Variables** ```bash # Enable OpenAI tracking (default: true) SPINAL_INCLUDE_OPENAI=true # Exclude specific hosts SPINAL_EXCLUDED_HOSTS=api.openai.com,api.anthropic.com # Custom endpoint SPINAL_TRACING_ENDPOINT=https://cloud.withspinal.com ``` ### **Custom Tagging** ```typescript import { tag } from 'spinal-obs-node' // Add context to your API calls const t = tag({ aggregationId: 'user-chat-flow', tenant: 'acme-corp', service: 'chatbot', userId: 'user-123', sessionId: 'session-456' }) // Your OpenAI API call here const response = await openai.chat.completions.create({ model: 'gpt-4o-mini', messages: [{ role: 'user', content: 'Hello!' }] }) // Clean up t.dispose() ``` ## 📈 **Usage Analytics** ### **What You Can Track** - **API Call Frequency**: How often you're calling OpenAI - **Model Usage**: Which models are most/least used - **Cost Patterns**: Spending trends over time - **Performance**: Response times and latency - **Error Rates**: Failed requests and retries - **User Patterns**: Usage by user, tenant, or service ### **Cost Optimization Insights** - **Expensive Models**: Identify high-cost model usage - **Inefficient Patterns**: Find opportunities to reduce calls - **Rate Limiting**: Monitor and optimize for rate limits - **Error Costs**: Track costs from failed requests ## 🚀 **Integration Examples** ### **Next.js Application** ```typescript // lib/spinal.ts import { configure, instrumentHTTP, instrumentOpenAI, tag } from 'spinal-obs-node' export function initializeSpinal() { configure() instrumentHTTP() instrumentOpenAI() } export function trackChatCompletion(userId: string, sessionId: string) { return tag({ aggregationId: 'chat-completion', tenant: 'my-app', userId, sessionId }) } ``` ### **Express Backend** ```typescript import { configure, instrumentHTTP, instrumentOpenAI, tag } from 'spinal-obs-node' // Initialize configure() instrumentHTTP() instrumentOpenAI() app.post('/api/chat', async (req, res) => { const t = tag({ aggregationId: 'api-chat', tenant: req.user.tenant, userId: req.user.id }) try { const response = await openai.chat.completions.create({ model: 'gpt-4o-mini', messages: req.body.messages }) res.json(response) } finally { t.dispose() } }) ``` ## 🔮 **Roadmap** ### **Phase 1: Enhanced Token Tracking** - Direct integration with OpenAI SDK - Exact input/output token counts - Model-specific usage analytics ### **Phase 2: Advanced Analytics** - Usage pattern analysis - Cost optimization recommendations - Performance benchmarking ### **Phase 3: Multi-Provider Support** - Anthropic Claude tracking - Google Gemini tracking - Azure OpenAI tracking - Custom model support ## 📚 **Related Documentation** - [README.md](../README.md) - Main SDK documentation - [LOCAL_MODE.md](./LOCAL_MODE.md) - Local mode storage and data management - [QUICKSTART.md](./QUICKSTART.md) - Getting started guide ## 🆘 **Support** For questions about tracking capabilities: - Check the [README.md](../README.md) for quickstart - Review [DEPLOYMENT.md](./DEPLOYMENT.md) for setup - Contact: founders@withspinal.com