UNPKG

@dooor-ai/toolkit

Version:

Guards, Evals & Observability for AI applications - works seamlessly with LangChain/LangGraph

763 lines (599 loc) 20.9 kB
# DOOOR AI Toolkit <div align="center"> ``` ██████╗ ██████╗ ██████╗ ██████╗ ██████╗ ██╔══██╗██╔═══██╗██╔═══██╗██╔═══██╗██╔══██╗ ██║ ██║██║ ██║██║ ██║██║ ██║██████╔╝ ██║ ██║██║ ██║██║ ██║██║ ██║██╔══██╗ ██████╔╝╚██████╔╝╚██████╔╝╚██████╔╝██║ ██║ ╚═════╝ ╚═════╝ ╚═════╝ ╚═════╝ ╚═╝ ╚═╝ ``` **Guards, Evals & Observability for AI Applications** [![npm version](https://img.shields.io/npm/v/@dooor-ai/toolkit.svg)](https://www.npmjs.com/package/@dooor-ai/toolkit) [![License: MIT](https://img.shields.io/badge/License-MIT-blue.svg)](LICENSE) </div> --- An all-in-one framework for securing, evaluating, and monitoring AI applications. Works seamlessly with LangChain/LangGraph. ## Installation ```bash npm install @dooor-ai/toolkit ``` Requires `@langchain/core` (0.3.x or 1.x) as peer dependency. ## What's New in v0.1.46 - **🐛 Fixed RAG File Upload**: Buffer to base64 conversion now works correctly when uploading PDF/DOCX files via `RAGContext` - **✅ Improved Error Handling**: Better error messages for CortexDB connection issues - **📦 Updated Dependencies**: Compatible with latest LangChain versions [View full changelog →](https://github.com/dooor-ai/toolkit/releases) ## Quick Start ```typescript import { ChatGoogleGenerativeAI } from "@langchain/google-genai"; import { dooorChatGuard } from "@dooor-ai/toolkit"; import { PromptInjectionGuard, ToxicityGuard } from "@dooor-ai/toolkit/guards"; import { LatencyEval, CostEval } from "@dooor-ai/toolkit/evals"; // 1. Create your LangChain provider normally const baseProvider = new ChatGoogleGenerativeAI({ model: "gemini-2.0-flash-exp", apiKey: process.env.GEMINI_API_KEY, temperature: 0, }); // 2. Instrument with DOOOR Toolkit const llm = dooorChatGuard(baseProvider, { apiKey: "cortexdb://your_api_key@your-host:8000/my_evals", providerName: "gemini", guards: [ new PromptInjectionGuard({ threshold: 0.8 }), new ToxicityGuard({ threshold: 0.7 }) ], evals: [ new LatencyEval({ threshold: 3000 }), new CostEval({ budgetLimitUsd: 0.10 }) ], observability: true, }); // 3. Use normally - Guards + Evals work automatically const response = await llm.invoke("What is the capital of France?"); ``` ## Features ### Guards (Pre-execution) Protect your LLM from malicious inputs before they reach the model: - **PromptInjectionGuard** - Detects jailbreak attempts and prompt injection - **ToxicityGuard** - Blocks toxic/offensive content via AI moderation - **PIIGuard** - Detects and masks personal information ### Evals (Post-execution) Evaluate response quality automatically: - **LatencyEval** - Track response time and alert on slow responses - **CostEval** - Monitor costs and alert when budget limits are exceeded - **RelevanceEval** - Measure answer relevance (coming soon) - **HallucinationEval** - Detect hallucinations (coming soon) ### Observability Full visibility into your LLM calls: - Automatic tracing with unique trace IDs - Latency and token tracking - Cost estimation per request - Guards and evals results logging - CortexDB integration for persistent storage ## Provider Support Works with ANY LangChain provider: ```typescript import { ChatOpenAI } from "@langchain/openai"; import { ChatAnthropic } from "@langchain/anthropic"; import { ChatGoogleGenerativeAI } from "@langchain/google-genai"; import { dooorChatGuard } from "@dooor-ai/toolkit"; // OpenAI const openai = dooorChatGuard(new ChatOpenAI({...}), toolkitConfig); // Anthropic const claude = dooorChatGuard(new ChatAnthropic({...}), toolkitConfig); // Google const gemini = dooorChatGuard(new ChatGoogleGenerativeAI({...}), toolkitConfig); ``` ## LangGraph Integration ```typescript import { StateGraph } from "@langchain/langgraph"; import { dooorChatGuard } from "@dooor-ai/toolkit"; const llm = dooorChatGuard(baseProvider, toolkitConfig); const workflow = new StateGraph(...) .addNode("agent", async (state) => { const response = await llm.invoke(state.messages); return { messages: [response] }; }); // Guards + Evals work automatically via callbacks ``` ## Configuration ### Toolkit Config ```typescript interface ToolkitConfig { // CortexDB connection string (optional, for AI proxy and observability) apiKey?: string; // AI Provider name from CortexDB Studio (optional, for AI-based guards) providerName?: string; // Guards to apply (run before LLM) guards?: Guard[]; // Evals to apply (run after LLM) evals?: Eval[]; // Output guards (validate LLM output) outputGuards?: Guard[]; // Enable observability (default: true) observability?: boolean; // Eval execution mode: "async" | "sync" | "sample" evalMode?: string; // Sample rate for evals (0-1, default: 1.0) evalSampleRate?: number; // Guard failure mode: "throw" | "return_error" | "log_only" guardFailureMode?: string; // Project name for tracing project?: string; } ``` ### Guard Configuration ```typescript new PromptInjectionGuard({ threshold: 0.8, blockOnDetection: true, enabled: true, }) new ToxicityGuard({ threshold: 0.7, providerName: "gemini", // Optional: override global providerName categories: ["hate", "violence", "sexual", "harassment"], }) new PIIGuard({ detectTypes: ["email", "phone", "cpf", "credit_card", "ssn"], action: "mask", // "mask" | "block" | "warn" }) ``` ### Eval Configuration ```typescript new LatencyEval({ threshold: 3000, // Alert if > 3s }) new CostEval({ budgetLimitUsd: 0.10, // Alert if > $0.10 alertOnExceed: true, }) ``` ## CortexDB Integration For the best experience, use with CortexDB: **Benefits:** - AI Provider Proxy (no API keys in your code for guards) - Centralized AI Provider management - Automatic trace storage in Postgres - Dashboard UI for observability - Self-hosted and open-source **Connection String Format:** ``` cortexdb://api_key@host:port/database ``` **Example:** ```typescript const llm = dooorChatGuard(baseProvider, { apiKey: "cortexdb://cortexdb_adm123@35.223.201.25:8000/my_app", providerName: "gemini", // Configured in CortexDB Studio guards: [new ToxicityGuard()], // Uses Gemini via CortexDB proxy }); ``` ## RAG (Retrieval-Augmented Generation) The toolkit provides ephemeral RAG capabilities via CortexDB for ad-hoc document retrieval without creating permanent collections. ### Quick Start ```typescript import { RAGContext, RAGStrategy, buildRAGPrompt } from "@dooor-ai/toolkit/rag"; import { dooorChatGuard } from "@dooor-ai/toolkit"; // Create RAG context with documents const ragContext = new RAGContext({ documents: [ { content: "NestJS authentication guide: Use @nestjs/passport with JWT...", metadata: { source: "docs" } }, { content: "Database setup: Install @prisma/client and configure...", metadata: { source: "tutorial" } } ], embeddingProvider: "prod-gemini", // From CortexDB Studio strategy: RAGStrategy.SIMPLE, topK: 3, chunkSize: 500, chunkOverlap: 100, }); // Build prompt with RAG context const userQuery = "How to authenticate users in NestJS?"; const promptWithContext = await buildRAGPrompt(userQuery, ragContext); // Use with your LLM const llm = dooorChatGuard(baseProvider, {...}); const response = await llm.invoke(promptWithContext); ``` ### RAG Strategies #### 1. **SIMPLE** (Default - Fastest) Direct semantic search using cosine similarity. - ⚡ **Speed:** Fastest (1 embedding) - 💰 **Cost:** Lowest - 🎯 **Best for:** Direct questions, technical docs, FAQs ```typescript strategy: RAGStrategy.SIMPLE ``` #### 2. **HYDE** (Hypothetical Document Embeddings) Generates a hypothetical answer first, then searches for similar chunks. - ⚡ **Speed:** Medium (1 LLM call + 2 embeddings) - 💰 **Cost:** Medium - 🎯 **Best for:** Complex queries, conceptual questions **How it works:** 1. LLM generates hypothetical answer to your query 2. Embeds the hypothetical answer (not the query!) 3. Searches for chunks similar to the answer ```typescript strategy: RAGStrategy.HYDE ``` **Example:** ``` Query: "How to authenticate users?" ↓ LLM: "To authenticate users, use JWT tokens with passport..." ↓ Embed this answer → Search similar chunks ``` #### 3. **RERANK** (LLM-based Re-ranking) Retrieves more candidates, then uses LLM to rerank by relevance. - ⚡ **Speed:** Slower (1 embedding + 1 LLM rerank) - 💰 **Cost:** Medium - 🎯 **Best for:** Maximum precision, ambiguous queries **How it works:** 1. Semantic search returns top_k × 3 candidates 2. LLM analyzes all and ranks by relevance 3. Returns top_k most relevant ```typescript strategy: RAGStrategy.RERANK ``` #### 4. **FUSION** (SIMPLE + HYDE Combined) Runs SIMPLE and HYDE in parallel, combines results using Reciprocal Rank Fusion. - ⚡ **Speed:** Medium (parallel execution) - 💰 **Cost:** Highest (combines both strategies) - 🎯 **Best for:** Critical queries, maximum quality **How it works:** 1. Runs SIMPLE and HYDE simultaneously 2. Combines using RRF: `score = 1/(rank_simple + 60) + 1/(rank_hyde + 60)` 3. Returns top_k with highest fusion scores ```typescript strategy: RAGStrategy.FUSION ``` ### Strategy Comparison | Strategy | Speed | Cost | Precision | Best For | |----------|-------|------|-----------|----------| | **SIMPLE** | ⚡⚡⚡ | 💰 | ⭐⭐⭐ | Direct questions | | **HYDE** | ⚡⚡ | 💰💰 | ⭐⭐⭐⭐ | Complex queries | | **RERANK** | ⚡ | 💰💰 | ⭐⭐⭐⭐⭐ | Maximum precision | | **FUSION** | ⚡⚡ | 💰💰💰 | ⭐⭐⭐⭐⭐ | Critical queries | ### RAG with Files ```typescript const ragContext = new RAGContext({ files: [ { name: "manual.pdf", data: base64EncodedPDF, type: "application/pdf" } ], embeddingProvider: "prod-gemini", strategy: RAGStrategy.HYDE, topK: 5, }); ``` **Supported file types:** PDF, DOCX, MD, TXT ### RAG Observability All RAG calls are automatically logged to CortexDB: - Embedding tokens used - Chunks retrieved vs total - Strategy used - Timing breakdown (parse, embedding, search) - Similarity scores View in CortexDB Studio → Observability → RAG tab ## Real-World Examples ### NestJS Integration Perfect for building AI-powered APIs with guards, evals, and RAG: ```typescript import { Injectable, Logger } from '@nestjs/common'; import { ChatGoogleGenerativeAI } from "@langchain/google-genai"; import { createReactAgent } from '@langchain/langgraph/prebuilt'; import { dooorChatGuard, PromptInjectionGuard, ToxicityGuard, LatencyEval, AnswerRelevancyEval, RAGContext, RAGStrategy, buildRAGPrompt, } from "@dooor-ai/toolkit"; @Injectable() export class AIService { private readonly logger = new Logger(AIService.name); /** * Simple LLM with Guards + Evals */ async askQuestion(question: string) { const baseProvider = new ChatGoogleGenerativeAI({ model: "gemini-2.0-flash-exp", apiKey: process.env.GEMINI_API_KEY, temperature: 0, }); const llm = dooorChatGuard(baseProvider, { apiKey: "cortexdb://your_key@host:8000/my_db", providerName: "gemini", project: "my-api", guards: [ new PromptInjectionGuard({ threshold: 0.8 }), new ToxicityGuard({ threshold: 0.7 }), ], evals: [ new LatencyEval({ threshold: 3000 }), new AnswerRelevancyEval({ threshold: 0.7 }), ], observability: true, }); const result = await llm.invoke([ { role: "user", content: question } ]); return result; } /** * LangGraph Agent with Tools */ async runAgent(userMessage: string) { const baseProvider = new ChatGoogleGenerativeAI({ model: "gemini-2.0-flash-exp", apiKey: process.env.GEMINI_API_KEY, temperature: 0, }); const llm = dooorChatGuard(baseProvider, { apiKey: "cortexdb://your_key@host:8000/my_db", providerName: "gemini", project: "agent-api", guards: [new PromptInjectionGuard()], evals: [new LatencyEval({ threshold: 5000 })], observability: true, }); const agent = createReactAgent({ llm: llm, tools: [], // Your tools here prompt: `You are a helpful assistant.`, }); const result = await agent.invoke({ messages: [{ role: "user", content: userMessage }] }); return result; } /** * RAG with Documents (No files needed) */ async ragWithDocuments(query: string) { const baseProvider = new ChatGoogleGenerativeAI({ model: "gemini-2.0-flash-exp", apiKey: process.env.GEMINI_API_KEY, temperature: 0, }); const llm = dooorChatGuard(baseProvider, { apiKey: "cortexdb://your_key@host:8000/my_db", providerName: "gemini", project: "rag-api", guards: [new PromptInjectionGuard()], evals: [new LatencyEval({ threshold: 5000 })], observability: true, }); // Create RAG context with plain text documents const ragContext = new RAGContext({ documents: [ { content: ` NestJS Authentication Guide: To authenticate users in NestJS: 1. Install @nestjs/passport and passport-jwt 2. Create an AuthModule with JwtStrategy 3. Use @UseGuards(JwtAuthGuard) on protected routes 4. Store JWT token in Authorization header as "Bearer <token>" Example: @Post('login') async login(@Body() loginDto: LoginDto) { const user = await this.authService.validateUser(loginDto); return this.authService.generateJwt(user); } `, metadata: { source: 'nestjs-auth-docs' } }, { content: ` Database Configuration in NestJS: Use Prisma for type-safe database access: 1. Install @prisma/client 2. Define schema in prisma/schema.prisma 3. Run npx prisma migrate dev 4. Inject PrismaService in your services `, metadata: { source: 'nestjs-database-docs' } }, ], embeddingProvider: "prod-gemini", strategy: RAGStrategy.SIMPLE, topK: 3, chunkSize: 500, chunkOverlap: 100, }); // Build prompt with RAG context const promptWithContext = await buildRAGPrompt(query, ragContext); this.logger.log(`🔍 RAG Query: ${query}`); this.logger.log(`📄 Processing ${ragContext.documents.length} documents`); const result = await llm.invoke([ { role: "user", content: promptWithContext } ]); this.logger.log('✅ RAG Response received!'); return result; } /** * RAG with PDF File */ async ragWithPdf(query: string, pdfPath: string) { const baseProvider = new ChatGoogleGenerativeAI({ model: "gemini-2.0-flash-exp", apiKey: process.env.GEMINI_API_KEY, temperature: 0, }); const llm = dooorChatGuard(baseProvider, { apiKey: "cortexdb://your_key@host:8000/my_db", providerName: "gemini", project: "rag-pdf-api", guards: [], evals: [new LatencyEval({ threshold: 10000 })], observability: true, }); // Read PDF file const fs = require('fs').promises; const pdfBuffer = await fs.readFile(pdfPath); this.logger.log(`📄 PDF loaded: ${pdfPath} (${pdfBuffer.length} bytes)`); // Create RAG context with PDF const ragContext = new RAGContext({ files: [ { name: "manual.pdf", data: pdfBuffer, type: "application/pdf" } ], embeddingProvider: "prod-gemini", strategy: RAGStrategy.HYDE, // HyDE for complex queries topK: 5, chunkSize: 1000, chunkOverlap: 200, }); const promptWithContext = await buildRAGPrompt(query, ragContext); this.logger.log(`🔍 RAG Query: ${query}`); this.logger.log('📊 Strategy: HyDE'); const result = await llm.invoke([ { role: "user", content: promptWithContext } ]); this.logger.log('✅ RAG with PDF completed!'); return result; } /** * RAG with FUSION Strategy (Best Quality) */ async ragFusion(query: string) { const baseProvider = new ChatGoogleGenerativeAI({ model: "gemini-2.0-flash-exp", apiKey: process.env.GEMINI_API_KEY, temperature: 0, }); const llm = dooorChatGuard(baseProvider, { apiKey: "cortexdb://your_key@host:8000/my_db", providerName: "gemini", project: "rag-fusion-api", guards: [], evals: [new LatencyEval({ threshold: 15000 })], observability: true, }); const ragContext = new RAGContext({ documents: [ { content: "Microservices architecture divides applications into small, independent services.", metadata: { source: "microservices-101" } }, { content: "Monolithic architecture keeps all application logic in a single codebase.", metadata: { source: "monolith-101" } }, { content: "Service mesh provides observability and traffic management for microservices.", metadata: { source: "service-mesh-guide" } } ], embeddingProvider: "prod-gemini", strategy: RAGStrategy.FUSION, // FUSION = SIMPLE + HYDE combined topK: 5, }); const promptWithContext = await buildRAGPrompt(query, ragContext); this.logger.log(`🔍 RAG Query: ${query}`); this.logger.log('📊 Strategy: FUSION (SIMPLE + HYDE combined)'); const result = await llm.invoke([ { role: "user", content: promptWithContext } ]); this.logger.log('✅ RAG with FUSION strategy completed!'); return result; } } ``` ### Express.js API ```typescript import express from 'express'; import { ChatGoogleGenerativeAI } from "@langchain/google-genai"; import { dooorChatGuard, PromptInjectionGuard, LatencyEval } from "@dooor-ai/toolkit"; const app = express(); app.use(express.json()); app.post('/api/chat', async (req, res) => { const { message } = req.body; const baseProvider = new ChatGoogleGenerativeAI({ model: "gemini-2.0-flash-exp", apiKey: process.env.GEMINI_API_KEY, }); const llm = dooorChatGuard(baseProvider, { apiKey: "cortexdb://your_key@host:8000/my_db", providerName: "gemini", guards: [new PromptInjectionGuard()], evals: [new LatencyEval({ threshold: 3000 })], observability: true, }); try { const result = await llm.invoke([{ role: "user", content: message }]); res.json({ response: result.content }); } catch (error) { res.status(500).json({ error: error.message }); } }); app.listen(3000, () => console.log('Server running on port 3000')); ``` ### Standalone Script ```typescript import { ChatGoogleGenerativeAI } from "@langchain/google-genai"; import { dooorChatGuard, PromptInjectionGuard, LatencyEval } from "@dooor-ai/toolkit"; async function main() { const baseProvider = new ChatGoogleGenerativeAI({ model: "gemini-2.0-flash-exp", apiKey: process.env.GEMINI_API_KEY, }); const llm = dooorChatGuard(baseProvider, { apiKey: "cortexdb://your_key@host:8000/my_db", providerName: "gemini", guards: [new PromptInjectionGuard()], evals: [new LatencyEval({ threshold: 3000 })], observability: true, }); const result = await llm.invoke([ { role: "user", content: "What is the capital of France?" } ]); console.log(result.content); } main(); ``` ## Additional Examples See `examples/` directory in the repository: - `basic-usage.ts` - Complete example with all features - `multi-provider.ts` - Using different LangChain providers - `simple-usage.ts` - Minimal setup ## Development Status **Current:** MVP (Phase 1) - Complete **Completed Features:** - Core callback handler with lifecycle hooks - Provider-agnostic factory function - PromptInjectionGuard, ToxicityGuard, PIIGuard - LatencyEval, CostEval - CortexDB integration - Console and CortexDB observability backends **Roadmap (Phase 2):** - RelevanceEval (LLM-based quality scoring) - HallucinationEval (detect false information) - Dashboard UI in CortexDB Studio - Python SDK - Additional guards and evals ## License MIT ## Links - NPM: https://www.npmjs.com/package/@dooor-ai/toolkit - GitHub: https://github.com/dooor-ai/toolkit - Documentation: https://github.com/dooor-ai/toolkit/docs - Issues: https://github.com/dooor-ai/toolkit/issues