dphelper

# ai AI and LLM utilities for text processing, token estimation, data formatting, and integration tools. ## Functions | Function | Description | Example | |----------|-------------|---------| | `tokenCount` | Estimates token count for LLM input | `dphelper.ai.tokenCount({ users: [1,2,3] })` | | `smartSanitize` | Removes PII (emails, phones, etc.) from text | `dphelper.ai.smartSanitize(text)` | | `toon` | Converts JSON to TOON format | `dphelper.ai.toon({ users: [{id: 1, name: 'Ada'}] })` | | `toonToJson` | Converts TOON format back to JSON | `dphelper.ai.toonToJson('users[1]{id,name}:\n 1,Ada')` | | `chunker` | Splits long text into chunks for RAG | `dphelper.ai.chunker(text, { size: 1000, overlap: 200 })` | | `similarity` | Calculates cosine similarity between vectors | `dphelper.ai.similarity(vecA, vecB)` | | `extractReasoning` | Extracts AI reasoning tags from response | `dphelper.ai.extractReasoning(aiResponse)` | | `prompt` | Template engine for prompt variable injection | `dphelper.ai.prompt('Hello {{name}}', { name: 'Ada' })` | | `schema` | Generates TOON-style schema definition | `dphelper.ai.schema({ id: 1, name: 'Ada' })` | | `snapshot` | Captures app state snapshot for AI debugging | `dphelper.ai.snapshot()` | ## Description Comprehensive AI/LLM integration utilities: - **Token Estimation** - Estimate token counts for API limits - **Data Format Conversion** - TOON (Token-Oriented Object Notation) format - **Text Processing** - PII removal, text chunking for RAG - **Vector Operations** - Cosine similarity for embeddings - **Prompt Engineering** - Template variables, schema generation - **AI Debugging** - Extract reasoning, capture app snapshots ## Usage Examples ### Token Count Estimation ```javascript // Estimate tokens for a string const text = "Hello, how are you today?"; console.log(dphelper.ai.tokenCount(text)); // ~8 tokens // Estimate tokens for an object (uses TOON format) const data = { users: [ { id: 1, name: "John", email: "john@example.com" }, { id: 2, name: "Jane", email: "jane@example.com" } ] }; console.log(dphelper.ai.tokenCount(data)); // ~45 tokens // Use for API rate limiting const prompt = "Analyze the following data: " + JSON.stringify(data); const tokens = dphelper.ai.tokenCount(prompt); if (tokens > 4000) { console.log("Warning: Approaching token limit"); } ``` ### PII Sanitization ```javascript // Remove personal information from text const text = ` My name is John Smith and my email is john.smith@example.com. You can call me at (555) 123-4567. My credit card is 1234-5678-9012-3456. SSN: 123-45-6789 `; const sanitized = dphelper.ai.smartSanitize(text); console.log(sanitized); // My name is John Smith and my email is [EMAIL]. // You can call me at [PHONE]. // My credit card is [CREDIT_CARD]. // SSN: [SSN] // Use before sending data to external AI APIs async function sendToAI(data) { const safeData = dphelper.ai.smartSanitize(JSON.stringify(data)); return await fetch('/api/ai/analyze', { method: 'POST', body: JSON.stringify({ text: safeData }) }); } ``` ### TOON Format Conversion ```javascript // Convert JSON to TOON (compact, token-efficient format) const data = { users: [ { id: 1, name: "Ada", age: 30 }, { id: 2, name: "Bob", age: 25 } ] }; const toon = dphelper.ai.toon(data); console.log(toon); /* users[2]{id,name,age}: 1,Ada,30 2,Bob,25 */ // Convert back from TOON to JSON const json = dphelper.ai.toonToJson(toon); console.log(json); // Original data restored // Simple array const arr = [1, 2, 3, 4, 5]; console.log(dphelper.ai.toon(arr)); // [5]: 1, 2, 3, 4, 5 ``` ### Text Chunking for RAG ```javascript const longText = sit amet...`. `Lorem ipsum dolorrepeat(100); // Split into chunks embedding for vector const chunks = dphelper.ai.chunker(longText, { size: 1000, overlap: 200 }); console.log(chunks.length); // Number of chunks console.log(chunks[0].length); // ~1000 characters // Process each chunk for embedding async function embedText(text) { const chunks = dphelper.ai.chunker(text, { size: 500, overlap: 50 }); const embeddings = []; for (const chunk of chunks) { const embedding = await getEmbedding(chunk); embeddings.push({ chunk, embedding }); } return embeddings; } ``` ### Cosine Similarity ```javascript // Calculate similarity between two embedding vectors const vectorA = [1.0, 0.5, 0.3, 0.8]; const vectorB = [0.9, 0.6, 0.2, 0.7]; const similarity = dphelper.ai.similarity(vectorA, vectorB); console.log(similarity); // ~0.98 (very similar) // Find most similar document const query = [1.0, 0.5, 0.3]; const documents = [ [0.9, 0.4, 0.2], // Similar [0.1, 0.1, 0.1], // Different [0.8, 0.6, 0.4] // Similar ]; let mostSimilar = 0; let bestDoc = 0; documents.forEach((doc, i) => { const sim = dphelper.ai.similarity(query, doc); if (sim > mostSimilar) { mostSimilar = sim; bestDoc = i; } }); console.log(`Most similar: document ${bestDoc} (${mostSimilar})`); ``` ### Extract AI Reasoning ```javascript // Extract reasoning from AI response (e.g., Claude, OpenAI o1) const aiResponse = ` <think> The user is asking about the weather. Let me think about what data I have available. I can provide current conditions based on the user's location. </think> The current weather in your area is sunny with a temperature of 72°F. `; const extracted = dphelper.ai.extractReasoning(aiResponse); console.log(extracted.reasoning); // "The user is asking about the weather. Let me think about what data I have available..." console.log(extracted.content); // "The current weather in your area is sunny with a temperature of 72°F." ``` ### Prompt Template Engine ```javascript // Simple variable substitution const template = "Hello {{name}}, welcome to {{place}}!"; const result = dphelper.ai.prompt(template, { name: "Ada", place: "our app" }); console.log(result); // "Hello Ada, welcome to our app!" // With object variables (converts to TOON) const template2 = `Analyze this user data: {{user}}`; const userData = { id: 123, name: "John", preferences: { theme: "dark", lang: "en" } }; console.log(dphelper.ai.prompt(template2, { user: userData })); /* Analyze this user data: id: 123 name: John preferences: theme: dark lang: en */ // Default values for missing variables const t3 = "Welcome, {{name}}! Your role is {{role:guest}}."; console.log(dphelper.ai.prompt(t3, { name: "Alice" })); // "Welcome, Alice! Your role is guest." ``` ### Schema Generation ```javascript // Generate TOON-style schema from data const data = { id: 1, name: "John", email: "john@example.com", age: 30, active: true }; const schema = dphelper.ai.schema(data); console.log(schema); /* id: number name: string email: string age: number active: boolean */ // Array schema const users = [ { id: 1, name: "Ada" }, { id: 2, name: "Bob" } ]; console.log(dphelper.ai.schema(users)); // users: Array<{id: number, name: string}> ``` ### App State Snapshot ```javascript // Capture current app state for AI debugging const snapshot = dphelper.ai.snapshot(); console.log(snapshot); /* context: env: browser time: 2026-03-02T09:44:55.000Z url: https://app.example.com/dashboard title: My Dashboard state: ... logs: ... system: lang: en-US screen: 1920x1080 zoom: 100 */ // Useful for AI-assisted debugging async function askAIForHelp(error) { const snapshot = dphelper.ai.snapshot(); const prompt = `I'm seeing this error: ${error}\n\nCurrent state:\n${snapshot}`; return await fetch('/api/ai/debug', { method: 'POST', body: JSON.stringify({ prompt, snapshot }) }); } ``` ## Advanced Usage ### Complete RAG Pipeline ```javascript class RAGPipeline { constructor(chunkSize = 500, overlap = 50) { this.chunkSize = chunkSize; this.overlap = overlap; } async indexDocument(text, metadata = {}) { // Chunk the text const chunks = dphelper.ai.chunker(text, { size: this.chunkSize, overlap: this.overlap }); // Generate embeddings for each chunk const documents = []; for (let i = 0; i < chunks.length; i++) { const embedding = await this.getEmbedding(chunks[i]); documents.push({ id: `doc_${i}`, text: chunks[i], embedding, metadata, tokenCount: dphelper.ai.tokenCount(chunks[i]) }); } return documents; } async query(queryText, topK = 5) { const queryEmbedding = await this.getEmbedding(queryText); // Find most similar documents const results = this.documents .map(doc => ({ ...doc, similarity: dphelper.ai.similarity(queryEmbedding, doc.embedding) })) .sort((a, b) => b.similarity - a.similarity) .slice(0, topK); return results; } } ``` ## Details - **Author:** Dario Passariello - **Version:** 0.0.3 - **Creation Date:** 20260220 - **Last Modified:** 20260221 - **Environment:** Works in both client and server environments --- *Automatically generated document*