dphelper
Version:
dphelper devtools for developers
346 lines (284 loc) • 9.38 kB
Markdown
# ai
AI and LLM utilities for text processing, token estimation, data formatting, and integration tools.
## Functions
| Function | Description | Example |
|----------|-------------|---------|
| `tokenCount` | Estimates token count for LLM input | `dphelper.ai.tokenCount({ users: [1,2,3] })` |
| `smartSanitize` | Removes PII (emails, phones, etc.) from text | `dphelper.ai.smartSanitize(text)` |
| `toon` | Converts JSON to TOON format | `dphelper.ai.toon({ users: [{id: 1, name: 'Ada'}] })` |
| `toonToJson` | Converts TOON format back to JSON | `dphelper.ai.toonToJson('users[1]{id,name}:\n 1,Ada')` |
| `chunker` | Splits long text into chunks for RAG | `dphelper.ai.chunker(text, { size: 1000, overlap: 200 })` |
| `similarity` | Calculates cosine similarity between vectors | `dphelper.ai.similarity(vecA, vecB)` |
| `extractReasoning` | Extracts AI reasoning tags from response | `dphelper.ai.extractReasoning(aiResponse)` |
| `prompt` | Template engine for prompt variable injection | `dphelper.ai.prompt('Hello {{name}}', { name: 'Ada' })` |
| `schema` | Generates TOON-style schema definition | `dphelper.ai.schema({ id: 1, name: 'Ada' })` |
| `snapshot` | Captures app state snapshot for AI debugging | `dphelper.ai.snapshot()` |
## Description
Comprehensive AI/LLM integration utilities:
- **Token Estimation** - Estimate token counts for API limits
- **Data Format Conversion** - TOON (Token-Oriented Object Notation) format
- **Text Processing** - PII removal, text chunking for RAG
- **Vector Operations** - Cosine similarity for embeddings
- **Prompt Engineering** - Template variables, schema generation
- **AI Debugging** - Extract reasoning, capture app snapshots
## Usage Examples
### Token Count Estimation
```javascript
// Estimate tokens for a string
const text = "Hello, how are you today?";
console.log(dphelper.ai.tokenCount(text)); // ~8 tokens
// Estimate tokens for an object (uses TOON format)
const data = {
users: [
{ id: 1, name: "John", email: "john@example.com" },
{ id: 2, name: "Jane", email: "jane@example.com" }
]
};
console.log(dphelper.ai.tokenCount(data)); // ~45 tokens
// Use for API rate limiting
const prompt = "Analyze the following data: " + JSON.stringify(data);
const tokens = dphelper.ai.tokenCount(prompt);
if (tokens > 4000) {
console.log("Warning: Approaching token limit");
}
```
### PII Sanitization
```javascript
// Remove personal information from text
const text = `
My name is John Smith and my email is john.smith@example.com.
You can call me at (555) 123-4567.
My credit card is 1234-5678-9012-3456.
SSN: 123-45-6789
`;
const sanitized = dphelper.ai.smartSanitize(text);
console.log(sanitized);
// My name is John Smith and my email is [EMAIL].
// You can call me at [PHONE].
// My credit card is [CREDIT_CARD].
// SSN: [SSN]
// Use before sending data to external AI APIs
async function sendToAI(data) {
const safeData = dphelper.ai.smartSanitize(JSON.stringify(data));
return await fetch('/api/ai/analyze', {
method: 'POST',
body: JSON.stringify({ text: safeData })
});
}
```
### TOON Format Conversion
```javascript
// Convert JSON to TOON (compact, token-efficient format)
const data = {
users: [
{ id: 1, name: "Ada", age: 30 },
{ id: 2, name: "Bob", age: 25 }
]
};
const toon = dphelper.ai.toon(data);
console.log(toon);
/*
users[2]{id,name,age}:
1,Ada,30
2,Bob,25
*/
// Convert back from TOON to JSON
const json = dphelper.ai.toonToJson(toon);
console.log(json); // Original data restored
// Simple array
const arr = [1, 2, 3, 4, 5];
console.log(dphelper.ai.toon(arr));
// [5]: 1, 2, 3, 4, 5
```
### Text Chunking for RAG
```javascript
const longText = sit amet...`. `Lorem ipsum dolorrepeat(100);
// Split into chunks embedding for vector
const chunks = dphelper.ai.chunker(longText, { size: 1000, overlap: 200 });
console.log(chunks.length); // Number of chunks
console.log(chunks[0].length); // ~1000 characters
// Process each chunk for embedding
async function embedText(text) {
const chunks = dphelper.ai.chunker(text, { size: 500, overlap: 50 });
const embeddings = [];
for (const chunk of chunks) {
const embedding = await getEmbedding(chunk);
embeddings.push({ chunk, embedding });
}
return embeddings;
}
```
### Cosine Similarity
```javascript
// Calculate similarity between two embedding vectors
const vectorA = [1.0, 0.5, 0.3, 0.8];
const vectorB = [0.9, 0.6, 0.2, 0.7];
const similarity = dphelper.ai.similarity(vectorA, vectorB);
console.log(similarity); // ~0.98 (very similar)
// Find most similar document
const query = [1.0, 0.5, 0.3];
const documents = [
[0.9, 0.4, 0.2], // Similar
[0.1, 0.1, 0.1], // Different
[0.8, 0.6, 0.4] // Similar
];
let mostSimilar = 0;
let bestDoc = 0;
documents.forEach((doc, i) => {
const sim = dphelper.ai.similarity(query, doc);
if (sim > mostSimilar) {
mostSimilar = sim;
bestDoc = i;
}
});
console.log(`Most similar: document ${bestDoc} (${mostSimilar})`);
```
### Extract AI Reasoning
```javascript
// Extract reasoning from AI response (e.g., Claude, OpenAI o1)
const aiResponse = `
<think>
The user is asking about the weather. Let me think about what data I have available.
I can provide current conditions based on the user's location.
</think>
The current weather in your area is sunny with a temperature of 72°F.
`;
const extracted = dphelper.ai.extractReasoning(aiResponse);
console.log(extracted.reasoning);
// "The user is asking about the weather. Let me think about what data I have available..."
console.log(extracted.content);
// "The current weather in your area is sunny with a temperature of 72°F."
```
### Prompt Template Engine
```javascript
// Simple variable substitution
const template = "Hello {{name}}, welcome to {{place}}!";
const result = dphelper.ai.prompt(template, { name: "Ada", place: "our app" });
console.log(result); // "Hello Ada, welcome to our app!"
// With object variables (converts to TOON)
const template2 = `Analyze this user data:
{{user}}`;
const userData = {
id: 123,
name: "John",
preferences: { theme: "dark", lang: "en" }
};
console.log(dphelper.ai.prompt(template2, { user: userData }));
/*
Analyze this user data:
id: 123
name: John
preferences:
theme: dark
lang: en
*/
// Default values for missing variables
const t3 = "Welcome, {{name}}! Your role is {{role:guest}}.";
console.log(dphelper.ai.prompt(t3, { name: "Alice" }));
// "Welcome, Alice! Your role is guest."
```
### Schema Generation
```javascript
// Generate TOON-style schema from data
const data = {
id: 1,
name: "John",
email: "john@example.com",
age: 30,
active: true
};
const schema = dphelper.ai.schema(data);
console.log(schema);
/*
id: number
name: string
email: string
age: number
active: boolean
*/
// Array schema
const users = [
{ id: 1, name: "Ada" },
{ id: 2, name: "Bob" }
];
console.log(dphelper.ai.schema(users));
// users: Array<{id: number, name: string}>
```
### App State Snapshot
```javascript
// Capture current app state for AI debugging
const snapshot = dphelper.ai.snapshot();
console.log(snapshot);
/*
context:
env: browser
time: 2026-03-02T09:44:55.000Z
url: https://app.example.com/dashboard
title: My Dashboard
state: ...
logs: ...
system:
lang: en-US
screen: 1920x1080
zoom: 100
*/
// Useful for AI-assisted debugging
async function askAIForHelp(error) {
const snapshot = dphelper.ai.snapshot();
const prompt = `I'm seeing this error: ${error}\n\nCurrent state:\n${snapshot}`;
return await fetch('/api/ai/debug', {
method: 'POST',
body: JSON.stringify({ prompt, snapshot })
});
}
```
## Advanced Usage
### Complete RAG Pipeline
```javascript
class RAGPipeline {
constructor(chunkSize = 500, overlap = 50) {
this.chunkSize = chunkSize;
this.overlap = overlap;
}
async indexDocument(text, metadata = {}) {
// Chunk the text
const chunks = dphelper.ai.chunker(text, {
size: this.chunkSize,
overlap: this.overlap
});
// Generate embeddings for each chunk
const documents = [];
for (let i = 0; i < chunks.length; i++) {
const embedding = await this.getEmbedding(chunks[i]);
documents.push({
id: `doc_${i}`,
text: chunks[i],
embedding,
metadata,
tokenCount: dphelper.ai.tokenCount(chunks[i])
});
}
return documents;
}
async query(queryText, topK = 5) {
const queryEmbedding = await this.getEmbedding(queryText);
// Find most similar documents
const results = this.documents
.map(doc => ({
...doc,
similarity: dphelper.ai.similarity(queryEmbedding, doc.embedding)
}))
.sort((a, b) => b.similarity - a.similarity)
.slice(0, topK);
return results;
}
}
```
## Details
- **Author:** Dario Passariello
- **Version:** 0.0.3
- **Creation Date:** 20260220
- **Last Modified:** 20260221
- **Environment:** Works in both client and server environments
---
*Automatically generated document*