llm-json-fix

# LLM JSON Fix [![npm version](https://img.shields.io/npm/v/llm-json-fix.svg)](https://www.npmjs.com/package/llm-json-fix) [![License: MIT](https://img.shields.io/badge/License-MIT-blue.svg)](https://opensource.org/licenses/MIT) [![npm downloads](https://img.shields.io/npm/dm/llm-json-fix.svg)](https://www.npmjs.com/package/llm-json-fix) A comprehensive library for repairing malformed JSON outputs from Large Language Models (LLMs). ## Why This Library? JSON outputs from LLMs are powerful but notoriously inconsistent. Even a small 1% failure rate in JSON formatting can cause system failures that are difficult to debug. This library automatically identifies and repairs common issues in LLM-generated JSON, making your AI integrations more robust and reliable. ## Features - **LLM-Specific Repairs**: Handles unique issues in AI-generated content - **Markdown Cleanup**: Removes code blocks, explanatory text, and other non-JSON content - **Streaming Support**: Process infinitely large documents with minimal memory usage - **Schema Flexibility**: Works with any JSON structure - **Model-Specific Optimizations**: Can be configured for OpenAI, Anthropic, or other LLMs ## Installation ```bash # Using npm npm install llm-json-fix # Using yarn yarn add llm-json-fix # Using pnpm pnpm add llm-json-fix ``` ### Requirements - Node.js 14.0.0 or higher - Works in both CommonJS and ESM environments ## Basic Usage ```javascript import { fixLLMJson } from 'llm-json-fix'; // Fix malformed JSON from an LLM const response = `Here's the JSON you requested: \`\`\`json { name: "John", items: ['apple', 'banana', ...], active: True } \`\`\``; // Repair the JSON const fixedJson = fixLLMJson(response); // Use the fixed JSON const data = JSON.parse(fixedJson); console.log(data); ``` ## Issues Fixed ### Incomplete JSON Structures - Truncated outputs where closing brackets are missing - Unfinished arrays or objects due to token limits - Partial final elements ### Quote Inconsistencies - Mixing of single and double quotes - Unclosed quotes - Incorrectly escaped quotes within strings ### Schema Violations - Property names without quotes - Extra or missing commas - Trailing commas (valid in JavaScript but invalid in JSON) ### Markdown Artifacts - Code block markers (```) included in the JSON - Explanation text mixed with JSON output - Markdown formatting within JSON strings ### LLM Hallucinations - Explanatory comments included in the JSON - "..." or "[more items]" placeholders - Natural language interruptions mid-JSON ### Nested JSON Formatting Issues - Inconsistent indentation - Improperly escaped nested JSON strings - Confusion between string representations of objects and actual objects ## API Reference ### Regular API ```typescript fixLLMJson(text: string, options?: FixLLMJsonOptions): string ``` #### Options ```typescript interface FixLLMJsonOptions { // Whether to apply model-specific fixes (default: true) applyModelSpecificFixes?: boolean; // The specific LLM model being used, for optimized repairs // Supported values: 'openai', 'anthropic', 'general' model?: 'openai' | 'anthropic' | 'general'; // Whether to preserve comments in the JSON (default: false) preserveComments?: boolean; // Whether to be verbose about changes being made verbose?: boolean; } ``` ### Streaming API For processing large files or streams: ```typescript import { createLLMJsonFixStream } from 'llm-json-fix/stream'; import { createReadStream, createWriteStream } from 'fs'; import { pipeline } from 'stream'; const inputStream = createReadStream('broken.json'); const outputStream = createWriteStream('fixed.json'); const fixStream = createLLMJsonFixStream({ bufferSize: 64 * 1024, // 64KB model: 'openai' }); pipeline(inputStream, fixStream, outputStream, (err) => { if (err) { console.error('Error:', err); } else { console.log('JSON successfully repaired!'); } }); ``` ## Command Line Interface This package provides a command-line tool for repairing JSON files: ```bash # Install globally npm install -g llm-json-fix # Repair a file llm-json-fix broken.json > fixed.json # Or with options llm-json-fix broken.json --output fixed.json --model openai --verbose ``` ### CLI Options ``` --version, -v Show application version --help, -h Display help for command --output, -o Output file --overwrite Overwrite the input file --buffer Buffer size in bytes, for example 64K (default) or 1M --model Specify the LLM model (openai, anthropic, general) --verbose Show detailed repair information --preserve-comments Preserve comments in the output ``` ## Examples See the [examples](./examples) directory for more usage examples: - [Basic Usage](./examples/basic-usage.js) - [Streaming API](./examples/streaming-api.js) - [OpenAI Integration](./examples/openai-integration.js) ## Common Patterns & Integration Tips ### With OpenAI ```javascript try { const response = await openai.chat.completions.create({ model: "gpt-4", messages: [ { role: "system", content: "Respond with valid JSON only." }, { role: "user", content: prompt } ] }); const content = response.choices[0].message.content; const fixedJson = fixLLMJson(content, { model: 'openai' }); const data = JSON.parse(fixedJson); // Use the data... } catch (error) { console.error('Error:', error); } ``` ### With Anthropic Claude ```javascript try { const response = await anthropic.messages.create({ model: "claude-3-opus-20240229", max_tokens: 4000, messages: [ { role: "user", content: "Return this data as JSON: " + prompt } ], system: "Return only valid JSON data with no additional text." }); const content = response.content[0].text; const fixedJson = fixLLMJson(content, { model: 'anthropic' }); const data = JSON.parse(fixedJson); // Use the data... } catch (error) { console.error('Error:', error); } ``` ## License [MIT License](LICENSE) ## Package Contents The npm package includes: - CommonJS build for Node.js environments - ESM build for modern JavaScript environments - UMD build for browser usage - TypeScript type definitions - CLI executable - Full documentation For more information, see the [changelog](CHANGELOG.md).