llm-json-fix
Version:
Fix malformed JSON outputs from Large Language Models (LLMs)
244 lines (181 loc) • 6.35 kB
Markdown
# LLM JSON Fix
[](https://www.npmjs.com/package/llm-json-fix)
[](https://opensource.org/licenses/MIT)
[](https://www.npmjs.com/package/llm-json-fix)
A comprehensive library for repairing malformed JSON outputs from Large Language Models (LLMs).
## Why This Library?
JSON outputs from LLMs are powerful but notoriously inconsistent. Even a small 1% failure rate in JSON formatting can cause system failures that are difficult to debug. This library automatically identifies and repairs common issues in LLM-generated JSON, making your AI integrations more robust and reliable.
## Features
- **LLM-Specific Repairs**: Handles unique issues in AI-generated content
- **Markdown Cleanup**: Removes code blocks, explanatory text, and other non-JSON content
- **Streaming Support**: Process infinitely large documents with minimal memory usage
- **Schema Flexibility**: Works with any JSON structure
- **Model-Specific Optimizations**: Can be configured for OpenAI, Anthropic, or other LLMs
## Installation
```bash
# Using npm
npm install llm-json-fix
# Using yarn
yarn add llm-json-fix
# Using pnpm
pnpm add llm-json-fix
```
### Requirements
- Node.js 14.0.0 or higher
- Works in both CommonJS and ESM environments
## Basic Usage
```javascript
import { fixLLMJson } from 'llm-json-fix';
// Fix malformed JSON from an LLM
const response = `Here's the JSON you requested: \`\`\`json
{
name: "John",
items: ['apple', 'banana', ...],
active: True
}
\`\`\``;
// Repair the JSON
const fixedJson = fixLLMJson(response);
// Use the fixed JSON
const data = JSON.parse(fixedJson);
console.log(data);
```
## Issues Fixed
### Incomplete JSON Structures
- Truncated outputs where closing brackets are missing
- Unfinished arrays or objects due to token limits
- Partial final elements
### Quote Inconsistencies
- Mixing of single and double quotes
- Unclosed quotes
- Incorrectly escaped quotes within strings
### Schema Violations
- Property names without quotes
- Extra or missing commas
- Trailing commas (valid in JavaScript but invalid in JSON)
### Markdown Artifacts
- Code block markers (```) included in the JSON
- Explanation text mixed with JSON output
- Markdown formatting within JSON strings
### LLM Hallucinations
- Explanatory comments included in the JSON
- "..." or "[more items]" placeholders
- Natural language interruptions mid-JSON
### Nested JSON Formatting Issues
- Inconsistent indentation
- Improperly escaped nested JSON strings
- Confusion between string representations of objects and actual objects
## API Reference
### Regular API
```typescript
fixLLMJson(text: string, options?: FixLLMJsonOptions): string
```
#### Options
```typescript
interface FixLLMJsonOptions {
// Whether to apply model-specific fixes (default: true)
applyModelSpecificFixes?: boolean;
// The specific LLM model being used, for optimized repairs
// Supported values: 'openai', 'anthropic', 'general'
model?: 'openai' | 'anthropic' | 'general';
// Whether to preserve comments in the JSON (default: false)
preserveComments?: boolean;
// Whether to be verbose about changes being made
verbose?: boolean;
}
```
### Streaming API
For processing large files or streams:
```typescript
import { createLLMJsonFixStream } from 'llm-json-fix/stream';
import { createReadStream, createWriteStream } from 'fs';
import { pipeline } from 'stream';
const inputStream = createReadStream('broken.json');
const outputStream = createWriteStream('fixed.json');
const fixStream = createLLMJsonFixStream({
bufferSize: 64 * 1024, // 64KB
model: 'openai'
});
pipeline(inputStream, fixStream, outputStream, (err) => {
if (err) {
console.error('Error:', err);
} else {
console.log('JSON successfully repaired!');
}
});
```
## Command Line Interface
This package provides a command-line tool for repairing JSON files:
```bash
# Install globally
npm install -g llm-json-fix
# Repair a file
llm-json-fix broken.json > fixed.json
# Or with options
llm-json-fix broken.json --output fixed.json --model openai --verbose
```
### CLI Options
```
--version, -v Show application version
--help, -h Display help for command
--output, -o Output file
--overwrite Overwrite the input file
--buffer Buffer size in bytes, for example 64K (default) or 1M
--model Specify the LLM model (openai, anthropic, general)
--verbose Show detailed repair information
--preserve-comments Preserve comments in the output
```
## Examples
See the [examples](./examples) directory for more usage examples:
- [Basic Usage](./examples/basic-usage.js)
- [Streaming API](./examples/streaming-api.js)
- [OpenAI Integration](./examples/openai-integration.js)
## Common Patterns & Integration Tips
### With OpenAI
```javascript
try {
const response = await openai.chat.completions.create({
model: "gpt-4",
messages: [
{ role: "system", content: "Respond with valid JSON only." },
{ role: "user", content: prompt }
]
});
const content = response.choices[0].message.content;
const fixedJson = fixLLMJson(content, { model: 'openai' });
const data = JSON.parse(fixedJson);
// Use the data...
} catch (error) {
console.error('Error:', error);
}
```
### With Anthropic Claude
```javascript
try {
const response = await anthropic.messages.create({
model: "claude-3-opus-20240229",
max_tokens: 4000,
messages: [
{ role: "user", content: "Return this data as JSON: " + prompt }
],
system: "Return only valid JSON data with no additional text."
});
const content = response.content[0].text;
const fixedJson = fixLLMJson(content, { model: 'anthropic' });
const data = JSON.parse(fixedJson);
// Use the data...
} catch (error) {
console.error('Error:', error);
}
```
## License
[MIT License](LICENSE)
## Package Contents
The npm package includes:
- CommonJS build for Node.js environments
- ESM build for modern JavaScript environments
- UMD build for browser usage
- TypeScript type definitions
- CLI executable
- Full documentation
For more information, see the [changelog](CHANGELOG.md).