UNPKG

vector-chunk

Version:

🚀 Next-Gen Content Intelligence - The most powerful, lightweight, and intelligent vector search package for modern applications. Zero dependencies, AI-powered search, real-time processing, content analysis, tone detection, style matching, DNA fingerprint

460 lines (384 loc) 12 kB
# 🚀 Vector Search Pro - Complete Usage Guide ## 🎯 Overview Next-Gen Content Intelligence package with zero dependencies, featuring AI-powered search, content analysis, tone detection, style matching, DNA fingerprinting, auto-summarization, and adaptive optimization. Perfect for modern applications requiring intelligent content understanding. ## 📦 Installation ```bash npm install vector-chunk ``` ## 🔄 Complete Workflow ``` Frontend Upload → Content Analysis → Intelligent Chunking → Database Storage → Elasticsearch Indexing → Smart Search & Retrieval ``` ## 📡 Backend API Implementation ### 1. File Upload & Content Intelligence ```typescript import { VectorSearch } from 'vector-chunk'; import { Client } from '@elastic/elasticsearch'; app.post('/api/upload', upload.single('file'), async (req, res) => { try { const file = req.file; const fileContent = await fs.readFile(file.path, 'utf8'); // Initialize intelligent search engine const searchEngine = new VectorSearch({ similarityMetric: 'cosine', maxResults: 10, threshold: 0.0 }); // Analyze content for intelligence const contentAnalysis = await searchEngine.analyzeContent(fileContent); // Create intelligent chunks with metadata const chunks = searchEngine.createChunks(fileContent, { documentId: file.originalname, chunkType: 'text' }); // Store in database with intelligence data const fileRecord = await db.files.create({ filename: file.originalname, size: file.size, chunks: chunks.length, tone: contentAnalysis.tone.dominantTone, complexity: contentAnalysis.dna.complexity, qualityScore: contentAnalysis.qualityScore, keywords: contentAnalysis.keywords }); // Store intelligent chunks for (const chunk of chunks) { await db.chunks.create({ fileId: fileRecord.id, content: chunk.content, vector: JSON.stringify(chunk.vector), metadata: chunk.metadata, chunkIndex: chunk.chunkIndex, tone: contentAnalysis.tone.dominantTone, style: contentAnalysis.style.readabilityScore }); } // Index in Elasticsearch with intelligence const esClient = new Client({ node: 'http://localhost:9200' }); for (const chunk of chunks) { await esClient.index({ index: 'documents', body: { fileId: fileRecord.id, filename: file.originalname, content: chunk.content, vector: chunk.vector.values, metadata: chunk.metadata, chunkIndex: chunk.chunkIndex, tone: contentAnalysis.tone.dominantTone, complexity: contentAnalysis.dna.complexity, keywords: contentAnalysis.keywords } }); } res.json({ success: true, fileId: fileRecord.id, chunks: chunks.length, intelligence: { tone: contentAnalysis.tone.dominantTone, quality: contentAnalysis.qualityScore, complexity: contentAnalysis.dna.complexity, summary: contentAnalysis.summary } }); } catch (error) { res.status(500).json({ error: error.message }); } }); ``` ### 2. Intelligent Search & Retrieval ```typescript app.get('/api/search', async (req, res) => { try { const { query, filters } = req.query; // Initialize search engine const searchEngine = new VectorSearch(); // Search with content intelligence const searchResults = await searchEngine.searchContent( "Your document content here...", query ); // Get optimization recommendations const recommendations = searchEngine.getOptimizationRecommendations(); // Get performance analytics const analytics = searchEngine.getPerformanceAnalytics(); res.json({ results: searchResults, intelligence: { optimization: recommendations, performance: analytics } }); } catch (error) { res.status(500).json({ error: error.message }); } }); ``` ### 3. Multi-source Content Fusion ```typescript app.post('/api/fuse', async (req, res) => { try { const { sources } = req.body; const searchEngine = new VectorSearch(); // Fuse multiple content sources intelligently const fusion = await searchEngine.fuseContent(sources); // Get fusion insights and recommendations const insights = searchEngine.getFusionInsights(fusion); res.json({ summary: fusion.summary, coherence: fusion.coherence, conflicts: fusion.conflicts, gaps: fusion.gaps, insights: insights }); } catch (error) { res.status(500).json({ error: error.message }); } }); ``` ## 🎯 Frontend Implementation ### React Component Example ```typescript import React, { useState } from 'react'; import { VectorSearch } from 'vector-chunk'; const ContentIntelligence = () => { const [content, setContent] = useState(''); const [analysis, setAnalysis] = useState(null); const [searchResults, setSearchResults] = useState([]); const searchEngine = new VectorSearch(); const analyzeContent = async () => { try { const result = await searchEngine.analyzeContent(content); setAnalysis(result); } catch (error) { console.error('Analysis failed:', error); } }; const searchContent = async (query: string) => { try { const results = await searchEngine.searchContent(content, query); setSearchResults(results); } catch (error) { console.error('Search failed:', error); } }; return ( <div> <textarea value={content} onChange={(e) => setContent(e.target.value)} placeholder="Enter content to analyze..." /> <button onClick={analyzeContent}>Analyze Content</button> {analysis && ( <div> <h3>Content Intelligence Results</h3> <p>Tone: {analysis.tone.dominantTone}</p> <p>Quality: {(analysis.qualityScore * 100).toFixed(1)}%</p> <p>Complexity: {(analysis.dna.complexity * 100).toFixed(1)}%</p> <p>Summary: {analysis.summary}</p> </div> )} <input type="text" placeholder="Search query..." onKeyPress={(e) => e.key === 'Enter' && searchContent(e.target.value)} /> {searchResults.length > 0 && ( <div> <h3>Search Results</h3> {searchResults.map((result, index) => ( <div key={index}> <p>Similarity: {(result.similarity * 100).toFixed(1)}%</p> <p>Content: {result.content}</p> </div> ))} </div> )} </div> ); }; export default ContentIntelligence; ``` ## 🗄️ Database Schema ### Prisma Schema Example ```prisma model File { id String @id @default(cuid()) filename String size Int chunks Int tone String? complexity Float? qualityScore Float? keywords String[] createdAt DateTime @default(now()) updatedAt DateTime @updatedAt chunks Chunk[] } model Chunk { id String @id @default(cuid()) fileId String content String vector String // JSON string of vector values metadata Json chunkIndex Int tone String? style Float? createdAt DateTime @default(now()) file File @relation(fields: [fileId], references: [id]) } ``` ## 🔍 Elasticsearch Mapping ```json { "mappings": { "properties": { "fileId": { "type": "keyword" }, "filename": { "type": "text" }, "content": { "type": "text" }, "vector": { "type": "dense_vector", "dims": 128 }, "metadata": { "type": "object" }, "chunkIndex": { "type": "integer" }, "tone": { "type": "keyword" }, "complexity": { "type": "float" }, "keywords": { "type": "keyword" } } } } ``` ## 🚀 Performance Optimization ### 1. Adaptive Configuration ```typescript const searchEngine = new VectorSearch( // Search config { maxResults: 20, threshold: 0.1 }, // Optimization config { enableAutoOptimization: true, learningRate: 0.15, performanceThreshold: 0.85 }, // Adaptive config { optimizationStrategy: 'aggressive', autoTuning: true } ); ``` ### 2. Performance Monitoring ```typescript // Record metrics for optimization searchEngine.recordPerformanceMetrics({ searchTime: 45, chunkSize: 512, memoryUsage: 2.5, accuracy: 0.85 }); // Get optimization stats const stats = searchEngine.getOptimizationStats(); console.log(`Performance improvement: ${(stats.performanceImprovement * 100).toFixed(1)}%`); ``` ## 🎯 Use Cases & Applications ### 1. **Content Management Systems** - Intelligent document organization - Content quality assessment - Style consistency checking - Tone optimization ### 2. **E-commerce Platforms** - Smart product search - Content recommendation - Review analysis - Product description optimization ### 3. **Knowledge Bases** - Multi-source synthesis - Conflict detection - Gap identification - Intelligent summarization ### 4. **Research Tools** - Paper analysis - Citation management - Literature review - Research synthesis ### 5. **Legal Systems** - Contract analysis - Policy search - Compliance checking - Document comparison ### 6. **Marketing Platforms** - Content tone analysis - Style optimization - Audience targeting - Content effectiveness ## 🔧 Configuration Options ### Search Configuration ```typescript { similarityMetric: 'cosine', maxResults: 10, threshold: 0.0 } ``` ### Optimization Configuration ```typescript { enableAutoOptimization: true, learningRate: 0.1, performanceThreshold: 0.8, optimizationInterval: 60000, maxOptimizationAttempts: 10 } ``` ### Adaptive Configuration ```typescript { enableLearning: true, performanceTracking: true, autoTuning: true, optimizationStrategy: 'balanced' } ``` ## 📊 Performance Benchmarks | Feature | Performance | Benefits | |---------|-------------|----------| | **Search Speed** | < 10ms | Instant results | | **Memory Usage** | 2-5MB | Efficient storage | | **Setup Time** | 0 seconds | No configuration | | **Dependencies** | 0 | Pure implementation | | **Learning** | Continuous | Self-improving | | **Optimization** | Automatic | Zero maintenance | ## 🚀 Getting Started 1. **Install the package** ```bash npm install vector-chunk ``` 2. **Initialize the engine** ```typescript import { VectorSearch } from 'vector-chunk'; const searchEngine = new VectorSearch(); ``` 3. **Analyze content** ```typescript const analysis = await searchEngine.analyzeContent(content); ``` 4. **Search intelligently** ```typescript const results = await searchEngine.searchContent(content, query); ``` 5. **Fuse multiple sources** ```typescript const fusion = await searchEngine.fuseContent(sources); ``` 6. **Monitor performance** ```typescript const analytics = searchEngine.getPerformanceAnalytics(); ``` ## 💡 Best Practices 1. **Start with defaults**: The package is pre-optimized 2. **Monitor performance**: Use built-in analytics 3. **Let it learn**: Performance improves automatically 4. **Batch operations**: Process multiple documents together 5. **Follow insights**: Use optimization recommendations --- **Vector Search Pro** - Where content meets intelligence! 🚀✨