UNPKG

@astermind/astermind-pro

Version:

Astermind Pro - Premium ML Toolkit with Advanced RAG, Reranking, Summarization, and Information Flow Analysis

378 lines (286 loc) β€’ 12.7 kB
# Astermind Pro **Premium ML Toolkit** - Advanced RAG, Reranking, Summarization, and Information Flow Analysis Astermind Pro extends the base `@astermind/astermind-elm` package with premium features for production-grade machine learning applications. ## Features ### πŸš€ Core Premium Features - **Omega RAG System** - Complete RAG pipeline with recursive compression - **OmegaRR Reranking** - Production-grade reranking with engineered features and MMR filtering - **OmegaSumDet** - Intent-aware, deterministic summarization - **Transfer Entropy** - Information flow analysis with PWS and closed-loop control - **Advanced Numerical Methods** - KRR, RFF, OnlineRidge, and production math utilities - **Hybrid Retrieval** - Sparse (TF-IDF) + dense (kernel) retrieval system - **Auto-Tuning** - Hyperparameter optimization (dev worker only) - **Tree-Aware Parsing** - Hierarchical markdown processing - **Advanced ELM Variants** - 5 premium ELM variants (Multi-Kernel, Deep Pro, Online Kernel, Multi-Task, Sparse) ### πŸ“¦ Package Structure All APIs are **public and extensible** - no private APIs. Build your own pipelines using this professional toolbox. ``` src/ β”œβ”€β”€ math/ # Production-grade numerical methods β”œβ”€β”€ omega/ # Omega RAG system β”œβ”€β”€ retrieval/ # Hybrid retrieval system (sparse + dense) β”‚ β”œβ”€β”€ vectorization.ts # TF-IDF, sparse/dense operations β”‚ β”œβ”€β”€ index-builder.ts # Vocabulary, IDF, NystrΓΆm landmarks β”‚ └── hybrid-retriever.ts # Hybrid retrieval with ridge regularization β”œβ”€β”€ elm/ # Advanced ELM variants β”‚ β”œβ”€β”€ multi-kernel-elm.ts # Multi-Kernel ELM β”‚ β”œβ”€β”€ deep-elm-pro.ts # Improved Deep ELM β”‚ β”œβ”€β”€ online-kernel-elm.ts # Online Kernel ELM β”‚ β”œβ”€β”€ multi-task-elm.ts # Multi-Task ELM β”‚ └── sparse-elm.ts # Sparse ELM β”œβ”€β”€ reranking/ # OmegaRR reranking β”œβ”€β”€ summarization/ # OmegaSumDet summarization β”œβ”€β”€ infoflow/ # Transfer Entropy analysis β”œβ”€β”€ workers/ # Web Workers (dev & production) β”œβ”€β”€ utils/ # Utility functions β”‚ β”œβ”€β”€ tokenization.ts # Tokenization & stemming β”‚ β”œβ”€β”€ markdown.ts # Markdown parsing & chunking β”‚ β”œβ”€β”€ autotune.ts # Hyperparameter optimization β”‚ └── model-serialization.ts # Model export/import └── types.ts # TypeScript types ``` **Key Feature:** All retrieval and utility functions are now **reusable outside of workers** - use them directly in your applications! ## Installation ```bash npm install @astermind/astermind-pro ``` **Prerequisites:** - `@astermind/astermind-elm` (peer dependency) - `@astermindai/license-runtime` (included as dependency) **Note:** Astermind Pro subscription includes **Astermind Synth** - a synthetic data generator for bootstrapping your projects. See the [Developer Guide](./docs/guides/DEVELOPER_GUIDE.md#bootstrapping-with-astermind-synth) for details. ## License Setup Astermind Pro uses a **centralized license configuration** that automatically propagates to both Pro and Synth. ### Get a Trial License You can obtain a trial license by making a request to the license server: ```bash curl -X POST "https://license.astermind.ai/v1/trial/create" \ -H "Content-Type: application/json" \ -d '{"email": "Your-email@example.com", "product": "astermind-elm"}' ``` The response will contain your trial license token. ### Quick Setup: 1. **Edit `src/config/license-config.ts`**: ```typescript export const LICENSE_TOKEN: string | null = 'YOUR_LICENSE_TOKEN_HERE'; ``` 2. **Or use environment variable**: ```bash export ASTERMIND_LICENSE_TOKEN="your-license-token-here" ``` 3. **Or set programmatically**: ```typescript import { setLicenseTokenFromString } from '@astermind/astermind-pro'; await setLicenseTokenFromString('your-license-token-here'); ``` See [LICENSE_SETUP.md](./LICENSE_SETUP.md) for complete license setup guide. ## Usage ### Basic Import ```typescript import { // License Management initializeLicense, checkLicense, setLicenseTokenFromString, // Math utilities cosine, l2, normalizeL2, ridgeSolvePro, OnlineRidge, buildRFF, // Retrieval (NEW - reusable outside workers!) tokenize, expandQuery, toTfidf, hybridRetrieve, buildIndex, parseMarkdownToSections, flattenSections, // Omega RAG omegaComposeAnswer, // Reranking rerank, rerankAndFilter, filterMMR, // Summarization summarizeDeterministic, // Information Flow TransferEntropy, InfoFlowGraph, TEController, // Auto-tuning (NEW - reusable!) autoTune, sampleQueriesFromCorpus, // Model serialization (NEW - reusable!) exportModel, importModel, // Advanced ELM Variants (NEW!) MultiKernelELM, DeepELMPro, OnlineKernelELM, MultiTaskELM, SparseELM, // Types SerializedModel, Settings } from '@astermind/astermind-pro'; ``` ### Development Worker (with Training) For development and training: ```typescript // In browser context const worker = new Worker( new URL('@astermind/astermind-pro/workers/dev-worker', import.meta.url), { type: 'module' } ); worker.postMessage({ action: 'init', payload: { settings: { /* ... */ }, chaptersPath: '/chapters.json' } }); // Training, autotune, etc. available worker.postMessage({ action: 'autotune', payload: { budget: 40, sampleQueries: 24 } }); ``` ### Production Worker (Inference Only) For production deployments - optimized for inference: ```typescript // In browser context const worker = new Worker( new URL('@astermind/astermind-pro/workers/prod-worker', import.meta.url), { type: 'module' } ); // Load pre-trained model worker.postMessage({ action: 'init', payload: { model: serializedModel // SerializedModel from dev-worker exportModel() } }); // Query only worker.postMessage({ action: 'ask', payload: { q: 'your query here' } }); ``` ## Key Features Overview ### Omega RAG System Complete RAG pipeline with recursive compression, query-aligned sentence selection, and personality modes (neutral, teacher, scientist). **Use Cases:** Technical documentation assistants, customer support systems, knowledge base Q&A ### OmegaRR Reranking Production-grade reranking with rich feature engineering (TF-IDF, BM25, structural signals), weak supervision, and MMR filtering. **Use Cases:** Search engines, legal document retrieval, product search optimization ### OmegaSumDet Summarization Intent-aware, deterministic summarization with code-aware processing and heading alignment. **Use Cases:** Code explanation generation, research paper summarization, technical documentation summaries ### Transfer Entropy Analysis Information flow monitoring with streaming TE estimation, PWS variant, and closed-loop adaptive control. **Use Cases:** Pipeline quality assurance, automatic hyperparameter tuning, system health monitoring ### Advanced Numerical Methods Production-grade math including KRR (Cholesky + CG fallback), RFF approximation, OnlineRidge, and overflow-safe operations. ### Hybrid Retrieval Sparse (TF-IDF) + dense (kernel) retrieval with NystrΓΆm approximation and multiple kernel types. **Now available as standalone modules** - use `hybridRetrieve()` and `buildIndex()` directly in your code, not just in workers! ### Auto-Tuning System Automated hyperparameter optimization with random search, refinement, and real-time progress reporting. **Now available as standalone function** - use `autoTune()` directly in your applications. ## Performance - **Training Speed**: Milliseconds (vs. minutes for traditional ML) - **Inference Latency**: Microseconds per query - **Model Size**: KB-sized (vs. GB for large language models) - **Memory Usage**: Minimal - runs entirely on-device - **Scalability**: Handles millions of documents ## 🎁 Bonus: Astermind Synth Included **Every Astermind Pro subscription includes Astermind Synth** - the synthetic data generator that helps you bootstrap your ML projects quickly. **Features:** - 5 Generation Modes - From simple retrieval to premium generation - Pretrained Models - Ready-to-use generators for common data types - Label-Conditioned - Generate data for specific categories - High Realism - 56%+ realism scores on internal benchmarks - ELM Integration - Train ELM models directly from synthetic data See the [Developer Guide](./docs/guides/DEVELOPER_GUIDE.md#bootstrapping-with-astermind-synth) for complete examples. ## Documentation - πŸ“– **[Developer Guide](./docs/guides/DEVELOPER_GUIDE.md)** - Complete API reference (1,657+ lines) - πŸ’‘ **[Examples](./docs/guides/EXAMPLES.md)** - 15+ practical code examples - 🧠 **[ELM Variants Examples](./docs/features/ELM_VARIANTS_EXAMPLES.md)** - Complete examples for all 5 advanced ELM variants - πŸ’Ό **[ELM Variants Business Examples](./docs/features/ELM_VARIANTS_BUSINESS_EXAMPLES.md)** - Real-world business use cases across industries - ⚑ **[Quick Reference](./docs/guides/QUICK_REFERENCE.md)** - Quick lookup guide - 🎯 **[Premium Features](./docs/features/PREMIUM_FEATURES.md)** - Detailed feature documentation - πŸ“š **[Documentation Index](./docs/DOCS_INDEX.md)** - Complete documentation overview ## Technical Specifications - **Language**: TypeScript/JavaScript - **Platform**: Browser & Node.js - **Dependencies**: @astermind/astermind-elm (peer dependency) - **License**: Proprietary - **Browser Support**: Modern browsers (Chrome, Firefox, Safari, Edge) - **Node.js**: Version 18+ ## Professional Architecture - **No Private APIs** - Everything is public and extensible - **Fully Modular** - Use components independently or build custom pipelines - **Type-Safe** - Full TypeScript support with comprehensive types - **Production Ready** - Optimized workers for dev and production deployments ## Real-World Applications - **Technical Documentation** - Build intelligent assistants that understand code, APIs, and technical concepts - **Legal Research** - Extract relevant information from legal documents with citation-aware ranking - **Customer Support** - Provide accurate, helpful answers from knowledge bases - **E-Commerce** - Improve product search relevance and generate comparison summaries - **Medical Information** - Retrieve accurate medical information with trust-weighted ranking - **Research Analysis** - Summarize research papers and extract key findings automatically ## Quick Start Examples ### Custom Retrieval Pipeline (Outside Workers) ```typescript import { buildIndex, hybridRetrieve, rerankAndFilter, summarizeDeterministic } from '@astermind/astermind-pro'; // Build index from your documents const index = buildIndex({ chunks: yourDocuments, vocab: 10000, landmarks: 256, headingW: 2.0, useStem: true, kernel: 'rbf', sigma: 1.0 }); // Perform hybrid retrieval const retrieved = hybridRetrieve({ query: 'your query', chunks: yourDocuments, vocabMap: index.vocabMap, idf: index.idf, tfidfDocs: index.tfidfDocs, denseDocs: index.denseDocs, landmarksIdx: index.landmarksIdx, landmarkMat: index.landmarkMat, vocabSize: index.vocabMap.size, kernel: 'rbf', sigma: 1.0, alpha: 0.7, beta: 0.1, ridge: 0.08, headingW: 2.0, useStem: true, expandQuery: false, topK: 10 }); // Rerank and summarize const reranked = rerankAndFilter(query, retrieved.items, { lambdaRidge: 1e-2, probThresh: 0.45, useMMR: true }); const summary = summarizeDeterministic(query, reranked, { maxAnswerChars: 1000, includeCitations: true }); ``` ### Traditional Pipeline (Using Workers) ```typescript import { rerankAndFilter, summarizeDeterministic, InfoFlowGraph } from '@astermind/astermind-pro'; // Build your custom pipeline const results = rerankAndFilter(query, documents, { lambdaRidge: 1e-2, probThresh: 0.45, useMMR: true }); const summary = summarizeDeterministic(query, results, { maxAnswerChars: 1000, includeCitations: true }); ``` ## Support & Resources - **Documentation**: See [DOCS_INDEX.md](./docs/DOCS_INDEX.md) for complete documentation - **Examples**: See [EXAMPLES.md](./docs/guides/EXAMPLES.md) for practical code examples - **Pricing**: See [PRICING_PAGE.md](./PRICING_PAGE.md) for pricing information - **Legal**: See [LEGAL_INDEX.md](./LEGAL_INDEX.md) for terms, privacy, and legal documents ## License **PROPRIETARY** - This is a premium package. See [TERMS_OF_SERVICE.md](./TERMS_OF_SERVICE.md) for usage rights. --- **Astermind Pro** - Professional ML Toolkit for Production Applications For questions and support, contact AsterMind LLC.