@astermind/astermind-pro
Version:
Astermind Pro - Premium ML Toolkit with Advanced RAG, Reranking, Summarization, and Information Flow Analysis
830 lines (659 loc) • 22.2 kB
Markdown
# Advanced ELM Variants - Practical Examples
Complete guide with practical examples for all 5 advanced ELM variants in Astermind Pro.
---
## Table of Contents
1. [Multi-Kernel ELM (MK-ELM)](#multi-kernel-elm-mk-elm)
2. [DeepELMPro](#deepelmpro)
3. [Online Kernel ELM](#online-kernel-elm)
4. [Multi-Task ELM](#multi-task-elm)
5. [Sparse ELM](#sparse-elm)
---
## Multi-Kernel ELM (MK-ELM)
Combines multiple kernel types (RBF, linear) for improved accuracy through weighted kernel combination.
### Example 1: Basic Multi-Kernel Classification
```typescript
import { MultiKernelELM } from '@astermind/astermind-pro';
// Create Multi-Kernel ELM with RBF and linear kernels
const mkElm = new MultiKernelELM(['positive', 'negative', 'neutral'], {
kernels: [
{ type: 'rbf', params: { gamma: 0.01 }, weight: 0.6 },
{ type: 'linear', weight: 0.4 },
],
ridgeLambda: 0.001,
learnWeights: true, // Automatically learn optimal kernel weights
nystrom: {
m: 100,
strategy: 'uniform',
},
});
// Prepare training data
const X = [
[1, 2, 3, 4, 5],
[6, 7, 8, 9, 10],
[11, 12, 13, 14, 15],
// ... more samples
];
const y = [
[1, 0, 0], // positive
[0, 1, 0], // negative
[0, 0, 1], // neutral
// ... more labels
];
// Train
mkElm.fit(X, y);
// Predict
const predictions = mkElm.predict([1, 2, 3, 4, 5], 3);
console.log(predictions);
// Output: [{ label: 'positive', prob: 0.85 }, ...]
// Get learned kernel weights
const weights = mkElm.getKernelWeights();
console.log('Kernel weights:', weights);
// Output: [0.65, 0.35] (learned optimal weights)
```
### Example 2: Text Sentiment Analysis
```typescript
import { MultiKernelELM } from '@astermind/astermind-pro';
import { tokenize } from '@astermind/astermind-pro';
// Prepare text data
const texts = [
'I love this product!',
'This is terrible.',
'It is okay, nothing special.',
// ... more texts
];
// Convert to feature vectors (using tokenization)
const X = texts.map(text => {
const tokens = tokenize(text, true);
// Convert to numeric features (simplified - in practice use proper encoding)
return tokens.map(t => t.charCodeAt(0) % 100);
});
const y = [
[1, 0, 0], // positive
[0, 1, 0], // negative
[0, 0, 1], // neutral
];
// Create Multi-Kernel ELM optimized for text
const mkElm = new MultiKernelELM(['positive', 'negative', 'neutral'], {
kernels: [
{ type: 'rbf', params: { gamma: 0.1 } }, // For non-linear patterns
{ type: 'linear' }, // For linear patterns
],
learnWeights: true,
});
mkElm.fit(X, y);
// Classify new text
const newText = 'This is amazing!';
const features = tokenize(newText, true).map(t => t.charCodeAt(0) % 100);
const result = mkElm.predict(features, 1);
console.log(`Sentiment: ${result[0].label} (${(result[0].prob * 100).toFixed(1)}%)`);
```
### Example 3: Image Classification with Multiple Kernels
```typescript
import { MultiKernelELM } from '@astermind/astermind-pro';
// Image features (e.g., from CNN or hand-crafted features)
const imageFeatures = [
[0.1, 0.2, 0.3, 0.4, 0.5, ...], // Image 1
[0.2, 0.3, 0.4, 0.5, 0.6, ...], // Image 2
// ... more images
];
const labels = [
[1, 0, 0, 0], // cat
[0, 1, 0, 0], // dog
[0, 0, 1, 0], // bird
[0, 0, 0, 1], // other
];
// Use multiple kernels to capture different patterns
const mkElm = new MultiKernelELM(['cat', 'dog', 'bird', 'other'], {
kernels: [
{ type: 'rbf', params: { gamma: 0.01 } }, // For local patterns
{ type: 'rbf', params: { gamma: 0.1 } }, // For global patterns
{ type: 'linear' }, // For linear relationships
],
learnWeights: true,
ridgeLambda: 0.0001,
});
mkElm.fit(imageFeatures, labels);
// Classify new image
const newImage = [0.15, 0.25, 0.35, 0.45, 0.55, ...];
const prediction = mkElm.predict(newImage, 1);
console.log(`Predicted: ${prediction[0].label}`);
```
---
## DeepELMPro
Improved multi-layer ELM with autoencoder pretraining, regularization, batch normalization, and dropout.
### Example 1: Basic DeepELMPro with All Features
```typescript
import { DeepELMPro } from '@astermind/astermind-pro';
// Create DeepELMPro with all advanced features
const deepElm = new DeepELMPro({
layers: [256, 128, 64, 32], // Four hidden layers
categories: ['spam', 'ham', 'promotional'],
activation: 'relu',
useDropout: true,
dropoutRate: 0.2,
useBatchNorm: true,
regularization: {
type: 'l2',
lambda: 0.0001,
},
layerWiseTraining: true,
pretraining: true, // Enable autoencoder pretraining
maxLen: 100,
});
// Training data
const X = [
[1, 2, 3, 4, 5, ...],
[6, 7, 8, 9, 10, ...],
// ... more samples
];
const y = [0, 1, 2, ...]; // Labels
// Train with improved strategies
await deepElm.train(X, y);
// Predict
const predictions = deepElm.predict([1, 2, 3, 4, 5, ...], 3);
console.log(predictions);
```
### Example 2: Document Classification with Pretraining
```typescript
import { DeepELMPro } from '@astermind/astermind-pro';
// Classify documents into categories
const deepElm = new DeepELMPro({
layers: [512, 256, 128], // Deep network for complex patterns
categories: ['technical', 'business', 'legal', 'medical'],
activation: 'relu',
pretraining: true, // Pretrain layers as autoencoders
layerWiseTraining: true, // Train sequentially
regularization: {
type: 'elastic',
lambda: 0.0001,
alpha: 0.5, // Balance between L1 and L2
},
useBatchNorm: true, // Stabilize training
maxLen: 500,
});
// Document feature vectors (e.g., TF-IDF or embeddings)
const documents = [
[0.1, 0.2, 0.0, 0.3, ...], // Technical doc
[0.0, 0.1, 0.4, 0.2, ...], // Business doc
// ... more documents
];
const labels = [0, 1, 2, 3, ...];
await deepElm.train(documents, labels);
// Classify new document
const newDoc = [0.15, 0.25, 0.05, 0.3, ...];
const result = deepElm.predict(newDoc, 1);
console.log(`Category: ${result[0].label}`);
```
### Example 3: Handling Overfitting with Regularization
```typescript
import { DeepELMPro } from '@astermind/astermind-pro';
// When you have limited training data and risk overfitting
const deepElm = new DeepELMPro({
layers: [128, 64],
categories: ['class1', 'class2', 'class3'],
useDropout: true,
dropoutRate: 0.3, // Higher dropout for more regularization
regularization: {
type: 'l2',
lambda: 0.001, // Stronger regularization
},
useBatchNorm: true,
pretraining: false, // Skip pretraining for small datasets
layerWiseTraining: false, // Use joint training
});
// Small dataset
const X = [
[1, 2, 3],
[4, 5, 6],
// ... limited samples
];
const y = [0, 1, ...];
await deepElm.train(X, y);
```
### Example 4: Comparison: Base DeepELM vs DeepELMPro
```typescript
import { DeepELM } from '@astermind/astermind-elm';
import { DeepELMPro } from '@astermind/astermind-pro';
// Base DeepELM (from astermind-elm)
const baseDeep = new DeepELM({
layers: [{ hiddenUnits: 128, activation: 'relu' }],
maxLen: 100,
});
// DeepELMPro (improved version)
const proDeep = new DeepELMPro({
layers: [128],
categories: ['a', 'b', 'c'],
activation: 'relu',
pretraining: true, // ✅ Not in base
layerWiseTraining: true, // ✅ Not in base
regularization: { // ✅ Not in base
type: 'l2',
lambda: 0.0001,
},
useBatchNorm: true, // ✅ Not in base
useDropout: true, // ✅ Not in base
dropoutRate: 0.2,
maxLen: 100,
});
// DeepELMPro provides better generalization and training stability
```
---
## Online Kernel ELM
Real-time learning for streaming data with incremental updates and forgetting mechanisms.
### Example 1: Streaming Data Classification
```typescript
import { OnlineKernelELM } from '@astermind/astermind-pro';
// Create Online Kernel ELM for streaming data
const onlineKelm = new OnlineKernelELM({
kernel: {
type: 'rbf',
gamma: 0.01,
},
categories: ['normal', 'anomaly', 'critical'],
ridgeLambda: 0.001,
windowSize: 1000, // Keep last 1000 samples
decayFactor: 0.99, // Exponential decay for old samples
maxLandmarks: 100,
});
// Initial batch training
const initialX = [
[1, 2, 3],
[4, 5, 6],
// ... initial samples
];
const initialY = [0, 1, 0, ...];
onlineKelm.fit(initialX, initialY);
// Stream new data and update incrementally
async function processStream(dataStream: AsyncIterable<{ features: number[], label: number }>) {
for await (const { features, label } of dataStream) {
// Update model with new sample
onlineKelm.update(features, label);
// Predict immediately
const prediction = onlineKelm.predict(features, 1);
console.log(`Predicted: ${prediction[0].label}`);
}
}
```
### Example 2: Real-Time Anomaly Detection
```typescript
import { OnlineKernelELM } from '@astermind/astermind-pro';
// Real-time anomaly detection system
const anomalyDetector = new OnlineKernelELM({
kernel: { type: 'rbf', gamma: 0.1 },
categories: ['normal', 'anomaly'],
windowSize: 500, // Recent samples only
decayFactor: 0.95, // Faster decay for anomaly detection
});
// Initial training on normal data
const normalSamples = [
[10, 20, 30],
[11, 21, 31],
// ... normal patterns
];
anomalyDetector.fit(normalSamples, new Array(normalSamples.length).fill(0));
// Process real-time sensor data
function processSensorReading(sensorData: number[]) {
// Update with new reading
const isNormal = Math.random() > 0.1; // Simulated
anomalyDetector.update(sensorData, isNormal ? 0 : 1);
// Check for anomalies
const prediction = anomalyDetector.predict(sensorData, 1);
if (prediction[0].label === 'anomaly' && prediction[0].prob > 0.8) {
console.warn('Anomaly detected!', sensorData);
}
}
// Simulate streaming data
setInterval(() => {
const reading = [Math.random() * 100, Math.random() * 100, Math.random() * 100];
processSensorReading(reading);
}, 1000);
```
### Example 3: Adaptive Learning with Concept Drift
```typescript
import { OnlineKernelELM } from '@astermind/astermind-pro';
// Handle concept drift in streaming data
const adaptiveModel = new OnlineKernelELM({
kernel: { type: 'rbf', gamma: 0.01 },
categories: ['class1', 'class2', 'class3'],
windowSize: 200, // Small window for faster adaptation
decayFactor: 0.9, // Strong decay to forget old patterns
});
// Initial training
adaptiveModel.fit(initialData, initialLabels);
// Continuously adapt to changing patterns
function adaptToNewPattern(newSample: number[], newLabel: number) {
// Update model (old samples automatically decayed)
adaptiveModel.update(newSample, newLabel);
// Model automatically adapts to new patterns
// Old patterns fade away due to decay factor
}
```
---
## Multi-Task ELM
Joint learning across related tasks with shared feature extraction and task-specific outputs.
### Example 1: Sentiment + Topic Classification
```typescript
import { MultiTaskELM } from '@astermind/astermind-pro';
// Classify both sentiment and topic simultaneously
const mtElm = new MultiTaskELM({
tasks: [
{
name: 'sentiment',
categories: ['positive', 'negative', 'neutral'],
weight: 1.0, // Equal importance
},
{
name: 'topic',
categories: ['technology', 'sports', 'politics', 'entertainment'],
weight: 1.0,
},
],
sharedHiddenUnits: 256,
taskSpecificHiddenUnits: [128, 128],
activation: 'relu',
useTokenizer: true,
maxLen: 100,
});
// Training data
const texts = [
'I love this new phone!',
'The game was terrible.',
'Politics is complicated.',
// ... more texts
];
// Prepare task labels
const sentimentLabels = [0, 1, 2, ...]; // positive, negative, neutral
const topicLabels = [0, 1, 2, 3, ...]; // tech, sports, politics, entertainment
const yTaskData = new Map([
['sentiment', sentimentLabels],
['topic', topicLabels],
]);
// Train on both tasks simultaneously
mtElm.train(texts, yTaskData);
// Predict for all tasks
const newText = 'This technology is amazing!';
const allPredictions = mtElm.predict(newText, 2);
console.log('Sentiment:', allPredictions.get('sentiment'));
// Output: [{ task: 'sentiment', label: 'positive', prob: 0.9 }, ...]
console.log('Topic:', allPredictions.get('topic'));
// Output: [{ task: 'topic', label: 'technology', prob: 0.85 }, ...]
// Or predict for specific task
const sentimentOnly = mtElm.predictTask(newText, 'sentiment', 1);
console.log('Sentiment:', sentimentOnly[0].label);
```
### Example 2: Multi-Label Document Classification
```typescript
import { MultiTaskELM } from '@astermind/astermind-pro';
// Classify documents by multiple attributes
const docClassifier = new MultiTaskELM({
tasks: [
{
name: 'category',
categories: ['news', 'blog', 'academic', 'legal'],
weight: 1.0,
},
{
name: 'language',
categories: ['english', 'spanish', 'french'],
weight: 0.8, // Slightly less important
},
{
name: 'formality',
categories: ['formal', 'informal', 'mixed'],
weight: 0.6,
},
],
sharedHiddenUnits: 512, // Large shared layer
taskSpecificHiddenUnits: [256, 128, 128],
});
// Document feature vectors
const documents = [
[0.1, 0.2, 0.3, ...],
[0.4, 0.5, 0.6, ...],
// ... more documents
];
const categoryLabels = [0, 1, 2, 3, ...];
const languageLabels = [0, 1, 2, ...];
const formalityLabels = [0, 1, 2, ...];
const yTaskData = new Map([
['category', categoryLabels],
['language', languageLabels],
['formality', formalityLabels],
]);
docClassifier.train(documents, yTaskData);
// Classify new document
const newDoc = [0.2, 0.3, 0.4, ...];
const results = docClassifier.predict(newDoc, 1);
console.log('Category:', results.get('category')?.[0].label);
console.log('Language:', results.get('language')?.[0].label);
console.log('Formality:', results.get('formality')?.[0].label);
```
### Example 3: Customer Support Ticket Classification
```typescript
import { MultiTaskELM } from '@astermind/astermind-pro';
// Classify support tickets by urgency, category, and department
const ticketClassifier = new MultiTaskELM({
tasks: [
{
name: 'urgency',
categories: ['low', 'medium', 'high', 'critical'],
weight: 2.0, // More important
},
{
name: 'category',
categories: ['technical', 'billing', 'account', 'general'],
weight: 1.0,
},
{
name: 'department',
categories: ['support', 'engineering', 'sales', 'billing'],
weight: 1.0,
},
],
sharedHiddenUnits: 256,
taskSpecificHiddenUnits: [128, 128, 128],
});
// Ticket text features
const tickets = [
'My account is locked',
'I need a refund',
'The app crashes on startup',
// ... more tickets
];
const urgencyLabels = [2, 1, 3, ...]; // high, medium, critical
const categoryLabels = [2, 1, 0, ...]; // account, billing, technical
const departmentLabels = [0, 3, 1, ...]; // support, billing, engineering
const yTaskData = new Map([
['urgency', urgencyLabels],
['category', categoryLabels],
['department', departmentLabels],
]);
ticketClassifier.train(tickets, yTaskData);
// Route new ticket
const newTicket = 'I cannot log in to my account';
const routing = ticketClassifier.predict(newTicket, 1);
const urgency = routing.get('urgency')?.[0].label;
const dept = routing.get('department')?.[0].label;
console.log(`Route to ${dept} department (${urgency} urgency)`);
```
---
## Sparse ELM
Efficiency and interpretability for high-dimensional data with L1/L2 regularization and feature selection.
### Example 1: High-Dimensional Feature Selection
```typescript
import { SparseELM } from '@astermind/astermind-pro';
// Handle high-dimensional data (e.g., 10,000+ features)
const sparseElm = new SparseELM({
categories: ['class1', 'class2', 'class3'],
hiddenUnits: 256,
regularization: {
type: 'l1', // L1 for feature selection
lambda: 0.01,
},
sparsityTarget: 0.7, // Target 70% sparsity
pruneThreshold: 1e-6,
useTokenizer: false, // Using numeric features
});
// High-dimensional feature vectors
const X = [
[1, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...], // 10,000 features
[0, 0, 1, 0, 0, 0, 0, 0, 0, 0, ...],
// ... more samples
];
const y = [0, 1, 2, ...];
sparseElm.train(X, y);
// Get sparsity statistics
const stats = sparseElm.getSparsityStats();
console.log(`Sparsity: ${(stats.sparsity * 100).toFixed(1)}%`);
console.log(`Active weights: ${stats.activeWeights} / ${stats.totalWeights}`);
// Get feature importance
const importance = sparseElm.getFeatureImportance();
console.log('Top 10 most important features:');
const topFeatures = importance
.map((imp, idx) => ({ idx, imp }))
.sort((a, b) => b.imp - a.imp)
.slice(0, 10);
topFeatures.forEach(({ idx, imp }) => {
console.log(` Feature ${idx}: ${(imp * 100).toFixed(1)}%`);
});
```
### Example 2: Interpretable Text Classification
```typescript
import { SparseELM } from '@astermind/astermind-pro';
// Create interpretable model for text classification
const sparseElm = new SparseELM({
categories: ['spam', 'ham'],
hiddenUnits: 128,
useTokenizer: true,
maxLen: 200,
regularization: {
type: 'elastic', // Elastic net for balanced selection
lambda: 0.001,
alpha: 0.5,
},
sparsityTarget: 0.5, // 50% sparsity for interpretability
});
// Training data
const emails = [
'Win a free prize! Click now!',
'Meeting scheduled for tomorrow',
// ... more emails
];
const labels = [0, 1, ...]; // spam, ham
sparseElm.train(emails, labels);
// Get feature importance to understand what words matter
const importance = sparseElm.getFeatureImportance();
// In practice, map feature indices back to words/tokens
console.log('Most important features for classification:', importance);
```
### Example 3: Efficient Model for Production
```typescript
import { SparseELM } from '@astermind/astermind-pro';
// Create efficient sparse model for production
const sparseElm = new SparseELM({
categories: ['cat', 'dog', 'bird'],
hiddenUnits: 512,
regularization: {
type: 'l2',
lambda: 0.01, // Strong regularization
},
sparsityTarget: 0.8, // 80% sparsity for efficiency
pruneThreshold: 1e-5,
});
// Train
sparseElm.train(X, y);
// Model is now efficient:
// - 80% of weights are zero (faster inference)
// - Smaller memory footprint
// - Still maintains good accuracy
const stats = sparseElm.getSparsityStats();
console.log(`Model efficiency: ${(stats.sparsity * 100).toFixed(1)}% sparse`);
console.log(`Memory saved: ~${(stats.sparsity * 100).toFixed(1)}%`);
// Fast prediction with sparse model
const prediction = sparseElm.predict(newSample, 1);
```
### Example 4: Feature Selection for High-Dimensional Data
```typescript
import { SparseELM } from '@astermind/astermind-pro';
// Use Sparse ELM to identify important features
const featureSelector = new SparseELM({
categories: ['disease', 'healthy'],
hiddenUnits: 256,
regularization: {
type: 'l1', // L1 encourages sparsity
lambda: 0.1, // Strong L1 for feature selection
},
sparsityTarget: 0.9, // Very sparse
});
// Medical data with many features (genes, biomarkers, etc.)
const medicalData = [
[0.1, 0.0, 0.0, 0.3, 0.0, 0.0, 0.2, ...], // 1000+ features
[0.0, 0.2, 0.0, 0.0, 0.1, 0.0, 0.0, ...],
// ... more samples
];
const diagnoses = [0, 1, ...]; // disease, healthy
featureSelector.train(medicalData, diagnoses);
// Identify important biomarkers/features
const importance = featureSelector.getFeatureImportance();
const importantFeatures = importance
.map((imp, idx) => ({ feature: idx, importance: imp }))
.filter(f => f.importance > 0.1) // Threshold
.sort((a, b) => b.importance - a.importance);
console.log('Important features for diagnosis:');
importantFeatures.forEach(({ feature, importance }) => {
console.log(` Feature ${feature}: ${(importance * 100).toFixed(1)}%`);
});
```
---
## Comparison: When to Use Each Variant
### Use Multi-Kernel ELM when:
- You have heterogeneous data patterns
- Single kernel doesn't capture all patterns
- You want automatic kernel weight optimization
- **Example**: Text classification with both linear and non-linear patterns
### Use DeepELMPro when:
- You need deep feature hierarchies
- Base DeepELM overfits or doesn't converge
- You want better generalization
- You have complex pattern recognition tasks
- **Example**: Document classification with multiple abstraction levels
### Use Online Kernel ELM when:
- You have streaming/real-time data
- Data distribution changes over time (concept drift)
- You need incremental learning
- **Example**: Real-time anomaly detection, sensor monitoring
### Use Multi-Task ELM when:
- You have multiple related classification tasks
- Tasks share underlying features
- You want to leverage task relationships
- **Example**: Sentiment + topic classification, multi-label problems
### Use Sparse ELM when:
- You have high-dimensional data (1000+ features)
- You need interpretability (feature importance)
- You want efficient models (memory/speed)
- You need feature selection
- **Example**: Gene expression analysis, text with many features
---
## Combining Variants
You can combine multiple variants for even better performance:
### Example: Sparse Multi-Task DeepELMPro
```typescript
import { MultiTaskELM, SparseELM } from '@astermind/astermind-pro';
// Use Sparse ELM for each task in Multi-Task setup
// (This would require custom implementation combining both)
// Or use them sequentially:
// 1. Use SparseELM for feature selection
// 2. Use selected features with MultiTaskELM
```
---
## Performance Tips
1. **Multi-Kernel ELM**: Start with 2-3 kernels, let it learn weights automatically
2. **DeepELMPro**: Use pretraining for large datasets, skip for small datasets
3. **Online Kernel ELM**: Adjust `decayFactor` based on how fast your data changes
4. **Multi-Task ELM**: Balance task weights based on importance
5. **Sparse ELM**: Use L1 for feature selection, L2 for general regularization
---
## Next Steps
- See [DEVELOPER_GUIDE.md](../guides/DEVELOPER_GUIDE.md) for advanced patterns
- Check [EXAMPLES.md](../guides/EXAMPLES.md) for integration examples
- Review [PREMIUM_FEATURES.md](./PREMIUM_FEATURES.md) for complete feature list