@vectorchat/mcp-server
Version:
VectorChat MCP Server - Encrypted AI-to-AI communication with hardware security (YubiKey/TPM). 45+ MCP tools for Windsurf, Claude, and AI assistants. Model-based identity with EMDM encryption. Dynamic AI playbook system, communication zones, message relay
439 lines (369 loc) • 11.7 kB
Markdown
# Session Management & Context Compression
## Overview
Comprehensive session management system with context awareness, token tracking, and automatic compression.
---
## Features
### 1. Session Management
**What is a Session?**
- A conversation with a specific peer using a specific AI model
- Maintains context across messages
- Tracks token usage
- Automatically compresses when window exhausted
**Session Structure:**
```dart
class ChatSession {
String peer; // Who you're chatting with
String modelId; // Which AI model
List<ChatMessage> messages; // Full conversation history
List<ChatMessage> contextWindow; // Current context (compressed)
int tokenCount; // Current tokens used
int maxTokens; // Model's context window size
DateTime created;
DateTime lastActive;
String? compressedHistory; // Older messages (compressed)
}
```
---
### 2. Token Window Tracking
**Display in UI:**
```
┌─────────────────────────────────────┐
│ Chat with Allicia │
│ Context: 1024/2048 tokens (50%) │
│ [████████████░░░░░░░░░░░░░░] │
└─────────────────────────────────────┘
```
**Color Coding:**
- 🟢 Green: 0-70% (plenty of space)
- 🟡 Yellow: 70-90% (getting full)
- 🔴 Red: 90-100% (compression needed)
**Implementation:**
```dart
class TokenTracker {
int currentTokens = 0;
int maxTokens = 2048;
double get percentage => currentTokens / maxTokens;
Color get statusColor {
if (percentage < 0.7) return Colors.green;
if (percentage < 0.9) return Colors.yellow;
return Colors.red;
}
bool get needsCompression => percentage > 0.9;
}
```
---
### 3. Context Compression
**When to Compress:**
- Token usage > 90% of window
- User explicitly requests compression
- Switching to a new topic
**Compression Strategy:**
```
Original Context (2000 tokens):
Message 1: "Hello"
Message 2: "How are you?"
Message 3: "Tell me about quantum physics"
Message 4: [Long explanation - 1500 tokens]
Message 5: "Thanks, that was helpful"
Message 6: "Now tell me about AI"
After Compression (500 tokens):
Compressed Summary: "User asked about quantum physics.
AI explained [key points]. User satisfied."
Recent Context (kept):
Message 6: "Now tell me about AI"
New window: 500 tokens used, 1548 available
```
**Compression Algorithm:**
```dart
class ContextCompressor {
Future<String> compress(List<ChatMessage> messages) async {
// Use AI to summarize older messages
final summary = await _aiSummarize(messages);
// Store compressed version
return summary;
}
Future<String> _aiSummarize(List<ChatMessage> messages) async {
// Send to AI: "Summarize this conversation in 100 tokens"
final prompt = '''
Summarize this conversation, keeping key facts and context:
${messages.map((m) => '${m.sender}: ${m.content}').join('\n')}
Summary (max 100 tokens):
''';
return await aiModel.generate(prompt);
}
}
```
---
### 4. Model Switcher with "More..."
**Quick Switcher:**
```dart
PopupMenuButton<String>(
icon: Icon(Icons.psychology),
tooltip: 'Change AI Model',
onSelected: (model) {
if (model == 'more') {
_showFullModelSelection();
} else {
_quickSwitchModel(model);
}
},
itemBuilder: (context) => [
PopupMenuItem(value: 'gpt2', child: Text('GPT-2 (Fast)')),
PopupMenuItem(value: 'tinyllama', child: Text('TinyLlama')),
PopupMenuItem(value: 'phi2', child: Text('Phi-2')),
PopupMenuDivider(),
PopupMenuItem(
value: 'more',
child: Row(
children: [
Icon(Icons.more_horiz, size: 20),
SizedBox(width: 8),
Text('More Models...'),
],
),
),
],
)
```
**Quick Switch (Same Identity):**
```dart
void _quickSwitchModel(String model) {
// Can only quick-switch if model uses same identity
if (_canQuickSwitch(model)) {
setState(() => _currentModel = model);
_wsService.send({'type': 'change_model', 'model_name': model});
} else {
// Different identity required
_showIdentityRequiredDialog(model);
}
}
bool _canQuickSwitch(String model) {
// Check if model uses same fingerprint
final currentFingerprint = _identityManager.fingerprint;
final newFingerprint = _getModelFingerprint(model);
return currentFingerprint == newFingerprint;
}
```
**Full Selection (New Identity):**
```dart
void _showFullModelSelection() async {
final result = await Navigator.push(
context,
MaterialPageRoute(
builder: (context) => ModelSelectionScreen(isFirstRun: false),
),
);
if (result != null) {
// User selected new model
// Must go through identity setup
await _setupIdentityForModel(result);
}
}
Future<void> _setupIdentityForModel(String modelPath) async {
final result = await Navigator.push(
context,
MaterialPageRoute(
builder: (context) => IdentitySetupScreen(
modelPath: modelPath,
isFirstRun: false,
),
),
);
if (result != null) {
// Identity created, restart app to apply
_showRestartDialog();
}
}
```
---
### 5. Session Persistence
**Save Session:**
```dart
class SessionManager {
Future<void> saveSession(ChatSession session) async {
final sessionData = {
'peer': session.peer,
'model_id': session.modelId,
'messages': session.messages.map((m) => m.toJson()).toList(),
'context_window': session.contextWindow.map((m) => m.toJson()).toList(),
'token_count': session.tokenCount,
'max_tokens': session.maxTokens,
'compressed_history': session.compressedHistory,
'created': session.created.toIso8601String(),
'last_active': session.lastActive.toIso8601String(),
};
// Encrypt with EMDM V2
final encrypted = await _emdmEncoder.encode(
jsonEncode(sessionData),
_identityManager.currentIdentity,
);
// Save to disk
final file = File(_getSessionPath(session.peer, session.modelId));
await file.writeAsBytes(encrypted);
}
Future<ChatSession loadSession(String peer, String modelId) async {
final file = File(_getSessionPath(peer, modelId));
if (!await file.exists()) return null;
// Decrypt with EMDM V2
final encrypted = await file.readAsBytes();
final decrypted = await _emdmEncoder.decode(
encrypted,
_identityManager.currentIdentity,
);
final data = jsonDecode(decrypted);
return ChatSession.fromJson(data);
}
}
```
---
### 6. UI Components
**Token Window Display:**
```dart
Widget _buildTokenWindow() {
final tracker = _sessionManager.currentSession?.tokenTracker;
if (tracker == null) return SizedBox.shrink();
return Container(
padding: EdgeInsets.symmetric(horizontal: 16, vertical: 8),
child: Column(
crossAxisAlignment: CrossAxisAlignment.start,
children: [
Row(
mainAxisAlignment: MainAxisAlignment.spaceBetween,
children: [
Text(
'Context Window',
style: TextStyle(fontSize: 12, fontWeight: FontWeight.bold),
),
Text(
'${tracker.currentTokens}/${tracker.maxTokens}',
style: TextStyle(
fontSize: 12,
color: tracker.statusColor,
fontWeight: FontWeight.bold,
),
),
],
),
SizedBox(height: 4),
LinearProgressIndicator(
value: tracker.percentage,
backgroundColor: Colors.grey[300],
valueColor: AlwaysStoppedAnimation<Color>(tracker.statusColor),
),
if (tracker.needsCompression)
Padding(
padding: EdgeInsets.only(top: 4),
child: Row(
children: [
Icon(Icons.warning, size: 14, color: Colors.orange),
SizedBox(width: 4),
Text(
'Context window full - compression recommended',
style: TextStyle(fontSize: 10, color: Colors.orange),
),
Spacer(),
TextButton(
onPressed: _compressContext,
child: Text('Compress', style: TextStyle(fontSize: 10)),
),
],
),
),
],
),
);
}
```
**Session Selector:**
```dart
Widget _buildSessionSelector() {
return PopupMenuButton<ChatSession>(
icon: Icon(Icons.history),
tooltip: 'Session History',
onSelected: (session) => _loadSession(session),
itemBuilder: (context) {
final sessions = _sessionManager.getSessions(_selectedPeer);
return [
...sessions.map((session) => PopupMenuItem(
value: session,
child: ListTile(
leading: Icon(Icons.chat_bubble_outline),
title: Text('${session.modelId}'),
subtitle: Text(
'${session.messages.length} messages • '
'${session.tokenCount}/${session.maxTokens} tokens',
),
trailing: Text(
_formatDate(session.lastActive),
style: TextStyle(fontSize: 10),
),
),
)),
PopupMenuDivider(),
PopupMenuItem(
value: null,
child: ListTile(
leading: Icon(Icons.add),
title: Text('New Session'),
),
),
];
},
);
}
```
---
### 7. Automatic Compression
**Background Compression:**
```dart
class AutoCompressor {
Timer? _compressionTimer;
void startMonitoring(ChatSession session) {
_compressionTimer = Timer.periodic(
Duration(seconds: 30),
(_) => _checkAndCompress(session),
);
}
Future<void> _checkAndCompress(ChatSession session) async {
if (session.tokenTracker.needsCompression) {
print('🔄 Auto-compressing context...');
// Keep last N messages
final recentMessages = session.messages.takeLast(5).toList();
final oldMessages = session.messages.take(session.messages.length - 5).toList();
// Compress old messages
final compressed = await _compressor.compress(oldMessages);
// Update session
session.compressedHistory = compressed;
session.contextWindow = recentMessages;
session.tokenCount = _calculateTokens(recentMessages) +
_calculateTokens([compressed]);
await _sessionManager.saveSession(session);
print('✅ Context compressed: ${oldMessages.length} → summary');
print(' Tokens: ${session.tokenCount}/${session.maxTokens}');
}
}
}
```
---
## Implementation Priority
### Phase 1 (Immediate)
1. ✅ Add "More..." to model switcher
2. ✅ Token window display
3. ✅ Basic session persistence
### Phase 2 (Next)
4. ✅ Context compression
5. ✅ Session selector UI
6. ✅ Auto-compression
### Phase 3 (Future)
7. ✅ Multi-session management
8. ✅ Session export/import
9. ✅ Advanced compression strategies
---
## Benefits
✅ **Better UX** - Users see token usage
✅ **No context loss** - Compression preserves key info
✅ **Model flexibility** - Easy switching with proper identity
✅ **Session continuity** - Pick up where you left off
✅ **Efficient** - Automatic compression when needed
---
**This is how professional AI chat applications work!** 🎯