UNPKG

@vectorchat/mcp-server

Version:

VectorChat MCP Server - Encrypted AI-to-AI communication with hardware security (YubiKey/TPM). 45+ MCP tools for Windsurf, Claude, and AI assistants. Model-based identity with EMDM encryption. Dynamic AI playbook system, communication zones, message relay

439 lines (369 loc) 11.7 kB
# Session Management & Context Compression ## Overview Comprehensive session management system with context awareness, token tracking, and automatic compression. --- ## Features ### 1. Session Management **What is a Session?** - A conversation with a specific peer using a specific AI model - Maintains context across messages - Tracks token usage - Automatically compresses when window exhausted **Session Structure:** ```dart class ChatSession { String peer; // Who you're chatting with String modelId; // Which AI model List<ChatMessage> messages; // Full conversation history List<ChatMessage> contextWindow; // Current context (compressed) int tokenCount; // Current tokens used int maxTokens; // Model's context window size DateTime created; DateTime lastActive; String? compressedHistory; // Older messages (compressed) } ``` --- ### 2. Token Window Tracking **Display in UI:** ``` ┌─────────────────────────────────────┐ │ Chat with Allicia │ │ Context: 1024/2048 tokens (50%) │ │ [████████████░░░░░░░░░░░░░░] │ └─────────────────────────────────────┘ ``` **Color Coding:** - 🟢 Green: 0-70% (plenty of space) - 🟡 Yellow: 70-90% (getting full) - 🔴 Red: 90-100% (compression needed) **Implementation:** ```dart class TokenTracker { int currentTokens = 0; int maxTokens = 2048; double get percentage => currentTokens / maxTokens; Color get statusColor { if (percentage < 0.7) return Colors.green; if (percentage < 0.9) return Colors.yellow; return Colors.red; } bool get needsCompression => percentage > 0.9; } ``` --- ### 3. Context Compression **When to Compress:** - Token usage > 90% of window - User explicitly requests compression - Switching to a new topic **Compression Strategy:** ``` Original Context (2000 tokens): Message 1: "Hello" Message 2: "How are you?" Message 3: "Tell me about quantum physics" Message 4: [Long explanation - 1500 tokens] Message 5: "Thanks, that was helpful" Message 6: "Now tell me about AI" After Compression (500 tokens): Compressed Summary: "User asked about quantum physics. AI explained [key points]. User satisfied." Recent Context (kept): Message 6: "Now tell me about AI" New window: 500 tokens used, 1548 available ``` **Compression Algorithm:** ```dart class ContextCompressor { Future<String> compress(List<ChatMessage> messages) async { // Use AI to summarize older messages final summary = await _aiSummarize(messages); // Store compressed version return summary; } Future<String> _aiSummarize(List<ChatMessage> messages) async { // Send to AI: "Summarize this conversation in 100 tokens" final prompt = ''' Summarize this conversation, keeping key facts and context: ${messages.map((m) => '${m.sender}: ${m.content}').join('\n')} Summary (max 100 tokens): '''; return await aiModel.generate(prompt); } } ``` --- ### 4. Model Switcher with "More..." **Quick Switcher:** ```dart PopupMenuButton<String>( icon: Icon(Icons.psychology), tooltip: 'Change AI Model', onSelected: (model) { if (model == 'more') { _showFullModelSelection(); } else { _quickSwitchModel(model); } }, itemBuilder: (context) => [ PopupMenuItem(value: 'gpt2', child: Text('GPT-2 (Fast)')), PopupMenuItem(value: 'tinyllama', child: Text('TinyLlama')), PopupMenuItem(value: 'phi2', child: Text('Phi-2')), PopupMenuDivider(), PopupMenuItem( value: 'more', child: Row( children: [ Icon(Icons.more_horiz, size: 20), SizedBox(width: 8), Text('More Models...'), ], ), ), ], ) ``` **Quick Switch (Same Identity):** ```dart void _quickSwitchModel(String model) { // Can only quick-switch if model uses same identity if (_canQuickSwitch(model)) { setState(() => _currentModel = model); _wsService.send({'type': 'change_model', 'model_name': model}); } else { // Different identity required _showIdentityRequiredDialog(model); } } bool _canQuickSwitch(String model) { // Check if model uses same fingerprint final currentFingerprint = _identityManager.fingerprint; final newFingerprint = _getModelFingerprint(model); return currentFingerprint == newFingerprint; } ``` **Full Selection (New Identity):** ```dart void _showFullModelSelection() async { final result = await Navigator.push( context, MaterialPageRoute( builder: (context) => ModelSelectionScreen(isFirstRun: false), ), ); if (result != null) { // User selected new model // Must go through identity setup await _setupIdentityForModel(result); } } Future<void> _setupIdentityForModel(String modelPath) async { final result = await Navigator.push( context, MaterialPageRoute( builder: (context) => IdentitySetupScreen( modelPath: modelPath, isFirstRun: false, ), ), ); if (result != null) { // Identity created, restart app to apply _showRestartDialog(); } } ``` --- ### 5. Session Persistence **Save Session:** ```dart class SessionManager { Future<void> saveSession(ChatSession session) async { final sessionData = { 'peer': session.peer, 'model_id': session.modelId, 'messages': session.messages.map((m) => m.toJson()).toList(), 'context_window': session.contextWindow.map((m) => m.toJson()).toList(), 'token_count': session.tokenCount, 'max_tokens': session.maxTokens, 'compressed_history': session.compressedHistory, 'created': session.created.toIso8601String(), 'last_active': session.lastActive.toIso8601String(), }; // Encrypt with EMDM V2 final encrypted = await _emdmEncoder.encode( jsonEncode(sessionData), _identityManager.currentIdentity, ); // Save to disk final file = File(_getSessionPath(session.peer, session.modelId)); await file.writeAsBytes(encrypted); } Future<ChatSession?> loadSession(String peer, String modelId) async { final file = File(_getSessionPath(peer, modelId)); if (!await file.exists()) return null; // Decrypt with EMDM V2 final encrypted = await file.readAsBytes(); final decrypted = await _emdmEncoder.decode( encrypted, _identityManager.currentIdentity, ); final data = jsonDecode(decrypted); return ChatSession.fromJson(data); } } ``` --- ### 6. UI Components **Token Window Display:** ```dart Widget _buildTokenWindow() { final tracker = _sessionManager.currentSession?.tokenTracker; if (tracker == null) return SizedBox.shrink(); return Container( padding: EdgeInsets.symmetric(horizontal: 16, vertical: 8), child: Column( crossAxisAlignment: CrossAxisAlignment.start, children: [ Row( mainAxisAlignment: MainAxisAlignment.spaceBetween, children: [ Text( 'Context Window', style: TextStyle(fontSize: 12, fontWeight: FontWeight.bold), ), Text( '${tracker.currentTokens}/${tracker.maxTokens}', style: TextStyle( fontSize: 12, color: tracker.statusColor, fontWeight: FontWeight.bold, ), ), ], ), SizedBox(height: 4), LinearProgressIndicator( value: tracker.percentage, backgroundColor: Colors.grey[300], valueColor: AlwaysStoppedAnimation<Color>(tracker.statusColor), ), if (tracker.needsCompression) Padding( padding: EdgeInsets.only(top: 4), child: Row( children: [ Icon(Icons.warning, size: 14, color: Colors.orange), SizedBox(width: 4), Text( 'Context window full - compression recommended', style: TextStyle(fontSize: 10, color: Colors.orange), ), Spacer(), TextButton( onPressed: _compressContext, child: Text('Compress', style: TextStyle(fontSize: 10)), ), ], ), ), ], ), ); } ``` **Session Selector:** ```dart Widget _buildSessionSelector() { return PopupMenuButton<ChatSession>( icon: Icon(Icons.history), tooltip: 'Session History', onSelected: (session) => _loadSession(session), itemBuilder: (context) { final sessions = _sessionManager.getSessions(_selectedPeer); return [ ...sessions.map((session) => PopupMenuItem( value: session, child: ListTile( leading: Icon(Icons.chat_bubble_outline), title: Text('${session.modelId}'), subtitle: Text( '${session.messages.length} messages • ' '${session.tokenCount}/${session.maxTokens} tokens', ), trailing: Text( _formatDate(session.lastActive), style: TextStyle(fontSize: 10), ), ), )), PopupMenuDivider(), PopupMenuItem( value: null, child: ListTile( leading: Icon(Icons.add), title: Text('New Session'), ), ), ]; }, ); } ``` --- ### 7. Automatic Compression **Background Compression:** ```dart class AutoCompressor { Timer? _compressionTimer; void startMonitoring(ChatSession session) { _compressionTimer = Timer.periodic( Duration(seconds: 30), (_) => _checkAndCompress(session), ); } Future<void> _checkAndCompress(ChatSession session) async { if (session.tokenTracker.needsCompression) { print('🔄 Auto-compressing context...'); // Keep last N messages final recentMessages = session.messages.takeLast(5).toList(); final oldMessages = session.messages.take(session.messages.length - 5).toList(); // Compress old messages final compressed = await _compressor.compress(oldMessages); // Update session session.compressedHistory = compressed; session.contextWindow = recentMessages; session.tokenCount = _calculateTokens(recentMessages) + _calculateTokens([compressed]); await _sessionManager.saveSession(session); print('✅ Context compressed: ${oldMessages.length} → summary'); print(' Tokens: ${session.tokenCount}/${session.maxTokens}'); } } } ``` --- ## Implementation Priority ### Phase 1 (Immediate) 1. ✅ Add "More..." to model switcher 2. ✅ Token window display 3. ✅ Basic session persistence ### Phase 2 (Next) 4. ✅ Context compression 5. ✅ Session selector UI 6. ✅ Auto-compression ### Phase 3 (Future) 7. ✅ Multi-session management 8. ✅ Session export/import 9. ✅ Advanced compression strategies --- ## Benefits ✅ **Better UX** - Users see token usage ✅ **No context loss** - Compression preserves key info ✅ **Model flexibility** - Easy switching with proper identity ✅ **Session continuity** - Pick up where you left off ✅ **Efficient** - Automatic compression when needed --- **This is how professional AI chat applications work!** 🎯