claude-coordination-system
Version:
🤖 Multi-Claude Parallel Processing Coordination System - Organize multiple Claude AI instances to work together seamlessly on complex development tasks
582 lines (445 loc) • 12.3 kB
Markdown
# 🤖 CLAUDE COORDINATION SYSTEM - COMPLETE OPERATIONS GUIDE
**⚠️ IMPORTANT**: This file belongs to the `claude-coordination-system` package. All instructions here are specific to managing Claude AI workers through the coordination system.
## 📋 TABLE OF CONTENTS
1. [System Architecture](#system-architecture)
2. [Configuration Files](#configuration-files)
3. [Coordinator Operations](#coordinator-operations)
4. [Worker Management](#worker-management)
5. [Group Management](#group-management)
6. [Troubleshooting](#troubleshooting)
7. [Safety Protocols](#safety-protocols)
8. [Advanced Operations](#advanced-operations)
## 🏗️ SYSTEM ARCHITECTURE
### Core Components
- **Coordinator**: Central management system (1 per project)
- **Workers**: AI processing units (multiple per project)
- **Groups**: Work categorization (TypeScript, UI, Deploy, etc.)
- **Dashboard**: Web monitoring interface (port 7777/7778)
### File Structure
```
project/
├── claude-coord.json # Main configuration file
├── .claude-coord/ # Coordination directory
│ ├── system-state.json # Live system state
│ ├── workers/ # Worker state files
│ └── locks/ # File lock management
└── DEVELOPMENT.log # System activity log
```
## ⚙️ CONFIGURATION FILES
### 1. Main Configuration: `claude-coord.json`
**Location**: Project root directory
**Purpose**: Defines groups, tasks, dependencies, and system settings
**Structure**:
```json
{
"project": {
"name": "ProjectName",
"type": "nextjs",
"coordination": "file-based"
},
"groups": {
"TYPESCRIPT": {
"name": "TypeScript & Build System",
"priority": 1,
"files": ["**/*.ts", "**/*.tsx", "tsconfig.json"],
"dependencies": [],
"tasks": [
"Fix TypeScript compilation errors",
"Add type definitions",
"Update TypeScript configuration"
]
}
},
"limits": {
"maxWorkers": 6,
"memoryPerWorker": 256
}
}
```
### 2. System State: `.claude-coord/system-state.json`
**Location**: `.claude-coord/` directory
**Purpose**: Live tracking of coordinator and workers
**⚠️ WARNING**: Never edit manually - system managed
### 3. Configuration Commands
```bash
# View all groups
claude-coord list-groups
# Setup interactive rules
claude-coord setup-rules -i
# Initialize project
claude-coord init
# Show configuration
claude-coord config show
```
## 🖥️ COORDINATOR OPERATIONS
### Starting Coordinator
```bash
# Standard start (port 7777)
claude-coord start
# Custom port
claude-coord start --port=7778
# Development mode
claude-coord start --mode=dev
# Background mode
claude-coord start &
```
### Coordinator Status
```bash
# Check if running
ps aux | grep "claude-coord\|cli.js start"
# View dashboard
open http://localhost:7777
# Monitor logs
tail -f DEVELOPMENT.log
# System health
claude-coord status
```
### Stopping Coordinator
```bash
# Graceful stop (if interactive)
# Press 'Q' in coordinator terminal
# Force stop (find process ID first)
ps aux | grep "cli.js start"
kill <PID>
# Stop all claude processes
pkill -f "claude-coord\|claude-worker"
```
## 👥 WORKER MANAGEMENT
### Worker Creation
**Standard Worker**:
```bash
# Create worker for specific group
claude-worker --id=claude_main --group=TYPESCRIPT --verbose
# Standby worker (no initial group)
claude-worker --id=claude_flex --standby
# Background worker
claude-worker --id=claude_bg --group=UI --verbose &
```
**Interactive Group Creation**:
When you specify an unknown group, system will:
1. Show available groups
2. Ask to create new group
3. Suggest tasks based on group name
4. Configure group settings interactively
### Worker Status
```bash
# List all running workers
ps aux | grep claude-worker
# Check worker in dashboard
open http://localhost:7777
# Worker logs in terminal
# (visible in worker's own terminal)
```
### Worker Control
```bash
# Standby worker commands (in worker terminal):
# join <group> - Join specific group
# list - List available groups
# status - Show worker status
# quit - Exit worker
# From outside worker:
kill <worker_PID> # Stop specific worker
pkill -f "worker.js --id=worker_name" # Stop by name
```
## 🏷️ GROUP MANAGEMENT
### Understanding Groups
Groups categorize work by:
- **Technology** (TypeScript, React, Vue)
- **Domain** (UI, API, Database)
- **Process** (Deploy, Testing, Documentation)
### Viewing Groups
```bash
# List all configured groups
claude-coord list-groups
# Example output:
# TYPESCRIPT: TypeScript & Build System
# UI: UI Components & Pages (depends on: TYPESCRIPT, ESLINT)
# DEPLOY: Deployment & Production (depends on: TYPESCRIPT)
```
### Adding New Groups
**Method 1: Interactive CLI**
```bash
# Start interactive setup
claude-coord setup-rules -i
# Follow prompts for:
# - Group name and description
# - File patterns
# - Dependencies
# - Tasks
```
**Method 2: Manual Config Edit**
```bash
# 1. Edit claude-coord.json
vim claude-coord.json # or your preferred editor
# 2. Add group structure:
{
"groups": {
"DEPLOY": {
"name": "Deployment & Production",
"priority": 4,
"files": [
"vercel.json",
"next.config.*",
"package.json"
],
"dependencies": ["TYPESCRIPT", "ESLINT"],
"tasks": [
"Monitor deployment status",
"Fix production build errors",
"Handle deployment failures"
]
}
}
}
# 3. NO NEED TO RESTART - Config auto-reloads
```
**Method 3: Worker-Initiated Creation**
```bash
# When starting worker with unknown group:
claude-worker --id=claude_deploy --group=Deploy
# System will:
# 1. Detect unknown group "Deploy"
# 2. Show available groups
# 3. Ask to create new group
# 4. Suggest Deploy-related tasks automatically
# 5. Configure group interactively
# 6. Start worker in new group
```
### Group Dependencies
```json
{
"UI": {
"dependencies": ["TYPESCRIPT", "ESLINT"]
}
}
```
**Meaning**: UI workers won't start until TYPESCRIPT and ESLINT workers are active.
## 🔧 TROUBLESHOOTING
### Common Issues & Solutions
#### 1. "Address already in use" Error
```bash
# Problem: Port 7777 occupied
Error: listen EADDRINUSE: address already in use :::7777
# Solution A: Find and kill process
lsof -i :7777
kill <PID>
# Solution B: Use different port
claude-coord start --port=7778
```
#### 2. Worker Shows "Unknown" in Dashboard
```bash
# Problem: Worker not properly registered
Active Workers: Unknown WORKING Group: DEPLOY
# Diagnosis:
ps aux | grep claude-worker # Check if worker running
tail -f DEVELOPMENT.log # Check registration logs
# Solutions:
# 1. Restart worker only (NOT coordinator)
kill <worker_PID>
claude-worker --id=claude_deploy --group=DEPLOY --verbose
# 2. Check group exists
claude-coord list-groups
# 3. Verify coordination directory
ls -la .claude-coord/
```
#### 3. Configuration Not Loading
```bash
# Problem: Changes to claude-coord.json not appearing
# Check file location
find . -name "claude-coord.json" -type f
# Verify JSON syntax
node -e "console.log(JSON.parse(require('fs').readFileSync('claude-coord.json', 'utf8')))"
# Force reload (if needed)
# Usually auto-reloads, but if stuck:
# Restart coordinator ONLY if no workers running
ps aux | grep claude-worker # Must be empty
claude-coord start
```
#### 4. Worker Stuck in Loop
```bash
# Check worker terminal output
# Look for repeated error messages
# Common causes:
# - Missing dependencies
# - Invalid group configuration
# - File permission issues
# - Memory limits exceeded
# Solution: Stop and restart with verbose
kill <worker_PID>
claude-worker --id=worker_name --group=GROUP_NAME --verbose
```
#### 5. Dashboard Not Loading
```bash
# Check coordinator status
ps aux | grep "cli.js start"
# Verify port
netstat -an | grep 7777
# Try alternative dashboard ports
open http://localhost:7777
open http://localhost:7778
# Check logs
tail -f DEVELOPMENT.log | grep "dashboard\|web"
```
## 🚨 SAFETY PROTOCOLS
### Critical Rules
1. **NEVER restart coordinator when workers are running**
```bash
# Always check first:
ps aux | grep claude-worker
# If workers found, stop them first:
pkill -f claude-worker
# Then restart coordinator:
claude-coord start
```
2. **NEVER edit .claude-coord/ files manually**
- System-managed directory
- Manual edits cause corruption
- Use CLI commands instead
3. **ALWAYS use background processes for workers**
```bash
# Correct:
claude-worker --id=worker1 --group=UI --verbose &
# Also correct (in Claude Code):
# Use run_in_background=true parameter
```
### Pre-Flight Checklist
Before any major operation:
```bash
# 1. Check system status
ps aux | grep claude
lsof -i :7777
# 2. Backup configuration
cp claude-coord.json claude-coord.json.backup
# 3. Check coordination directory
ls -la .claude-coord/
```
### Emergency Recovery
```bash
# If system corrupted:
# 1. Stop everything
pkill -f "claude-coord\|claude-worker"
# 2. Clean coordination directory
rm -rf .claude-coord/
# 3. Restore from backup (if available)
cp claude-coord.json.backup claude-coord.json
# 4. Restart fresh
claude-coord start
# 5. Restart workers one by one
claude-worker --id=worker1 --group=GROUP1 &
```
## 🚀 ADVANCED OPERATIONS
### Monitoring & Logging
```bash
# Real-time system monitoring
tail -f DEVELOPMENT.log
# Filter specific worker
tail -f DEVELOPMENT.log | grep "worker_name"
# Monitor coordination events
tail -f DEVELOPMENT.log | grep "coordinator\|registration\|heartbeat"
# Dashboard monitoring
open http://localhost:7777
# Check: Active Workers, System Status, Group Distribution
```
### Performance Tuning
```bash
# Adjust worker memory limits
claude-worker --id=worker1 --group=UI --memory=512
# Monitor memory usage
# (Visible in worker terminal and dashboard)
# Group priority optimization
# Edit claude-coord.json:
{
"groups": {
"CRITICAL": {"priority": 1},
"NORMAL": {"priority": 2},
"LOW": {"priority": 3}
}
}
```
### Multi-Project Setup
```bash
# Project A (port 7777)
cd /path/to/projectA
claude-coord start
# Project B (port 7778)
cd /path/to/projectB
claude-coord start --port=7778
# Workers connect to local coordinator automatically
# Each project has independent coordination
```
### Custom Group Templates
Common group patterns:
```json
{
"FRONTEND": {
"name": "Frontend Development",
"files": ["src/components/**", "src/pages/**", "**/*.tsx"],
"dependencies": ["TYPESCRIPT"],
"tasks": [
"Fix React component errors",
"Update component styling",
"Add responsive design"
]
},
"BACKEND": {
"name": "Backend API",
"files": ["src/app/api/**", "**/*.ts"],
"dependencies": ["TYPESCRIPT"],
"tasks": [
"Fix API endpoints",
"Add error handling",
"Update database queries"
]
},
"TESTING": {
"name": "Testing & Quality",
"files": ["**/*.test.*", "**/*.spec.*"],
"dependencies": ["TYPESCRIPT", "FRONTEND", "BACKEND"],
"tasks": [
"Write unit tests",
"Fix failing tests",
"Add integration tests"
]
}
}
```
## 🎯 QUICK REFERENCE
### Essential Commands
```bash
# System Status
ps aux | grep claude
lsof -i :7777
# Start System
claude-coord start
claude-worker --id=worker1 --group=GROUP --verbose &
# Monitor System
open http://localhost:7777
tail -f DEVELOPMENT.log
# Emergency Stop
pkill -f claude
# Configuration
claude-coord list-groups
vim claude-coord.json
```
### File Locations
- **Config**: `./claude-coord.json`
- **State**: `./.claude-coord/system-state.json`
- **Logs**: `./DEVELOPMENT.log`
- **Dashboard**: `http://localhost:7777`
### Common Patterns
- **1 Coordinator per project**
- **Multiple workers per coordinator**
- **Background worker processes**
- **Group-based task organization**
- **Dependency-driven execution order**
**📞 Remember**: This guide is part of `claude-coordination-system`. All commands and procedures here are specific to managing Claude AI workers through this coordination framework.