claude-flow-novice

Version:

Claude Flow Novice - Advanced orchestration platform for multi-agent AI workflows with CFN Loop architecture Includes Local RuVector Accelerator and all CFN skills for complete functionality.

github.com/cfn-dev/claude-flow-novice

cfn-dev/claude-flow-novice

1,294 lines (1,036 loc) • 42.7 kB

Markdown

# Claude Flow Novice - Features Matrix (v2.18.1) ## RuVector Codebase Search (v2.18.1 - NEW) ### Purpose Semantic codebase search for faster file discovery using vector embeddings. ### Commands ```bash # Index (full) ./.claude/skills/cfn-ruvector-codebase-index/index.sh --full # Search ./.claude/skills/cfn-ruvector-codebase-index/search.sh "authentication logic" --top 5 # Incremental reindex /cfn-ruvector:codebase-reindex ``` ### Architecture - **Embeddings**: OpenAI text-embedding-3-small (1536 dimensions) - **Storage**: @ruvector/core (redb format) - **Location**: `.claude/skills/cfn-ruvector-codebase-index/data/` ### Manifest System `.cfn-manifest.json` tracks file ownership: - `cfn_files`: CFN-managed, updated on package updates - `protected_files`: Never touched (config.json, data/*) - Unlisted: Custom files, always preserved ### Distribution Projects receive skill files via manifest. Custom work preserved during updates. --- ## Integration Standardization (v2.16.0) ### Purpose Unification of patterns across skills for lifecycle, database operations, and data formats. ### Skill Lifecycle Automation **Purpose**: Deploy, version, and promote skills with single command execution **Features**: - Automated skill deployment from staging to production - Version management with semantic versioning enforcement - Promotion workflows with validation gates - Rollback capabilities for failed deployments - Dependency tracking across skill updates **Usage**: ```bash # Deploy skill to production ./.claude/skills/skill-lifecycle-manager/deploy-skill.sh \ --skill cfn-coordination \ --version 2.1.0 \ --environment production # Promote skill from staging to production ./.claude/skills/skill-lifecycle-manager/promote-skill.sh \ --skill cfn-agent-spawning \ --from staging \ --to production # Rollback to previous version ./.claude/skills/skill-lifecycle-manager/rollback-skill.sh \ --skill cfn-loop-validation \ --target-version 1.9.5 ``` **Components**: - `deploy-skill.sh`: Deploy with validation and dependency checks - `promote-skill.sh`: Multi-environment promotion workflows - `rollback-skill.sh`: Safe version rollback with state restoration - `version-manager.sh`: Semantic versioning enforcement **Integration**: Automatic version tracking in skill metadata, deployment hooks for pre/post validation ### Cross-Database Transactions **Purpose**: Coordinate operations across PostgreSQL, SQLite, and Redis with ACID guarantees **Features**: - Transaction coordinator for multi-database operations - Two-phase commit protocol implementation - Automatic rollback on partial failures - Consistency validation across databases - Performance monitoring and optimization **Architecture**: ```bash # Transaction flow 1. Begin transaction (all databases) 2. Execute operations (PostgreSQL + SQLite + Redis) 3. Pre-commit validation 4. Commit phase 1 (prepare) 5. Commit phase 2 (finalize) 6. Rollback on any failure ``` **Usage**: ```bash # Execute coordinated transaction ./.claude/skills/cross-db-integration/execute-transaction.sh \ --postgres "INSERT INTO agents (...)" \ --sqlite "INSERT INTO audit_log (...)" \ --redis "SET swarm:state:active true" \ --timeout 30 # Validate consistency ./.claude/skills/cross-db-integration/validate-consistency.sh \ --transaction-id "tx-1730545678" ``` **Components**: - `execute-transaction.sh`: Multi-database transaction coordinator - `validate-consistency.sh`: Cross-database consistency checks - `rollback-transaction.sh`: Automatic rollback on failures - `monitor-transactions.sh`: Performance and health monitoring **Databases Supported**: - PostgreSQL: Relational data, agent profiles, skill metadata - SQLite: Local audit trails, test benchmarks, ACE context - Redis: Coordination state, pub/sub messaging, cache **Error Handling**: - Partial failure detection - Automatic rollback with state restoration - Transaction log persistence for debugging - Retry with exponential backoff ### File System Standardization **Purpose**: Unified patterns for backups, logging, and state persistence across all skills **Features**: - Standardized backup directory structure - Unified logging format (JSON + human-readable) - State persistence with TTL management - Atomic file operations with fallback - Path validation and sanitization **Directory Structure**: ``` .backups/[agent-id]/[timestamp]_[hash]/ # Pre-edit backups (24h TTL) .cfn/logs/[skill]/[date]/ # Skill execution logs (7d TTL) .cfn/state/[skill]/ # Persistent state files /tmp/cfn_[skill]/ # Temporary scratch space ``` **Logging Standard**: ```bash # JSON format for machine parsing { "timestamp": "2025-11-17T10:30:00Z", "skill": "cfn-coordination", "level": "INFO", "message": "Agent spawned successfully", "context": {"agent_id": "backend-1", "task_id": "cfn-123"} } # Human-readable format for debugging [2025-11-17 10:30:00] [INFO] cfn-coordination: Agent spawned successfully (agent_id=backend-1) ``` **Usage**: ```bash # Create standardized backup ./.claude/skills/file-system-standard/create-backup.sh \ --source src/file.ts \ --agent-id backend-1 \ --ttl 86400 # Write standardized log ./.claude/skills/file-system-standard/write-log.sh \ --skill cfn-agent-spawning \ --level INFO \ --message "Agent spawn initiated" \ --context '{"agent_type": "backend-developer"}' # Persist state with TTL ./.claude/skills/file-system-standard/persist-state.sh \ --skill cfn-coordination \ --key "active_agents" \ --value '["agent-1", "agent-2"]' \ --ttl 3600 ``` **Components**: - `create-backup.sh`: SHA-256 hashing, TTL management, metadata storage - `write-log.sh`: Dual-format logging (JSON + human-readable) - `persist-state.sh`: Atomic writes with TTL and validation - `cleanup-expired.sh`: TTL-based garbage collection **Integration**: All 43 skills updated to use standardized file operations ### Data Format Harmonization **Purpose**: Consistent JSON schemas across all skill inputs and outputs **Features**: - Unified JSON schema definitions - Automatic validation on skill invocation - Schema versioning and evolution - Type-safe data exchange between skills - Backward compatibility enforcement **Standard Schemas**: **Agent Metadata**: ```json { "agent_id": "string (required)", "agent_type": "string (required)", "status": "enum: spawned|active|completed|failed", "confidence": "number (0.0-1.0)", "spawned_at": "ISO8601 timestamp", "completed_at": "ISO8601 timestamp (optional)", "metadata": "object (optional)" } ``` **Task Configuration**: ```json { "task_id": "string (required)", "task_description": "string (required)", "mode": "enum: mvp|standard|enterprise", "scope": { "epic_goal": "string", "in_scope": ["array of strings"], "out_of_scope": ["array of strings"], "deliverables": ["array of file paths"], "acceptance_criteria": ["array of criteria"] }, "thresholds": { "gate": "number (0.0-1.0)", "consensus": "number (0.0-1.0)", "max_iterations": "integer" } } ``` **Skill Output**: ```json { "skill": "string (skill name)", "version": "string (semver)", "status": "enum: success|warning|error", "result": "object (skill-specific)", "metrics": { "execution_time_ms": "integer", "resource_usage": "object" }, "errors": ["array of error objects (optional)"] } ``` **Usage**: ```bash # Validate input against schema ./.claude/skills/data-format-harmonization/validate-input.sh \ --schema agent-metadata \ --input '{"agent_id": "backend-1", "agent_type": "backend-developer"}' # Generate schema for new skill ./.claude/skills/data-format-harmonization/generate-schema.sh \ --skill cfn-new-skill \ --template task-config # Validate skill output ./.claude/skills/data-format-harmonization/validate-output.sh \ --skill cfn-coordination \ --output-file /tmp/skill-output.json ``` **Components**: - `validate-input.sh`: JSON schema validation for skill inputs - `validate-output.sh`: JSON schema validation for skill outputs - `generate-schema.sh`: Schema template generation for new skills - `migrate-schema.sh`: Schema version migration with backward compatibility **Schema Registry**: Central repository at `.cfn/schemas/` with versioned definitions ### Edge Case Auto-Patching **Purpose**: Feedback loop generates patches from test failures automatically **Features**: - Test failure analysis with pattern extraction - Automatic patch generation from failure context - Git patch creation with descriptive commits - Validation of generated patches before application - Metrics tracking for patch effectiveness **Workflow**: ```bash 1. Test suite executes and detects failure 2. Failure analyzer extracts context (stack trace, inputs, expected vs actual) 3. Pattern matcher identifies failure type (null pointer, type mismatch, etc.) 4. Patch generator creates fix based on pattern library 5. Patch validator runs tests against generated patch 6. Git commit created if validation passes 7. Metrics updated with success/failure tracking ``` **Usage**: ```bash # Analyze test failure and generate patch ./.claude/skills/edge-case-auto-patch/analyze-failure.sh \ --test-file tests/cfn-v3/test-coordination.sh \ --failure-log /tmp/test-failure.log # Apply generated patch ./.claude/skills/edge-case-auto-patch/apply-patch.sh \ --patch-id patch-1730545678 \ --validate-before-commit # Track patch effectiveness ./.claude/skills/edge-case-auto-patch/track-metrics.sh \ --patch-id patch-1730545678 \ --outcome success ``` **Pattern Library**: - Null pointer dereference fixes - Type mismatch corrections - Missing error handling additions - Race condition resolution - Resource leak prevention **Components**: - `analyze-failure.sh`: Extract failure context from test logs - `generate-patch.sh`: Create patch from pattern matching - `apply-patch.sh`: Validate and apply generated patch - `track-metrics.sh`: Effectiveness tracking and reporting **Integration**: Hooks into test runner system for automatic failure detection **Metrics**: - Patch generation success rate - Patch application success rate - Test pass rate after patching - False positive rate ### Documentation **Integration Standardization Plan**: `planning/integration-standardization/INDEX.md` **Implementation Progress**: 27/30 tasks complete (90%) **Cross-Database Guide**: `docs/CROSS_DATABASE_INTEGRATION.md` **File System Standards**: `docs/FILE_SYSTEM_STANDARDIZATION.md` **Schema Registry**: `.cfn/schemas/README.md` ### Provider Routing System (v2.16.0) ### Purpose Cross-provider model compatibility for CFN Loop agent routing without modifying agent profiles. ### Implementation Centralized mapping system translates agent-specified models (sonnet/haiku/opus) to provider-specific model names. **Core Components**: - `provider-model-mappings.yaml` - Single source of truth for model mappings - `resolve-provider-model.ts` - TypeScript resolver service - Zero Redis dependency for model resolution ### Provider Support **Available Providers**: - `zai`: glm-4.6, glm-4.6-flash models - `kimi`: kimi-k2-turbo-preview model - `openrouter`: anthropic/claude-sonnet-4.5, anthropic/claude-3.5-haiku - `gemini`: google/gemini-2.0-flash-001, google/gemini-pro-exp - `xai`: grok-beta model - `anthropic`: claude-3-5-sonnet-20241022, claude-3-5-haiku-20241022 ### Agent Model Translation **Agent Profile**: `model: sonnet` **Resolution**: - Z.ai → glm-4.6 - Kimi → kimi-k2-turbo-preview - OpenRouter → anthropic/claude-sonnet-4.5 - Anthropic → claude-3-5-sonnet-20241022 ### Usage **Command Line**: ```bash ./resolve-provider-model.ts --provider zai --model sonnet # Returns: glm-4.6 ``` **Agent Spawning Integration**: ```bash AGENT_MODEL="sonnet" PROVIDER="zai" RESOLVED_MODEL=$(./resolve-provider-model.ts --provider "$PROVIDER" --model "$AGENT_MODEL") export ANTHROPIC_MODEL="$RESOLVED_MODEL" ``` ### Model Selection Tiers **Configuration**: Economy/Standard/Premium tiers for model selection **Example**: ```bash # Economy tier for bulk operations ./resolve-provider-model.ts --provider kimi --model haiku --tier economy # Returns: kimi-k2-turbo-preview ``` ### Integration Points - Agent spawning scripts use resolver for model mapping - CLI mode provider flags specify target provider - CFN Loop agents maintain existing model specifications - No changes required to 65+ agent profile files ### Performance - Resolution latency: <1ms per lookup - Memory footprint: <1MB configuration object - Startup time: <10ms YAML file load - O(1) lookup time for unlimited concurrent requests ## Namespace Isolation **Purpose**: Prevent file collision when installing CFN package **Strategy**: - Agents in `.claude/agents/cfn-dev-team/` subfolder - Skills prefixed with `cfn-*` - Hooks prefixed with `cfn-*` - Commands in `.claude/commands/cfn/` subdirectory **Collision risk**: ~0.01% **Installation**: ```bash npm install claude-flow-novice npx cfn-init # Copies namespace-isolated files ``` **NPM Distribution Workaround**: - npm has dotfile extraction limitation on Windows/WSL2 - `.claude/` directories don't extract from tarballs - Solution: Auto-generate `claude-assets/` during `npm pack` - Single source of truth: `.claude/` in development - `prepack` script copies to `claude-assets/` before publish - `claude-assets/` not tracked in git (auto-generated only) **Benefits**: - User custom agents/skills/hooks preserved - Safe updates (only cfn-* files overwritten) - Can run cfn-init multiple times safely - All content distributes correctly despite npm limitations ### Agent Statistics (v2.10.6) - **Development Team**: 23 agents in cfn-dev-team - **Production Agents**: 23 agents - **Package Metrics**: - Size: 4.7 MB unpacked (1.2 MB tarball) - Files: 1300 (includes all agents, skills, hooks, commands) - Files: 303 files (68% reduction) ### 6. Pre-Edit Backup System **Purpose**: Safe file revert without git operations during parallel agent sessions **Implementation**: - Backup location: `.backups/[agent-id]/[timestamp]_[hash]/` - SHA-256 file hashing for unique identification - JSON metadata with TTL (default 24h) - Agent-isolated directories prevent conflicts **Usage**: ```bash # Create backup before edit BACKUP_PATH=$(./.claude/hooks/cfn-invoke-pre-edit.sh "src/file.ts" --agent-id "coder-1") # Revert to most recent backup ./.claude/skills/pre-edit-backup/revert-file.sh "src/file.ts" --agent-id "coder-1" # Interactive mode (select backup) ./.claude/skills/pre-edit-backup/revert-file.sh "src/file.ts" --agent-id "coder-1" --interactive # List available backups ./.claude/skills/pre-edit-backup/revert-file.sh "src/file.ts" --agent-id "coder-1" --list-only ``` **Configuration**: - TTL: `.claude/hooks/post-edit.config.json` → `pre_edit_backup.default_ttl` - Cleanup: Automatic removal after TTL expiration - Permissions: 700 (owner-only access) **Integration**: Required before all Edit/Write operations in agent workflows ### 7. Multi-Language Code Validators #### Purpose Detect language-specific bugs before runtime via post-edit hook pipeline #### Supported Languages - **Bash**: Pipe safety, dependency validation, line ending enforcement - **Python**: Subprocess safety, async/await validation, import checking - **JavaScript/TypeScript**: Promise handling, ESLint integration - **Rust**: Command safety, future handling, dependency validation #### Validators **Bash** (3 validators, 8/8 tests passing): - `bash-pipe-safety.sh` - Unsafe pipe detection with set -o pipefail - `bash-dependency-checker.sh` - Validates script dependencies exist - `enforce-lf.sh` - Auto-converts CRLF to LF line endings **Python** (3 validators, 6/6 tests passing): - `python-subprocess-safety.py` - AST-based subprocess stderr validation - `python-async-safety.py` - Async/await parent tracking for safety - `python-import-checker.py` - Third-party import resolution checking **JavaScript/TypeScript** (2 validators, 4/4 tests passing): - `js-promise-safety.sh` - Context-aware unhandled promise detection - `.eslintrc.json` - ESLint plugin with no-floating-promises rule **Rust** (3 validators, 6/6 tests passing): - `rust-command-safety.sh` - Command::new stderr validation - `rust-future-safety.sh` - Async fn .await detection - `rust-dependency-checker.sh` - Cargo.toml dependency validation #### Exit Code Convention - **0**: Pass (validation succeeded) - **1**: Error (blocks edit, critical issue) - **2**: Warning (non-blocking, logged) #### Integration Automatically triggered via post-edit hook for matching file extensions (.sh, .py, .js, .ts, .rs) **Configuration**: `.claude/hooks/cfn-post-edit.config.json` **Test Location**: `tests/post-edit/test-*-validators.sh` **Documentation**: See [logs-hooks.md](./logs-hooks.md#multi-language-code-validators) ### 8. Cyclomatic Complexity Analysis #### Purpose Automatic code complexity monitoring integrated into post-edit pipeline #### Features - **Automatic Analysis**: Triggers on files >200 lines - **Two-Tier Warnings**: - Complexity 30-39: Warning (exit code 8) - Complexity ≥40: Critical + detailed Lizard analysis (exit code 7) - **Multi-Language Support**: Bash, JavaScript, TypeScript, Python - **Performance**: ~23ms overhead per file #### Tools - **simple-complexity.sh**: Fast bash-native analyzer - **Lizard**: Professional multi-language analyzer (auto-installed) - **cyclomatic-complexity-reducer agent**: Automated refactoring #### Integration - Post-edit hook runs automatically - GitHub Actions workflow (manual trigger) - Real-time feedback during development #### Configuration ```bash # Disable in .claude/hooks/post-edit.config.json "complexityChecks": { "enabled": false } # Adjust thresholds in config/hooks/post-edit-pipeline.js if (complexity >= 30) { /* warning */ } if (complexity >= 40) { /* critical */ } ``` #### Output **Warning (30-39)**: ```json { "status": "COMPLEXITY_WARNING", "metrics": { "cyclomaticComplexity": 35 }, "recommendations": [{ "type": "complexity", "priority": "medium", "message": "Cyclomatic complexity is 35 (threshold: 30)", "action": "Consider refactoring to reduce complexity" }] } ``` **Critical (≥40)**: ```json { "status": "COMPLEXITY_CRITICAL", "metrics": { "cyclomaticComplexity": 74 }, "complexityAnalysis": { "tool": "lizard", "detailedReport": "NLOC CCN token PARAM length location\n..." }, "recommendations": [{ "priority": "critical", "action": "Run cyclomatic-complexity-reducer agent" }] } ``` ### 9. CFN Loop v3 Dual-Mode Architecture #### Purpose Flexible agent spawning with architectural optimization and context management #### Modes of Operation 1. **CLI Mode (Default)** - Routing: Main Chat → Coordinator → Orchestrator Script → CLI Agents - Context Management: Redis-based storage with full scope injection - Resource Optimization: Redis-based coordination 2. **Task Mode** - Routing: Main Chat → JSON Config File → Direct Task Agent Spawning - Context Management: Shared config file (`.cfn/task-configs/task-[id].json`) - Routing Provider: Anthropic native routing - Detailed Tracking: Full visibility in Main Chat #### Core Features - **Redis Context Storage** (CLI Mode) - Eliminates CLI JSON escaping complexities - Enables stateful agent coordination - Supports swarm recovery after interruptions - **Full scope injection** (v2.10.7): epicGoal, inScope, outOfScope, deliverables, acceptanceCriteria - Product owner receives complete context for informed decisions - **Config File Context** (Task Mode) - Structured JSON configuration generated at startup - All agents read from shared config file - Automatic scope extraction from task description - Prevents "consensus on vapor" with explicit deliverable lists - **Enhanced CLI Context Parsing (v2.9.0)** - Automatic JSON-to-markdown conversion - Converts file lists to bullet points - Converts requirements to numbered lists - Maintains Task agent clarity with CLI efficiency - Supported fields: task, files, requirements, deliverables, instructions, acceptanceCriteria, batch, directory - Fallback: Plain text if not valid JSON - **Domain-Specific Validation** - 6 Structured Validation Templates: 1. Software Development 2. Content Creation 3. Research Analysis 4. Design Workflows 5. Infrastructure Management 6. Data Processing - **Advanced Monitoring Capabilities** - Intervention Detection Mechanisms * Confidence plateau tracking * Recurring feedback identification * Stuck deliverable recognition - **Adaptive Learning** - Playbook Pattern Extraction * SQLite-based pattern storage * Automatic agent selection based on historical performance - Retrospective Analysis (Loop 5 Post-Sprint) * Pattern identification * Performance optimization recommendations #### Integration Points - Seamless compatibility with: - `/cfn-loop-cli` - `/cfn-loop-task` - `/cfn-loop-frontend` #### Configuration - Mode toggling via `/cfn-mode` command - Granular control over spawning behavior - Zero-configuration default settings ### 9. ACE System (Adaptive Context Extension) #### Purpose Learn from sprint execution to prevent repeating mistakes #### Core Capabilities - Automatic reflection after sprint completion - Anti-pattern detection from low-confidence sprints (<0.70) - Unified context injection (positive + negative) - Relevance scoring with adaptive limits - A/B testing and effectiveness tracking #### Skills - `invoke-context-reflect.sh` - Extract lessons from sprints - `invoke-context-inject.sh` - Inject adaptive context - `query-contexts.sh` - Retrieve strategies/patterns - `query-anti-patterns.sh` - Retrieve anti-patterns - `format-negative-context.sh` - Format anti-patterns with visual distinction - `extract-tags.sh` - Automatic tag extraction from task descriptions - `classify-task.sh` - Domain classification (frontend, backend, security, etc.) - `score-relevance.sh` - Multi-factor relevance scoring - `track-ab-test.sh` - A/B testing analytics - `analyze-anti-pattern-effectiveness.sh` - Effectiveness metrics tracking - `export-ace-metrics.sh` - Dashboard metrics export #### Context Types - **Strategies (STRAT-XXX)**: High-level approaches proven effective - **Patterns (PATTERN-XXX)**: Reusable implementation patterns - **Anti-Patterns (ANTI-XXX)**: Practices to avoid based on failures #### Relevance Scoring - Exact tag match: 1.0 - Partial tag match: 0.6 - Domain match: 0.3 - Recency boost: +0.1 (last 30 days) - Frequency boost: +0.05 per repetition #### Adaptive Context Limits - High relevance (≥0.8): 10 bullets - Medium relevance (0.5-0.79): 5 bullets - Low relevance (<0.5): 3 bullets #### Storage - SQLite database: `ace-context.db` - Indexed queries: tags, domain, confidence, timestamp - Redis analytics tracking #### Integration - CFN Loop orchestrator integration - Post-sprint reflection hooks - Agent context injection in real-time - Dashboard metrics visualization #### Configuration - Reflection modes: automatic, manual, disabled - Context limits: adaptive (3-10 bullets based on relevance) - A/B testing: enabled/disabled per agent ### 10. Frontend CFN Loop (Visual Iteration Workflow) #### Purpose Specialized CFN Loop for frontend development with visual validation and brand consistency enforcement #### Key Features - **Design-First Approach** - Mockup integration (PNG/JPG images) - Brand guideline extraction from mockups - Design token management (colors, typography, spacing) - **Dual Validation System** - Screenshot analysis: Visual fidelity (colors, layout, spacing) - Video analysis: Interaction quality (animations, loading states, error handling) - Combined score threshold: ≥85% required - **Visual Iteration Loop** - Playwright screenshot capture - Playwright video recording (`video: 'on'` in config) - Image analysis via `mcp__zai-mcp-server__analyze_image` - Video analysis via `mcp__zai-mcp-server__analyze_video` - Structured feedback: static discrepancies + interaction issues - **Coordinator Orchestration** - Coordinator orchestrates only (does NOT implement code) - Spawns frontend specialists (react-frontend-engineer, accessibility-advocate-persona) - Manages iteration cycles based on combined visual + interaction score #### Integration - Works in both CLI and Task modes - Supports `/cfn-loop-cli` and `/cfn-loop-task` with `--spawn-mode` parameter - Brand guidelines stored in `.claude/brand-guidelines.json` #### Configuration ```bash # Task Mode (full visibility) /cfn-loop-frontend "Build login UI" \ --mockup=/path/to/mockup.png \ --brand-guidelines=/path/to/brand.json \ --spawn-mode=task # CLI Mode (production) /cfn-loop-frontend "Build dashboard" \ --mockup=/path/to/dashboard.png \ --mode=enterprise ``` #### Output Artifacts - Screenshots: `tests/screenshots/*.png` - Videos: `test-results/**/video.webm` - Brand guidelines: `.claude/brand-guidelines.json` - Sprint docs: `docs/SPRINT_*.md` #### Documentation - Guide: `.claude/commands/cfn/CFN_LOOP_FRONTEND.md` - Covers: Phase 0 planning, brand guidelines, visual iteration, validator coordination ### 11. Task Mode Execution (CFN Loop) #### Purpose Simplified CFN Loop execution with direct agent spawning, full visibility, and structured scope configuration #### ANTI-023 Memory Leak Protection **Critical Fix (v2.14.28)**: Prevents Task Mode agents from executing CLI coordination scripts **Problem Solved**: System architect agent caused 25GB memory usage when spawned via Task tool due to embedded Redis protocols **Three-Layer Defense System**: 1. **Agent Documentation**: Clean completion protocols without Redis commands (29 agent files updated) 2. **Agent-Level Detection**: `detect_task_mode_and_exit()` functions reject CLI calls 3. **Code-Level Blocking**: Runtime checks in coordination scripts exit immediately **Detection Logic**: ```bash # Task Mode detection using environment variables if [[ -z "${TASK_ID:-}" || -z "${AGENT_ID:-}" ]]; then echo "❌ TASK MODE DETECTED - CLI commands forbidden" exit 1 fi ``` **Protected Components**: - Validator agents: reviewer, tester, perf-analyzer, security-specialist - Coordination scripts: report-completion.sh, consensus.sh, orchestrate.sh - Agent spawning: spawn-agent.sh, spawn-agents.sh **Mode-Specific Behavior**: - **Task Mode**: Return structured JSON directly to Main Chat - **CLI Mode**: Use Redis coordination and CLI scripts **Memory Impact**: - **Before**: Up to 23GB memory consumption per hanging agent - **After**: <100MB normal usage, automatic process cleanup #### Task Config Initialization **New in v2.10.7**: Automatic config generation at CFN Loop startup **Location**: `.cfn/task-configs/task-[task-id].json` **Structure**: ```json { "taskId": "cfn-phase-1730545678", "taskDescription": "Implement JWT authentication", "mode": "standard", "scope": { "epicGoal": "Build authentication system", "inScope": ["JWT token generation", "OAuth2 integration"], "outOfScope": ["Multi-factor auth", "Biometric login"], "deliverables": ["src/auth/jwt.ts", "tests/auth.test.ts"], "directory": "src/auth", "acceptanceCriteria": ["Tokens expire correctly", "Tests pass >80%"] }, "agents": { "loop3": ["backend-developer", "researcher"], "loop2": ["reviewer", "tester", "architect", "security-specialist"] }, "thresholds": {"gate": 0.75, "consensus": 0.90, "maxIterations": 10} } ``` **Usage**: ```bash # Automatic initialization in /cfn-loop-cli or /cfn-loop-task command CONFIG_PATH=$(./.claude/skills/cfn-task-config-init/initialize-config.sh \ --task-description "Implement JWT auth" \ --mode "standard" \ --task-id "cfn-phase-1730545678") ``` **Benefits**: - Product owner receives complete scope context - Distinguishes sprint vs epic completion - Prevents "consensus on vapor" (approving plans without code) - All agents receive consistent scope information - Enables autonomous sprint transitions #### Key Differences from CLI Mode - **No coordinator agent**: Main Chat coordinates directly - **Task() spawning**: Agents spawned via Task tool (not CLI) - **Config-based context**: All agents read from shared config file - **Anthropic routing**: All agents use Main Chat provider - **Automatic scope extraction**: Task description analyzed for deliverables/criteria #### Agent Specialization - **Loop 3 (Implementation)**: backend-developer, researcher, mobile-dev, devops, rust-developer - **Loop 2 (Validation)**: reviewer, tester, architect, security-specialist, accessibility-advocate-persona - **Loop 4 (Product Owner)**: product-owner (autonomous decision-making, no manual approval) #### Adaptive Validator Scaling | Complexity | Files | LOC | Validators | Agents | Threshold | |------------|-------|-----|------------|--------|-----------| | Simple | 1-2 | <200 | 2 | reviewer, tester | 0.85 | | Standard | 3-5 | 200-500 | 4 | +architect, +security-specialist | 0.90 | | Complex/Enterprise | >5 | >500 | 5+ | +code-analyzer, +perf/ada | 0.92-0.95 | #### Sprint Workflow 1. Initialize task config (scope, deliverables, criteria) 2. Main Chat reads config and guide: `.claude/commands/CFN_LOOP_TASK_MODE.md` 3. Spawns Loop 3 agents in parallel with full scope context 4. Collects confidence scores, checks gate threshold 5. Spawns Loop 2 validators with acceptance criteria 6. Product Owner makes autonomous PROCEED/ITERATE/ABORT decision 7. Git commit + push on PROCEED (no manual approval) #### Product Owner Autonomous Execution **Updated in v2.10.7**: Product owner eliminates manual approval gates **Decision Output**: ``` Decision: PROCEED Reasoning: Consensus 0.92 exceeds 0.90 threshold, all deliverables created Confidence: 0.95 Next Action: Proceed to Sprint 2 (epic has 3 sprints remaining) ``` **No user confirmation prompts** - Product owner determines next action and continues autonomously #### Backlog Management - **P1 (critical)**: Blocking issues, security fixes - **P2 (high)**: Important features, performance improvements - **P3 (background)**: Nice-to-have features, cleanup tasks #### Documentation - Guide: `.claude/commands/CFN_LOOP_TASK_MODE.md` - Config Init: `.claude/skills/cfn-task-config-init/SKILL.md` - Covers: Agent selection, adaptive scaling, sprint completion, scope management, backlog mechanism ### 12. Test Runner System with Benchmarking **Purpose**: Automated test execution with SQLite benchmarking and regression detection **Components**: - Test runner: `.claude/skills/cfn-test-runner/run-all-tests.sh` - Benchmark storage: `.claude/skills/cfn-test-runner/store-benchmarks.sh` - Regression detection: `.claude/skills/cfn-test-runner/detect-regressions.sh` - SQLite database: `.test-benchmarks.db` **Test Coverage**: - Hello World: 7 layers (tool validation, coordinator spawning, review handoff, error retry) - CFN E2E: 9 tests (coordinator handoff, gate checks, Loop 2/3 validation, Product Owner decision) - Total: 13 automated tests **Benchmarking**: - Stores: run timestamp, git commit/branch, test counts, duration, success rate - Baseline: 10-run moving average - Regression threshold: 10% (configurable) - Exit code 1 if regression detected **Usage**: ```bash # Run all tests with benchmarking ./.claude/skills/cfn-test-runner/run-all-tests.sh --benchmark --detect-regressions # Via slash command /run-tests --benchmark --detect-regressions --threshold 5 ``` **Integration**: CI/CD ready with exit code validation ### 13. Redis Key Validation **Purpose**: Enforce consistent Redis key patterns across codebase **Validator**: `.claude/skills/cfn-test-runner/validate-redis-keys.sh` **Standard Patterns**: - Agent completion: `swarm:${TASK_ID}:${AGENT_ID}:done` - Confidence reporting: `swarm:${TASK_ID}:${AGENT_ID}:confidence` - Product Owner decision: `swarm:${TASK_ID}:decision` - Gate checks: `swarm:${TASK_ID}:gate-passed|gate-failed` - Loop 2 consensus: `swarm:${TASK_ID}:loop2:consensus` - Metrics (global): `swarm:metrics:decisions:*` **Validation Checks**: - Anti-pattern detection (non-standard keys) - Product Owner key validation - Namespace consistency (swarm: prefix) - TTL enforcement **Audit Results** (3 independent agents): - Overall confidence: 0.92 (High) - Consistency: 90%+ adherence - Documentation: `docs/REDIS_KEY_CONSISTENCY_AUDIT.md` **Usage**: ```bash # Manual validation ./.claude/skills/cfn-test-runner/validate-redis-keys.sh # Integration with test suite /run-tests --validate-redis-keys ``` **Exit Codes**: - 0: PASS (all keys valid) - 1: FAIL (violations found) ### 14. E2E Test Suite **Purpose**: Validate CFN Loop coordination across complete workflow scenarios **Components**: - Main suite: `tests/cfn-v3/test-e2e-cfn-loop.sh` (19 test scenarios) - Decision parsing: `tests/cfn-v3/test-execute-decision-defensive.sh` (18 unit tests) - Process cleanup: `tests/cfn-v3/cleanup-test-processes.sh` **Test Coverage**: - Coordinator spawning and handoff - Gate threshold validation (Loop 3 self-assessment) - Loop 2 validator coordination - Product Owner decision parsing - Redis key consistency - Process lifecycle management - Error recovery and retry logic **Key Improvements**: **TEST 5 Fix: Product Owner Decision Key** - Defensive file handling in `execute-decision.sh` - Guarantees Redis `:decision` key creation - Handles missing/empty output files - 18 unit tests validate edge cases (empty output, malformed JSON, missing decision) **TEST 9 Fix: JSON Parsing** - Remove special characters from test context - Prevents orchestrator timeout from malformed JSON - Validates context extraction before agent spawn **Process Management**: - Process groups with `setsid` for clean termination - Trap handlers (`SIGTERM`, `SIGINT`, `EXIT`) prevent orphaned processes - Cleanup utility kills agent process trees - Prevents resource leaks during test failures **Test Reliability**: - BLPOP event-driven waits eliminate race conditions - Adaptive timeouts handle variable execution times - Sequential validation prevents premature checks - Success rate: 100% (19/19 tests pass) **Usage**: ```bash # Run full E2E suite tests/cfn-v3/test-e2e-cfn-loop.sh # Run decision parsing tests tests/cfn-v3/test-execute-decision-defensive.sh # Clean up orphaned processes tests/cfn-v3/cleanup-test-processes.sh ``` **Integration**: Validates complete CFN Loop flow from coordinator spawn to Product Owner decision ### 15. Product Owner Decision Automation **Purpose**: Guarantee Product Owner decision execution and Redis storage **Implementation**: `.claude/skills/cfn-product-owner-decision/execute-decision.sh` **Features**: - Robust multi-pattern parsing (PROCEED/ITERATE/ABORT) - Deliverable verification (prevents "consensus on vapor") - TTL management (1 hour expiration) - Decision storage: `swarm:${TASK_ID}:decision` **Decision Logic**: ```bash # Parse from agent output DECISION_TYPE=$(echo "$PO_OUTPUT" | grep -oiE "Decision:\s*(PROCEED|ITERATE|ABORT)") # Verify deliverables if PROCEED if [ "$DECISION_TYPE" = "PROCEED" ]; then FILES_CHANGED=$(git status --short | grep -E "^(A|M|\?\?)" | wc -l) if [ "$FILES_CHANGED" -eq 0 ]; then DECISION_TYPE="ITERATE" REASONING="No deliverables created" fi fi # Store with TTL redis-cli SET "swarm:${TASK_ID}:decision" "$DECISION_TYPE" EX 3600 ``` **Integration**: Orchestrator invokes after Loop 2 consensus (v2.14.6) **Documentation**: `docs/BUG_11_PRODUCT_OWNER_DECISION_KEY_MISSING.md` ### 16. n8n MCP Integration #### Purpose Execute marketing workflows via n8n webhooks, enabling multi-platform automation #### Architecture - CFN Skills invoke n8n workflows via HTTP webhooks - n8n workflows serve as MCP servers - Bash operation scripts handle webhook authentication (N8N_BASE_URL + N8N_API_KEY) - JSON payloads constructed with jq for type safety #### Implementation Pattern ```bash #!/bin/bash set -euo pipefail # Call n8n webhook curl -X POST "$N8N_BASE_URL/webhook/endpoint" \ -H "X-N8N-API-KEY: $N8N_API_KEY" \ -H "Content-Type: application/json" \ -d "$PAYLOAD" ``` #### Platform Integrations **34 platforms across 6 categories:** - **Email**: Mailchimp, SendGrid, HubSpot - **Social**: Meta, LinkedIn, Twitter/X, TikTok - **Analytics**: Google Analytics 4, HubSpot CRM, Salesforce - **Paid Ads**: Google Ads, Meta Ads, LinkedIn Ads - **Conversational**: Intercom, Drift, Twilio, Plivo - **Intelligence**: BuzzSumo, SEMrush, Ahrefs, Unbounce, Instapage - **PR/Media**: PR Newswire, Business Wire, Muck Rack, HARO, Meltwater, Brandwatch #### Compliance Frameworks - **TCPA**: Opt-in verification, DNC registry check, consent logging (exit code 3 for violations) - **A/B Testing**: 95% confidence, 100 min conversions, 7-day minimum duration - **Crisis Detection**: <15 min alert latency, 2-hour response SLA - **Budget Validation**: Hard-coded spend limits, multi-tier approval #### Skills Created - **12 skills**: 65 operations total - **12 workflows**: Across 5 marketing phases - **Exit Code Pattern**: 1=params, 2=API/network, 3=validation/compliance #### Use Cases - Marketing department automation (email, social, ads, analytics) - Multi-platform campaign orchestration - Compliance-enforced communications (TCPA, BANT) - Real-time crisis monitoring and alerting [... rest of previous content remains unchanged ...] ### 17. CFN Loop Forgiveness Mechanisms **Purpose**: Prevent CFN Loop failures through systematic error handling and recovery **Mechanisms**: **Multi-tier Agent Spawning Fallback**: - Strategy 1: Instrumented npx spawn - Strategy 2: Direct npx spawn - Strategy 3: Global binary fallback - Strategy 4: Placeholder agent (degraded mode) - Impact: 95% reduction in agent spawning failures **Pre-flight Validation**: - Dependency validation: Node.js, npx, Redis connectivity - Resource checking: Disk space (100MB min), memory (512MB min) - Helper script validation: Critical script availability - Graceful degradation: Continue with warnings vs hard failures **Adaptive Timeout Calculation**: - Phase-specific base timeouts - Memory-based adjustment: +50% when <1GB available - Concurrency monitoring: Alert when >10 processes - Bounds enforcement: 60s minimum, 1800s maximum **Race Condition Prevention**: - Collision-resistant agent ID generation - Timestamp + random suffix for uniqueness - Prevents ID collisions in concurrent orchestrators **Graceful Shutdown**: - Signal handlers: SIGTERM, SIGINT, EXIT - Process group cleanup - State persistence before exit - Resource deallocation **Checkpoint/Restart System**: - Periodic state snapshots - Iteration checkpoints every N iterations - Resume from last checkpoint on failure - State restoration validation **Fallback Mode Operation**: - Redis connection failure handling - Local state management fallback - Reduced functionality mode - Automatic recovery when Redis available **Self-healing Error Recovery**: - Automatic retry with backoff - Error categorization and routing - Resource pressure detection - Recovery strategy selection **Test Coverage**: - CLI test suite: `tests/test-cfn-forgiveness-cli-hello-world.sh` - Docker test suite: `tests/test-cfn-forgiveness-docker-hello-world.sh` - Adaptive timeout tests: `tests/test-adaptive-timeout-*.sh` - Graceful shutdown tests: `tests/test-graceful-shutdown-*.sh` **Documentation**: - Implementation: `docs/CFN_FORGIVENESS_MECHANISMS_COMPLETE.md` - Testing guide: `docs/CFN_FORGIVENESS_TESTING_GUIDE.md` - Docker guide: `docs/DOCKER_FORGIVENESS_TESTING_GUIDE.md` ### 18. CFN Error Logging Skill **Purpose**: Capture and report CFN Loop failures for debugging **Actions**: - `capture`: Capture error data on CFN Loop failure - `report`: Generate user-friendly error report - `cleanup`: Manage error log retention - `list`: List error logs **Data Captured**: - System diagnostics: CPU, memory, disk, dependencies - CFN Loop state: Configuration, execution, Redis data - Error context: Type, message, exit code, stack traces - Process tree: Parent/child relationships **Integration Points**: - CLI Loop: orchestrate.sh error handling - Docker Loop: container failure handling - Agent spawning: npx command failures **Storage**: - Location: `/tmp/cfn_error_logs/` - Retention: 7 days default - Format: JSON (machine-readable), Markdown (user-friendly) - Privacy: No code content, no credentials **Usage**: ```bash # Capture error ./.claude/skills/cfn-error-logging/invoke-error-logging.sh \ --action capture \ --task-id "$TASK_ID" \ --error-type "orchestrator" \ --error-message "Agent spawning failed" # Generate report ./.claude/skills/cfn-error-logging/invoke-error-logging.sh \ --action report \ --task-id "$TASK_ID" \ --format markdown ``` **Error Categories**: orchestrator, agent-spawn, timeout, consensus, resource ### CFN Docker Test Infrastructure **Purpose**: Validate Docker-based agent coordination and Redis-based state management **Architecture**: - 4-layer validation system (tool → mesh → review → error handling) - 3 specialized validation tests (context injection, Redis keys, Product Owner decisions) - Docker agent image with MCP integration - Redis coordination testing with fallback mechanisms **Test Layers**: - **Layer 0**: Docker tool validation and environment setup - **Layer 1**: Mesh coordination and agent spawning validation - **Layer 2**: Review coordination and consensus validation - **Layer 3**: Error handling and retry mechanism validation **Specialized Tests**: - **Context Injection Between CFN Loops**: Validates context flow between Loop 3 → Loop 2 → Product Owner - **Redis Key Structure Validation**: Validates correct Redis key patterns and namespace usage - **Product Owner Decision Flow**: Tests decision making with confidence scoring and risk assessment **Components**: - DockerTestUtils: Container coordination utilities - RedisTestUtils: Redis state management validation - SpecializedTestRunner: Orchestrates all validation tests **Usage**: ```bash # Run complete test suite node tests/hello-world/test-runner.cjs # Run specialized tests only node tests/hello-world/specialized/specialized-test-runner.cjs # Run individual validation layers node tests/hello-world/layer0/layer0-docker-tool-validation.cjs ``` **Integration**: - Works with existing CFN Loop coordination skills - Validates Redis-based agent coordination patterns - Tests MCP authentication and Docker networking [... rest of previous content remains unchanged ...]