aiwg

Version:

Cognitive architecture for AI-augmented software development with structured memory, ensemble validation, and closed-loop correction. FAIR-aligned artifacts, 84% cost reduction via human-in-the-loop, standards adopted by 100+ organizations.

aiwg.io

jmagly/aiwg

1,010 lines (812 loc) • 30.5 kB

Markdown

# Ralph-RLM Integration **Version**: 1.0.0 **Status**: Active **Research Foundation**: REF-089 Recursive Language Models (Zhang et al., 2026), REF-018 ReAct **Issue**: #329 ## Overview This document describes how the RLM (Recursive Language Models) addon integrates with Ralph loops as a processing strategy for long-context tasks. Ralph provides the outer TAO (Thought→Action→Observation) iteration loop, while RLM handles recursive decomposition of large-context tasks into manageable sub-problems. ## Quick Reference ```bash # Explicit RLM strategy ralph "Analyze this large codebase" --strategy rlm --completion "analysis complete" # Auto-detection (Ralph suggests RLM when appropriate) ralph "Search for API key leaks across all repos" --completion "report generated" # → Detects batch pattern, suggests: "Use --strategy rlm for better performance?" # Override RLM defaults ralph "Analyze security risks" --strategy rlm --max-depth 5 --max-sub-calls 50 ``` ## Why RLM + Ralph? ### Problem: Context Window Limitations Ralph alone: - User: "Analyze all 89 research papers for memory patterns" - Ralph iteration 1: Tries to load all 89 papers into context → context overflow - Ralph iteration 2: Summarizes all papers → loses critical details - Result: Incomplete or shallow analysis Ralph + RLM: - User: "Analyze all 89 research papers for memory patterns" - Ralph uses RLM agent strategy - RLM: Filters to 12 relevant papers → delegates per-paper analysis → aggregates - Result: Complete, lossless analysis within context limits ### Solution: Structural Equivalence RLM's REPL loop IS structurally equivalent to Ralph's TAO loop: | RLM Component | Ralph Component | Description | |---------------|-----------------|-------------| | `code ← LLM(hist)` | Thought → Action | Generate action based on state | | `REPL(state, code)` | Action execution | Execute tool/code in environment | | `Metadata(stdout)` | Observation | Capture execution result | | `state[Final] is set` | Completion criteria | Task completion signal | This structural equivalence makes RLM a natural strategy for Ralph when context decomposition is needed. ## TAO Loop Mapping ### RLM REPL Cycle From REF-089: ``` Loop: 1. code ← LLM(hist) 2. stdout, stderr ← REPL(state, code) 3. hist ← hist ∪ Metadata(stdout) 4. IF state[Final] is set: DONE 5. ELSE: continue loop ``` ### Ralph TAO Cycle From AIWG Ralph implementation: ``` Loop: 1. THOUGHT: What should I do next? 2. ACTION: Execute tool/command 3. OBSERVATION: Capture result 4. IF completion criteria met: DONE 5. ELSE: continue loop ``` ### Mapping Table | Concept | RLM | Ralph | Notes | |---------|-----|-------|-------| | **Decision** | `LLM(hist)` generates code | `THOUGHT` determines action | Both: reasoning about next step | | **Execution** | `REPL(state, code)` runs code | `ACTION` executes tool | Both: interact with environment | | **Feedback** | `Metadata(stdout)` captures output | `OBSERVATION` records result | Both: incorporate execution outcome | | **State** | `state` variables (Final, prompt, etc.) | Ralph reflection memory | Both: persistent context | | **Completion** | `state[Final] != null` | Completion criteria met | Both: explicit termination | | **Recursion** | `llm_query()` spawns sub-LMs | Ralph sub-agents via Task tool | Both: delegate sub-problems | ### Example Parallel Execution **RLM REPL Trajectory**: ```python # Iteration 1 code: grep_files("**/*.ts", "authenticate") output: ["src/auth.ts:42", "src/login.ts:18"] state: {"files_found": 2} # Iteration 2 code: llm_query("Analyze src/auth.ts authentication logic") output: sub_result_1 state: {"files_found": 2, "auth_analysis": "..."} # Iteration 3 code: set_final("Analysis complete") state: {"Final": "Analysis complete", ...} ``` **Ralph TAO Trajectory**: ```yaml # Iteration 1 thought: "I need to find files with authentication logic" action: Grep(pattern="authenticate", glob="**/*.ts") observation: ["src/auth.ts:42", "src/login.ts:18"] state: {"files_found": 2} # Iteration 2 thought: "Delegate analysis of src/auth.ts to sub-agent" action: Task("Analyze src/auth.ts authentication logic") observation: sub_agent_result state: {"files_found": 2, "auth_analysis": "..."} # Iteration 3 thought: "Analysis is complete" action: Write(final_report) observation: success completion: criteria met ``` **Key Insight**: RLM's REPL loop is a domain-specific instance of Ralph's TAO loop, specialized for code-driven context decomposition. ## Explicit RLM Strategy: `--strategy rlm` ### Usage ```bash ralph "task description" --strategy rlm [--rlm-options] --completion "criteria" ``` ### How It Changes Ralph Behavior | Aspect | Ralph Default | Ralph + RLM Strategy | |--------|---------------|---------------------| | **Agent Selection** | General-purpose agent | RLM-specific agent (rlm-agent.md) | | **Context Handling** | Load files directly into context | Programmatic filtering via Grep/Glob first | | **Decomposition** | Manual agent decision | Automatic recursive decomposition | | **Sub-Agent Pattern** | Ad-hoc Task tool usage | Structured llm_query() pattern | | **State Management** | Ralph reflection memory | RLM state variables (.aiwg/rlm/state/) | | **Completion Signal** | Completion criteria string match | `Final` variable set in RLM state | ### Configuration Options ```yaml rlm_strategy_config: max_depth: 5 # Maximum recursion depth max_sub_calls: 20 # Maximum sub-agents per iteration sub_model: "sonnet" # Model for sub-agents (default: same as parent) parallel_sub_calls: true # Allow parallel Task execution budget_tokens: 500000 # Token budget across entire task tree intermediate_dir: ".aiwg/rlm/tasks/{task-id}/intermediate/" state_dir: ".aiwg/rlm/tasks/{task-id}/state/" completion_artifact: "final-result.md" ``` ### Command-Line Override ```bash # Override max depth ralph "Analyze codebase" --strategy rlm --max-depth 3 # Override sub-call limit ralph "Batch process files" --strategy rlm --max-sub-calls 50 # Override token budget ralph "Large corpus analysis" --strategy rlm --budget-tokens 1000000 # Combine overrides ralph "Complex task" --strategy rlm --max-depth 4 --max-sub-calls 30 --sub-model haiku ``` ## Auto-Detection Heuristics Ralph automatically suggests RLM mode when it detects long-context patterns. ### Detection Triggers | Pattern | Example | Confidence | Action | |---------|---------|------------|--------| | **Batch keywords** | "all files", "entire codebase", "every module" | High (0.9) | Auto-activate RLM | | **Large file count** | Task targets >50 files | High (0.85) | Auto-activate RLM | | **Estimated tokens** | Context >100K tokens | High (0.9) | Auto-activate RLM | | **Corpus mention** | "all papers", "research corpus" | Medium (0.7) | Suggest RLM | | **Fan-out pattern** | "for each X, do Y" | Medium (0.65) | Suggest RLM | | **Recursive keywords** | "recursively", "all subdirectories" | Medium (0.6) | Suggest RLM | ### Detection Algorithm ```python def should_use_rlm(task: str, context_files: List[str]) -> Tuple[bool, float]: """ Returns: (should_use, confidence) """ confidence = 0.0 # Keyword analysis batch_keywords = ["all files", "entire", "every", "across all"] if any(kw in task.lower() for kw in batch_keywords): confidence += 0.4 # File count analysis if len(context_files) > 50: confidence += 0.3 elif len(context_files) > 20: confidence += 0.15 # Token estimation estimated_tokens = estimate_token_count(context_files) if estimated_tokens > 100000: confidence += 0.3 elif estimated_tokens > 50000: confidence += 0.15 # Corpus patterns corpus_keywords = ["corpus", "all papers", "all documents"] if any(kw in task.lower() for kw in corpus_keywords): confidence += 0.2 # Fan-out patterns if "for each" in task.lower() or "map" in task.lower(): confidence += 0.1 # Decision thresholds if confidence >= 0.8: return (True, confidence) # Auto-activate elif confidence >= 0.5: return (False, confidence) # Suggest else: return (False, confidence) # Don't use RLM ``` ### User Experience **Auto-activation (confidence ≥ 0.8)**: ```bash $ ralph "Analyze security of all 500 source files" --completion "report ready" → Detected long-context task (confidence: 0.90) → Auto-activating RLM strategy for optimal performance → Override with --no-rlm if you prefer standard processing [Ralph loop begins with RLM agent] ``` **Suggestion (0.5 ≤ confidence < 0.8)**: ```bash $ ralph "Find API key leaks across the repository" --completion "results saved" ⚠️ This task might benefit from RLM strategy (confidence: 0.65) Reason: Batch pattern detected, multiple files expected Continue with standard processing? [Y/n] Or use RLM: ralph "..." --strategy rlm ``` **No suggestion (confidence < 0.5)**: ```bash $ ralph "Fix the bug in src/auth.ts" --completion "tests pass" [Standard Ralph processing, no RLM suggestion] ``` ## Checkpoint Integration RLM state seamlessly integrates with Ralph's checkpoint system. ### RLM State as Ralph Checkpoints RLM maintains explicit state variables that map to Ralph checkpoints: ```yaml # Ralph checkpoint structure ralph_checkpoint: iteration: 5 thought: "Delegating per-file analysis" action: "Task(analyze_file_1)" observation: "..." rlm_state: state_id: "state-a1b2c3d4" tree_id: "tree-87654321" variables: Final: null prompt: "Analyze all source files" files_analyzed: 12 files_remaining: 38 checkpoints: - checkpoint_id: "ckpt-iteration-5" snapshot_path: ".aiwg/rlm/checkpoints/ckpt-a1b2c3d4.json" ``` ### On Ralph Loop Crash Ralph's crash recovery protocol with RLM: ``` 1. Ralph detects crash (unhandled exception, timeout, OOM) 2. Ralph loads last checkpoint 3. Ralph detects RLM state reference in checkpoint 4. Ralph restores RLM state from `.aiwg/rlm/states/{state_id}/state.json` 5. Ralph resumes with RLM agent at last known state 6. RLM agent reads state variables, sees partial progress 7. RLM continues from where it left off (no re-work) ``` **Example Recovery**: ```bash # Original run $ ralph "Analyze 100 papers" --strategy rlm → Iteration 1-10: Analyzed 30 papers → Iteration 11: Started analyzing paper 31 → CRASH (OOM) # Resume $ ralph --resume last → Loaded checkpoint from iteration 10 → Restored RLM state (30 papers analyzed) → Resuming from paper 31 → No duplicate work ``` ### Partial Results Preserved RLM writes intermediate results to files, not just memory: ``` .aiwg/rlm/tasks/task-{id}/ ├── state/ │ └── state.json # State variables (Final, files_analyzed, etc.) ├── intermediate/ │ ├── analysis-paper-001.md # Preserved across crashes │ ├── analysis-paper-002.md │ └── ... ├── checkpoints/ │ ├── ckpt-iteration-5.json │ └── ckpt-iteration-10.json └── final-result.md # Only written on completion ``` **On crash**: All intermediate files remain on disk. On resume, RLM reads these files instead of re-analyzing. ### State Variable Restoration ```yaml # Before crash (iteration 10) rlm_state: variables: Final: null papers_analyzed: 30 papers_remaining: 70 current_batch: [31, 32, 33, 34, 35] results_so_far: "file:.aiwg/rlm/intermediate/aggregated-results.json" # After crash recovery rlm_state: variables: Final: null papers_analyzed: 30 # Restored papers_remaining: 70 # Restored current_batch: [31, 32, 33, 34, 35] # Restored results_so_far: "file:.aiwg/rlm/intermediate/aggregated-results.json" # File still exists ``` ### Checkpoint Frequency ```yaml checkpoint_policy: # Ralph creates checkpoints every N iterations standard_frequency: 5 # RLM creates internal checkpoints on state changes rlm_internal_checkpoints: - before_sub_agent_spawn - after_batch_completion - on_state_variable_update # Combined: Ralph checkpoints include RLM state snapshots combined_checkpoint_trigger: - every_5_ralph_iterations - every_rlm_internal_checkpoint - on_ralph_loop_boundary ``` ## RLM State Variables → Ralph Reflection Memory Mapping between RLM and Ralph state systems: | RLM State Variable | Ralph Reflection Memory | Purpose | |-------------------|------------------------|---------| | `Final` | Completion signal | Both: indicates task done | | `prompt` | Original task | Both: preserve original request | | `{custom_vars}` | Reflection entries | Both: intermediate findings | | `state[files_analyzed]` | Progress counter | Track work completed | | `state[errors]` | Failure log | Debug failed sub-calls | | `state[cost_so_far]` | Cost tracking | Monitor budget usage | ### Cross-System Access ```yaml # RLM agent can read Ralph reflection memory rlm_access_to_ralph: - "Read reflection memory to avoid duplicate work" - "Check if similar task was attempted before" - "Learn from past failures" # Ralph can read RLM state variables ralph_access_to_rlm: - "Check RLM completion status (Final variable)" - "Monitor RLM progress (custom variables)" - "Incorporate RLM results into reflection" ``` ### Example Integration ```yaml # Ralph iteration 15 ralph_reflection: iteration: 15 thought: "RLM agent has analyzed 50/100 papers" rlm_state_snapshot: papers_analyzed: 50 cost_so_far_usd: 2.35 current_quality_score: 0.87 decision: "Continue RLM processing, quality is good" # RLM uses reflection rlm_state: variables: Final: null papers_analyzed: 50 ralph_reflection_note: "Ralph confirms quality is good, continue" ``` ## Combined Workflow Patterns ### Ralph Outer + RLM Inner **Pattern**: Ralph orchestrates high-level flow, RLM handles context-heavy steps. ```yaml workflow: phase_1_requirements: agent: requirements-analyst strategy: standard phase_2_architecture: agent: rlm-agent strategy: rlm reason: "Need to analyze all existing APIs for consistency" phase_3_implementation: agent: software-implementer strategy: standard phase_4_testing: agent: rlm-agent strategy: rlm reason: "Need to analyze all code paths for test coverage" ``` ### When to Use Each Approach | Scenario | Use Ralph Alone | Use RLM Alone | Use Ralph + RLM | |----------|----------------|---------------|-----------------| | **Single focused task** | ✓ | | | | **Iterative refinement** | ✓ | | | | **Large file analysis** | | ✓ | | | **Multi-file search** | | ✓ | | | **Multi-phase workflow with large-context steps** | | | ✓ | | **Long-running corpus analysis** | | ✓ | | | **Interactive debugging** | ✓ | | | | **Batch processing** | | ✓ | | ### Decision Matrix ``` ┌─────────────────────────────────────────────────────────────┐ │ Task Characteristics │ ├─────────────────────────────────────────────────────────────┤ │ │ │ Context Size Iteration Need Decomposition │ │ │ │ Small (<10K) High → Ralph Alone │ │ Small (<10K) Low → Direct Agent │ │ Large (>100K) Low → RLM Alone │ │ Large (>100K) High → Ralph + RLM │ │ Medium (10-100K) High → Ralph (w/ RLM?) │ │ Medium (10-100K) Low → RLM or Ralph │ │ │ └─────────────────────────────────────────────────────────────┘ ``` ## Configuration Examples ### Basic Ralph + RLM ```yaml # .aiwg/ralph-config.yml ralph: max_iterations: 50 completion_criteria: "analysis complete" strategy: rlm rlm: max_depth: 3 max_sub_calls: 20 budget_tokens: 500000 ``` ### Advanced Ralph + RLM ```yaml # .aiwg/ralph-config.yml ralph: max_iterations: 100 completion_criteria: "all files analyzed AND report generated" strategy: rlm checkpoint_frequency: 5 enable_reflection_memory: true rlm: max_depth: 5 max_sub_calls: 50 budget_tokens: 1000000 parallel_sub_calls: true chunk_strategy: "by_structure" # auto | by_function | by_section cache_intermediate: true cost_tracking: true intermediate_dir: ".aiwg/rlm/tasks/{task-id}/intermediate/" state_dir: ".aiwg/rlm/tasks/{task-id}/state/" # Sub-agent configuration sub_agent_config: model: "sonnet" temperature: 0.7 timeout_ms: 300000 # Checkpoint configuration checkpoint_policy: create_on_state_change: true create_on_sub_call_complete: true retention_count: 10 ``` ### Completion Criteria with RLM ```yaml # Standard Ralph completion ralph: completion_criteria: "tests pass" # RLM-specific completion (Final variable) ralph: strategy: rlm completion_criteria: type: rlm_final_set expected_pattern: "Analysis complete.*" timeout_iterations: 50 # Hybrid completion (both must be true) ralph: strategy: rlm completion_criteria: - rlm_final_set: true - output_file_exists: ".aiwg/rlm/final-result.md" - ralph_criteria: "report generated" ``` ## Example Workflows ### Workflow 1: Codebase Security Analysis (Auto-Detect) **Command**: ```bash ralph "Analyze security of entire codebase" --completion "security report ready" ``` **Auto-Detection**: ``` → Detected: "entire codebase" (batch keyword) → File count: 487 files → Estimated tokens: 850K → Confidence: 0.95 → AUTO-ACTIVATING RLM strategy ``` **Execution**: ```yaml # Ralph Iteration 1 thought: "Need to discover all source files" action: RLM.discover_files() observation: "Found 487 .ts/.js files" rlm_state: files_discovered: 487 # Ralph Iteration 2-10 thought: "RLM is analyzing files in batches" action: RLM.batch_analyze(batch_size=50) observation: "Batch 1/10 complete (50 files analyzed)" rlm_state: files_analyzed: 50 security_issues_found: 12 # Ralph Iteration 11-50 [RLM continues batch processing] # Ralph Iteration 51 thought: "RLM has completed analysis" action: RLM.finalize_report() observation: "Security report written" rlm_state: Final: "Security analysis complete. 47 issues found." completion: "security report ready" → MATCHES ``` ### Workflow 2: Refactor All Tests (Explicit --strategy rlm) **Command**: ```bash ralph "Refactor all test files to use new pattern" --strategy rlm --completion "all tests pass" ``` **Execution**: ```yaml # Ralph Iteration 1 thought: "Using explicit RLM strategy" action: RLM.discover_test_files() observation: "Found 123 test files" rlm_state: test_files: 123 # Ralph Iteration 2 thought: "Create refactoring plan" action: RLM.analyze_current_pattern() observation: "Current pattern identified" rlm_state: current_pattern: "describe/it blocks" target_pattern: "test/expect blocks" # Ralph Iteration 3-20 thought: "Delegate per-file refactoring" action: RLM.spawn_refactor_agents(parallel=true) observation: "Spawned 123 sub-agents" rlm_state: refactorings_complete: 0/123 # Ralph Iteration 21-70 [RLM sub-agents complete refactoring] rlm_state: refactorings_complete: 123/123 # Ralph Iteration 71 thought: "All refactorings complete, run tests" action: Bash("npm test") observation: "All tests pass" rlm_state: Final: "Refactoring complete. All 123 files updated." completion: "all tests pass" → MATCHES ``` ### Workflow 3: Multi-Repo API Key Search (rlm-batch) **Command**: ```bash ralph "Search for API key leaks across all 5 repositories" --strategy rlm --completion "report saved" ``` **Execution**: ```yaml # Ralph Iteration 1 thought: "RLM strategy for multi-repo search" action: RLM.discover_repos() observation: "Found 5 repos" rlm_state: repos: ["repo1", "repo2", "repo3", "repo4", "repo5"] # Ralph Iteration 2 thought: "Spawn parallel search per repo" action: RLM.spawn_repo_searches(parallel=true) observation: "5 sub-agents spawned" rlm_state: searches_complete: 0/5 # Ralph Iteration 3-8 [Sub-agents search in parallel] sub_agent_1: "repo1: 0 API keys found" sub_agent_2: "repo2: 3 API keys found" sub_agent_3: "repo3: 0 API keys found" sub_agent_4: "repo4: 1 API key found" sub_agent_5: "repo5: 0 API keys found" rlm_state: searches_complete: 5/5 total_keys_found: 4 # Ralph Iteration 9 thought: "Aggregate results" action: RLM.generate_report() observation: "Report written to .aiwg/rlm/api-key-leaks-report.md" rlm_state: Final: "Search complete. 4 API keys found across 2 repos." completion: "report saved" → MATCHES ``` ## Workflow Visualization ### Ralph-Only Workflow ``` User Task ↓ ┌───────────────────┐ │ Ralph Loop │ │ │ │ Iteration 1-N: │ │ - Think │ │ - Act │ │ - Observe │ │ - Reflect │ │ │ │ Completion: │ │ - Criteria met │ └───────────────────┘ ↓ Output ``` ### RLM-Only Workflow ``` User Task ↓ ┌────────────────────────────────────────┐ │ RLM Agent │ │ │ │ 1. Discover context (Grep/Glob) │ │ 2. Decompose into sub-tasks │ │ 3. Spawn sub-agents (recursive) │ │ 4. Aggregate results │ │ 5. Set Final variable │ │ │ │ Completion: │ │ - Final != null │ └────────────────────────────────────────┘ ↓ Output ``` ### Ralph + RLM Workflow ``` User Task ↓ ┌──────────────────────────────────────────────────────────┐ │ Ralph Loop (Outer) │ │ │ │ Iteration 1: │ │ - Thought: "This needs RLM strategy" │ │ - Action: Initialize RLM agent │ │ - Observation: RLM agent ready │ │ │ │ Iteration 2-N: │ │ ┌──────────────────────────────────────────────────┐ │ │ │ RLM Agent (Inner) │ │ │ │ │ │ │ │ RLM Step 1: Discover context │ │ │ │ RLM Step 2: Decompose │ │ │ │ RLM Step 3: Spawn sub-agents │ │ │ │ ├── Sub-agent 1 (depth 1) │ │ │ │ ├── Sub-agent 2 (depth 1) │ │ │ │ └── Sub-agent N (depth 1) │ │ │ │ RLM Step 4: Aggregate │ │ │ │ RLM Step 5: Set Final │ │ │ └──────────────────────────────────────────────────┘ │ │ - Observation: RLM Final set, results ready │ │ │ │ Iteration N+1: │ │ - Thought: "RLM complete, verify results" │ │ - Action: Check completion criteria │ │ - Observation: Criteria met │ │ │ │ Completion: │ │ - Ralph criteria AND RLM Final set │ └──────────────────────────────────────────────────────────┘ ↓ Output ``` ## Cost Model Based on REF-089 research: | Metric | Ralph-Only | RLM-Only | Ralph + RLM | |--------|-----------|----------|-------------| | **Median cost** | 1.0x | 0.8-1.2x | 0.9-1.3x | | **Best case** | 1.0x | 0.3x (sparse access) | 0.4x | | **Worst case** | 3.0x (context overflow) | 3.0x (bad decomposition) | 3.5x | | **Cost variance** | Low | Moderate | Moderate | **When Ralph + RLM is cheaper**: - Long contexts (>100K tokens) - Sparse access patterns (only need 10% of context) - Parallelizable sub-problems - Good decomposition (RLM agent understands structure) **When Ralph + RLM is more expensive**: - Short contexts (<10K tokens) — overhead not justified - Dense access patterns (need 90%+ of context) - Sequential dependencies (can't parallelize) - Poor decomposition (RLM creates too many sub-calls) ## Integration Points ### With Agent Supervisor Agent Supervisor routes tasks to RLM-enabled Ralph: ```yaml agent_supervisor_routing: rules: - condition: "task mentions 'all files' OR file_count > 50" route_to: ralph_with_rlm strategy: rlm - condition: "task is focused AND context < 10K tokens" route_to: ralph_standard strategy: standard ``` ### With Cost Tracking ```yaml cost_tracking: ralph_level: - total_iterations - total_duration - total_tokens rlm_level: - sub_calls_count - tree_depth - parallel_efficiency - per_node_cost combined_report: - ralph_overhead: "Ralph coordination cost" - rlm_decomposition_cost: "RLM planning cost" - rlm_execution_cost: "RLM sub-agent cost" - total_cost: sum(ralph + rlm) - cost_vs_baseline: comparison to direct processing ``` ### With Reflection Memory ```yaml reflection_memory_integration: # Ralph writes RLM state to reflection on_rlm_checkpoint: - capture_rlm_state_snapshot - write_to_ralph_reflection - tag: "rlm_state" # RLM reads past attempts from reflection on_rlm_init: - load_ralph_reflection_memory - check_for_similar_past_tasks - reuse_decomposition_if_applicable ``` ## Best Practices ### When to Use --strategy rlm ✅ **Use RLM when**: - Task involves >20 files or >50K tokens - Task contains batch keywords ("all", "entire", "every") - Need to preserve information fidelity (lossless) - Sub-problems are parallelizable - Cost efficiency matters ❌ **Don't use RLM when**: - Task is focused on 1-3 files - Context is <10K tokens - Summarization is acceptable (lossy is OK) - Real-time constraints (RLM adds latency) - No clear decomposition strategy ### Effective Decomposition ```yaml good_decomposition: - "Analyze each file independently" - "Search all repos, aggregate results" - "Per-module security review" - "Batch process documents" poor_decomposition: - "Understand the entire system" (too vague) - "Find all bugs" (no clear sub-structure) - "Improve code quality" (subjective, hard to parallelize) ``` ### Monitoring RLM Progress ```bash # Check RLM state during execution ralph-status --show-rlm-state # Output: # Ralph Iteration: 25/50 # RLM State: # - Files analyzed: 120/487 # - Current depth: 2 # - Sub-calls active: 8 # - Cost so far: $3.42 # - Estimated completion: 15 iterations ``` ## Troubleshooting ### RLM Not Activating **Problem**: Ralph doesn't use RLM even though task is long-context. **Solutions**: ```bash # Explicit activation ralph "task" --strategy rlm # Check detection confidence ralph "task" --debug-strategy # Shows: "RLM confidence: 0.45 (below threshold 0.5)" # Lower threshold (if needed) export AIWG_RLM_THRESHOLD=0.4 ralph "task" # Now activates if confidence > 0.4 ``` ### RLM Creating Too Many Sub-Calls **Problem**: RLM spawns 100+ sub-agents, costs spike. **Solutions**: ```bash # Limit sub-calls ralph "task" --strategy rlm --max-sub-calls 20 # Increase chunk size (fewer sub-calls) ralph "task" --strategy rlm --chunk-size 2000 # Use coarser decomposition ralph "task" --strategy rlm --chunk-strategy by_module # vs by_function ``` ### RLM Recursion Too Deep **Problem**: RLM creates depth-5 trees, loses context. **Solutions**: ```bash # Limit depth ralph "task" --strategy rlm --max-depth 2 # Force flat decomposition ralph "task" --strategy rlm --decomposition-strategy parallel ``` ### Ralph + RLM Never Completes **Problem**: Loop runs to max iterations without completion. **Solutions**: ```bash # Check RLM Final variable ralph-status --show-rlm-state | grep Final # If Final is null, RLM hasn't signaled completion # Debug RLM state cat .aiwg/rlm/tasks/{task-id}/state/state.json | jq '.variables.Final' # Adjust completion criteria ralph "task" --strategy rlm --completion "Final set AND output exists" ``` ## References - @agentic/code/addons/rlm/agents/rlm-agent.md - RLM agent definition - @agentic/code/addons/rlm/schemas/rlm-task-tree.yaml - Task tree structure - @agentic/code/addons/rlm/schemas/rlm-state.yaml - State management - @agentic/code/addons/ralph/agents/ralph-loop.md - Ralph loop implementation - @.aiwg/research/findings/REF-089-recursive-language-models.md - Research foundation - @.claude/rules/tao-loop.md - TAO loop standardization - @tools/daemon/agent-supervisor.mjs - Task routing - Issue #329 - Ralph-RLM integration epic --- **Status**: Active **Last Updated**: 2026-02-09