aiwg
Version:
Cognitive architecture for AI-augmented software development with structured memory, ensemble validation, and closed-loop correction. FAIR-aligned artifacts, 84% cost reduction via human-in-the-loop, standards adopted by 100+ organizations.
444 lines (313 loc) • 11.1 kB
Markdown
# Production-Grade AIWG Guide
This guide documents the production-grade improvements to AIWG based on academic research analysis.
## Research Foundation
AIWG production-grade features are grounded in peer-reviewed research:
- **REF-001**: Bandara et al. (2024) ["Production-Grade Agentic AI Workflows"](#research-production-workflows) (arxiv 2512.08769)
- **REF-002**: Roig (2025) ["How Do LLMs Fail In Agentic Scenarios?"](#research-failure-modes) (arxiv 2512.07497v2)
## Agent Design Bible
The [Agent Design Bible](#ref-agent-design) defines 10 Golden Rules for building production-grade agents.
### Quick Reference
| Rule | Name | Key Principle |
|------|------|---------------|
| 1 | Single Responsibility | One agent, one job |
| 2 | Minimal Tools | 0-3 tools per agent |
| 3 | Explicit I/O | Clear inputs and outputs |
| 4 | Grounding First | Verify before acting |
| 5 | Escalate Uncertainty | Never guess silently |
| 6 | Scoped Context | Filter irrelevant data |
| 7 | Recovery-First | Design for failure |
| 8 | Model Tier | Match complexity to task |
| 9 | Parallel-Ready | Enable concurrent execution |
| 10 | Observable | Emit structured logs |
### Agent Linting
Validate agents against the 10 rules:
```bash
# Lint all framework agents
aiwg lint agents
# Lint specific directory with verbose output
aiwg lint agents .claude/agents/ --verbose
# CI-friendly JSON output
aiwg lint agents --json --strict
```
### Agent Scaffolding
Create new agents using validated templates:
```bash
# Simple agent (haiku, minimal tools)
aiwg add-agent code-analyzer --to my-addon --template simple
# Complex agent (sonnet, multiple tools)
aiwg add-agent architecture-reviewer --to my-addon --template complex
# Orchestrator (opus, coordinates other agents)
aiwg add-agent workflow-coordinator --to my-addon --template orchestrator
```
## LLM Failure Mode Mitigations
REF-002 identifies four failure archetypes. AIWG provides specific mitigations:
### Archetype 1: Premature Action Without Grounding
**Problem**: Agent guesses database schema instead of inspecting it.
**Mitigation**: Rule 4 (Grounding First)
```markdown
## Grounding Protocol
Before external operations:
1. List available resources (files, tables, APIs)
2. Verify target exists
3. Inspect structure before manipulation
4. NEVER assume schema from similar names
```
### Archetype 2: Over-Helpfulness Under Uncertainty
**Problem**: Agent substitutes similar entity when target not found.
**Mitigation**: Rule 5 (Escalate Uncertainty)
```markdown
## Uncertainty Protocol
When input cannot be resolved:
1. List candidates with confidence scores
2. Explain why no exact match
3. Ask user to select or provide more context
4. NEVER silently substitute
```
### Archetype 3: Distractor-Induced Context Pollution
**Problem**: Irrelevant data in context causes errors ("Chekhov's gun" effect).
**Mitigation**: Context Curator Addon
```bash
# Deploy context curator
aiwg use sdlc # Includes context-curator addon
```
The Context Curator agent scores context as:
- **RELEVANT**: Directly needed for current task
- **PERIPHERAL**: May be useful, keep accessible
- **DISTRACTOR**: Remove from active context
### Archetype 4: Fragile Execution Under Load
**Problem**: Agent enters loops, loses coherence, or makes malformed tool calls.
**Mitigation**: Resilience Protocol
```markdown
## PAUSE→DIAGNOSE→ADAPT→RETRY→ESCALATE
1. **PAUSE**: Stop, preserve state
2. **DIAGNOSE**: Classify error type
3. **ADAPT**: Choose different approach
4. **RETRY**: Max 3 adapted attempts
5. **ESCALATE**: Structured report to user
```
Loop detection triggers when:
- Same tool called 3+ times consecutively
- Same error occurs 2+ times
- Identical outputs from different attempts
## @-Mention Traceability
Claude Code 2.0.43 fixed @-mention loading for nested files. AIWG leverages this for live traceability.
### Conventions
```
# Requirements
@.aiwg/requirements/UC-{NNN}-{slug}.md # Use cases
@.aiwg/requirements/NFR-{CAT}-{NNN}.md # Non-functional
# Architecture
@.aiwg/architecture/adrs/ADR-{NNN}-{slug}.md # Decisions
# Security
@.aiwg/security/TM-{NNN}.md # Threats
@.aiwg/security/controls/{id}.md # Controls
# Code
@src/{path} # Source
@test/{path} # Tests
```
### Code Integration
Add @-mentions to file headers:
```typescript
/**
* @file Authentication Service
* @implements @.aiwg/requirements/UC-003-user-auth.md
* @architecture @.aiwg/architecture/adrs/ADR-005-jwt-strategy.md
* @security @.aiwg/security/controls/authn-001.md
* @tests @test/integration/auth.test.ts
*/
```
### Wiring Utilities
```bash
# Analyze codebase and suggest @-mentions
aiwg wire-mentions --dry-run
# Apply high-confidence suggestions
aiwg wire-mentions --auto
# Validate all @-mentions resolve
aiwg validate-mentions --strict
# Lint for style consistency
aiwg mention-lint --fix
```
## Hooks
AIWG hooks integrate with Claude Code 2.0.43+ lifecycle events.
### Trace Hook
Captures agent execution history for debugging and recovery.
```javascript
// .claude/hooks/aiwg-trace.js
// Automatically logs:
// - SubagentStart: agent_id, type, timestamp
// - SubagentStop: agent_id, transcript_path, duration
```
View traces:
```bash
# Tree view
node ~/.local/share/ai-writing-guide/agentic/code/addons/aiwg-hooks/scripts/trace-viewer.mjs tree
# Timeline view
node ~/.local/share/ai-writing-guide/agentic/code/addons/aiwg-hooks/scripts/trace-viewer.mjs timeline
```
### Permission Hook
Auto-approve trusted operations to reduce prompts:
```javascript
// .claude/hooks/aiwg-permissions.js
// Auto-approves:
// - Write to .aiwg/**
// - Read from ai-writing-guide/**
// - Git operations on AIWG branches
```
### Session Hook
Manage named sessions for workflow persistence:
```bash
# Generate session name
node aiwg-session.js suggest inception-to-elaboration
# Output: aiwg-inception-to-elaboration-2025-01-15-1030
# Record session
node aiwg-session.js record aiwg-my-session --workflow security-review
# List recent sessions
node aiwg-session.js list
```
## Prompt Registry
Import prompts directly into CLAUDE.md using @-imports.
### Available Prompts
| Path | Purpose |
|------|---------|
| `prompts/core/orchestrator.md` | Claude orchestrator guidance |
| `prompts/core/multi-agent-pattern.md` | Primary→Reviewers→Synthesizer |
| `prompts/core/consortium-pattern.md` | Multi-expert coordination |
| `prompts/agents/design-rules.md` | Condensed 10 Golden Rules |
| `prompts/reliability/decomposition.md` | Task decomposition templates |
| `prompts/reliability/parallel-hints.md` | Parallel execution patterns |
| `prompts/reliability/resilience.md` | Recovery protocol |
### Usage
In your project's CLAUDE.md:
```markdown
## Orchestration Guidelines
@~/.local/share/ai-writing-guide/agentic/code/addons/aiwg-utils/prompts/core/orchestrator.md
## Multi-Agent Pattern
@~/.local/share/ai-writing-guide/agentic/code/addons/aiwg-utils/prompts/core/multi-agent-pattern.md
```
## Deploy Generators
Generate production-ready deployment configurations.
```bash
# Docker (multi-stage, non-root, health checks)
/deploy-gen docker --app-name myapp --port 3000
# Kubernetes (SecurityContext, probes, resources)
/deploy-gen k8s --app-name myapp --port 3000
# Docker Compose
/deploy-gen compose --app-name myapp --port 3000
```
Generated configurations include:
- Multi-stage builds (deps → builder → runner)
- Non-root user execution
- Health check endpoints
- Resource limits
- Security contexts
- Anti-affinity rules (K8s)
## Evals Framework
Automated testing for agent behavior against failure archetypes.
### Running Evals
```bash
# Test agent against grounding scenario
/eval-agent my-agent --category archetype --scenario grounding-test
# Test agent against distractor scenario
/eval-agent my-agent --category archetype --scenario distractor-test
# Test recovery behavior
/eval-agent my-agent --category archetype --scenario recovery-test
```
### Scenario Types
| Scenario | Tests | Pass Criteria |
|----------|-------|---------------|
| grounding-test | Archetype 1 | Agent inspects before acting |
| distractor-test | Archetype 3 | Agent ignores irrelevant data |
| recovery-test | Archetype 4 | Agent recovers from errors |
### Creating Custom Scenarios
```yaml
# .aiwg/evals/my-scenario.yaml
name: custom-validation
category: archetype
description: Test custom validation behavior
setup:
files:
- path: target.md
content: |
# Target Document
This is the correct target.
validation:
type: output_contains
expected: "target.md"
pass_threshold: 1.0
```
## Agent Personas
Pre-configured agents for common workflows.
### Available Personas
| Persona | Focus | Model | Permissions |
|---------|-------|-------|-------------|
| `aiwg-orchestrator` | Full SDLC orchestration | opus | full |
| `aiwg-reviewer` | Code review | sonnet | write-artifacts |
| `aiwg-security` | Security audit | sonnet | read-only |
| `aiwg-writer` | Documentation | sonnet | write-artifacts |
### Usage
```bash
# Launch with persona
claude --agent aiwg-orchestrator
# Or via AIWG CLI
aiwg --persona orchestrator
```
## Multi-Agent Patterns
### Primary → Reviewers → Synthesizer
Standard pattern for document generation with review:
```
Primary Author (opus) → Creates draft
↓
Parallel Reviewers (sonnet) → Independent reviews
↓
Synthesizer (sonnet) → Merges feedback
↓
Archive → .aiwg/[category]/
```
**Key**: Launch reviewers in SINGLE message for true parallelism.
### Consortium Pattern
Multi-expert coordination with trade-off documentation:
```
Coordinator (opus)
↓
Parallel Experts (sonnet) → Independent analysis
↓
Trade-off Matrix → Document disagreements
↓
Synthesis → Majority + dissent record
```
## Metrics & Targets
Based on REF-002 benchmarks:
| Metric | Target | Source |
|--------|--------|--------|
| Grounding compliance | >90% | Archetype 1 |
| Entity substitution rate | <5% | Archetype 2 |
| Distractor error reduction | ≥50% | Archetype 3 |
| Recovery success rate | ≥80% | Archetype 4 |
| Parallel utilization | >60% | REF-001 BP-9 |
## Quick Start
### New Project
```bash
# Create project with SDLC framework
aiwg -new my-project
cd my-project
# Verify agent linting passes
aiwg lint agents
# Wire @-mentions
aiwg wire-mentions --dry-run
```
### Existing Project
```bash
# Deploy SDLC framework
aiwg use sdlc
# Setup hooks
cp ~/.local/share/ai-writing-guide/agentic/code/addons/aiwg-hooks/hooks/*.js .claude/hooks/
# Generate intake from codebase
/intake-from-codebase .
# Wire @-mentions
aiwg wire-mentions --interactive
```
## References
- [Agent Design Bible](#ref-agent-design) - 10 Golden Rules
- [REF-001](#research-production-workflows) - Production-Grade Research
- [REF-002](#research-failure-modes) - LLM Failure Modes
- [SDLC Framework](#quickstart-sdlc) - Complete lifecycle