aiwg

Version:

Deployment tool and support utility for AI context. Copies agents, skills, commands, rules, and behaviors into the paths each AI platform reads (Claude Code, Codex, Copilot, Cursor, Warp, OpenClaw, and 6 more) so one source of truth works across 10 platfo

aiwg.io

jmagly/aiwg

110 lines (77 loc) • 3.76 kB

Markdown

# UAT-MCP Toolkit Agent-executable acceptance testing via MCP connections. Generate phased UAT plans from MCP tool manifests, execute them against live connections, and produce structured coverage reports. ## Quick Start ```bash # Install the addon aiwg use uat-mcp # Generate a UAT plan from connected MCP servers /uat-generate --mode mcp # Execute the plan /uat-execute .aiwg/testing/uat/plan.md # Generate coverage report /uat-report .aiwg/testing/uat/results/ ``` Or use natural language: ``` "run UAT on the MCP tools" "generate a UAT plan for this server" "acceptance test the MCP connections" ``` ## Components | Type | Name | Purpose | |------|------|---------| | Agent | `uat-planner` | Designs phased UAT plans from MCP tool manifests and domain context | | Agent | `uat-executor` | Executes UAT plans step-by-step via MCP, filing issues on failure | | Command | `/uat-generate` | Discover MCP tools and scaffold phased UAT plan with test specs | | Command | `/uat-execute` | Run a UAT plan against live MCP connections | | Command | `/uat-report` | Generate UAT completion report with coverage metrics | | Skill | `uat-mode` | Natural language detection for UAT-related requests | ## Key Principles ### MCP-First Policy All tests use MCP tool calls. If a tool doesn't exist for an operation, that's a finding — file a bug. Never fall back to curl/HTTP. The purpose is to validate the interface agents actually use. ### Phase Structure Tests are organized into sequential phases: 1. **Preflight** — Verify MCP connectivity and authentication 2. **Seed Data** — Create test data via MCP tools 3. **Per-Category** — Test each tool category (CRUD, search, admin, etc.) 4. **E2E Chains** — Cross-phase workflows using stored variables 5. **Cleanup** — Remove test data (always runs, regardless of failures) ### Negative Test Isolation Tests expecting errors run in isolation (single MCP call per turn) to prevent sibling-call cascades from polluting results. ### Auto-Issue Filing Failed tests automatically create issues tagged `bug` + `uat` in the configured tracker (Gitea or GitHub). ## Execution Modes | Mode | Tests Run | Duration | Use Case | |------|-----------|----------|----------| | Quick Smoke | Preflight + 1 happy path per tool | ~5 min | CI/pre-commit | | Standard | All happy paths + key edge cases | ~15 min | Sprint validation | | Full | All tests including negative + E2E | ~30 min | Release qualification | ## Configuration In `.aiwg/config.yaml`: ```yaml uat: mode: mcp # Default test mode (mcp, future: api, ui) issue_filing: true # Auto-create issues for failures issue_provider: gitea # gitea | github | local max_phases: 30 # Safety limit on phase count execution_mode: standard # quick | standard | full cleanup_always: true # Run cleanup phase even on failure negative_test_isolation: true # Isolate error-expecting tests ``` ## When to Use - **Pre-release**: Validate MCP tool surface before shipping - **After refactors**: Ensure MCP tools still behave correctly - **New MCP server setup**: Generate baseline test suite from tool manifest - **CI integration**: Run quick smoke tests on every push - **Regression detection**: Compare results across runs ## Future Modes The `--mode` parameter defaults to `mcp` but is designed for extensibility: | Mode | Status | Description | |------|--------|-------------| | `mcp` | Available | Test MCP tool connections | | `api` | Planned | Test REST/GraphQL API endpoints | | `ui` | Planned | Test UI interactions via browser automation | ## Related - Issue: #380 - RLM addon (similar structure): `agentic/code/addons/rlm/` - MCP server implementation: `src/mcp/`