aiwg
Version:
Deployment tool and support utility for AI context. Copies agents, skills, commands, rules, and behaviors into the paths each AI platform reads (Claude Code, Codex, Copilot, Cursor, Warp, OpenClaw, and 6 more) so one source of truth works across 10 platfo
110 lines (77 loc) • 3.76 kB
Markdown
# UAT-MCP Toolkit
Agent-executable acceptance testing via MCP connections. Generate phased UAT plans from MCP tool manifests, execute them against live connections, and produce structured coverage reports.
## Quick Start
```bash
# Install the addon
aiwg use uat-mcp
# Generate a UAT plan from connected MCP servers
/uat-generate --mode mcp
# Execute the plan
/uat-execute .aiwg/testing/uat/plan.md
# Generate coverage report
/uat-report .aiwg/testing/uat/results/
```
Or use natural language:
```
"run UAT on the MCP tools"
"generate a UAT plan for this server"
"acceptance test the MCP connections"
```
## Components
| Type | Name | Purpose |
|------|------|---------|
| Agent | `uat-planner` | Designs phased UAT plans from MCP tool manifests and domain context |
| Agent | `uat-executor` | Executes UAT plans step-by-step via MCP, filing issues on failure |
| Command | `/uat-generate` | Discover MCP tools and scaffold phased UAT plan with test specs |
| Command | `/uat-execute` | Run a UAT plan against live MCP connections |
| Command | `/uat-report` | Generate UAT completion report with coverage metrics |
| Skill | `uat-mode` | Natural language detection for UAT-related requests |
## Key Principles
### MCP-First Policy
All tests use MCP tool calls. If a tool doesn't exist for an operation, that's a finding — file a bug. Never fall back to curl/HTTP. The purpose is to validate the interface agents actually use.
### Phase Structure
Tests are organized into sequential phases:
1. **Preflight** — Verify MCP connectivity and authentication
2. **Seed Data** — Create test data via MCP tools
3. **Per-Category** — Test each tool category (CRUD, search, admin, etc.)
4. **E2E Chains** — Cross-phase workflows using stored variables
5. **Cleanup** — Remove test data (always runs, regardless of failures)
### Negative Test Isolation
Tests expecting errors run in isolation (single MCP call per turn) to prevent sibling-call cascades from polluting results.
### Auto-Issue Filing
Failed tests automatically create issues tagged `bug` + `uat` in the configured tracker (Gitea or GitHub).
## Execution Modes
| Mode | Tests Run | Duration | Use Case |
|------|-----------|----------|----------|
| Quick Smoke | Preflight + 1 happy path per tool | ~5 min | CI/pre-commit |
| Standard | All happy paths + key edge cases | ~15 min | Sprint validation |
| Full | All tests including negative + E2E | ~30 min | Release qualification |
## Configuration
In `.aiwg/config.yaml`:
```yaml
uat:
mode: mcp # Default test mode (mcp, future: api, ui)
issue_filing: true # Auto-create issues for failures
issue_provider: gitea # gitea | github | local
max_phases: 30 # Safety limit on phase count
execution_mode: standard # quick | standard | full
cleanup_always: true # Run cleanup phase even on failure
negative_test_isolation: true # Isolate error-expecting tests
```
## When to Use
- **Pre-release**: Validate MCP tool surface before shipping
- **After refactors**: Ensure MCP tools still behave correctly
- **New MCP server setup**: Generate baseline test suite from tool manifest
- **CI integration**: Run quick smoke tests on every push
- **Regression detection**: Compare results across runs
## Future Modes
The `--mode` parameter defaults to `mcp` but is designed for extensibility:
| Mode | Status | Description |
|------|--------|-------------|
| `mcp` | Available | Test MCP tool connections |
| `api` | Planned | Test REST/GraphQL API endpoints |
| `ui` | Planned | Test UI interactions via browser automation |
## Related
- Issue: #380
- RLM addon (similar structure): `agentic/code/addons/rlm/`
- MCP server implementation: `src/mcp/`