UNPKG

claude-flow

Version:

Ruflo - Enterprise AI agent orchestration for Claude Code. Deploy 60+ specialized agents in coordinated swarms with self-learning, fault-tolerant consensus, vector memory, and MCP integration

27 lines 988 B
/** * GAIA End-to-End Smoke — ADR-133 * * Wires gaia-agent.ts + gaia-judge.ts into a single end-to-end pipeline: * * for each question in SMOKE_FIXTURE: * 1. runGaiaAgent(question) — Haiku agent loop, ≤8 turns * 2. judgeAnswer(question, result.finalAnswer) — exact-match fast-path, * Sonnet LLM-judge only if exact-match misses * * Reports: pass rate, total cost, mean turn count. * Asserts: ≥ 3/5 questions pass (lenient — smoke fixture is not trivial). * * Cost discipline: * - Agent: claude-haiku-4-5 at $0.25/$1.25 per M tokens * - Judge: claude-sonnet-4-6 at $3/$15 per M tokens (only when needed) * - Expected total for 5 questions × ~2 turns × Haiku + 1-2 Sonnet * judge calls ≈ $0.02 * * Usage: * ANTHROPIC_API_KEY=sk-ant-... npx tsx src/benchmarks/gaia-e2e-smoke.ts * * Refs: ADR-133, #2156 */ declare function runE2ESmoke(): Promise<void>; export { runE2ESmoke }; //# sourceMappingURL=gaia-e2e-smoke.d.ts.map