claude-flow
Version:
Ruflo - Enterprise AI agent orchestration for Claude Code. Deploy 60+ specialized agents in coordinated swarms with self-learning, fault-tolerant consensus, vector memory, and MCP integration
125 lines • 5.34 kB
TypeScript
/**
* GAIA Question Decomposer — ADR-135 Track E
*
* Decomposes complex GAIA benchmark questions into 1-5 ordered sub-questions
* that can each be answered with a single tool call, then synthesizes the
* sub-answers into a final response.
*
* Motivation (iter 29 finding): tool quality is the bottleneck on L1 (~20.8%).
* Bad tools that fail on complex queries may succeed on focused sub-queries.
* This mimics what humans do at 92% on GAIA (decompose-then-solve).
* Expected L1 lift: +5-10pp on multi-step questions (~30-40% of L1 set).
*
* Design:
* - NEW standalone file only; NOT wired into gaia-bench.ts to avoid merge
* conflicts with in-flight iter 29/31/34/35/36 branches.
* Wiring is a small follow-up PR.
* - `decomposeQuestion()` — Haiku-cheap classification + decomposition
* (~$0.0003 per question).
* - `synthesizeFromSubAnswers()` — Sonnet synthesis from sub-answers
* (the sub-answers are the hard work; synthesis is just combination).
* - Atomic questions are returned as-is — no overhead when not needed.
* - Graceful fallback to atomic on API errors or malformed JSON.
*
* Cost discipline:
* - Decomposition uses claude-haiku-4-5 (~$0.0003/question).
* - Synthesis uses claude-sonnet-4-6 (~$0.002/question).
* - Total overhead per question (when decomposed): ~$0.002-0.003.
*
* Plugin sync TODO (follow-up PR after gaia-bench wiring):
* - Update plugins/ruflo-workflows/commands/gaia-run.md with --decompose flag.
* - Update plugins/ruflo-workflows/skills/gaia-debugging/SKILL.md: add
* decomposition as a recommended strategy for multi-step failures.
*
* Refs: ADR-135, ADR-133, iter 29 finding, #2156
*/
export interface DecomposedQuestion {
/** The original unmodified question text. */
originalQuestion: string;
/**
* Ordered sub-questions in dependency order (1-5).
* If decomposed=false, contains exactly one entry equal to originalQuestion.
*/
subQuestions: string[];
/**
* Brief hint for how to combine sub-answers into the final answer.
* Example: "Multiply the two values found in sub-questions 1 and 2."
*/
synthesisHint: string;
/**
* true if the question was split into multiple sub-questions;
* false if it was deemed atomic (single lookup / single computation).
*/
decomposed: boolean;
/** USD spent on the decomposition API call (0 if fallback/atomic). */
cost: number;
}
export interface DecomposerOptions {
/**
* Model to use for decomposition classification.
* Default: 'claude-haiku-4-5' (cheap).
*/
model?: string;
/**
* Maximum sub-questions to produce when decomposing.
* Default: 5.
*/
maxSubQuestions?: number;
/** API key override. Falls back to ANTHROPIC_API_KEY env var. */
apiKey?: string;
}
export interface SynthesizerOptions {
/**
* Model to use for final-answer synthesis.
* Default: 'claude-sonnet-4-6'.
*/
model?: string;
/** API key override. Falls back to ANTHROPIC_API_KEY env var. */
apiKey?: string;
}
export interface SynthesisResult {
/** The synthesized final answer string. */
finalAnswer: string;
/** Brief reasoning that led to the final answer. */
reasoning: string;
/** USD spent on the synthesis API call. */
cost: number;
}
/**
* Decomposes a complex question into 1-5 sub-questions in dependency order,
* OR returns the question as-is if it is already atomic.
*
* Uses `claude-haiku-4-5` by default for cheap classification + decomposition
* (~$0.0003 per question).
*
* Heuristics the model uses internally for "should decompose":
* - Question contains "and", "then", "after", "if X …"
* - Question contains multiple named entities that must each be looked up
* - Question asks for a derived/computed answer (X of Y where Y must be found)
*
* Graceful degradation: on API errors or malformed JSON, returns the question
* as atomic so the calling agent can still attempt a direct answer.
*
* @param questionText - The full GAIA question text.
* @param options - Optional overrides (model, maxSubQuestions, apiKey).
* @returns - DecomposedQuestion with subQuestions and synthesisHint.
*/
export declare function decomposeQuestion(questionText: string, options?: DecomposerOptions): Promise<DecomposedQuestion>;
/**
* Given a decomposed question and answers to each sub-question, synthesizes
* a final concise answer.
*
* Uses `claude-sonnet-4-6` by default for higher reasoning quality — the
* sub-answers contain the hard-won information; synthesis is recombination.
*
* Graceful degradation: on API errors or malformed JSON, returns the last
* sub-answer concatenated with reasoning note.
*
* @param decomposed - The DecomposedQuestion from `decomposeQuestion()`.
* @param subAnswers - Array of string answers, one per sub-question.
* Must be the same length as decomposed.subQuestions.
* @param options - Optional overrides (model, apiKey).
* @returns - SynthesisResult with finalAnswer, reasoning, and cost.
*/
export declare function synthesizeFromSubAnswers(decomposed: DecomposedQuestion, subAnswers: string[], options?: SynthesizerOptions): Promise<SynthesisResult>;
//# sourceMappingURL=gaia-decomposer.d.ts.map