claude-flow
Version:
Ruflo - Enterprise AI agent orchestration for Claude Code. Deploy 60+ specialized agents in coordinated swarms with self-learning, fault-tolerant consensus, vector memory, and MCP integration
51 lines • 2.1 kB
TypeScript
/**
* GAIA Hardness Predictor — Training Data Loader (ADR-136 Track Q)
*
* Loads labelled training examples from prior bench-run result JSONs
* (iter-15, iter-23, iter-28 outputs) and converts them into the
* `LabeledExample[]` format consumed by `HardnessPredictor.train()`.
*
* Expected result JSON schema (matches gaia-bench --output json):
* {
* level: number,
* model: string,
* summary: { total, passed, passRate, estCostUsd, meanTurns, meanWallMs },
* results: [
* {
* task_id: string, question: string, model: string, correct: boolean,
* answer: string | null, expected_output: string, error?: string,
* turns?: number, wallMs?: number, inputTokens?: number, outputTokens?: number
* }
* ]
* }
*
* The file may contain either:
* (a) a single JSON object (one model run), or
* (b) a JSON array of objects (multi-model run from --models a,b,c), or
* (c) a text preamble followed by JSON (raw output from gaia-bench text mode
* — we scan for the first '[' or '{' and parse from there).
*
* Missing files are silently skipped (returns empty array).
* Malformed files emit a warning to stderr and are skipped.
*
* Default search paths (tried in order, first found wins per iter):
* /tmp/gaia-l1-full.json
* /tmp/gaia-l1-haiku.json
* /tmp/gaia-all-p1b.json
* /tmp/gaia-all-p2.json
* <custom paths passed by caller>
*
* Refs: ADR-136, #2156
*/
import type { LabeledExample } from './predictor.js';
/** Default candidate paths for historical bench-run result JSONs. */
export declare const DEFAULT_RESULT_PATHS: readonly string[];
/**
* Load labelled training examples from historical bench-run result JSONs.
*
* @param additionalPaths - Extra file paths to scan beyond the defaults.
* @param verbose - If true, log loaded example counts to stderr.
* @returns Deduplicated array of LabeledExample (dedup by task_id, last write wins).
*/
export declare function loadTrainingData(additionalPaths?: string[], verbose?: boolean): LabeledExample[];
//# sourceMappingURL=train-data-loader.d.ts.map