UNPKG

agent-contracts-runtime

Version:

Runtime bridge for executing agent-contracts workflows on Agent SDKs

824 lines (749 loc) 24.2 kB
# Reference Templates Complete file templates for LLM command integration. Replace `{project}`, `{agent-name}`, `{domain}` with project-specific values. ## DSL: Main entry (`dsl/{project}-dsl.yaml`) ```yaml version: 1 system: id: {project} name: "{Project} Agent System" default_workflow_order: - {workflow-1} - {workflow-2} - {workflow-3} agents: { $ref: "./agents/" } tasks: { $ref: "./tasks.yaml" } handoff_types: { $ref: "./handoff-types.yaml" } guardrails: output-schema-conformance: description: >- Ensure all agent outputs conform to the result handoff schema. Validation is performed at runtime via Zod. scope: agents: - {agent-name} tasks: - {task-1} - {task-2} - {task-3} rationale: >- Structured output conformance is critical for CLI parsing and downstream tooling integration. no-{dangerous-action}: description: >- Agent must never {perform dangerous action}. scope: agents: - {agent-name} rationale: >- The agent operates in read-only analysis mode. confidence-threshold: description: >- Findings with confidence below 0.5 must include an explicit uncertainty disclaimer in their recommendation. scope: tasks: - {task-1} - {task-2} rationale: >- Low-confidence findings without disclaimers may mislead users. guardrail_policies: {domain}-safety-policy: description: >- Groups all guardrails for the {agent-name} agent. rules: - guardrail: output-schema-conformance severity: critical action: block - guardrail: no-{dangerous-action} severity: critical action: block - guardrail: confidence-threshold severity: warning action: warn workflow: {workflow-1}: description: "{Workflow 1 description}" trigger: cli-command entry_conditions: - "{Entry condition}" steps: - type: delegate task: {task-1} from_agent: {agent-name} {workflow-2}: description: "{Workflow 2 description}" trigger: cli-command entry_conditions: - "{Entry condition}" steps: - type: delegate task: {task-2} from_agent: {agent-name} {workflow-3}: description: "{Workflow 3 description}" trigger: cli-command entry_conditions: - "{Entry condition}" steps: - type: delegate task: {task-3} from_agent: {agent-name} ``` ## DSL: Agent (`dsl/agents/{agent-name}.yaml`) ```yaml {agent-name}: role_name: "{Agent Role Name}" purpose: >- {Describe the agent's purpose. What domain knowledge does it have? What can it do that static rules cannot?} mode: read-only can_read_artifacts: [] can_write_artifacts: [] can_invoke_agents: [] can_execute_tools: [] can_return_handoffs: - {result-handoff-1} - {result-handoff-2} - {result-handoff-3} responsibilities: - "{Responsibility 1}" - "{Responsibility 2}" - "{Responsibility 3}" constraints: - "Do not claim safety without citing specific assumptions" - "Output must conform to AgentAuditResult schema (summary, riskLevel, findings, recommendedActions, metadata)" - "Findings must use {project} category vocabulary ({cat-1}, {cat-2}, ...)" - "Each finding must include a recommendation when severity is warning or above" rules: - id: "R-{PREFIX}-001" description: "{Rule 1}" severity: mandatory - id: "R-{PREFIX}-002" description: "{Rule 2}" severity: mandatory escalation_criteria: - condition: "Input cannot be parsed or is structurally invalid" action: stop_and_report - condition: "Input contains dynamic content that cannot be statically analyzed" action: stop_and_report ``` ## DSL: Tasks (`dsl/tasks.yaml`) ```yaml {task-1}: description: "{Task 1: audit-style}" target_agent: {agent-name} allowed_from_agents: - {agent-name} workflow: {workflow-1} input_artifacts: [] invocation_handoff: {request-handoff} result_handoff: {result-handoff-1} responsibilities: - "Evaluate all items for risks" - "Classify findings by severity and category" completion_criteria: - "All items evaluated" - "Findings classified by severity and category" - "Output conforms to AgentAuditResult schema" {task-2}: description: "{Task 2: proposal-style}" target_agent: {agent-name} allowed_from_agents: - {agent-name} workflow: {workflow-2} input_artifacts: [] invocation_handoff: {request-handoff} result_handoff: {result-handoff-2} responsibilities: - "Identify problems requiring structured proposals" - "Generate valid, actionable proposals" completion_criteria: - "Proposal is complete and actionable" - "Each step has valid output" - "recommendedActions include concrete edit_file actions" {task-3}: description: "{Task 3: explain-style}" target_agent: {agent-name} allowed_from_agents: - {agent-name} workflow: {workflow-3} input_artifacts: [] invocation_handoff: {request-handoff} result_handoff: {result-handoff-3} responsibilities: - "Summarize output for non-expert readers" - "Provide concrete next steps as recommendedActions" completion_criteria: - "Explanation accurately summarizes the command output" - "Action items are concrete and actionable" - "Technical terms explained for non-expert readers" ``` ## DSL: Handoff types (`dsl/handoff-types.yaml`) **Preferred**: Use `$ref` to `cli-contract.yaml` AgentAuditResult as SSoT (this is how cli-contracts itself does it): ```yaml {request-handoff}: version: 1 description: Request to execute a {project} audit task schema: type: object properties: task_id: type: string description: Identifier of the task to execute context: type: string description: Built context string required: - task_id {result-handoff-1}: version: 1 description: >- Result of a {project} audit task. Schema references cli-contract.yaml AgentAuditResult as SSoT. schema: $ref: '../cli-contract.yaml#/components/schemas/AgentAuditResult' ``` **Fallback** (when agent-contracts cannot resolve cross-file `$ref` at generate time): Inline the schema with a comment noting the SSoT. The inlined schema must follow the **cli-contracts reference specification**: ```yaml {result-handoff-1}: version: 1 description: >- Result of a {project} audit task. Inlined from cli-contract.yaml AgentAuditResult (SSoT). schema: type: object required: [summary, riskLevel, findings] properties: summary: type: string riskLevel: type: string enum: [low, medium, high, critical] findings: type: array items: # AgentFinding — cli-contracts reference schema type: object required: [severity, category, message] properties: id: type: string description: Unique finding identifier severity: type: string enum: [info, warning, error, critical] category: type: string description: Finding category (domain-specific vocabulary) target: type: string description: Target of the finding (file path, command ID, etc.) location: type: string description: Location within the target message: type: string recommendation: type: string confidence: type: number minimum: 0 maximum: 1 description: Confidence score (0-1) for LLM-generated findings evidence: type: array items: type: object properties: type: { type: string } content: { type: string } source: { type: string } details: type: object additionalProperties: true recommendedActions: type: array items: # AgentRecommendedAction — cli-contracts reference schema type: object required: [kind, title] properties: kind: type: string enum: [run_command, edit_file, review, confirm, block, ignore] title: type: string command: type: string description: CLI command to run (for run_command kind) target: type: string description: Target file or resource rationale: type: string metadata: type: object properties: tool: { type: string } command: { type: string } version: { type: string } generatedAt: { type: string } adapter: { type: string } model: { type: string } ``` **Domain-specific result types** extend the base shape with additional fields while keeping all AgentAuditResult fields intact: ```yaml {result-handoff-2}: version: 1 description: >- Domain-specific result extending AgentAuditResult. Adds {domain-specific fields} alongside base fields. schema: type: object required: [summary, riskLevel, findings] properties: # All AgentAuditResult base fields (same as above) summary: { type: string } riskLevel: { type: string, enum: [low, medium, high, critical] } findings: type: array items: { ... } # Same AgentFinding shape recommendedActions: type: array items: { ... } # Same AgentRecommendedAction shape metadata: { ... } # Domain-specific extensions: phases: type: array items: type: object required: [name, sql] properties: name: { type: string } sql: { type: string } description: { type: string } {result-handoff-3}: version: 1 description: >- Explain-style result extending AgentAuditResult. schema: type: object required: [summary, riskLevel, findings] properties: # All AgentAuditResult base fields summary: { type: string } riskLevel: { type: string, enum: [low, medium, high, critical] } findings: { ... } recommendedActions: { ... } metadata: { ... } # Domain-specific extension: explanation: type: string description: Human-readable explanation of the command output ``` ## DSL: Runtime config (`dsl/agent-runtime.config.yaml`) ```yaml dsl: ./{project}-dsl.yaml generated_dir: ../src/generated/dsl ``` ## TypeScript: `src/agents/types.ts` ```typescript import type { {ResultType} } from "../generated/dsl/handoffs.js"; export type TaskId = | "{task-1}" | "{task-2}" | "{task-3}"; export interface AgentConfig { adapter?: string; model?: string; temperature?: number; } export interface AgentOptions { showPrompt?: boolean; failOn?: "warning" | "error" | "critical"; } export interface AgentRunResult { taskId: TaskId; data: {ResultType} | null; raw: string; prompt: string; showPrompt: boolean; status: "success" | "error" | "escalation" | "validation_error"; errorMessage?: string; followUpsUsed: number; retriesUsed: number; } ``` ## TypeScript: `src/agents/orchestrator.ts` ```typescript import type { AgentConfig, AgentOptions, AgentRunResult, TaskId } from "./types.js"; export const EXIT_RUNTIME_MISSING = 11; export const EXIT_ADAPTER_ERROR = 12; async function createAdapter(runtimePkg: string, name: string, config: AgentConfig) { switch (name) { case "mock": { const mod = await import(`${runtimePkg}/adapters/mock`); return new mod.MockAdapter(); } case "cursor": { const mod = await import(`${runtimePkg}/adapters/cursor-sdk`); const apiKey = process.env.CURSOR_API_KEY; if (!apiKey) throw new Error("CURSOR_API_KEY environment variable is required"); return mod.CursorSdkAdapter.create({ apiKey, model: config.model }); } case "openai": { const mod = await import(`${runtimePkg}/adapters/openai-agents-sdk`); return new mod.OpenAIAgentsSdkAdapter({ model: config.model, maxTurns: 1 }); } case "gemini": { const mod = await import(`${runtimePkg}/adapters/gemini-sdk`); return new mod.GeminiSdkAdapter({ apiKey: process.env.GEMINI_API_KEY, model: config.model, temperature: config.temperature, }); } case "claude": { const mod = await import(`${runtimePkg}/adapters/claude-agent-sdk`); return new mod.ClaudeAgentSdkAdapter({ model: config.model, tools: ["Read", "Glob", "Grep"], permissionMode: "bypassPermissions", }); } default: throw new Error(`Unsupported adapter: "${name}". Available: mock, cursor, openai, gemini, claude.`); } } export async function runAgentTask( userRequest: string, taskId: TaskId, config: AgentConfig, options: AgentOptions, ): Promise<AgentRunResult> { if (options.showPrompt) { return { taskId, data: null, raw: "", prompt: userRequest, showPrompt: true, status: "success", followUpsUsed: 0, retriesUsed: 0 }; } const RUNTIME_PKG = ["agent-contracts", "runtime"].join("-"); let runTask; try { const runtime = await import(RUNTIME_PKG); runTask = runtime.runTask; } catch { throw Object.assign( new Error("agent-contracts-runtime is not installed.\n npm install agent-contracts-runtime"), { exitCode: EXIT_RUNTIME_MISSING }, ); } let agentRegistry, taskRegistry, handoffSchemas; try { const dsl = await import("../generated/dsl/index.js"); agentRegistry = dsl.agentRegistry; taskRegistry = dsl.taskRegistry; handoffSchemas = dsl.handoffSchemas; } catch { agentRegistry = {}; taskRegistry = {}; handoffSchemas = {}; } let adapter; try { adapter = await createAdapter(RUNTIME_PKG, config.adapter ?? "mock", config); } catch (err) { throw Object.assign(err as Error, { exitCode: EXIT_ADAPTER_ERROR }); } const result = await runTask(adapter, taskId, { user_request: userRequest }, { maxFollowUps: 3, maxRetries: 1, agentRegistry, taskRegistry, handoffSchemas, }); const outcome = result.outcome; return { taskId, data: outcome.status === "success" ? outcome.data : null, raw: (outcome.raw as string) ?? "", prompt: userRequest, showPrompt: false, status: outcome.status, errorMessage: outcome.status === "error" ? outcome.message : outcome.status === "escalation" ? outcome.reason : outcome.status === "validation_error" ? outcome.errors?.message : undefined, followUpsUsed: result.follow_ups_used, retriesUsed: result.retries_used, }; } ``` ## TypeScript: `src/agents/formatter.ts` ```typescript import type { AgentRunResult, AgentOptions } from "./types.js"; export function computeExitCode(result: AgentRunResult, options: AgentOptions): number { if (result.showPrompt) return 0; if (result.status !== "success" || !result.data) return 1; const failOn = options.failOn ?? "error"; const order = ["info", "warning", "error", "critical"] as const; const threshold = order.indexOf(failOn); const hasBlocking = result.data.findings.some( (f) => order.indexOf(f.severity) >= threshold, ); return hasBlocking ? 10 : 0; } export function formatResultText(result: AgentRunResult): string { if (result.showPrompt) return result.prompt; if (result.status !== "success" || !result.data) { return result.errorMessage ?? `Task failed: ${result.status}`; } const d = result.data; const lines = [`Risk Level: ${d.riskLevel.toUpperCase()}`, `Summary: ${d.summary}`, ""]; if (d.findings.length > 0) { lines.push(`Findings (${d.findings.length}):`, ""); for (const f of d.findings) { const icon = f.severity === "critical" || f.severity === "error" ? "✖" : f.severity === "warning" ? "⚠" : "ℹ"; lines.push(` ${icon} [${f.category}] ${f.message}`); if (f.target) lines.push(` Target: ${f.target}`); if (f.location) lines.push(` Location: ${f.location}`); if (f.recommendation) lines.push(` Recommendation: ${f.recommendation}`); if (f.confidence != null) lines.push(` Confidence: ${f.confidence}`); lines.push(""); } } if (d.recommendedActions?.length) { lines.push("Recommended Actions:"); for (const a of d.recommendedActions) { lines.push(` - [${a.kind}] ${a.title}`); if (a.command) lines.push(` $ ${a.command}`); if (a.target) lines.push(` Target: ${a.target}`); if (a.rationale) lines.push(` Rationale: ${a.rationale}`); } } return lines.join("\n"); } export function formatResultJson(result: AgentRunResult): string { if (result.showPrompt) return JSON.stringify({ showPrompt: true, prompt: result.prompt }, null, 2); if (!result.data) return JSON.stringify({ error: result.errorMessage, status: result.status }, null, 2); return JSON.stringify(result.data, null, 2); } ``` ## CLI registration pattern ```typescript import { Command } from "commander"; const audit = new Command("audit") .description("Semantic safety audit via LLM") .argument("[target]", "file or directory to audit") .option("--adapter <name>", "LLM adapter (cursor, openai, gemini, claude, mock)", "mock") .option("--model <name>", "Model name to pass to the adapter") .option("--show-prompt", "Output the constructed prompt without calling the LLM API") .option("--fail-on <level>", "Minimum severity for non-zero exit", "error") .option("-o, --output <file>", "Write result to file instead of stdout") .option("--report-format <fmt>", "Output format (json, text, yaml)", "json") .action(async (target, opts) => { // load config, call command handler }); program.addCommand(audit); ``` ## cli-contract.yaml: LLM command options + x-agent ```yaml commands: audit: description: "Semantic safety audit via LLM" arguments: - name: target index: 0 required: false description: "File or directory to audit" schema: type: string options: - name: adapter description: LLM adapter to use. valueName: name schema: type: string enum: [mock, cursor, claude, openai, gemini] - name: model description: Model name to pass to the adapter. valueName: name schema: type: string - name: show-prompt description: Output the constructed prompt without calling the LLM API. schema: type: boolean default: false - name: fail-on description: Minimum severity that causes a non-zero exit. valueName: level schema: type: string enum: [warning, error, critical] default: error - name: output aliases: [o] description: Write result to a file instead of stdout. valueName: file schema: type: string file: mode: write mediaType: application/json encoding: utf-8 - name: report-format description: Output format for the audit report. valueName: fmt schema: type: string enum: [json, text, yaml] default: json x-agent: riskLevel: low requiresConfirmation: false idempotent: true sideEffects: [network] sideEffectNote: >- Network calls to LLM provider when adapter is not mock. Filesystem write only when --output is specified. safeDryRunOption: show-prompt expectedDurationMs: 120000 retryableExitCodes: [1, 12] exits: '0': description: Completed without blocking findings. stdout: format: '{options.report-format}' schema: $ref: '#/components/schemas/AgentAuditResult' '1': description: Unexpected error. stderr: format: json schema: $ref: '#/components/schemas/Error' '3': description: Input validation/parse failed. stderr: format: json schema: $ref: '#/components/schemas/Error' '10': description: Completed with blocking findings. stdout: format: '{options.report-format}' schema: $ref: '#/components/schemas/AgentAuditResult' '11': description: Runtime dependency missing (agent-contracts-runtime). stderr: format: json schema: $ref: '#/components/schemas/Error' '12': description: Adapter initialization error. stderr: format: json schema: $ref: '#/components/schemas/Error' ``` ## cli-contract.yaml: AgentAuditResult schema (in components/schemas) This is the reference specification from cli-contracts. Include it in your project's `cli-contract.yaml` `components/schemas` section: ```yaml components: schemas: AgentFinding: type: object description: >- A single finding from an agent audit. Reference schema for AI agent interoperability. required: [severity, category, message] properties: id: type: string description: Unique finding identifier. severity: type: string enum: [info, warning, error, critical] category: type: string description: Finding category (domain-specific vocabulary). target: type: string description: Target of the finding (file path, command ID, etc.). location: type: string description: Location within the target. message: type: string recommendation: type: string confidence: type: number minimum: 0 maximum: 1 description: Confidence score (0-1) for LLM-generated findings. evidence: type: array items: $ref: '#/components/schemas/AgentEvidence' details: type: object additionalProperties: true AgentEvidence: type: object properties: type: { type: string } content: { type: string } source: { type: string } AgentRecommendedAction: type: object description: >- A recommended action from an agent audit. Reference schema for AI agent interoperability. required: [kind, title] properties: kind: type: string enum: [run_command, edit_file, review, confirm, block, ignore] title: type: string command: type: string description: CLI command to run (for run_command kind). target: type: string description: Target file or resource. rationale: type: string AgentAuditResult: type: object description: >- Top-level result from an agent audit. Reference schema for AI agent interoperability. required: [summary, riskLevel, findings] properties: summary: type: string riskLevel: type: string enum: [low, medium, high, critical] findings: type: array items: $ref: '#/components/schemas/AgentFinding' recommendedActions: type: array items: $ref: '#/components/schemas/AgentRecommendedAction' metadata: type: object properties: tool: { type: string } command: { type: string } version: { type: string } generatedAt: { type: string } adapter: { type: string } model: { type: string } ```