agent-contracts
Version:
Declarative YAML DSL toolkit for defining, validating, and rendering multi-agent development workflows
1,454 lines (1,123 loc) • 74.3 kB
Markdown
# agent-contracts
[](https://www.npmjs.com/package/agent-contracts)
[](https://opensource.org/licenses/MIT)
**Design multi-agent systems as contracts.**
`agent-contracts` is a toolkit for declaratively defining multi-agent development workflows in **YAML DSL**, with **static validation, semantic linting, and prompt rendering**.
It is designed for teams that need more than “agents that happen to work”.
It helps you define, validate, and evolve:
- who each agent is
- what tasks can be delegated
- which artifacts exist and who owns them
- what validations are required
- how handoffs are structured
- how prompts are rendered from the design itself
Instead of letting workflow rules live only in prompts and code, `agent-contracts` makes the system **explicit, reviewable, and CI-checkable**.
---
## Why agent-contracts?
Most agent frameworks focus on **runtime execution**.
`agent-contracts` focuses on **design-time guarantees**.
As multi-agent systems grow, teams usually run into the same problems:
- agent responsibilities become ambiguous
- handoff rules drift across prompts
- artifact ownership is unclear
- validation logic is inconsistent
- prompts diverge from the intended workflow
- shared team conventions stop being enforceable
`agent-contracts` addresses this by treating your agent workflow as a **contract**, not just a set of prompts.
You can think of it as:
- **OpenAPI for multi-agent workflows**
- **a contract layer above runtime orchestration**
- **a source of truth for agent roles, handoffs, and artifact flows**
---
## Who this is for
`agent-contracts` is a strong fit for teams that build or operate:
- multi-agent coding workflows
- spec → implement → audit → release style pipelines
- internal agent platforms
- review-heavy or gate-heavy delivery processes
- agent systems where artifact ownership matters
- reusable team definitions shared across projects
Typical users include:
- platform teams standardizing agent workflows
- engineering teams building internal coding/review agents
- products that require explicit validation and handoff policies
- teams that want CI enforcement for agent design consistency
---
## Who this is not for
`agent-contracts` is probably **not** the right starting point if you want:
- a single-agent chatbot
- a quick prompt prototype
- an all-in-one hosted agent runtime
- built-in scheduling, memory, tracing, or hosting
- a purely code-first orchestration style with no declarative spec
- maximum flexibility with minimal process constraints
In short:
- if you want to **run agents quickly**, start with a runtime framework
- if you want to **design multi-agent systems that stay coherent over time**, use `agent-contracts`
---
## What makes it different?
`agent-contracts` does not try to replace every agent framework.
It occupies a different layer.
### Positioning
| Product / approach | Primary focus | Best fit | How `agent-contracts` differs |
|---|---|---|---|
| **OpenAI Agents SDK** | runtime execution with instructions, tools, and handoffs | apps built around agent runtime behavior | `agent-contracts` focuses on design contracts, static guarantees, and artifact relationships |
| **CrewAI** | agent/task workflow orchestration | teams that want runtime task execution in YAML | `agent-contracts` goes deeper on validation, ownership, inheritance, and renderable design specs |
| **AutoGen** | code-first multi-agent programming | research or custom orchestration flows | `agent-contracts` is more declarative, reviewable, and CI-oriented |
| **Google ADK style patterns** | choosing runtime interaction patterns | production systems built around runtime composition | `agent-contracts` is framework-agnostic and centered on workflow design as a contract |
The key distinction is simple:
> Other frameworks mainly answer: **How do I run these agents?**
> `agent-contracts` answers: **What is the allowed structure of this agent system, and how do we keep it correct as it evolves?**
This positioning is consistent with common industry patterns: some frameworks center the agent runtime, others separate agent definition and task invocation, but `agent-contracts` is strongest as a **design-time contract layer** across those execution models.
---
## Quick Start
Define your system in a single YAML file:
```yaml
# agent-contracts.yaml
version: 1
system:
id: my-project
name: My Agent Workflow
default_workflow_order: [design, implement]
agents:
architect:
role_name: "Architect"
purpose: "Drive phases and delegate work"
can_invoke_agents: [implementer]
implementer:
role_name: "Implementer"
purpose: "Implement features based on specs"
tasks:
implement-feature:
description: "Delegate feature implementation"
target_agent: implementer
allowed_from_agents: [architect]
workflow: implement
input_artifacts: [spec-md]
invocation_handoff: task-delegation
result_handoff: implementation-result
artifacts:
spec-md:
type: document
owner: architect
producers: [architect]
editors: [architect]
consumers: [implementer]
states: [draft, reviewed, approved]
````
Validate and generate:
````bash
agent-contracts validate
agent-contracts generate -c agent-contracts.config.yaml
````
A working example is available in [`sample/`](./sample), including:
* [`sample/agent-contracts.yaml`](./sample/agent-contracts.yaml)
* [`sample/agent-contracts.config.yaml`](./sample/agent-contracts.config.yaml)
* [`sample/templates`](./sample/templates)
* [`sample/output`](./sample/output)
A multi-team example is available in [`sample/multi-team/`](./sample/multi-team), demonstrating cross-team interface declaration and consumption.
---
## Core concepts
### Agent
An **Agent** defines who an execution entity is:
* role name
* purpose
* capabilities
* permissions
* constraints
* behavioral rules
* structured content sections (reference material, procedures, criteria)
* memory — optional capability declaration for session resume support (`resumable`, `ref_required`, `emits_memory_ref`)
### Task
A **Task** defines a delegatable unit of work:
* target agent
* allowed callers
* workflow
* input artifacts
* invocation/result handoffs
* task-specific execution expectations
* `model_class` — optional LLM capability requirement (`fast`, `standard`, `thinking`)
### Artifact
An **Artifact** defines the objects that move through the workflow:
* owner
* producers
* editors
* consumers
* states
* required validations
* visibility
### Tool
A **Tool** defines an invokable CLI/MCP tool:
* kind (cli, mcp, etc.)
* input/output artifacts
* invokable_by (which agents can use it)
* `extends` — inherit from a base tool definition
* `command` — single command name (alternative to `commands[]`)
* `commands` — structured list of sub-commands with `category`, `reads`, `writes`, and `purpose`
* `cli_contract` — path to a CLI contract YAML (for CLI/MCP adapter invocation)
* `component_contract` — path to an AaaC Component contract YAML (for in-process / SDK / MCP Component invocation). Mutually exclusive with `cli_contract`.
* `artifact_bindings` — maps contract slot names to project artifact IDs
* `effects` (on agents/tasks) — optional narrow-only override of capability effects derived from executable tools
### Workflow
A **Workflow** defines a phase-level execution sequence:
* `description` — human-readable summary
* `entry_conditions`
* `trigger`
* `external_participants` — actors/participants outside the agent system (e.g., User, external advisory)
* ordered steps (`delegate`, `gate`, `team_task`, `decision`; legacy: `handoff`, `validation`)
Workflow steps support additional properties:
* `group` — consecutive steps with the same group are rendered as `par` (parallel) blocks in sequence diagrams
* `depends_on` — list of step task IDs that must complete before this step starts. When specified, the runtime can execute independent steps in parallel. When omitted, the step implicitly depends on all preceding steps (sequential execution)
* `max_retries` (delegate steps) — maximum number of full task re-executions (new sessions) allowed per step. Defaults to `0` (no retries), or `1` when a `retry` block is present
* `max_follow_ups` (delegate steps) — maximum number of lightweight same-session follow-up messages for output format corrections
* `retry` (delegate steps) — defines a conditional retry loop with `condition`, `fix_task`, and optional `revalidate_task`. These are rendered as recovery instructions in the LLM prompt
* `routing_key` (decision steps) — the field that determines branch selection. The legacy field `on` is still accepted but deprecated due to YAML 1.1 reserved word collision
### Validation
A **Validation** defines a verification step for an artifact:
* `target_artifact` — the artifact being verified
* `kind` — the type of verification (see below)
* `executor_type` — `tool` (automated) or `agent` (agent-driven)
* `executor` — the tool or agent that runs the validation
* `blocking` — whether the validation must pass before proceeding
* `produces_evidence` — optional artifact produced as evidence
#### Validation kinds
| Kind | Purpose | Example |
|------|---------|---------|
| `schema` | Structural schema check | JSON Schema validation, OpenAPI lint, SQL syntax |
| `mechanical` | Automated tool check | CLI linters, diff checks, coverage reports |
| `semantic` | Meaning-level review | Agent-based review of spec intent, plan coherence |
| `approval` | Human/agent sign-off gate | Architect approval before implementation |
| `provenance` | Source derivation verification | Confirm generated artifact derives from its canonical source (e.g., manifest from API contracts) |
| `traceability` | Cross-artifact link completeness | Verify every spec requirement reaches contracts, tests, and code |
| `fidelity` | Semantic faithfulness to source | Confirm tests actually verify spec intent, not just structural compliance |
`schema` and `mechanical` are best suited for automated checks via tools. `semantic`, `fidelity`, and `approval` are typically agent-driven. `provenance` and `traceability` can be either tool or agent-based depending on the verification complexity.
### Guardrail
A **Guardrail** declares a cross-cutting constraint:
* description — what is protected
* scope — which DSL entities it applies to (agents, tasks, tools, artifacts, workflows)
* rationale — why the constraint exists
* tags — classification for filtering
* exemptions — glob patterns or entity IDs exempt from the guardrail
### Guardrail Policy
A **Guardrail Policy** defines enforcement strategy for guardrails:
* rules — array of enforcement rules mapping guardrails to actions
* Each rule specifies: severity (`critical`/`mandatory`/`warning`/`info`), action (`block`/`warn`/`shadow`/`info`), override permissions
* `action` supports a conditional form for state-dependent enforcement: `{ default: "block", when: { maintenance: "shadow" } }`
* Available states are declared system-wide via `system.states`
### Handoff Type
A **Handoff Type** defines the schema for inter-agent messages:
* `schema` — a JSON Schema object describing the full message structure
* `description`
* `example`
* `version`
Schemas can use `allOf` with `$ref: "#/components/schemas/..."` to compose shared fields (e.g., common envelope) with type-specific properties.
### Components
**Components** provide reusable definitions, following the OpenAPI pattern:
* `components.schemas` — named JSON Schema fragments that can be referenced from anywhere via `$ref: "#/components/schemas/<name>"`
---
## Why teams adopt it
### 1. Explicit workflow design
Your architecture stops living only in prompts, code, and tribal knowledge.
### 2. Static guarantees before runtime
You can catch broken references, invalid ownership, missing validations, and workflow inconsistencies before execution.
### 3. Prompt generation from source of truth
Rendered prompts come from the same DSL that defines roles, tasks, artifacts, and policies.
### 4. Reuse across teams and projects
Shared base definitions can be extended safely with `extends`.
### 5. Better CI discipline
Design regressions become testable.
---
## Features
* **Declarative YAML DSL** for multi-agent development workflows
* **Agent `sections`** for embedding structured reference material, procedures, and criteria directly in agent definitions
* **Static schema validation**
* **Reference integrity checks**
* **Semantic linting**
* **Structured handoff definitions** with formal JSON Schema and `allOf` composition
* **Reusable schema components** via `components.schemas` and JSON Pointer `$ref`
* **Artifact ownership and lifecycle modeling**
* **Config-driven prompt rendering** with `skip_empty` support for conditional file generation
* **Variable substitution** via `${vars.xxx}` in DSL values
* **Inheritance with merge operators via `extends`**
* **Guardrail definitions** for cross-cutting process constraints
* **Guardrail policies** with configurable enforcement (block/warn/shadow/info)
* **State-dependent guardrail action** — `action` accepts a conditional form `{ default, when }` keyed by `system.states` for workspace-mode-aware enforcement
* **Software bindings** (DI) for tool-specific guardrail implementation (Cursor, Git, GitHub)
* **Guardrail generation** from DSL + policy + bindings via `generate guardrails`
* **Navigation index** — compile-time artifact-centric model mapping artifacts to operations, agent permissions, relations, and action routes
* **Artifact coverage** — measure what percentage of project files are covered by artifact `path_patterns` definitions, with CI gating via `--min-coverage`
* **Tool `extends`** — tool inheritance for sharing `cli_contract`, `artifact_bindings`, and other metadata across related tool definitions
* **Interface generation** from DSL via `generate interface` for cross-team contracts
* **Flexible file splitting** via `$ref` (replacement), `$refs` (import + deep-merge), and JSON Pointer `$ref` (in-document)
* **Multi-team collaboration** via `team_interface` (public boundary), `imports` (team consumption), and `team_task` (cross-team delegation)
* **YAML safety linting** for reserved word collision detection across YAML 1.1/1.2
* **`extensions` declarations** with scope, schema validation, and strict enforcement for custom `x-*` fields
* **`resolve --expand-defaults`** to materialize all Zod schema defaults in output
* **DSL completeness scoring** with 7 dimensions, text/JSON output, and `--threshold` CI gate
* **LLM-based semantic audit** — design coherence, prompt fidelity, and completeness checks via Claude, OpenAI, Gemini, or Cursor adapters
* **JSON Schema for editor support and external tooling**
* **CI-friendly workflow checks**
---
## DSL structure
Entities are defined as **maps keyed by ID**.
````yaml
version: 1
extends: "./base/"
system:
id: my-project
name: My Agent Workflow
default_workflow_order:
- analyze
- specify
- plan
- implement
- audit
- release
- reflect
states: [] # optional — named workspace states for conditional guardrail action
agents: {}
tasks: {}
artifacts: {}
tools: {}
validations: {}
handoff_types: {}
team_interface: # optional — multi-team public boundary
version: 1
accepts:
workflows: {}
exposes:
artifacts: []
imports: {} # optional — consumed team interfaces
workflow: {}
policies: {}
guardrails: {}
guardrail_policies: {}
components:
schemas: {}
extensions:
x-flags:
type: array
items: string
description: "CLI flags for tool commands"
x-path-hint:
type: string
description: "Filesystem path hint"
scope: [artifact]
schema:
type: string
minLength: 1
required: true
extensions_strict: false
````
This makes definitions easy to merge, extend, and reference by stable identifiers.
### Single-file format
````yaml
version: 1
system: { ... }
agents: { ... }
tasks: { ... }
artifacts: { ... }
````
### Multi-file format (section-level `$ref`)
````yaml
version: 1
extends: "./base/"
system:
id: my-project
name: My Agent Workflow
default_workflow_order: [analyze, specify, plan, implement, audit, release, reflect]
agents: { $ref: "./agents.yaml" }
tasks: { $ref: "./tasks.yaml" }
artifacts: { $ref: "./artifacts.yaml" }
tools: { $ref: "./tools.yaml" }
validations: { $ref: "./validations.yaml" }
handoff_types: { $ref: "./handoff-types.yaml" }
workflow: { $ref: "./workflow.yaml" }
policies: { $ref: "./policies.yaml" }
````
### Per-entry `$ref`
`$ref` can be used at any object position. This allows splitting individual entries into separate files:
````yaml
agents:
architect: { $ref: "./agents/architect.yaml" }
implementer: { $ref: "./agents/implementer.yaml" }
test-writer: { $ref: "./agents/test-writer.yaml" }
````
Each referenced file contains the agent definition directly (without the key):
````yaml
# agents/architect.yaml
role_name: "Architect"
purpose: "Drive phases and delegate work"
can_invoke_agents: [implementer]
````
### Directory `$ref`
When `$ref` points to a directory, all `*.yaml` / `*.yml` files in the directory are loaded and merged:
````yaml
agents: { $ref: "./agents/" }
````
Each file in the directory contains one or more keyed entries:
````yaml
# agents/architect.yaml
architect:
role_name: "Architect"
purpose: "Drive phases and delegate work"
````
Files are loaded in alphabetical order. Conflicting leaf values across files result in an error.
### `$refs` (import and merge)
`$refs` imports multiple files and **deep-merges** them into the containing map.
Unlike `$ref` (which replaces an object entirely), `$refs` allows mixing inline definitions with external files.
````yaml
agents:
inline-agent:
role_name: "Inline Agent"
purpose: "Defined right here"
$refs:
- "./agents/architect.yaml"
- "./agents/implementer.yaml"
- "./more-agents/" # directories are also supported
````
Each referenced file uses the same keyed format:
````yaml
# agents/architect.yaml
architect:
role_name: "Architect"
purpose: "Drive phases and delegate work"
````
`$refs` can also be used at the root level to compose a DSL from multiple aspect-oriented files:
````yaml
version: 1
system:
id: my-project
name: My Agent Workflow
default_workflow_order: [analyze, implement]
$refs:
- "./agents-core.yaml" # agents + artifacts definitions
- "./agents-constraints.yaml" # constraints for the same agents
- "./tasks.yaml"
````
Overlapping map keys are deep-merged recursively. Conflicting leaf values (scalar or array) result in an error.
| Directive | Type | Behavior |
| --------- | ------ | -------------------------------------------------------- |
| `$ref` | string | Replace the object at that position with file contents |
| `$ref` (`#/...`) | string | Replace with the value at the given JSON Pointer path within the document |
| `$refs` | array | Import files and deep-merge into the containing map |
### JSON Pointer `$ref`
`$ref` also supports **in-document references** using JSON Pointer syntax (RFC 6901).
When the value starts with `#/`, it resolves against the root document instead of the file system.
````yaml
components:
schemas:
handoff-common:
type: object
required: [from_agent, to_agent]
properties:
from_agent: { type: string }
to_agent: { type: string }
handoff_types:
task-delegation:
version: 1
schema:
allOf:
- $ref: "#/components/schemas/handoff-common"
- type: object
required: [payload]
properties:
payload:
type: object
required: [objective]
properties:
objective: { type: string }
````
This is particularly useful for sharing common schema fragments across multiple `handoff_types` entries via `components.schemas`.
JSON Pointer references are resolved in the same processing phase as file `$ref` — before Zod validation. They can be used anywhere in the document, not just within `handoff_types`.
---
## Example: Agent definition
````yaml
agents:
main-architect:
role_name: "Architect"
purpose: "Drive phases, delegate, make gate decisions, integrate audits"
dispatch_only: true
mode: read-only
can_read_artifacts:
- spec-md
- codebase
- test-report
can_write_artifacts:
- review-note
can_execute_tools:
- spec-impact-check
can_perform_validations:
- evidence-gate-review
can_invoke_agents:
- implementer
- test-writer
can_return_handoffs:
- evidence-gate-verdict
responsibilities:
- "Manage phase progression and gate decisions"
constraints:
- "Never write code directly"
memory:
resumable: true
emits_memory_ref: true
sections:
- title: "Delegation Protocol"
content: |
You act as the Architect. You NEVER implement or test directly.
Instead you delegate to specialist sub-agents.
````
---
## Example: Task definition
````yaml
tasks:
implement-feature:
description: "Delegate feature implementation"
target_agent: implementer
allowed_from_agents:
- main-architect
workflow: implement
model_class: standard # optional: fast | standard | thinking
input_artifacts:
- spec-md
- plan-md
invocation_handoff: task-delegation
result_handoff: dependency-evidence
responsibilities:
- "Implement all requirements from spec-md"
execution_steps:
- id: read-specs
action: "Read spec-md and design-docs"
reads_artifact: spec-md
- id: implement
action: "Implement changes in codebase"
produces_artifact: codebase
depends_on: [read-specs]
- id: run-db-lint
action: "Run db-lint"
uses_tool: db-lint
x-timeout: 120
completion_criteria:
- "canonical artifacts updated"
````
`x-` prefixed custom properties work at any nesting level — including inside
`execution_steps`, `rules`, `workflow.steps`, and other nested objects.
### Extension declarations
Projects can declare their custom `x-*` extension fields in the DSL using `extensions`. This makes extensions discoverable, self-documenting, and — optionally — machine-validated:
````yaml
extensions:
x-flags:
type: array
items: string
description: "CLI flags for tool commands"
x-path-hint:
type: string
description: "Filesystem path hint"
scope: [artifact]
schema:
type: string
minLength: 1
required: true
extensions_strict: true # undeclared x-* properties become errors
````
Each key must start with `x-` (validated at schema level). The declaration supports:
| Field | Type | Default | Description |
|-------|------|---------|-------------|
| `type` | `string` | *(required)* | Informational type descriptor |
| `items` | `string` | — | Item type (for array-typed extensions) |
| `description` | `string` | — | Human-readable description |
| `scope` | `string[]` | all node types | Restricts which DSL node types this extension may appear on |
| `schema` | `object` | — | JSON Schema to validate the extension value |
| `required` | `boolean` | `false` | Whether the extension must be present on every in-scope entity |
**Scope values**: `root`, `system`, `agent`, `task`, `execution_step`, `artifact`, `tool`, `tool_command`, `validation`, `handoff_type`, `workflow`, `workflow_step`, `policy`, `guardrail`, `guardrail_policy`, `rule`, `escalation_criterion`, `prerequisite`
**`extensions_strict`**: When `true`, any `x-*` property not declared in `extensions` is an error. When `false` (default), undeclared extensions produce a warning.
**Diagnostics**:
| Code | Severity | Trigger |
|------|----------|---------|
| `extension-scope-mismatch` | error | Extension used on a node type outside its declared `scope` |
| `extension-schema-violation` | error | Extension value fails the declared JSON Schema |
| `extension-required-missing` | error | Required extension missing on an in-scope entity |
| `undeclared-extension` | warning/error | Extension not declared in `extensions` (error when `extensions_strict: true`) |
> **Backward compatibility:** `x-extensions` and `x-extensions-strict` are still accepted as deprecated aliases. They produce a `deprecated-property` warning and are normalized to `extensions` / `extensions_strict` during validation.
---
## Example: Artifact definition
````yaml
artifacts:
spec-md:
type: document
description: "Specification document"
owner: main-architect
producers: [main-architect]
editors: [main-architect]
consumers: [implementer, test-writer]
states: [draft, reviewed, approved]
required_validations: [spec-semantic-review]
visibility: internal
````
---
## artifact-contracts integration
agent-contracts integrates with [artifact-contracts](https://github.com/foo-log-inc/artifact-contracts) and [cli-contracts](https://github.com/foo-log-inc/cli-contracts) to provide a unified artifact governance model.
### Design principle
Relationships flow in one direction: **agent → artifact**.
- Agents declare which artifacts they own (`own_artifacts`), read (`can_read_artifacts`), or write (`can_write_artifacts`)
- Tools declare which abstract slots map to project artifacts (`artifact_bindings`)
- Artifact definitions themselves do not reference agents (the legacy `owner`/`producers`/`editors`/`consumers` fields are deprecated)
### How it works
**1. Define project artifacts** in `artifact-contracts.yaml` (project-specific):
````yaml
artifacts:
api-specs:
type: source
authority: canonical
path_patterns: ["specs/**/*.yaml"]
api-contracts:
type: generated-code
authority: generated
path_patterns: ["src/generated/**/*.ts"]
````
**2. Import artifacts** into your agent-contracts DSL via `$ref`:
````yaml
artifacts: { $ref: "./artifact-contracts.yaml#/artifacts" }
````
**3. cli-contracts define tools with domain-agnostic slot names** (reusable across projects):
````yaml
# cli-contract.yaml (tool's interface)
artifactSlots:
source-specs:
description: "Source specification files"
direction: read
contract-output:
description: "Generated contract output"
direction: write
commandSets:
tool-name:
commands:
generate:
summary: Generate contracts
effects:
reads: [source-specs]
writes: [contract-output]
exits:
'0':
description: Success
````
**4. Map abstract slots to project artifacts** using `artifact_bindings` on tools:
````yaml
tools:
micro-contracts:
kind: cli
cli_contract: tools/micro-contracts/cli-contract.yaml
artifact_bindings:
source-specs: api-specs
contract-output: api-contracts
````
**5. Agents reference tools and artifacts** directly:
````yaml
agents:
architect:
own_artifacts: [api-contracts, api-specs]
can_read_artifacts: [api-specs, api-contracts]
can_write_artifacts: [api-contracts]
can_execute_tools: [micro-contracts]
````
### Validation and linting
- `own_artifacts` entries are validated to exist in the `artifacts` section
- `artifact_bindings` values are validated to exist in the `artifacts` section
- A lint rule warns if `own_artifacts` entries are not included in `can_read_artifacts`
---
## Example: Workflow definition
````yaml
workflow:
specify:
description: "Externalize requirements — create spec.md from user stories"
entry_conditions:
- User story or feature request received
trigger: "User invokes /speckit.specify or asks to create a feature spec."
steps:
- type: delegate
task: specify-feature
from_agent: main-architect
- type: validation
validation: spec-semantic-review
- type: decision
routing_key: evidence-gate-verdict.verdict
branches:
PASS: [plan]
REVISE: [specify-feature]
````
Decision steps use `routing_key` to specify the field that determines branching. The legacy `on` field is still accepted but deprecated — see [YAML safety](#yaml-safety) below.
---
## Example: Handoff type definition
Handoff types define the schema for inter-agent messages using JSON Schema.
````yaml
handoff_types:
task-delegation:
version: 1
description: "Delegate a task to a sub-agent"
schema:
type: object
required: [task, objective]
properties:
task: { type: string }
objective: { type: string }
constraints:
type: array
items: { type: string }
````
### Using `components.schemas` with `allOf`
Common fields (e.g., `from_agent`, `to_agent`, `run_id`) can be shared across handoff types by placing them in `components.schemas` and composing via `allOf`:
````yaml
components:
schemas:
handoff-common:
type: object
required: [from_agent, to_agent]
properties:
from_agent: { type: string }
to_agent: { type: string }
run_id: { type: string }
handoff_types:
task-delegation:
version: 1
description: "Delegate a task"
schema:
allOf:
- $ref: "#/components/schemas/handoff-common"
- type: object
required: [payload]
properties:
payload:
type: object
required: [objective]
properties:
objective: { type: string }
implementation-result:
version: 1
description: "Return implementation results"
schema:
allOf:
- $ref: "#/components/schemas/handoff-common"
- type: object
required: [payload]
properties:
payload:
type: object
required: [result]
properties:
result: { type: string }
evidence:
type: array
items: { type: string }
````
The `$ref: "#/..."` references are resolved during loading, before validation. The resulting merged schema is then meta-validated as valid JSON Schema.
---
## Inheritance and merge operators
`agent-contracts` supports shared base definitions with project-level overrides through `extends`.
````yaml
extends: "./base/"
agents:
implementer:
constraints:
$append:
- "Use only approved external libraries"
designer:
role_name: "Designer"
purpose: "UI design"
tasks:
implement-feature:
execution_steps:
$insert_after:
target: run-db-lint
items:
- id: run-contract-pipeline
action: "Run contract pipeline"
uses_tool: api-pipeline
````
### `$clone` — resolve-time entity duplication
`$clone` creates a new entity by copying an existing entity within the same section and optionally applying a merge diff:
````yaml
agents:
implementer.api:
$clone:
from: implementer
merge:
purpose: "API-specialized implementer"
can_write_artifacts:
$replace: [openapi-spec]
responsibilities:
$append: ["Validate schema changes"]
````
`$clone` is processed during `resolve` (after `extends`, before tool inheritance). All merge operators (`$append`, `$prepend`, `$replace`, `$remove`, `$insert_after`) work within `merge`. The base entity is preserved; chained clones (A→B→C) are resolved via topological sort. Circular clones are rejected.
Supported merge operators:
| Operator | Behavior |
| --------------- | ------------------------------------------ |
| `$append` | Append entries to end of map/array |
| `$prepend` | Prepend entries to beginning of map/array |
| `$insert_after` | Insert after element with specified key/id |
| `$replace` | Replace entire value |
| `$remove` | Remove entries by key/id |
| direct value | Override scalar field |
---
## Multi-team collaboration
`agent-contracts` supports multi-team workflows where teams declare public interfaces and consume each other's capabilities.
### Team Interface
A `team_interface` declares what a team exposes to the outside:
````yaml
team_interface:
version: 1
description: "Backend team public interface"
accepts:
workflows:
implement:
internal_workflow: feature-implement
input_handoff: feature-request
output_handoff: implementation-result
description: "Request a feature implementation"
exposes:
artifacts:
- api-contract
- build-report
constraints:
- "feature-request must include acceptance_criteria"
````
Key design decisions:
* **Workflow-level accepts** — external callers invoke a workflow, not individual tasks
* **Explicit mapping** — `internal_workflow` separates the stable public name from the internal workflow ID
* **Listing-based exposure** — an entity is external only if listed in `team_interface`
### Imports
A team consumes another team's generated interface via `imports`:
````yaml
imports:
backend:
interface: ./teams/backend/team-interface.yaml
version: ">=1"
````
Imported entities are referenced as `{team_id}.{public_name}` in cross-team workflow steps.
### `team_task` workflow step
Cross-team delegation uses the `team_task` step type:
````yaml
workflow:
execute-tests:
steps:
- type: team_task
to_team: backend
workflow: implement
handoff: feature-request
expects: implementation-result
description: "Delegate implementation to backend team"
````
| Field | Description |
|-------|-------------|
| `to_team` | Team ID from `imports` |
| `workflow` | Public workflow name from the imported interface |
| `handoff` | Handoff type for the request |
| `expects` | Handoff type for the response |
### Generating a team interface
The `generate interface` command produces a self-contained `team-interface.yaml`:
````bash
agent-contracts generate interface -c agent-contracts.config.yaml
agent-contracts generate interface -c agent-contracts.config.yaml --team backend
agent-contracts generate interface -c agent-contracts.config.yaml -o custom-output.yaml
agent-contracts generate interface -c agent-contracts.config.yaml --dry-run
````
The output includes:
* Workflow entries with handoff key references
* A `handoff_types` section containing only schemas referenced by external workflows
* An `exposes.artifacts` section with type, description, and states
* Metadata (`team_id`, `team_name`, `version`, `generated_at`)
### Interface drift detection
The `check` command detects drift between the declared `team_interface` and the generated `team-interface.yaml`:
````bash
agent-contracts check -c agent-contracts.config.yaml
````
If a `team-interface.yaml` exists and differs from what would be regenerated, the check reports drift.
For managing multiple teams from a single configuration file (shared bindings, vars, and `--team` filtering), see [Multi-team configuration](#multi-team-configuration).
---
## Variable substitution
When using `extends` to share a base DSL across projects, base definitions often contain values that differ per project (project name, language, repository URL, etc.).
`vars` in `agent-contracts.config.yaml` lets you define project-specific values that are substituted into DSL string values using `${vars.xxx}` syntax.
### Defining vars
Add a `vars` section to your config file. Values must be flat string key-value pairs.
````yaml
# agent-contracts.config.yaml
vars:
project_name: "my-service"
language: "TypeScript"
repo_url: "https://github.com/org/my-service"
````
### Using placeholders in DSL
Use `${vars.<key>}` in any string value within the DSL YAML (base or project).
````yaml
# base/agent-contracts.yaml
agents:
implementer:
purpose: "Implements features for ${vars.project_name}"
constraints:
- "Use ${vars.language} for all implementations"
- "Repository: ${vars.repo_url}"
````
### Processing order
Variable substitution happens **after** DSL resolution (`extends` merge) and **before** schema validation:
1. Load config (including `vars`)
2. Resolve DSL (load + merge `extends`)
3. Substitute `${vars.xxx}` in all string values
4. Validate schema
5. Render / lint / check
This ensures that merged strings from both base and project are substituted, and the resulting values pass schema validation.
### Error handling
If a placeholder references an undefined variable, the command exits with an error:
````
VarsSubstitutionError: Undefined variable "repo_url" in value "Repository: ${vars.repo_url}"
Defined vars: project_name, language
````
### Notes
- Only string values are substituted; object keys are not affected.
- `vars` is optional. If omitted, no substitution occurs.
- Patterns that do not match `${vars.<key>}` (e.g. `${env.HOME}`, `$vars.xxx`, `{{vars.xxx}}`) are left unchanged.
---
## CLI
For the full CLI reference with all commands, options, arguments, exit codes, and AI agent policies, see the [CLI Reference](docs/cli-reference.md).
The CLI contract specification is defined in [`cli-contract.yaml`](cli-contract.yaml) using [CLI Contracts](https://github.com/foo-log-inc/cli-contracts). Commands that have side effects declare structured `effects` metadata, and the `--introspect` global option outputs the derived policy as JSON without executing the command.
### Installation
````bash
npm install -g agent-contracts
npm install -D agent-contracts
npx agent-contracts
````
### Main commands
| Command | Description |
| --------------------------------- | ------------------------------------------------------ |
| `agent-contracts resolve [path]` | Resolve `extends` inheritance and output resolved YAML |
| `agent-contracts validate [path]` | Validate schema and references |
| `agent-contracts lint [path]` | Run semantic lint |
| `agent-contracts generate` | Generate all artifacts (templates + guardrails + interface) |
| `agent-contracts generate templates` | Render template outputs from config |
| `agent-contracts generate guardrails` | Generate guardrail artifacts from bindings |
| `agent-contracts generate interface` | Generate team interface YAML from DSL |
| `agent-contracts score [path]` | Calculate DSL completeness score |
| `agent-contracts audit <type>` | Run LLM-based semantic audit (render/dsl/prompt/all) |
| `agent-contracts check` | Run resolve → validate → lint → render --check |
| `agent-contracts navigation-index` | Build artifact-centric navigation index |
| `agent-contracts artifact-coverage` | Measure file coverage by artifact definitions |
| `agent-contracts extract` | Extract embedded CLI contract specification |
| `agent-contracts render` | _(deprecated)_ Alias for `generate templates` |
The `[path]` argument defaults to `agent-contracts.yaml` in the current directory.
If `-c` / `--config` is specified, the DSL path from the config file is used.
All commands also accept `--team <id>` to limit execution to a single team when using a [multi-team configuration](#multi-team-configuration).
#### `--introspect` (global)
Any command can be invoked with `--introspect` to output the derived policy as JSON **without executing** the command. This is useful for AI agents to inspect what side effects a command would have before deciding whether to run it.
````bash
agent-contracts generate --introspect
agent-contracts audit --introspect
agent-contracts validate --introspect
````
The output follows the [CLI Contracts](https://github.com/foo-log-inc/cli-contracts) `IntrospectionResult` shape:
````json
{
"command": "generate",
"activeOptions": ["format"],
"policy": {
"riskLevel": "low",
"requiresConfirmation": false,
"idempotent": true,
"sideEffects": ["file_write"],
"reads": [],
"writes": [
{
"kind": "semantic",
"target": "configured render, guardrail, and interface output paths",
"idempotent": true,
"source": "command:generate"
}
]
}
}
````
#### `resolve` options
| Option | Description |
|--------|-------------|
| `--format <text\|json>` | Output format (default: `text`) |
| `--expand-defaults` | Expand all Zod default values in output. Fields like `required_validations: []`, `tags: []`, and `can_read_artifacts: []` are written explicitly instead of being silently applied by schema defaults. |
| `-c, --config <path>` | Path to `agent-contracts.config.yaml` |
| `--team <id>` | Limit to one team (multi-team config only) |
#### `score` options
| Option | Description |
|--------|-------------|
| `--format <text\|json>` | Output format (default: `text`) |
| `--threshold <number>` | Minimum score; exit 1 if below (for CI gates) |
| `-c, --config <path>` | Path to `agent-contracts.config.yaml` |
| `--team <id>` | Limit to one team (multi-team config only) |
#### `audit` options
| Option | Description |
|--------|-------------|
| `--format <text\|json\|markdown>` | Output format (default: `text`) |
| `--scope <filter>` | Limit audit scope (e.g. `agents:architect,implementer`) |
| `--dry-run` | Output the audit prompt without calling the LLM |
| `--adapter <name>` | SDK adapter: `claude`, `openai`, `gemini`, `cursor` (overrides config) |
| `--model <name>` | LLM model override (overrides config) |
| `-l, --log-file <path>` | Write structured agent progress log to a file |
| `-c, --config <path>` | Path to `agent-contracts.config.yaml` |
| `--team <id>` | Limit to one team (multi-team config only) |
The `audit` command requires `agent-contracts-runtime` (optional peer dependency) to be installed. Configure the default adapter and model in `agent-contracts.config.yaml`:
```yaml
audit:
adapter: openai
model: gpt-4.1
```
#### `artifact-coverage` options
| Option | Description |
|--------|-------------|
| `--format <text\|json>` | Output format (default: `text`) |
| `--min-coverage <number>` | Minimum coverage %; exit 1 if below (for CI gates) |
| `-c, --config <path>` | Path to `agent-contracts.config.yaml` |
| `--team <id>` | Limit to one team (multi-team config only) |
Configure additional exclude patterns in `agent-contracts.config.yaml`:
```yaml
artifact_coverage:
exclude_patterns:
- "*.lock"
- "**/*.snap"
- ".cursor/**"
```
The score command evaluates 7 dimensions:
| Dimension | What it measures | Weight |
|-----------|-----------------|--------|
| Artifact validation coverage | % of artifacts with non-empty `required_validations` | High |
| Task validation coverage | % of tasks with at least one entry in `validations` | High |
| Guardrail policy coverage | % of guardrails referenced by at least one policy rule | Medium |
| Workflow validation integration | % of blocking validations referenced in workflow steps or tasks | High |
| Schema completeness | % of optional fields filled (description, rationale, trigger, etc.) | Low |
| Cross-reference bidirectionality | % of agent↔artifact, agent↔tool refs that are reciprocated | Medium |
| Guardrail scope resolution | % of guardrail scope entries that resolve to existing entities | Medium |
### Common usage
````bash
agent-contracts resolve
agent-contracts resolve --expand-defaults --format json
agent-contracts validate
agent-contracts lint --strict
agent-contracts score
agent-contracts score -c agent-contracts.config.yaml --threshold 70
agent-contracts score --format json
agent-contracts generate -c agent-contracts.config.yaml
agent-contracts generate templates -c agent-contracts.config.yaml
agent-contracts generate templates -c agent-contracts.config.yaml --check
agent-contracts check -c agent-contracts.config.yaml --strict
agent-contracts generate interface -c agent-contracts.config.yaml
agent-contracts generate interface -c agent-contracts.config.yaml --dry-run
agent-contracts generate interface -c agent-contracts.config.yaml --format json
agent-contracts audit dsl -c agent-contracts.config.yaml
agent-contracts audit render -c agent-contracts.config.yaml --format json
agent-contracts audit all -c agent-contracts.config.yaml --adapter claude
agent-contracts audit dsl -c agent-contracts.config.yaml --dry-run
agent-contracts navigation-index
agent-contracts navigation-index --format yaml
agent-contracts navigation-index --artifact api-contracts
agent-contracts artifact-coverage
agent-contracts artifact-coverage --format json
agent-contracts artifact-coverage --min-coverage 80
agent-contracts artifact-coverage -c agent-contracts.config.yaml
````
---
## Config-driven rendering
Rendering is configured via `agent-contracts.config.yaml`.
````yaml
dsl: ./agent-contracts.yaml
vars:
project_name: "my-service"
language: "TypeScript"
repo_url: "https://github.com/org/my-service"
renders:
- template: ./templates/agent-prompt.md.hbs
context: agent
output: ./output/{agent.id}.md
- template: ./templates/overview.md.hbs
context: system
output: ./output/overview.md
````
This lets you generate static outputs for:
* agent prompts
* task specs
* overviews
* artifact docs
* validation docs
* workflow docs
all from the same resolved DSL.
### Multi-team configuration
When several teams (for example backend, QA, infra) are managed from one workspace, you can list every team in a single config file instead of maintaining separate configs.
This complements the DSL-level [multi-team collaboration](#multi-team-collaboration) features (`team_interface`, `imports`, `team_task`).
````yaml
teams:
_defaults:
bindings:
- ./bindings/cursor.yaml
vars:
language: TypeScript
paths:
cursor_root: .cursor
active_guardrail_policy: default-enforcement
backend:
dsl: ./teams/backend/agent-contracts.yaml
interface_output: ./teams/backend/team-interface.yaml
bindings:
- ./teams/backend/bindings/observability.yaml
vars:
team_name: backend
qa:
dsl: ./teams/qa/agent-contracts.yaml
vars:
team_name: qa
````
**`_defaults`:** Reserved meta-entry in the `teams` map. It uses the same schema as team entries except `dsl` is not required. Values are inherited by all teams. The underscore prefix avoids colliding with real team IDs.
**Merge with `_defaults`:**
* `bindings` — `_defaults` bindings are prepended before team-specific bindings
* `vars` — shallow merge; team values win
* `paths` — shallow merge; team values win
* `active_guardrail_policy` — team wins when present
All commands accept `--team <id>` to run against a single team:
````bash
agent-contracts validate -c config.yaml # all teams
agent-contracts validate -c config.yaml --team backend # one team
agent-contracts check -c config.yaml --team qa # one team
````
The `check` command also validates that imported interface files exist on disk (cross-team references).
**Design constraints:**
* `dsl` and `teams` are mutually exclusive at the config root
* Every team except `_defaults` must specify `dsl`
* Existing single-team configs (top-level `dsl` only) remain valid unchanged
### Artifact binding (config-level)
The `artifact_binding` config field connects DSL artifact definitions to an external artifact registry (e.g., `artifact-contracts.yaml`). Registry values override DSL defaults using deep-merge semantics.
Two forms are supported:
````yaml
# Simple form (IDs match between DSL and registry)
artifact_binding: ./artifact-contracts.yaml
# Explicit mapping form (IDs differ)
artifact_binding:
source: ./artifact-contracts.yaml
mappings:
openapi-spec: billing_api_contract
````
**Merge semantics:**
* Registry fields override DSL fields (deep-merge at field level)
* DSL-only fields are preserved
* `{var}` templates in `path_patterns` are substituted using `config.paths`
**Diagnostics:**
| Rule | Severity | Description |
|------|----------|-------------|
| `unbound-artifact` | warning | DSL artifact has no registry counterpart |
| `orphan-binding` | warning | Registry artifact has no DSL counterpart |
| `type-mismatch` | warning | DSL and registry disagree on `type`/`authority` |
**Placement:** Top-level for single-team configs, or per-team in `teams` (inheritable from `_defaults`).
Currently consumed by `navigation-index` and `artifact-coverage` commands. When not configured, behavior is unchanged (full backward compatibility).
### Render target options
Each entry in `renders` supports these fields:
| Field | Type | Required | Description |
|-------|------|----------|-------------|
| `template` | string | yes | Path to Handlebars template |
| `context` | string | yes | Context type (see below) |
| `output` | string | yes | Output file path (supports `{<context>.id}` placeholder) |
| `include` | string[] | no | Only render these entity IDs (not with `system`) |
| `exclude` | string[] | no | Skip these entity IDs (not with `system`) |
| `skip_empty` | boolean | no | When `true`, if the rendered output is empty or whitespace-only, the file is **not written**. If the file already exists, it is **deleted**. |
#### `skip_empty` usage
`skip_empty` is useful when a single template applies to all entities of a context type, but only some entities produce meaningful output.
For example, when using `context: tool` to generate per-tool scripts, tools without an `x-script` property would produce empty files. Wi