@blundergoat/goat-flow

Version:

AI coding agent harness and local dashboard for Claude Code, OpenAI Codex, Google Antigravity, and GitHub Copilot - setup audits, guardrails, structured skills, deny hooks, and persistent learning loops.

github.com/blundergoat/goat-flow

blundergoat/goat-flow

90 lines (69 loc) • 9.98 kB

Markdown

View Raw

# Deterministic Audit Checks `npx goat-flow audit` currently registers **37 deterministic checks**: - **20 build checks**: 16 setup-scope checks plus 4 agent-scope checks - **17 harness checks**: additional checks enabled by `--harness` Default `npx goat-flow audit .` runs the build checks. `npx goat-flow audit . --harness` runs those same build checks plus the harness checks. Harness checks are still deterministic even when they are typed as `integrity`, `advisory`, or `metric`; the type changes scoring behavior, not whether the check is deterministic. Machine-readable output is available with `--format json` or `--format sarif`. SARIF output registers each active audit check id as a SARIF rule id and emits results only for actionable audit findings: failing setup/agent/harness checks plus drift/content findings when those optional checks are enabled. SARIF rule ids are derived from the stable goat-flow check ids below, so renaming a check id is a downstream alert-identity change for SARIF consumers. Source of truth: - `src/cli/audit/check-goat-flow.ts` exports `SETUP_CHECKS` - `src/cli/audit/check-agent-setup.ts` exports `AGENT_CHECKS` - `src/cli/audit/harness/index.ts` exports `HARNESS_CHECKS` ## Build Checks Build mode is the structural install gate. It validates files, directories, config, skills, settings, and deny wiring. It does **not** execute the project's lint, test, or build toolchain commands. ### Setup Scope (16) | Check id | Display name | What it validates | |----------|--------------|-------------------| | `lessons` | Lessons | `.goat-flow/learning-loop/lessons/` and `.goat-flow/learning-loop/lessons/README.md` exist | | `footguns` | Footguns | `.goat-flow/learning-loop/footguns/` and `.goat-flow/learning-loop/footguns/README.md` exist | | `architecture` | Architecture | `.goat-flow/architecture.md` exists | | `code-map` | Code map | `.goat-flow/code-map.md` exists | | `glossary` | Glossary | `.goat-flow/glossary.md` exists | | `patterns` | Patterns | `.goat-flow/learning-loop/patterns/README.md` exists | | `decisions` | Decisions | `.goat-flow/learning-loop/decisions/` exists | | `session-logs` | Session logs | `.goat-flow/logs/sessions/` exists | | `plans` | Plans | `.goat-flow/plans/`, `.goat-flow/plans/.gitignore`, and `.goat-flow/plans/README.md` exist | | `scratchpad` | Scratchpad | `.goat-flow/scratchpad/`, `.goat-flow/scratchpad/.gitignore`, and `.goat-flow/scratchpad/README.md` exist | | `goat-flow-gitignore` | goat-flow gitignore exceptions | `.goat-flow/.gitignore` exists and contains the `!learning-loop/`, `!learning-loop/**`, `!skill-docs/`, `!skill-docs/**`, `!hooks/`, `!hooks/**`, `!plans/`, and `!plans/**` un-ignore entries. Catches stale installs whose gitignore silently hides committed goat-flow surfaces from git. Remediation re-runs the installer and prompts a `git add` of the previously hidden directories | | `instruction-file-skill-docs-pointer` | Instruction file skill-docs/playbooks pointer | Requires the full skill-docs and playbook pack, a READ-step rule to consult `.goat-flow/skill-docs/playbooks/` and run playbook Availability Checks before declaring tools unavailable, and a Router Table pointer in each present instruction file (`CLAUDE.md`, `AGENTS.md`, `.github/copilot-instructions.md`). Missing `.goat-flow/skill-docs/` or `.goat-flow/skill-docs/playbooks/` files fail here instead of falling through to `other-files` | | `other-files` | Other required files | Every manifest-required file or directory not already covered by a named setup check exists, including local log README surfaces | | `config-parses` | Config file | `.goat-flow/config.yaml` exists, parses as YAML, and validates against the manifest-backed config contract | | `config-version` | Config version | `.goat-flow/config.yaml` declares the current `AUDIT_VERSION` | | `hook-version` | Hook version | Installed hook dispatchers in `.goat-flow/hooks/` (`deny-dangerous.sh`, `gruff-code-quality.sh`, `post-turn-safety.sh`) carry the current `goat-flow-hook-version` stamp; a missing or behind stamp signals a partial upgrade and the fix re-runs `hooks sync` | ### Agent Scope (4) | Check id | Display name | What it validates | |----------|--------------|-------------------| | `agent-instruction` | Agent instruction file | The selected agent's instruction file exists; for Copilot, `.github/copilot-instructions.md` must also reference `docs/coding-standards/git-commit.md` under a `## Commit Messages` section. Without `--agent`, this also detects orphaned agent artifacts and incomplete Copilot installs | | `agent-skills` | Agent skills | The selected agent has every canonical skill file, each installed skill declares the current `goat-flow-skill-version`, and no deprecated skill directories remain | | `agent-settings` | Agent settings | The selected agent's settings file parses as valid JSON or TOML | | `agent-guardrails` | Agent deny mechanism | The selected agent has a deny mechanism, any installed shell hooks pass `bash -n`, deny patterns exist, installed `deny-dangerous.sh` plus `.goat-flow/hooks/deny-dangerous/` match the workflow templates, `deny-dangerous-self-test.sh --self-test=smoke` passes when the hook scripts exist, and a runtime-shaped blocked Bash payload is denied through the registered hook path | Aggregate-mode nuance: - The audit report always includes these 4 registered agent-scope checks. - Without `--agent <id>`, only `agent-instruction` can actively fail; the other 3 agent-scope checks are effectively no-ops until the audit is scoped to a concrete agent. ## Harness Checks `npx goat-flow audit . --harness` adds **17** deterministic harness-completeness checks on top of the 20 build checks. These checks are grouped by concern and typed as `integrity`, `advisory`, or `metric`. JSON output exposes each check's raw `status` plus `displayStatus`, `impact`, and optional `assurance` so score-only metric/advisory warnings and platform-limited passes do not look like ordinary hard failures or full-assurance passes. | Concern | Check id | Type | What it validates | |---------|----------|------|-------------------| | Context | `instruction-line-count` | `advisory` | Each configured instruction file stays within `lineLimits.limit` from `.goat-flow/config.yaml` | | Context | `execution-loop-present` | `advisory` | Structural smoke check for the Execution Loop heading plus READ / SCOPE / ACT / VERIFY vocabulary | | Context | `doc-paths-resolve` | `integrity` | Router-table paths, `.goat-flow/architecture.md` backtick paths, and curated audit/glossary docs backtick paths resolve to real files | | Context | `instruction-sections-present` | `advisory` | Structural smoke check for required hot-path headings: Truth Order, Execution Loop, Definition of Done, and Router Table | | Context | `boundary-guidance-present` | `advisory` | Structural smoke check for workspace boundary guidance (controlling workspace vs target workspace separation) | | Constraints | `deny-covers-secrets` | `integrity` | Direct literal secret-path reads are blocked by the deny layer; agents with file-read deny need both settings/Codex permission coverage and Bash-hook direct-path coverage. Codex permission coverage is limited to exact paths and trailing `/**` subtrees accepted by the current CLI. Script-only agents can pass with `assurance: "limited"` because file-read deny is unavailable; a 100 constraints score means known deny-pattern coverage, not broad file read/write enforcement. | | Constraints | `deny-blocks-dangerous` | `integrity` | Deny patterns block broad recursive deletion, all git push (ADR-025), and `chmod` | | Constraints | `deny-blocks-pipe-to-shell` | `advisory` | Deny patterns block `curl | bash` and `wget | sh` pipe-to-shell execution | | Constraints | `deny-hook-registered` | `integrity` | A deny hook that exists on disk is registered in the correct pre-tool hook slot | | Verification | `hooks-registered` | `integrity` | Post-turn hook registrations and on-disk hook files stay in sync | | Verification | `commit-guidance` | `advisory` | Commit guidance exists at the canonical `docs/coding-standards/git-commit.md`; old GitHub commit-guidance locations are flagged as misplaced | | Verification | `evidence-before-claims` | `metric` | Present instruction files carry the Hallucination red-flags clauses and Rationalisations-to-reject pointer | | Verification | `post-turn-hook-integrity` | `metric` | For agents with a manifest-backed post-turn event, reports whether the registered post-turn hook is the universal safety guard or a custom hook with literal validation commands and whether it swallows failures; absence or masking is no hook evidence, not proof. Agents with `supportsPostTurnHook=false` are skipped as not applicable instead of penalized. | | Recovery | `milestone-tracking` | `integrity` | `.goat-flow/plans/` exists; task count, checkbox completion, milestone status, and roadmap progress are optional local workflow state | | Recovery | `session-logs` | `integrity` | `.goat-flow/logs/sessions/` exists | | Feedback loop | `feedback-loop-active` | `integrity` | The lessons and footguns directories exist, with valid metadata and non-stale evidence references | | Feedback loop | `decisions-tracked` | `integrity` | `.goat-flow/learning-loop/decisions/` exists | ## Command Matrix | Command | Checks included | Notes | |---------|-----------------|-------| | `npx goat-flow audit .` | 16 setup + 4 agent = 20 build checks | Structural install gate only | | `npx goat-flow audit . --agent <id>` | Same 20 build checks, with agent checks enforced for the selected agent | Best way to validate one runtime's install state | | `npx goat-flow audit . --harness` | 20 build + 17 harness = 37 checks | Adds harness completeness, still deterministic | Harness mode is still structural. It does not judge whether the content is actually good for the project; that remains the job of `npx goat-flow quality`.