@dollhousemcp/mcp-server
Version:
DollhouseMCP - A Model Context Protocol (MCP) server that enables dynamic AI persona management from markdown files, allowing Claude and other compatible AI assistants to activate and switch between different behavioral personas.
122 lines • 5.54 kB
TypeScript
/**
* Autonomy Evaluator Service
*
* Determines whether an agent should continue autonomously or pause
* for human input after each step. This is the brain that enables
* automatic continue/pause decisions in the agentic loop.
*
* Decision factors:
* 1. Step count vs maxAutonomousSteps
* 2. Next action vs requiresApproval patterns
* 3. Safety tier (ALLOW/VERIFY/DENY)
* 4. Risk tolerance configuration
*
* ## Danger Zone Verification Flow (Issue #142)
*
* When an agent action triggers the DANGER_ZONE or VERIFY safety tier,
* a human-in-the-loop verification is required before the agent can proceed.
*
* End-to-end sequence:
*
* 1. Agent proposes an action (nextActionHint) during agentic loop
* 2. evaluateAutonomy() calls checkSafetyTier() → returns DANGER_ZONE or VERIFY
* 3. createVerificationChallenge() generates a challenge with:
* - challengeId (UUID v4 via crypto.randomUUID)
* - displayCode (human-readable code, e.g. "ABC123")
* - expiresAt (configurable, default 5 minutes)
* 4. storeAndDisplayChallenge():
* a. Stores {code, expiresAt, reason} in VerificationStore (server-side, one-time use)
* b. Shows displayCode to human via OS-native dialog (AppleScript/zenity/PowerShell)
* c. displayCode is NEVER included in the LLM-facing directive
* 5. For DANGER_ZONE: DangerZoneEnforcer.block() programmatically blocks the agent
* with verificationId=challengeId (file-backed, survives restarts)
* 6. LLM receives directive with verification.verificationId but NO displayCode
* plus actionable guidance: "Use verify_challenge { challenge_id, code }"
* 7. Human reads code from OS dialog, tells LLM
* 8. LLM calls verify_challenge MCP-AQL operation
* 9. MCPAQLHandler verify handler:
* a. Validates UUID v4 format (rejects invalid IDs before store lookup)
* b. Checks rate limit (max 10 failures per 60s sliding window)
* c. Distinguishes expired (VERIFICATION_EXPIRED) from wrong code (VERIFICATION_FAILED)
* d. On success: finds blocked agent by challengeId → DangerZoneEnforcer.unblock()
* e. Logs granular security events at each stage
* 10. Agent retries the operation → passes (no longer blocked)
*
* Security invariants:
* - displayCode never reaches the LLM (stripped before directive is built)
* - Codes are one-time use (VerificationStore deletes after any verify attempt)
* - Expired challenges are rejected with distinct VERIFICATION_EXPIRED events
* - Challenge IDs are validated as UUID v4 before store lookup (anti-enumeration)
* - Failed attempts are rate-limited globally (anti-brute-force)
* - All verification events are logged with granular types for monitoring
*
* Part of the Agentic Loop Completion (Epic #380).
*
* @since v2.0.0
*/
import type { AutonomyContext, AutonomyDirective, AgentAutonomyConfig } from './types.js';
/**
* Default autonomy configuration
*
* Used when an agent doesn't specify autonomy settings.
* Conservative defaults prioritize safety over speed.
*
* Note: maxAutonomousSteps uses a getter to support env-var overrides
* (Issue #390). The static value here serves as the baseline; callers
* should prefer mergeWithDefaults() which reads the live config.
*/
export declare const DEFAULT_AUTONOMY_CONFIG: Required<AgentAutonomyConfig>;
/**
* Snapshot of autonomy evaluation metrics.
* Non-persisted, in-memory counters reset on server restart.
*/
export interface AutonomyMetricsSnapshot {
/** Total evaluateAutonomy() calls since startup */
totalEvaluations: number;
/** Number of evaluations that returned continue=true */
continueCount: number;
/** Number of evaluations that returned continue=false */
pauseCount: number;
/** Distribution of pause reasons (reason string → count) */
pauseReasons: Record<string, number>;
/** Number of danger zone blocks triggered */
dangerZoneTriggered: number;
/** Number of verification challenges created */
verificationRequired: number;
/** Average step count at time of pause (0 if no pauses yet) */
averageStepCountAtPause: number;
}
/**
* Get the current autonomy evaluation metrics snapshot.
* Issue #391: Programmatic access for monitoring/diagnostics.
*/
export declare function getAutonomyMetrics(): AutonomyMetricsSnapshot;
/**
* Reset autonomy metrics (for testing only).
* @internal
*/
export declare function resetAutonomyMetrics(): void;
/**
* Evaluate whether an agent should continue autonomously or pause
*
* This is the main entry point for autonomy decisions. Call this after
* each step to determine if the LLM can proceed or must wait for
* human approval.
*
* @param context - The autonomy evaluation context
* @returns AutonomyDirective indicating continue or pause
*/
export declare function evaluateAutonomy(context: AutonomyContext): AutonomyDirective;
/**
* Quick check if an action would be auto-approved given an autonomy config.
*
* Utility function for pre-checking actions before execution.
* Does NOT evaluate safety tiers or risk scores — only pattern matching.
*
* @param action - The proposed action string to check
* @param config - Optional autonomy config (uses defaults if not provided)
* @returns true if the action matches an autoApprove pattern or the agent
* uses aggressive tolerance; false if it matches requiresApproval or no match
*/
export declare function wouldAutoApprove(action: string, config?: AgentAutonomyConfig): boolean;
//# sourceMappingURL=autonomyEvaluator.d.ts.map