xc-mcp
Version:
MCP server that wraps Xcode command-line tools for iOS/macOS development workflows
36 lines • 4.44 kB
TypeScript
/**
* Workflow: Tap Element - High-level semantic UI interaction
*
* Orchestrates multiple tools to find and tap a UI element by name/label:
* 1. accessibility-quality-check → Assess UI richness
* 2. idb-ui-find-element → Find element by semantic search
* 3. idb-ui-tap → Tap at element coordinates
* 4. (optional) idb-ui-input → Type text after tap
* 5. (optional) screenshot → Verify result
*
* This workflow keeps intermediate results internal, returning only the final outcome.
* Reduces agent context usage by ~80% compared to calling each tool manually.
*
* Part of the Programmatic Tool Calling pattern from Anthropic:
* https://www.anthropic.com/engineering/advanced-tool-use
*/
export interface TapElementArgs {
elementQuery: string;
inputText?: string;
verifyResult?: boolean;
udid?: string;
screenContext?: string;
}
export declare function workflowTapElementTool(args: TapElementArgs): Promise<{
content: {
type: "text";
text: string;
}[];
isError: boolean;
}>;
/**
* Workflow Tap Element documentation for RTFM
*/
export declare const WORKFLOW_TAP_ELEMENT_DOCS = "\n# workflow-tap-element\n\nHigh-level semantic UI interaction - find and tap elements by name without coordinate hunting.\n\n## Overview\n\nOrchestrates accessibility-first UI automation in a single call:\n1. **Check Accessibility** - Assess UI richness for automation approach\n2. **Find Element** - Semantic search by label/identifier\n3. **Tap Element** - Execute tap at discovered coordinates\n4. **Input Text** (optional) - Type into tapped field\n5. **Verify Result** (optional) - Screenshot for confirmation\n\nThis workflow keeps intermediate results internal, reducing agent context usage by ~80% compared to calling each tool manually.\n\n## Parameters\n\n### Required\n- **elementQuery** (string): Search term for element (e.g., \"Login\", \"Submit\", \"Email\")\n - Case-insensitive partial matching (\"log\" matches \"Login\")\n\n### Optional\n- **inputText** (string): Text to type after tapping (for text fields)\n- **verifyResult** (boolean): Take screenshot after action (default: false)\n- **udid** (string): Target device - auto-detected if omitted\n- **screenContext** (string): Screen name for tracking (e.g., \"LoginScreen\")\n\n## Returns\n\nConsolidated result with:\n- **success**: Overall workflow success\n- **tappedElement**: Found element details (type, label, coordinates)\n- **inputText**: Text entry status (if requested)\n- **verified**: Screenshot status (if requested)\n- **accessibilityQuality**: UI richness assessment\n- **totalDuration**: Total workflow time\n- **guidance**: Next steps\n\n## Examples\n\n### Tap Login Button\n```json\n{\"elementQuery\": \"Login\"}\n```\nFinds and taps the Login button.\n\n### Tap Email Field and Enter Text\n```json\n{\n \"elementQuery\": \"Email\",\n \"inputText\": \"user@example.com\",\n \"screenContext\": \"LoginScreen\"\n}\n```\nFinds email field, taps it, enters text.\n\n### Full Verification Workflow\n```json\n{\n \"elementQuery\": \"Submit\",\n \"verifyResult\": true,\n \"screenContext\": \"SignupForm\"\n}\n```\nTaps Submit button and captures verification screenshot.\n\n## Why Use This Workflow?\n\n### Token Efficiency\n- **Manual approach**: 4-5 tool calls \u00D7 ~50 tokens each = ~200+ tokens in responses\n- **Workflow approach**: 1 call with consolidated response = ~80 tokens\n\n### Reduced Context Pollution\n- Intermediate accessibility data not exposed\n- Element search results summarized\n- Only actionable outcome returned\n\n### Error Handling\n- Graceful degradation on partial failures\n- Helpful guidance when element not found\n- Clear troubleshooting steps\n\n## Related Tools\n\n- **idb-ui-find-element**: Direct element search (used internally)\n- **idb-ui-tap**: Direct tap (used internally)\n- **accessibility-quality-check**: Direct quality check (used internally)\n- **workflow-fresh-install**: Clean app installation workflow\n\n## Notes\n\n- Falls back gracefully if accessibility is minimal\n- Non-fatal errors (input, screenshot) don't fail the workflow\n- Element matching uses partial, case-insensitive search\n- Small delay between tap and input for keyboard appearance\n";
export declare const WORKFLOW_TAP_ELEMENT_DOCS_MINI = "Find and tap elements by name. Use rtfm({ toolName: \"workflow-tap-element\" }) for docs.";
//# sourceMappingURL=tap-element.d.ts.map