UNPKG

mcp-appium-visual

Version:

MCP Server for Appium mobile automation with visual recovery

658 lines (524 loc) 17.7 kB
# MCP-Appium Example: Testing a Mobile App This document provides a complete example of using MCP-Appium to automate testing for a mobile application. ## Example Scenario: Testing a Calculator App In this example, we'll automate testing of a calculator app on an Android device: 1. Launch the calculator app 2. Perform a simple addition (5 + 7) 3. Verify the result (12) 4. Take a screenshot of the result ### Step 1: Setting Up the Environment First, ensure you have: - Appium server running (`appium`) - Android device connected or emulator running - MCP-Appium server built (`npm run build`) ### Step 2: Create a Test Script Create a file called `calculator-test.ts` in your project: ```typescript import { McpClient } from "@modelcontextprotocol/sdk/client/mcp.js"; import { StdioClientTransport } from "@modelcontextprotocol/sdk/client/stdio.js"; import { spawn } from "child_process"; import * as path from "path"; async function testCalculator() { console.log("Starting MCP-Appium test for calculator app..."); // Start the MCP-Appium server as a child process const serverPath = path.resolve(process.cwd(), "dist", "index.js"); const serverProcess = spawn("node", [serverPath]); // Log server output for debugging serverProcess.stderr.on("data", (data) => { console.error(`Server: ${data.toString()}`); }); try { // Connect to the MCP server const transport = new StdioClientTransport({ input: serverProcess.stdout, output: serverProcess.stdin, }); const client = new McpClient(transport); await client.initialize(); console.log("Connected to MCP-Appium server"); // Step 1: List connected devices const devicesResult = await client.callTool({ name: "list-devices", arguments: {}, }); console.log("Available devices:"); console.log(devicesResult.content[0].text); // Get the first device ID (you may want to select a specific one) const deviceIdLine = devicesResult.content[0].text.split("\n")[1]; if (!deviceIdLine || deviceIdLine.includes("No devices connected")) { throw new Error( "No devices found. Please connect a device or start an emulator." ); } const deviceId = deviceIdLine.trim(); console.log(`Using device: ${deviceId}`); // Step 2: Initialize Appium session for the calculator app console.log("Starting Appium session for calculator app..."); await client.callTool({ name: "initialize-appium", arguments: { platformName: "Android", deviceName: deviceId, appPackage: "com.google.android.calculator", // Package name for Google Calculator appActivity: "com.android.calculator2.Calculator", // Main activity automationName: "UiAutomator2", }, }); // Step 3: Take a screenshot of initial state console.log("Taking initial screenshot..."); const initialScreenshot = await client.callTool({ name: "appium-screenshot", arguments: { name: "calculator-initial" }, }); console.log(initialScreenshot.content[0].text); // Step 4: Perform calculation (5 + 7 = 12) console.log("Performing calculation: 5 + 7..."); // Tap on '5' await client.callTool({ name: "tap-element", arguments: { selector: '//android.widget.Button[@text="5"]', strategy: "xpath", }, }); // Tap on '+' await client.callTool({ name: "tap-element", arguments: { selector: '//android.widget.Button[@content-desc="plus"]', strategy: "xpath", }, }); // Tap on '7' await client.callTool({ name: "tap-element", arguments: { selector: '//android.widget.Button[@text="7"]', strategy: "xpath", }, }); // Tap on '=' await client.callTool({ name: "tap-element", arguments: { selector: '//android.widget.Button[@content-desc="equals"]', strategy: "xpath", }, }); // Step 5: Wait for result and verify console.log("Waiting for result..."); await client.callTool({ name: "wait-for-element", arguments: { selector: '//android.widget.TextView[@resource-id="com.google.android.calculator:id/result"]', strategy: "xpath", timeoutMs: 5000, }, }); // Step 6: Get page source to analyze result const pageSource = await client.callTool({ name: "get-page-source", arguments: {}, }); // Step 7: Take a screenshot of the result console.log("Taking result screenshot..."); const resultScreenshot = await client.callTool({ name: "appium-screenshot", arguments: { name: "calculator-result" }, }); console.log(resultScreenshot.content[0].text); // Step 8: Extract locators from the page source to find the result element const extractResult = await client.callTool({ name: "extract-locators", arguments: { xmlSource: pageSource.content[0].text, elementType: "android.widget.TextView", maxResults: 5, }, }); console.log("Found elements:"); console.log(extractResult.content[0].text); // Step 9: Close the Appium session console.log("Closing Appium session..."); await client.callTool({ name: "close-appium", arguments: {}, }); console.log("Test completed successfully!"); } catch (error) { console.error("Test failed:", error); } finally { // Kill the server process serverProcess.kill(); } } // Run the test testCalculator().catch(console.error); ``` ### Step 3: Run the Test Compile and run the test script: ```bash # Compile the script npx tsc calculator-test.ts --esModuleInterop --target es2020 --module NodeNext # Run the test node calculator-test.js ``` ## Alternative: Using MCP-Appium with Claude Desktop You can also use the MCP-Appium server through Claude Desktop: 1. Configure Claude Desktop to use MCP-Appium (see README) 2. In Claude Desktop, ask questions like: - "Can you help me test a calculator app on my Android device?" - "I need to automate testing of this Android app with package name com.example.myapp" - "Show me all the installed apps on my connected Android device" Claude will use the MCP-Appium tools to: 1. Check for connected devices 2. Provide insights about the app structure 3. Generate test scripts for you 4. Execute commands to interact with the app ## Using MCP-Appium for Real App Testing For testing your own application: 1. **Find App Package and Activity**: ```bash adb shell dumpsys window | grep -E 'mCurrentFocus|mFocusedApp' ``` 2. **Update the Appium Initialization**: ```typescript await client.callTool({ name: "initialize-appium", arguments: { platformName: "Android", deviceName: deviceId, appPackage: "your.app.package", appActivity: "your.app.MainActivity", automationName: "UiAutomator2", }, }); ``` 3. **Find Element Locators**: - Launch your app - Get the page source and extract locators: ```typescript const pageSource = await client.callTool({ name: "get-page-source", arguments: {}, }); const extractResult = await client.callTool({ name: "extract-locators", arguments: { xmlSource: pageSource.content[0].text, maxResults: 20, }, }); ``` 4. **Create Interaction Sequences**: - Identify the elements you want to interact with - Create a sequence of tap, input, and wait operations - Verify results using screenshots and page source analysis ## Advanced: Testing UI Flows For testing more complex UI flows, combine multiple actions: ```typescript // Login flow example async function testLoginFlow(client, username, password) { // Tap username field await client.callTool({ name: "tap-element", arguments: { selector: '//android.widget.EditText[@resource-id="username_field"]', strategy: "xpath", }, }); // Enter username await client.callTool({ name: "send-keys", arguments: { selector: '//android.widget.EditText[@resource-id="username_field"]', text: username, strategy: "xpath", }, }); // Tap password field await client.callTool({ name: "tap-element", arguments: { selector: '//android.widget.EditText[@resource-id="password_field"]', strategy: "xpath", }, }); // Enter password await client.callTool({ name: "send-keys", arguments: { selector: '//android.widget.EditText[@resource-id="password_field"]', text: password, strategy: "xpath", }, }); // Tap login button await client.callTool({ name: "tap-element", arguments: { selector: '//android.widget.Button[@text="Login"]', strategy: "xpath", }, }); // Wait for home screen await client.callTool({ name: "wait-for-element", arguments: { selector: '//android.widget.TextView[@text="Welcome"]', strategy: "xpath", timeoutMs: 10000, }, }); // Take screenshot of home screen await client.callTool({ name: "appium-screenshot", arguments: { name: "home-screen" }, }); } ``` ## Tips for Effective Mobile Testing with MCP-Appium 1. **Dynamic Wait Strategies**: Always use `wait-for-element` before interacting with elements to ensure they're visible 2. **Error Handling**: Wrap each interaction in try/catch blocks for robust test scripts 3. **Screenshots**: Take screenshots at key points in your test flow for debugging 4. **Device Management**: Always check for connected devices before starting tests 5. **Element Location**: Use multiple strategies (ID, XPath, Accessibility ID) for reliable element location 6. **Test Organization**: Organize tests into small, reusable functions for different flows # MCP-Appium Examples This document provides practical examples of how to use MCP-Appium for mobile app testing and automation. ## Basic Setup with Claude ### Configuration Example Add MCP-Appium to your Claude Desktop configuration file: ```json { "servers": [ { "name": "MCP-Appium", "transport": "stdio", "command": "mcp-appium" } ] } ``` ### Example Conversation with Claude ``` User: I want to test my Android app. Can you help me connect to my device and take a screenshot? Claude: I'd be happy to help you test your Android app! Let's connect to your device and take a screenshot. First, let's check what devices are connected: [Claude uses list-devices tool to find connected Android devices] Great! I found your connected device. Now let's initialize an Appium session and take a screenshot: [Claude uses initialize-appium and appium-screenshot tools] Here's your screenshot! The screenshot has been saved to: ./test-screenshots/screenshot_2025-05-14_12-34-56.png Would you like me to help you interact with any specific elements on the screen? ``` ## Command Line Interface Examples ### Starting the MCP-Appium Server ```bash # Basic startup (starts both Appium and MCP server) mcp-appium # Show version information mcp-appium version # Show help mcp-appium help ``` ### Using the Interactive CLI ```bash # Launch the interactive CLI mcp-appium cli ``` In the CLI, you can: 1. List connected devices 2. Install APK files 3. Launch apps 4. Take screenshots 5. Find elements by text or XPath 6. Interact with UI elements 7. Extract element locators for automation ## Programmatic Usage Examples ### Basic Android Test ```typescript import { AppiumHelper, AppiumCapabilities } from "mcp-appium"; async function testAndroidApp() { // Initialize AppiumHelper with screenshot directory const appiumHelper = new AppiumHelper("./screenshots"); try { // Set up capabilities for Android const capabilities: AppiumCapabilities = { platformName: "Android", deviceName: "Android Device", automationName: "UiAutomator2", // For an installed app: appPackage: "com.example.app", appActivity: "com.example.app.MainActivity", noReset: true, }; // Initialize driver const driver = await appiumHelper.initializeDriver(capabilities); console.log("Driver initialized successfully"); // Take a screenshot const screenshotPath = await appiumHelper.takeScreenshot("initial_screen"); console.log(`Screenshot taken: ${screenshotPath}`); // Find and tap an element await appiumHelper.tapElement( '//android.widget.Button[@text="Login"]', "xpath" ); // Send keys to a text field await appiumHelper.sendKeys( '//android.widget.EditText[@resource-id="username"]', "testuser", "xpath" ); await appiumHelper.sendKeys( '//android.widget.EditText[@resource-id="password"]', "password123", "xpath" ); // Tap login button await appiumHelper.tapElement( '//android.widget.Button[@text="Submit"]', "xpath" ); // Wait for the next screen to load await appiumHelper.waitForElement( '//android.widget.TextView[@text="Welcome"]', "xpath", 5000 ); // Take another screenshot await appiumHelper.takeScreenshot("logged_in"); } catch (error) { console.error("Test failed:", error); } finally { // Close the driver await appiumHelper.closeDriver(); } } testAndroidApp().catch(console.error); ``` ### iOS Test Example ```typescript import { AppiumHelper, AppiumCapabilities } from "mcp-appium"; async function testIosApp() { const appiumHelper = new AppiumHelper("./screenshots"); try { // Set up capabilities for iOS const capabilities: AppiumCapabilities = { platformName: "iOS", deviceName: "iPhone", automationName: "XCUITest", // For an installed app: bundleId: "com.example.app", // Or to install and launch: // app: '/path/to/app.ipa', noReset: true, }; // Initialize driver await appiumHelper.initializeDriver(capabilities); // Take initial screenshot await appiumHelper.takeScreenshot("ios_initial"); // Find element using iOS predicate string const loginButton = await appiumHelper.findByIosPredicate( 'type == "XCUIElementTypeButton" AND name == "Login"' ); await loginButton.click(); // Send keys to text fields await appiumHelper.sendKeys( '//XCUIElementTypeTextField[@name="username"]', "testuser", "xpath" ); await appiumHelper.sendKeys( '//XCUIElementTypeSecureTextField[@name="password"]', "password123", "xpath" ); // Tap login button using iOS class chain await appiumHelper .findByIosClassChain('**/XCUIElementTypeButton[`name == "Submit"`]') .then((el) => el.click()); // Take final screenshot await appiumHelper.takeScreenshot("ios_logged_in"); } catch (error) { console.error("Test failed:", error); } finally { // Close the driver await appiumHelper.closeDriver(); } } testIosApp().catch(console.error); ``` ### Deep Linking Example ```typescript import { AppiumHelper, AppiumCapabilities } from "mcp-appium"; async function testDeepLinks() { const appiumHelper = new AppiumHelper("./screenshots"); try { // Set up capabilities const capabilities: AppiumCapabilities = { platformName: "Android", deviceName: "Android Device", automationName: "UiAutomator2", noReset: true, }; // Initialize driver await appiumHelper.initializeDriver(capabilities); // Open a deep link await appiumHelper.openDeepLink("myapp://products/1234"); console.log("Opened deep link successfully"); // Take a screenshot of the opened screen await appiumHelper.takeScreenshot("deep_link_screen"); // For Android, you can also use the Android-specific method with extras if (capabilities.platformName === "Android") { await appiumHelper.openAndroidDeepLink("myapp://search", { query: "shoes", filter: "popular", }); // Take another screenshot await appiumHelper.takeScreenshot("android_deep_link_with_extras"); } } catch (error) { console.error("Test failed:", error); } finally { // Close the driver await appiumHelper.closeDriver(); } } testDeepLinks().catch(console.error); ``` ## Advanced Examples ### UI Recovery Example When standard element locators fail, MCP-Appium can attempt recovery using UI analysis: ```typescript import { McpServer } from "@modelcontextprotocol/sdk/server/mcp.js"; import { registerRecoveryTools } from "mcp-appium"; // In your MCP server setup: const server = new McpServer(); registerRecoveryTools(server); // Now you can use tools like: // - smart-action (with automatic recovery) // - analyze-screen (to identify UI elements via screenshot analysis) ``` Usage example: ``` // Instead of this, which might fail if the element isn't found: await appiumHelper.tapElement("//android.widget.Button[@text='Accept']"); // You can use the smart-action tool: await mcpClient.callTool({ name: "smart-action", arguments: { action: "tap", selector: "//android.widget.Button[@text='Accept']", strategy: "xpath", fallbackToScreenshot: true } }); ``` This example shows how MCP-Appium can automatically recover and find the element even when traditional selectors fail. ## More Examples You can find more examples in the `examples` directory of the project: - `appium-test.ts` - Basic Appium testing - `deeplink-test.ts` - Mobile deep linking examples - `visual-recovery-test.ts` - UI recovery examples - `calculator-test.ts` - Simple calculator app testing