codeceptjs
Version:
Supercharged End 2 End Testing Framework for NodeJS
486 lines (364 loc) • 20.3 kB
Markdown
# CodeceptJS MCP Server
Model Context Protocol (MCP) server for CodeceptJS. Lets AI agents drive a CodeceptJS browser session — list tests, run arbitrary `I.*` code, pause-and-poke through a scenario, capture artifacts, and read aiTrace markdown — all in-process, sharing one browser and one container.
## Overview
The MCP server exposes the following tools:
- `list_tests` / `list_actions` — enumerate tests and `I.*` methods
- `start_browser` / `stop_browser` — open / close the session (only place plugin overrides go)
- `run_code` — run arbitrary JS with `I` and the full CodeceptJS scope; captures steps, console, return value, and a settled-state snapshot
- `snapshot` — capture URL/HTML/ARIA/screenshot/console/storage at any moment
- `run_test` — run a specific scenario; supports `pauseAt` for programmatic breakpoints
- `run_step_by_step` — pause after every step
- `continue` — release a paused test (run-to-end, run-to-next-pause, or run-to-finish)
- `cancel` — abort the in-progress / paused run without closing the browser
## Invocation
Two ways to launch the server:
- `npx codeceptjs-mcp` — the published bin
- `node node_modules/codeceptjs/bin/mcp-server.js` — direct path, useful for editor / agent configs
> ⚠️ **Run from the project's local `codeceptjs`, never a global install.**
> The MCP server resolves helpers, plugins, page objects, and custom support from the project's `node_modules`. A globally installed `codeceptjs` won't see project-local helpers (`@codeceptjs/helper`, `@codeceptjs/configure`, custom plugins) or your `include:` support objects, and per-project versions can drift from the global one. Always invoke via `npx codeceptjs-mcp` from inside the project directory, or point your MCP client config at `<project>/node_modules/codeceptjs/bin/mcp-server.js` directly.
## Configuration
Set up the MCP server in your client (Claude Desktop, Cursor, Continue, etc.):
### Basic
```json
{
"mcpServers": {
"codeceptjs": {
"command": "npx",
"args": ["codeceptjs-mcp"]
}
}
}
```
The server looks for `codecept.conf.js` (then `.cjs`) in the current working directory.
### With env vars
```json
{
"mcpServers": {
"codeceptjs": {
"command": "npx",
"args": ["codeceptjs-mcp"],
"env": {
"CODECEPTJS_CONFIG": "/absolute/path/to/codecept.conf.js",
"CODECEPTJS_PROJECT_DIR": "/absolute/path/to/project"
}
}
}
}
```
| Variable | Description |
|----------|-------------|
| `CODECEPTJS_CONFIG` | Absolute path to `codecept.conf.js`. Overrides cwd lookup. |
| `CODECEPTJS_PROJECT_DIR` | Absolute path to the project root. Used as the resolution base for the config file. |
## Session Defaults
When the session starts, the MCP server enforces two plugin defaults so the agent gets useful telemetry out of the box:
- **`aiTrace: { enabled: true, on: 'step' }`** — every step persists DOM/ARIA/screenshot/console artifacts to `output/trace_<TestName>_<hash>/`. Each scenario's `traceFile` is returned in run results so the agent can `Read` the markdown directly.
- **`browser: { enabled: true, show: false }`** — headless. Switch to headed via `start_browser` `plugins` arg.
Both can be overridden (or disabled) via `start_browser`'s `plugins` argument. The `codecept.conf.js`'s own plugin config still merges in for keys the user explicitly set there.
## Available Tools
### `start_browser`
Initializes the session — loads config, builds the container, opens the browser, kicks off the synthetic test scope so `run_code` and `snapshot` work. This is the only tool that customizes initialization; every other tool either uses the active session or auto-inits with project defaults.
**Parameters:**
- `config` (string, optional) — absolute path to `codecept.conf.js`. Defaults to `$CODECEPTJS_CONFIG`, then `./codecept.conf.js` in `$CODECEPTJS_PROJECT_DIR` or cwd.
- `plugins` (object, optional) — plugin configs keyed by name. Same shape as `plugins` in `codecept.conf.js`; `enabled: true` is added automatically. Most useful entries:
- `{ browser: { show: true } }` — visible browser
- `{ browser: { browser: "firefox", windowSize: "1280x720" } }` — switch browser + viewport
- `{ aiTrace: { enabled: false } }` — disable per-step trace overhead on a re-run
- `{ pause: { on: "fail" } }` / `{ screenshot: { on: "step" } }` — any other plugin works the same way
**Returns:**
```json
{
"status": "Session started — run_code and snapshot are now available",
"plugins": { "browser": { "show": false } }
}
```
### `stop_browser`
Closes the browser handles, drops the synthetic test scope, but **keeps the container, codecept, and Mocha alive**. Subsequent `start_browser` reopens the browser without rebuilding everything — important because ESM-loaded test files don't re-execute their top-level `Scenario(...)` on reload, so a fresh Mocha would have no suites.
**Parameters:** none
**Returns:**
```json
{ "status": "Browser stopped — Mocha and config preserved; call start_browser to reopen" }
```
### `cancel`
Aborts the currently paused or in-progress test run **without closing the browser**. Use when you want to bail out of a paused test and start something else. Mocha + container stay alive; the next `run_test` / `run_step_by_step` works immediately.
**Parameters:** none
**Returns:**
```json
{ "status": "Run cancelled — browser kept open" }
```
### `list_tests`
Lists all tests resolved from the project's `tests:` glob.
**Parameters:** none
**Returns:**
```json
{
"count": 5,
"tests": [
{ "file": "/abs/path/to/work_orders_test.js", "relativePath": "work_orders_test.js" }
]
}
```
### `list_actions`
Lists every `I.*` method from enabled helpers and support objects.
**Parameters:** none
**Returns:**
```json
{
"count": 120,
"actions": [
{ "helper": "Playwright", "action": "amOnPage", "signature": "I.amOnPage(url)" },
{ "helper": "SupportObject", "action": "loginAsAdmin", "signature": "I.loginAsAdmin()" }
]
}
```
### `run_code`
Run arbitrary JavaScript inside the live test scope. Captures steps, console output, return value, and a final-state snapshot.
**Parameters:**
- `code` (string, required) — JS source. Use `await` on `I.*` calls.
- `timeout` (number, optional) — ms (default `60000`).
- `saveArtifacts` (boolean, optional) — capture final-state artifacts (default `true`).
- `settleMs` (number, optional) — wait this many ms after the code finishes before capturing artifacts (default `300`). Bump to `1000`+ for slow re-renders, `0` to skip.
**Scope (everything reachable as a bare identifier in `code`):**
| Symbol | Source |
|--------|--------|
| `I` | The actor (with all helper methods) |
| Custom support objects | `include:` in `codecept.conf.js` (e.g. page objects, `login` from `auth` plugin) |
| `locate`, `within`, `session`, `secret`, `inject`, `pause`, `share` | from `codeceptjs` |
| `tryTo`, `retryTo`, `hopeThat` | from `codeceptjs/effects` |
| `step` | from `codeceptjs/steps` |
| `element`, `eachElement`, `expectElement`, `expectAnyElement`, `expectAllElements` | from `codeceptjs/els` |
| `container` | the DI container |
| `helpers` | live helpers map (e.g. `helpers.Playwright.page` for raw Playwright access) |
The full live list is returned in every response under `availableObjects`.
**Return-value handling:**
- An explicit `return X` is JSON-stringified (with circular-ref handling). Capped at 20 KB.
- If you forget `return`, the last grabbed step value is returned automatically (`await I.grabTitle()` on the last line works).
- A returned `WebElement` (or array of them, from `I.grabWebElement(s)`) is auto-described to a plain object: `{ text, html, visible, enabled, attrs }`.
**Returns:**
```json
{
"status": "success",
"output": "Code executed successfully",
"error": null,
"commands": ["I am on page \"/\"", "I grab text from \"h1\""],
"logs": [{ "level": "log", "message": "headline Welcome", "t": 47 }],
"returnValue": "{\n \"url\": \"http://localhost:8000/\",\n \"text\": \"Welcome\"\n}",
"availableObjects": ["I", "container", "eachElement", "element", "expectAllElements", "expectAnyElement", "expectElement", "helpers", "hopeThat", "inject", "locate", "login", "pause", "retryTo", "secret", "session", "share", "step", "tryTo", "within"],
"artifacts": {
"url": "http://localhost:8000/",
"html": "file:///output/trace_run_code_.../mcp_page.html",
"aria": "file:///output/trace_run_code_.../mcp_aria.txt",
"screenshot": "file:///output/trace_run_code_.../mcp_screenshot.png",
"console": "file:///output/trace_run_code_.../mcp_console.json",
"storage": "file:///output/trace_run_code_.../mcp_storage.json",
"cookieCount": 3,
"localStorageCount": 5
},
"ariaDiff": "...",
"dir": "/output/trace_run_code_...",
"traceFile": "file:///output/trace_run_code_.../trace.md"
}
```
- `traceFile` — markdown summary of this call. `Read` it for full context.
- `ariaDiff` — present when the call mutated the page; diff between the previous aiTrace ARIA snapshot and the new one.
- `aiTraceHint` — appears when aiTrace is disabled, suggesting how to re-enable it.
**Example:**
```json
{
"name": "run_code",
"arguments": {
"code": "await I.amOnPage('/'); const t = await I.grabTextFrom('h1'); return { url: await I.grabCurrentUrl(), text: t };"
}
}
```
### `snapshot`
Capture the current browser state without performing any action.
**Parameters:**
- `fullPage` (boolean, optional) — full-page screenshot (default `false`).
- `settleMs` (number, optional) — wait before capture (default `300`).
**Returns:**
```json
{
"status": "success",
"dir": "/output/snapshot_1700000000000_abcd1234",
"traceFile": "file:///output/snapshot_.../trace.md",
"artifacts": {
"url": "http://localhost:8000/dashboard",
"html": "file:///output/snapshot_.../snapshot_page.html",
"aria": "file:///output/snapshot_.../snapshot_aria.txt",
"screenshot": "file:///output/snapshot_.../snapshot_screenshot.png",
"console": "file:///output/snapshot_.../snapshot_console.json",
"storage": "file:///output/snapshot_.../snapshot_storage.json",
"cookieCount": 3,
"localStorageCount": 5
}
}
```
### `run_test`
Run a specific scenario. Returns reporter JSON with one entry per scenario; each entry has a `traceFile` (file:// URL) pointing to the per-scenario aiTrace markdown — `Read` it on failures to see the failing step's DOM/ARIA/screenshot.
If the test calls `pause()` — or if `pauseAt` matches a step — returns early with `status: "paused"` so the agent can inspect via `run_code` and release with `continue` (or abort with `cancel`).
**Parameters:**
- `test` (string, required) — file path or partial test name; resolved to a single test file.
- `timeout` (number, optional) — overall ms (default `60000`).
- `grep` (string, optional) — filter scenarios by title; passed to `mocha.grep`. Mirrors `--grep` on the CLI.
- `pauseAt` (number | string, optional) — programmatic breakpoint. Either:
- `number` — 1-based step index (test pauses after the Nth step completes)
- `string` — case-insensitive substring match against step name
- `"/regex/i"` — regex literal (the `/.../i` form is honored verbatim)
**Returns (completed normally):**
```json
{
"status": "completed",
"file": "/path/to/test.js",
"reporterJson": {
"stats": { "tests": 1, "passes": 1, "failures": 0 },
"tests": [
{
"title": "lists materials",
"file": "/path/to/materials_test.js",
"status": "passed",
"duration": 4123,
"traceFile": "file:///output/trace_materials__lists_materials_xxxx/trace.md"
}
]
},
"error": null
}
```
**Returns (paused):**
```json
{
"status": "paused",
"file": "/path/to/test.js",
"pausedAfter": { "index": 7, "name": "I select option {\"css\":\"main select\"}, \"Flux\"", "status": "success" },
"page": { "url": "https://app.example.com/materials", "title": "Materials", "contentSize": 18432 },
"suggestions": [
"Call snapshot to capture URL/HTML/ARIA/screenshot/console/storage at this point",
"Call run_code to inspect or manipulate state (e.g. return await I.grabText(\"h1\"))",
"Call continue to release the pause and let the test run the next step (or finish)"
]
}
```
**Examples:**
```json
{ "name": "run_test", "arguments": { "test": "checkout_test", "pauseAt": 5 } }
{ "name": "run_test", "arguments": { "test": "checkout_test", "pauseAt": "fill field" } }
{ "name": "run_test", "arguments": { "test": "checkout_test", "pauseAt": "/grab.*url/i" } }
```
### `run_step_by_step`
Run a test interactively, pausing after every step. The agent advances with `continue` or inspects with `run_code` / `snapshot`.
**Parameters:**
- `test` (string, required)
- `timeout` (number, optional)
- `grep` (string, optional)
- `plugins` (object, optional) — same as `start_browser`. Most useful is `{ browser: { show: true } }` so you can watch the run between pauses.
**Returns (after each step):**
```json
{
"status": "paused",
"file": "/path/to/test.js",
"pausedAfter": { "index": 1, "name": "I am on page \"/\"", "status": "success" },
"page": { "url": "http://localhost:8000/", "title": "Test App", "contentSize": 1832 },
"suggestions": [...]
}
```
**Returns (after the last step):** same shape as `run_test`'s completed response — every scenario carries its `traceFile`.
### `continue`
Release a paused test. The test runs until the next pause (`run_step_by_step`), the next `pause()` call, or completion.
**Parameters:**
- `timeout` (number, optional) — ms to wait for the next pause / completion (default `60000`).
**Returns (re-paused):** same shape as `run_test`'s paused response, with the new `pausedAfter` index.
**Returns (completed):** same shape as `run_test`'s completed response.
## Pause-and-poke flow
```json
{ "name": "run_step_by_step", "arguments": { "test": "checkout_test" } }
// → { "status": "paused", "pausedAfter": { "index": 1, ... } }
{ "name": "snapshot", "arguments": {} }
// → full artifact bundle for step 1
{ "name": "run_code", "arguments": { "code": "return await I.grabCurrentUrl()" } }
// → { "status": "success", "returnValue": "http://...", "artifacts": { ... } }
{ "name": "run_code", "arguments": { "code": "await I.click('Save')" } }
// → { "status": "success", ... } — actually mutates the live page
{ "name": "continue", "arguments": {} }
// → { "status": "paused", "pausedAfter": { "index": 2, ... } }
// ... or bail out:
{ "name": "cancel", "arguments": {} }
// → { "status": "Run cancelled — browser kept open" }
```
Notes:
- Pause runs in-process: `run_code` and the test share the same `I` / browser. There's no subprocess, no IPC.
- `run_test` / `run_step_by_step` / `continue` silence stdout/stderr while running so step output doesn't interleave with the MCP JSON-RPC stream.
- TTY behaviour is unchanged — `npx codeceptjs run --debug` at a terminal still opens the readline REPL when `process.stdin.isTTY` is true. The MCP server only intercepts pause when its handler is registered.
## Trace files (aiTrace)
When `aiTrace` is on (the default for MCP sessions), every step in a scenario produces:
```
output/
└── trace_Materials__lists_materials_<hash>/
├── 0001_<step>_screenshot.png
├── 0001_<step>_page.html # minified → trash classes/scripts/styles stripped → beautified
├── 0001_<step>_aria.txt # Playwright only
├── 0001_<step>_console.json
├── 0002_...
└── trace.md # AI-friendly markdown index
```
`run_test` / `run_step_by_step` results expose the `trace.md` URL per scenario (`reporterJson.tests[].traceFile`) — `Read` it on failure to see exactly what the failing step saw.
For ad-hoc `run_code` / `snapshot` runs, only a single set of artifacts is produced (`mcp_*` / `snapshot_*` prefix), packaged with their own `trace.md`.
### `trace.md` shape
```markdown
# Test: Login functionality
**Status**: failed
**File**: tests/login_test.js
## Steps
1. **I.amOnPage("/login")** — passed (150ms)
2. **I.fillField("#username", "user")** — passed (80ms)
3. **I.click("#login")** — passed (100ms)
4. **I.see("Welcome")** — failed (50ms)
## Error
Element "Welcome" not found
## Artifacts
- Screenshot: 0004_screenshot.png
- HTML: 0004_page.html
- ARIA: 0004_aria.txt
```
## HTML formatting
Every HTML snapshot saved by the MCP server (and the `aiTrace` / `pageInfo` plugins, since they all funnel through `captureSnapshot` in `lib/utils/trace.js`) goes through:
1. **Minify** (`html-minifier-terser`) — strip comments, collapse whitespace, drop redundant attributes.
2. **Clean** — drop `<style>`, `<noscript>`, and inline `<script>` (no `src`); keep `<script src="...">`; strip trash class names (Tailwind utilities, framework hashes, `xl:hidden`-style scoped classes); drop `style="..."` attributes. Semantic attributes (`id`, `aria-*`, `data-*`, `role`, `href`, `src`, `alt`, `title`, `name`) are preserved.
3. **Beautify** (`js-beautify`) — re-indent at 2 spaces; keep inline elements with their text.
Result: a multi-line, low-noise HTML doc that's far cheaper for an LLM to reason about than raw page source.
## Storage state
For Playwright, `captureSnapshot` calls `helper.grabStorageState()`. For Puppeteer / WebDriver, it falls back to `helper.grabCookie()` plus an `executeScript` walking `window.localStorage`. Both produce the same shape (`{ cookies: [...], origins: [{ origin, localStorage: [...] }] }`).
Storage capture is **enabled** for `run_code`, `snapshot`, `run_step_by_step` fallback, and `pageInfo`. **Disabled per-step in aiTrace** — cookies / localStorage rarely change between actions, and per-step files would just be noise.
## Architecture
- **In-process.** No subprocess, no IPC. The MCP tool calls and the running test share one container, one helper, one browser.
- **Synthetic test scope.** On first init the server emits `suite.before` + `test.before` and calls each helper's `_beforeSuite` + `_before`, so `run_code` / `snapshot` have a live `helper.page` to act on.
- **Mocha is reused.** `cleanReferencesAfterRun` is forced to `false` (Mocha 11's constructor ignores the option, so the setter is called explicitly). `stop_browser` closes the browser but keeps Mocha alive — re-running `run_test` after `start_browser` works without ESM cache invalidation tricks.
- **Locking.** `run_test` / `run_step_by_step` use a single-call lock so concurrent runs can't trample each other.
## Troubleshooting
### Server doesn't start
- Node 18+ recommended.
- Verify the path / `npx` resolution in your client config.
### Config not found
- Set `CODECEPTJS_CONFIG` to the absolute path of `codecept.conf.js` (or `.cjs`).
- Set `CODECEPTJS_PROJECT_DIR` if your config lives outside cwd.
### Tests not found
- Confirm the project's `tests:` glob in `codecept.conf.js` matches your files.
- `list_tests` runs from the same project — if it returns `[]`, the config is the issue, not MCP.
### Browser launch issues
- Playwright requires its browsers installed (`npx playwright install`).
- For visible runs use `start_browser` with `plugins={ browser: { show: true } }` — the default is headless.
### Tests stuck or timing out
- Bump `timeout` per call.
- Check that the app under test is actually reachable.
- For long re-renders that confuse `snapshot` / `run_code`'s artifact capture, raise `settleMs` (default `300`).
## Security
- The MCP server runs with the same permissions as the calling process.
- `run_code` runs arbitrary JavaScript in the project context — only expose to trusted agents / environments.
- Environment variables may contain absolute project paths; treat them like any other config.
## Contributing
When changing the MCP server:
1. Add coverage in `test/mcp/mcp_server_test.js`.
2. Update this doc with new tools / parameters.
3. Verify against a real project (e.g. the `examples/playwright/` setup) — the in-process recorder + lifecycle integration is sensitive to ordering.
4. Test with both Playwright and Puppeteer.
## License
MIT