browse
Version:
Unified Browserbase CLI for browser automation and cloud APIs.
291 lines (216 loc) • 12.5 kB
Markdown
<div align="center">
# >_ browse
### The single CLI your AI agents need to access the internet reliably.
[](https://www.npmjs.com/package/browse)
[](https://www.npmjs.com/package/browse)
[](https://github.com/browserbase/stagehand/blob/main/packages/cli/LICENSE)
```bash
npm install -g browse
```
<img src="https://github.com/browserbase/stagehand/blob/main/packages/cli/media/browse.gif?raw=true" alt="browse CLI demo" width="100%" />
</div>
---
`browse` gives any agent — or any terminal — a reliable way to drive a real browser, learn how to use specific websites, and tap into Browserbase's cloud. One command to navigate the open web, capture telemetry while you do it, and reuse skills the community has already built.
## Why browse
- **Browser interactions** — Navigate tricky, complex websites with `browse click`, `browse mouse scroll`, `browse type`, `browse select`, and 30+ more DOM commands.
- **Open web skills catalog** — `browse` is the official CLI for [browse.sh](https://browse.sh), the largest open web catalog. Run `browse skills add apartments.com` and your agent learns how to use that site and its APIs.
- **Rich debugging** — Arm your agents with network, console, and other web telemetry.
- **Cloud features** — Optionally use Browserbase cloud: load cookies via saved [Contexts](https://docs.browserbase.com/platform/browser/core-features/contexts), use [Verified Browsers](https://www.browserbase.com/verified), and call the [Fetch and Search APIs](https://www.browserbase.com/search).
## Quick start
```bash
npm install -g browse
browse open https://example.com
browse snapshot --compact
browse click @0-12
browse fill @0-8 "hello"
browse get title
browse screenshot --path page.png
browse stop
```
## How it works
`browse` runs a lightweight per-session daemon. The first command starts it, and subsequent commands reuse the same browser — so cookies, tabs, and snapshot refs persist between invocations. Run multiple isolated browsers at once with `--session <name>` (or the `BROWSE_SESSION` env var), and shut a session down with `browse stop`.
### Browser targets
Every driver command accepts the same flags to pick where the browser runs. Mix and match per command:
| Flag | Target |
|------|--------|
| _(default)_ | Managed local browser, or remote when `BROWSERBASE_API_KEY` is set |
| `--local` | Managed local browser (add `--headed` / `--headless`) |
| `--remote` | Remote Browserbase session (uses `BROWSERBASE_API_KEY`) |
| `--auto-connect` | Auto-discover and attach to a local Chrome with remote debugging enabled |
| `--cdp <url\|port>` | Attach directly to a CDP endpoint (port, `http(s)://`, or `ws(s)://`) |
| `--target-id <id>` | Select a specific CDP target when attaching to an existing browser |
```bash
browse open https://example.com # default target
browse open https://example.com --local --headed
browse open https://example.com --remote
browse open https://example.com --auto-connect
browse open https://example.com --cdp 9222
browse open https://example.com --cdp ws://127.0.0.1:9222/devtools/browser/<id> --target-id <target-id>
```
Run `browse doctor` to diagnose session and browser-connection prerequisites for any target.
## Commands
### Navigation
```bash
browse open <url> # Open a URL (--wait load|domcontentloaded|networkidle, --timeout <ms>)
browse reload # Reload the active page
browse back # Navigate backward
browse forward # Navigate forward
```
### Snapshot & refs
The accessibility snapshot is the recommended way for agents to discover elements. It prints a tree of refs like `@0-12` that the element commands accept directly.
```bash
browse snapshot # Accessibility snapshot + cached refs
browse snapshot --compact # Tree only, no ref maps
browse snapshot --filter submit # Filter lines by text or /regex/, keeping ancestors
browse snapshot --max-depth 4 # Trim output deeper than this depth
browse refs # Show refs cached from the last snapshot
```
### Element actions
Targets accept a snapshot ref (`@0-12`), an XPath, or a CSS selector.
```bash
browse click @0-12 # Click (also accepts selectors)
browse fill @0-8 "hello" # Fill an input (--press-enter to submit)
browse select @0-9 "CA" # Select an option (--value for <option value>)
browse type "hello world" # Type at the current focus (--delay, --mistakes)
browse press Enter # Press a key (alias: browse key) e.g. Meta+K, Escape
browse upload @0-4 ./resume.pdf # Upload file(s) (repeat --file for more)
browse highlight @0-12 # Highlight an element (--duration <ms>)
```
### Mouse (raw coordinates)
Use these when you need pixel coordinates instead of a ref. Add `--return-xpath` to get the XPath under the cursor.
```bash
browse mouse click 240 320 # Click coordinates (--button, --click-count)
browse mouse hover 240 320 # Move the mouse
browse mouse scroll 400 500 0 600 # Scroll from a point by (dx, dy)
browse mouse drag 100 100 400 400 # Drag between two points (--steps, --delay)
```
### Page info & state
```bash
browse get url # Read page data / element state:
browse get title # url, title, text, html, value, box,
browse get text @0-12 # visible, checked, markdown
browse get markdown body
browse get box 'button[type=submit]'
browse is visible @0-12 # Check element state: visible, checked
browse eval 'document.title' # Evaluate JavaScript in the page
browse viewport 1280 720 # Set viewport size (--scale for device pixel ratio)
browse cursor # Enable a visible cursor overlay
browse screenshot --path page.png # Screenshot (--full-page, --type, --quality, --clip)
```
### Waiting
```bash
browse wait load # Wait for a load state
browse wait load networkidle --timeout 45000
browse wait selector @0-12 --state visible # visible|hidden|attached|detached
browse wait timeout 1000 # Wait a fixed number of ms
```
### Tabs
```bash
browse tab list # List tabs (with stable targetIds)
browse tab new [url] # Open a new tab and make it active
browse tab switch <target-id> # Switch active tab (index or targetId)
browse tab close [target-id] # Close a tab (defaults to the active tab)
```
Prefer the `targetId` from `browse tab list` over the index for stable agent workflows.
### Network capture
Capture request/response traffic for the active session to a local directory.
```bash
browse network on # Start capturing
browse network off # Stop capturing
browse network path # Print the capture directory
browse network clear # Clear captured requests
```
> [!NOTE]
> `browse network on` writes request/response headers and bodies to a local owner-only capture directory. These files can include cookies, authorization headers, and other secrets — use network capture only on trusted machines and run `browse network clear` when done.
### Session & daemon
```bash
browse status # Show daemon status for a session
browse stop # Stop the daemon (--force to kill an unresponsive browser)
browse doctor # Diagnose session and browser-connection prerequisites
browse cdp 9222 # Attach to a CDP endpoint and stream DevTools events (--domain, --pretty)
```
### Global flags
These apply across driver commands:
| Flag | Description |
|------|-------------|
| `-s, --session <name>` | Named browser session (or `BROWSE_SESSION` env var) |
| `--local` / `--remote` | Choose a managed local or remote Browserbase browser |
| `--headed` / `--headless` | Window visibility for managed local sessions |
| `--auto-connect` | Attach to a local Chrome with remote debugging enabled |
| `--cdp <url\|port>` | Attach directly to a CDP endpoint |
| `--target-id <id>` | Select a specific CDP target |
| `--json` | Emit machine-readable JSON (available on most commands) |
## Open web skills catalog
Use [browse.sh](https://browse.sh), the largest open-source catalog of skills to reliably perform any task on the internet. Find a specialized skill to navigate `apartments.com`, for example, and drastically reduce your agent's time and token costs.
```bash
browse skills install # install the bundled browse CLI skill
browse skills list # list the public Browse.sh catalog
browse skills list --all # include every catalog entry
browse skills find reviews # search by slug, domain, title, tag…
browse skills find yelp.com/extract-reviews
browse skills add yelp.com/extract-reviews # install a catalog skill
browse skills add mcdonalds.order.online/order-delivery-42q71n
```
## Browserbase cloud commands
Manage projects, sessions, contexts, and extensions, or call the Fetch and Search APIs directly. These commands use `BROWSERBASE_API_KEY`.
```bash
# Projects
browse cloud projects list
browse cloud projects get <project-id>
browse cloud projects usage <project-id>
# Sessions
browse cloud sessions list # --limit, --status, --json
browse cloud sessions get <session-id>
browse cloud sessions create # --proxies, --verified, --region, --solve-captchas…
browse cloud sessions update <session-id> --status REQUEST_RELEASE
browse cloud sessions debug <session-id> # live debugger URLs
browse cloud sessions logs <session-id>
browse cloud sessions downloads get <session-id> # --output ./downloads.zip
browse cloud sessions uploads create <session-id> ./file.pdf
# Contexts
browse cloud contexts create
browse cloud contexts get <context-id>
browse cloud contexts update <context-id> # refresh the upload URL
browse cloud contexts delete <context-id>
# Extensions
browse cloud extensions upload ./extension.zip
browse cloud extensions get <extension-id>
browse cloud extensions delete <extension-id>
# Fetch & Search APIs
browse cloud fetch <url> # markdown by default
browse cloud search <query>
```
`browse cloud fetch` returns markdown-formatted page content by default. Use `--format raw` for the original response body, or `--format json --schema <schema>` for structured extraction.
## Functions
Browserbase [Functions](https://docs.browserbase.com/platform/runtime/overview) let you deploy browser agents and automation scripts directly onto Browserbase's infrastructure. Build locally, test instantly, and deploy as APIs.
```bash
browse functions init my-function # scaffold a new project (--package-manager)
browse functions dev index.ts # local development server (--port, --verbose)
browse functions publish index.ts # package and upload (--dry-run to preview)
browse functions invoke <function-id> --params '{"url":"https://example.com"}'
browse functions invoke --check-status <invocation-id>
```
## Templates
Discover and scaffold ready-to-run Browserbase example projects.
```bash
browse templates list # --tag, --source, --wide, --json
browse templates find amazon # search by slug, title, category, or tag
browse templates clone google-trends-keywords
browse templates clone amazon-product-scraping --language python ./my-scraper
```
## Configuration
Set your Browserbase API key to enable remote sessions and cloud commands:
```bash
export BROWSERBASE_API_KEY=bb_live_...
```
Local driver commands (`--local`) work without an API key.
| Variable | Description |
|----------|-------------|
| `BROWSERBASE_API_KEY` | Enables `--remote` sessions and all `browse cloud` / `functions` commands |
| `BROWSE_SESSION` | Default session name (alternative to `-s, --session`) |
## Links
- [browse.sh](https://browse.sh) — open web skills catalog
- [Browserbase docs](https://docs.browserbase.com)
- [GitHub](https://github.com/browserbase/stagehand/tree/main/packages/cli) · [Issues](https://github.com/browserbase/stagehand/issues)
## License
[MIT](https://github.com/browserbase/stagehand/blob/main/packages/cli/LICENSE)