crawl4ai-mcp-sse-stdio
Version:
MCP (Model Context Protocol) server for Crawl4AI - Universal web crawling and data extraction. Supports STDIO, SSE, and HTTP transports.
241 lines (180 loc) β’ 5.65 kB
Markdown
# π·οΈ Crawl4AI MCP Server
[](https://badge.fury.io/js/crawl4ai-mcp-sse-stdio)
[](https://www.npmjs.com/package/crawl4ai-mcp-sse-stdio)
[](https://opensource.org/licenses/MIT)
[](https://www.npmjs.com/package/crawl4ai-mcp-sse-stdio)
[](https://t.me/ii_pomogator)
**MCP (Model Context Protocol) server for Crawl4AI** - Universal web crawling and data extraction for AI agents.
Integrate powerful web scraping capabilities into Claude, ChatGPT, and any MCP-compatible AI assistant.
## π Table of Contents
- [π Quick Start](#-quick-start)
- [π³ Docker Usage](#-docker-usage)
- [π οΈ Available Tools](#οΈ-available-tools)
- [βοΈ Configuration](#οΈ-configuration)
- [π€ Contributing](#-contributing)
- [π License](#-license)
## π Quick Start
### NPM Installation (Recommended)
```bash
# Install globally
npm install -g crawl4ai-mcp-sse-stdio
# Run in different modes
npx crawl4ai-mcp --stdio --endpoint https://your-crawl4ai-server.com
npx crawl4ai-mcp --sse --port 3001 --endpoint https://your-crawl4ai-server.com
npx crawl4ai-mcp --http --port 3000 --endpoint https://your-crawl4ai-server.com
# With optional bearer token
npx crawl4ai-mcp --stdio --endpoint https://your-crawl4ai-server.com --bearer-token your-token
```
### With Claude Desktop
Add to your `claude_desktop_config.json`:
```json
{
"mcpServers": {
"crawl4ai": {
"command": "npx",
"args": [
"crawl4ai-mcp-sse-stdio",
"--stdio",
"--endpoint", "https://your-crawl4ai-server.com",
"--bearer-token", "your-optional-token"
]
}
}
}
```
## π³ Docker Usage
### Docker Hub
Official Docker image available on Docker Hub:
[](https://hub.docker.com/r/stgmt/crawl4ai-mcp)
```bash
# Pull and run the official Docker image
docker pull stgmt/crawl4ai-mcp:latest
docker run -p 3000:3000 stgmt/crawl4ai-mcp:latest
```
### Docker Compose
Create a `docker-compose.yml`:
```yaml
version: '3.8'
services:
crawl4ai-mcp:
image: stgmt/crawl4ai-mcp:latest
ports:
- "3000:3000"
environment:
- CRAWL4AI_ENDPOINT=https://your-crawl4ai-server.com
- CRAWL4AI_BEARER_TOKEN=your-optional-token
restart: unless-stopped
```
Run with:
```bash
docker-compose up -d
```
## π οΈ Available Tools
### 1. `crawl` - Full Web Crawling
Extract complete content from any webpage.
```json
{
"name": "crawl",
"arguments": {
"url": "https://example.com",
"wait_for": "css:.content",
"timeout": 30000
}
}
```
### 2. `md` - Markdown Extraction
Get clean markdown content from webpages.
```json
{
"name": "md",
"arguments": {
"url": "https://docs.example.com",
"clean": true
}
}
```
### 3. `html` - Raw HTML
Retrieve raw HTML content.
```json
{
"name": "html",
"arguments": {
"url": "https://example.com"
}
}
```
### 4. `screenshot` - Visual Capture
Take screenshots of webpages.
```json
{
"name": "screenshot",
"arguments": {
"url": "https://example.com",
"full_page": true
}
}
```
### 5. `pdf` - PDF Generation
Convert webpages to PDF.
```json
{
"name": "pdf",
"arguments": {
"url": "https://example.com",
"format": "A4"
}
}
```
### 6. `execute_js` - JavaScript Execution
Execute JavaScript on webpages.
```json
{
"name": "execute_js",
"arguments": {
"url": "https://example.com",
"script": "document.title"
}
}
```
## βοΈ Configuration
### Environment Variables
```bash
# REQUIRED: Crawl4AI endpoint URL
export CRAWL4AI_ENDPOINT="https://your-crawl4ai-server.com"
# OPTIONAL: Bearer authentication token
export CRAWL4AI_BEARER_TOKEN="your-api-token"
```
### Command Line Options
```bash
crawl4ai-mcp --help
Options:
--stdio Run in STDIO mode for MCP clients
--sse Run in SSE mode for web interfaces
--http Run in HTTP mode
--endpoint ENDPOINT Crawl4AI API endpoint URL (REQUIRED)
--bearer-token TOKEN Bearer authentication token (OPTIONAL)
--port PORT HTTP server port (default: 3000)
--sse-port PORT SSE server port (default: 9001)
--version, -v Show version
```
### Basic Commands
```bash
# HTTP mode (recommended for testing)
crawl4ai-mcp --http --port 3000 --endpoint https://your-crawl4ai-server.com
# SSE mode (Server-Sent Events)
crawl4ai-mcp --sse --port 3001 --endpoint https://your-crawl4ai-server.com
# STDIO mode (for MCP clients)
crawl4ai-mcp --stdio --endpoint https://your-crawl4ai-server.com
# With optional bearer token
crawl4ai-mcp --http --port 3000 --endpoint https://your-crawl4ai-server.com --bearer-token your-token
```
## π€ Contributing
We welcome contributions! See [CONTRIBUTING.md](CONTRIBUTING.md) for guidelines.
## π License
MIT License - see [LICENSE](LICENSE) file for details.
## π Links
- **NPM Package**: [https://www.npmjs.com/package/crawl4ai-mcp-sse-stdio](https://www.npmjs.com/package/crawl4ai-mcp-sse-stdio)
- **GitHub Repository**: [https://github.com/stgmt/crawl4ai-mcp](https://github.com/stgmt/crawl4ai-mcp)
- **Docker Hub**: [https://hub.docker.com/r/stgmt/crawl4ai-mcp](https://hub.docker.com/r/stgmt/crawl4ai-mcp)
---
**Made with β€οΈ for the AI community**