mcp-webresearch-stealthified
Version:
MCP server for web research, stealthified, improved, forked from mzxrai
167 lines (111 loc) • 5.21 kB
Markdown
# MCP Web Research Server - Stealthified
A Model Context Protocol (MCP) server for web research with enhanced stealth capabilities to avoid detection.
Bring real-time web information into Claude and easily research any topic without getting blocked by CAPTCHAs.
## Features
- Google search integration with anti-blocking measures
- Google Scholar academic research capabilities
- Webpage content extraction with clean markdown formatting
- Screenshot capture with automatic sizing optimization
- Research session tracking (search queries, visited pages, etc.)
- Bot detection avoidance techniques
## Prerequisites
- [Node.js](https://nodejs.org/) >= 18 (includes `npm` and `npx`)
- [Claude Desktop app](https://claude.ai/download)
## Installation
First, ensure you've downloaded and installed the [Claude Desktop app](https://claude.ai/download) and you have npm installed.
Next, add this entry to your `claude_desktop_config.json` (location varies by platform):
- **Mac**: `~/Library/Application\ Support/Claude/claude_desktop_config.json`
- **Windows**: `%APPDATA%\Claude\claude_desktop_config.json`
- **Linux**: `~/.config/Claude/claude_desktop_config.json`
```json
{
"mcpServers": {
"webresearch": {
"command": "npx",
"args": ["-y", "mcp-webresearch-stealthified@latest"]
}
}
}
```
This config allows Claude Desktop to automatically start the web research MCP server when needed.
## Usage
Simply start a chat with Claude and send a prompt that would benefit from web research. For a prebuilt prompt optimized for deeper web research, use the `agentic-research` prompt that comes with this package. Access that prompt in Claude Desktop by clicking the Paperclip icon in the chat input and then selecting `Choose an integration` → `webresearch` → `agentic-research`.
<img src="https://i.ibb.co/N6Y3C0q/Screenshot-2024-12-05-at-11-01-27-PM.png" alt="Example screenshot of web research" width="400"/>
### Tools
1. `search_google`
- Performs Google searches with anti-detection measures
- Arguments: `{ query: string }`
2. `search_scholar`
- Searches Google Scholar for academic papers and scholarly content
- Arguments: `{ query: string }`
3. `visit_page`
- Visits a webpage and extracts its content in clean markdown format
- Arguments: `{ url: string, takeScreenshot?: boolean }`
4. `take_screenshot`
- Takes a screenshot of the current page with automatic resizing
- No arguments required
### Prompts
#### `agentic-research`
A guided research prompt that helps Claude conduct thorough web research. The prompt instructs Claude to:
- Start with broad searches to understand the topic landscape
- Prioritize high-quality, authoritative sources
- Iteratively refine the research direction based on findings
- Keep you informed and let you guide the research interactively
- Always cite sources with URLs
### Resources
This server exposes two types of MCP resources:
#### Screenshots
When you take a screenshot, it's saved as an MCP resource. You can access captured screenshots in Claude Desktop via the Paperclip icon.
#### Research Session
The server maintains a research session that includes:
- Search queries
- Visited pages
- Extracted content
- Screenshots
- Timestamps
You can review this information through the MCP resources interface.
### Advanced Usage Tips
1. **Optimized Search Queries**: For general topics, suggest high-quality sources in your query (e.g., "news today from reuters or AP" instead of just "news today").
2. **Academic Research**: Use the `search_scholar` tool specifically for academic or scientific topics to get scholarly articles.
3. **Sequential Research**: For complex topics, guide Claude through a step-by-step research process, focusing on one aspect at a time.
4. **Reading Depth**: When Claude finds a relevant page, you can ask it to visit the page and analyze the content in depth.
5. **Visual Information**: Use the screenshot capability when understanding a page's layout or visual elements is important.
## Troubleshooting
This is pre-alpha code with potential issues. If you run into problems:
1. Check Claude Desktop's MCP logs:
```bash
# Mac/Linux
tail -n 20 -f ~/Library/Logs/Claude/mcp*.log
# Windows
Get-Content -Path "$env:APPDATA\Claude\mcp*.log" -Tail 20 -Wait
```
2. Common issues:
- CAPTCHAs: This fork includes anti-detection measures, but some sites may still show CAPTCHAs
- Content extraction: Some sites may not extract properly due to complex layouts
- Rate limiting: Excessive searches may trigger rate limiting from Google
## Development
```bash
# Clone the repository
git clone https://github.com/phialsbasement/mcp-webresearch-stealthified.git
cd mcp-webresearch-stealthified
# Install dependencies
pnpm install
# Build the project
pnpm build
# Watch for changes
pnpm watch
# Run in development mode
pnpm dev
```
## Requirements
- Node.js >= 18
- Playwright (automatically installed as a dependency)
## Verified Platforms
- [x] macOS
- [x] Linux
- [x] Windows
## License
MIT
## Credits
Originally created by [mzxrai](https://github.com/mzxrai)
Enhanced with stealth capabilities by [Phiality](https://github.com/phialsbasement)