UNPKG

mcp-webresearch-stealthified

Version:

MCP server for web research, stealthified, improved, forked from mzxrai

167 lines (111 loc) 5.21 kB
# MCP Web Research Server - Stealthified A Model Context Protocol (MCP) server for web research with enhanced stealth capabilities to avoid detection. Bring real-time web information into Claude and easily research any topic without getting blocked by CAPTCHAs. ## Features - Google search integration with anti-blocking measures - Google Scholar academic research capabilities - Webpage content extraction with clean markdown formatting - Screenshot capture with automatic sizing optimization - Research session tracking (search queries, visited pages, etc.) - Bot detection avoidance techniques ## Prerequisites - [Node.js](https://nodejs.org/) >= 18 (includes `npm` and `npx`) - [Claude Desktop app](https://claude.ai/download) ## Installation First, ensure you've downloaded and installed the [Claude Desktop app](https://claude.ai/download) and you have npm installed. Next, add this entry to your `claude_desktop_config.json` (location varies by platform): - **Mac**: `~/Library/Application\ Support/Claude/claude_desktop_config.json` - **Windows**: `%APPDATA%\Claude\claude_desktop_config.json` - **Linux**: `~/.config/Claude/claude_desktop_config.json` ```json { "mcpServers": { "webresearch": { "command": "npx", "args": ["-y", "mcp-webresearch-stealthified@latest"] } } } ``` This config allows Claude Desktop to automatically start the web research MCP server when needed. ## Usage Simply start a chat with Claude and send a prompt that would benefit from web research. For a prebuilt prompt optimized for deeper web research, use the `agentic-research` prompt that comes with this package. Access that prompt in Claude Desktop by clicking the Paperclip icon in the chat input and then selecting `Choose an integration``webresearch``agentic-research`. <img src="https://i.ibb.co/N6Y3C0q/Screenshot-2024-12-05-at-11-01-27-PM.png" alt="Example screenshot of web research" width="400"/> ### Tools 1. `search_google` - Performs Google searches with anti-detection measures - Arguments: `{ query: string }` 2. `search_scholar` - Searches Google Scholar for academic papers and scholarly content - Arguments: `{ query: string }` 3. `visit_page` - Visits a webpage and extracts its content in clean markdown format - Arguments: `{ url: string, takeScreenshot?: boolean }` 4. `take_screenshot` - Takes a screenshot of the current page with automatic resizing - No arguments required ### Prompts #### `agentic-research` A guided research prompt that helps Claude conduct thorough web research. The prompt instructs Claude to: - Start with broad searches to understand the topic landscape - Prioritize high-quality, authoritative sources - Iteratively refine the research direction based on findings - Keep you informed and let you guide the research interactively - Always cite sources with URLs ### Resources This server exposes two types of MCP resources: #### Screenshots When you take a screenshot, it's saved as an MCP resource. You can access captured screenshots in Claude Desktop via the Paperclip icon. #### Research Session The server maintains a research session that includes: - Search queries - Visited pages - Extracted content - Screenshots - Timestamps You can review this information through the MCP resources interface. ### Advanced Usage Tips 1. **Optimized Search Queries**: For general topics, suggest high-quality sources in your query (e.g., "news today from reuters or AP" instead of just "news today"). 2. **Academic Research**: Use the `search_scholar` tool specifically for academic or scientific topics to get scholarly articles. 3. **Sequential Research**: For complex topics, guide Claude through a step-by-step research process, focusing on one aspect at a time. 4. **Reading Depth**: When Claude finds a relevant page, you can ask it to visit the page and analyze the content in depth. 5. **Visual Information**: Use the screenshot capability when understanding a page's layout or visual elements is important. ## Troubleshooting This is pre-alpha code with potential issues. If you run into problems: 1. Check Claude Desktop's MCP logs: ```bash # Mac/Linux tail -n 20 -f ~/Library/Logs/Claude/mcp*.log # Windows Get-Content -Path "$env:APPDATA\Claude\mcp*.log" -Tail 20 -Wait ``` 2. Common issues: - CAPTCHAs: This fork includes anti-detection measures, but some sites may still show CAPTCHAs - Content extraction: Some sites may not extract properly due to complex layouts - Rate limiting: Excessive searches may trigger rate limiting from Google ## Development ```bash # Clone the repository git clone https://github.com/phialsbasement/mcp-webresearch-stealthified.git cd mcp-webresearch-stealthified # Install dependencies pnpm install # Build the project pnpm build # Watch for changes pnpm watch # Run in development mode pnpm dev ``` ## Requirements - Node.js >= 18 - Playwright (automatically installed as a dependency) ## Verified Platforms - [x] macOS - [x] Linux - [x] Windows ## License MIT ## Credits Originally created by [mzxrai](https://github.com/mzxrai) Enhanced with stealth capabilities by [Phiality](https://github.com/phialsbasement)