UNPKG

@knowcode/screenshotfetch

Version:

Web application spider with screenshot capture and customer journey documentation. Automate user flow documentation with authentication support.

404 lines (313 loc) • 11.5 kB
# @knowcode/screenshotfetch [![npm version](https://img.shields.io/npm/v/@knowcode/screenshotfetch.svg)](https://www.npmjs.com/package/@knowcode/screenshotfetch) [![npm downloads](https://img.shields.io/npm/dm/@knowcode/screenshotfetch.svg)](https://www.npmjs.com/package/@knowcode/screenshotfetch) [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT) [![Node.js Version](https://img.shields.io/node/v/@knowcode/screenshotfetch.svg)](https://nodejs.org/) A comprehensive web application spider with screenshot capture and customer journey documentation, built on Puppeteer. Automatically crawl web applications, handle authentication, and generate step-by-step user flow documentation with screenshots. ## šŸš€ Features - šŸ“ø **High-quality screenshot capture** - Crystal clear screenshots with full customization - šŸ•·ļø **Web application spidering** - Intelligent crawling with flow documentation - šŸ” **Automated authentication** - Username/password login with smart form detection - šŸ“‹ **Customer journey mapping** - Step-by-step flows with visual documentation - šŸ”— **URL tracking & cross-referencing** - Complete traceability for all screenshots - šŸŖ **Advanced cookie consent handling** - Multiple strategies for banner removal - šŸš€ **Fast and reliable** - Built on Puppeteer with robust error handling - šŸ’» **CLI and programmatic API** - Use via command line or integrate into your code - šŸ“¦ **Batch processing** - Process multiple URLs efficiently - šŸŽÆ **Smart filtering** - Avoids destructive actions like logout/delete ## šŸ“¦ Installation ### Global Installation (CLI Usage) ```bash npm install -g @knowcode/screenshotfetch ``` ### Local Installation (Programmatic Usage) ```bash npm install @knowcode/screenshotfetch ``` ## ⚔ Quick Start ### Spider a Web Application ```bash # Spider with authentication screenshotfetch spider https://app.example.com/login -u username -p password # Generate documentation in custom directory screenshotfetch spider https://app.example.com -u user@example.com -p pass123 -o ./my-docs ``` ### Single Screenshot ```bash # Basic screenshot screenshotfetch capture https://example.com -o screenshot.png # Full page screenshot screenshotfetch capture https://example.com -o fullpage.png --fullpage ``` ## CLI Usage ### Web Application Spider (NEW) ```bash # Spider a web application with authentication screenshotfetch spider https://app.example.com/login -u username -p password # Custom output directory and flow limits screenshotfetch spider https://app.example.com -u user@example.com -p pass123 -o ./my-docs --max-flows 3 # Debug mode (visible browser) screenshotfetch spider https://app.example.com -u username -p password --headless false # Custom viewport and timing screenshotfetch spider https://app.example.com -u user -p pass -w 1280 -h 720 --wait 3000 ``` ### Capture Single Screenshot ```bash # Basic usage screenshotfetch capture https://example.com -o ./screenshot.png # Full page capture screenshotfetch capture https://example.com -o ./fullpage.png --fullpage # Custom viewport screenshotfetch capture https://example.com -w 1280 -h 720 # Different cookie strategies screenshotfetch capture https://example.com -s none # No cookie handling screenshotfetch capture https://example.com -s click # Only try clicking screenshotfetch capture https://example.com -s remove # Only remove banners screenshotfetch capture https://example.com -s block # Only block services screenshotfetch capture https://example.com -s all # Try all strategies (default) ``` ### Batch Capture Create a JSON file with your screenshots: ```json [ { "url": "https://example.com", "output": "./screenshots/home.png", "fullPage": false }, { "url": "https://example.com/about", "output": "./screenshots/about.png", "fullPage": true } ] ``` Then run: ```bash screenshotfetch batch screenshots.json -d 3000 ``` ### View Example Format ```bash screenshotfetch example ``` ## Programmatic Usage ### Web Application Spider API ```javascript const { ApplicationSpider } = require('@knowcode/screenshotfetch'); async function spiderApplication() { const spider = new ApplicationSpider({ viewport: { width: 1920, height: 1080 }, cookieStrategy: 'all', maxFlows: 5, maxDepth: 10, outputDir: './docs' }); // Spider with authentication const result = await spider.spiderApplication( 'https://app.example.com/login', 'username', 'password' ); console.log(`Discovered ${result.summary.completedFlows} customer journeys`); console.log(`Generated ${result.summary.screenshotCount} screenshots`); console.log(`Documentation in: ${result.summary.outputDirectory}`); } ``` ### Screenshot Capture API ```javascript const { ScreenshotCapture } = require('@knowcode/screenshotfetch'); async function captureScreenshots() { const capture = new ScreenshotCapture({ viewport: { width: 1920, height: 1080 }, cookieStrategy: 'all', waitTime: 3000 }); await capture.init(); // Single screenshot const result = await capture.captureScreenshot( 'https://example.com', './screenshot.png', { fullPage: false } ); // Batch screenshots const screenshots = [ { url: 'https://example.com', output: './home.png' }, { url: 'https://example.com/about', output: './about.png' } ]; const results = await capture.captureMultiple(screenshots); await capture.close(); } ``` ## šŸ“‹ Generated Documentation Structure When using the spider functionality, the tool generates comprehensive documentation: ``` docs/ # Output directory ā”œā”€ā”€ index.md # Overview with flow summary ā”œā”€ā”€ flows/ # Customer journey documentation │ ā”œā”€ā”€ flow-1/ │ │ ā”œā”€ā”€ flow-1.md # Step-by-step journey with screenshots │ │ ā”œā”€ā”€ _images/ # Screenshots for this flow │ │ │ ā”œā”€ā”€ 01-login.png │ │ │ ā”œā”€ā”€ 02-dashboard.png │ │ │ └── 03-settings.png │ │ └── flow-1.json # Metadata and URL mappings │ └── flow-2/ │ ā”œā”€ā”€ flow-2.md # Another customer journey │ ā”œā”€ā”€ _images/ # Flow-specific screenshots │ └── flow-2.json # Flow metadata └── metadata/ ā”œā”€ā”€ url-index.json # Complete URL-to-screenshot mapping └── flow-summary.json # Summary of all discovered flows ``` Each flow markdown file contains: - Step-by-step customer journey with screenshots - URL tracking for each step - Action descriptions and navigation flow - Metadata for programmatic access ## Cookie Handling Strategies The tool provides multiple strategies for handling cookie consent banners: 1. **`all`** (default) - Tries all strategies in sequence 2. **`click`** - Attempts to click accept/agree buttons 3. **`remove`** - Removes banner elements from DOM 4. **`block`** - Blocks cookie consent service requests 5. **`none`** - No cookie handling ## Advanced Options ### Constructor Options ```javascript { viewport: { width: 1920, height: 1080, deviceScaleFactor: 1 }, timeout: 30000, // Navigation timeout headless: 'new', // 'new' or false cookieStrategy: 'all', // Cookie handling strategy waitTime: 3000 // Wait after page load } ``` ### Screenshot Options ```javascript { fullPage: false, // Capture full scrollable page clip: { // Capture specific region x: 0, y: 0, width: 800, height: 600 }, omitBackground: false, // Transparent background encoding: 'binary', // 'base64' or 'binary' type: 'png', // 'png' or 'jpeg' quality: 90 // JPEG quality (0-100) } ``` ## Examples ### Capture Competitor Screenshots ```javascript const { ScreenshotCapture } = require('screenshotfetch'); async function captureCompetitors() { const capture = new ScreenshotCapture({ cookieStrategy: 'all' }); await capture.init(); const competitors = [ 'https://mailchimp.com', 'https://activecampaign.com', 'https://convertkit.com' ]; for (const url of competitors) { const name = new URL(url).hostname.replace('www.', ''); await capture.captureScreenshot( url, `./competitors/${name}.png` ); } await capture.close(); } ``` ### Custom Cookie Handler ```javascript const { ScreenshotCapture, CookieHandler } = require('screenshotfetch'); const cookieHandler = new CookieHandler(); // Add custom selectors for specific sites cookieHandler.addCustomSelectors([ 'button[id="my-custom-accept"]', '.my-site-cookie-accept' ]); // Add domains to block cookieHandler.addBlockedDomains([ 'mycookieservice.com' ]); ``` ## Troubleshooting ### Navigation Timeout If you're getting timeout errors, try: - Using `domcontentloaded` instead of `networkidle2` - Reducing the timeout value - Using headless: false to see what's happening ### Cookie Banners Still Visible Try different strategies: - Use `-s all` to try all methods - Add custom selectors for specific sites - Use headless: false to debug ### Memory Issues For large batches: - Increase delay between captures - Process in smaller batches - Monitor system resources ## šŸ¤ Use Cases ### UX Research & Documentation - Document user flows for analysis and improvement - Create visual user journey maps - Generate training materials automatically ### Quality Assurance & Testing - Automate UI regression testing with screenshots - Document application state changes - Validate user experience flows ### Competitive Analysis - Document competitor application flows - Compare user experience patterns - Generate competitive intelligence reports ### Process Documentation - Create step-by-step user guides - Document internal workflows - Generate training materials ## šŸ“‹ Requirements - **Node.js**: 16.0.0 or higher - **Operating System**: Windows, macOS, or Linux - **Memory**: 512MB+ available RAM - **Disk Space**: Varies based on screenshot quantity ## šŸ”§ Advanced Configuration ### Spider Options ```javascript { maxFlows: 5, // Maximum customer journeys to discover maxDepth: 10, // Maximum steps per journey maxPages: 100, // Maximum total pages to visit waitTime: 2000, // Wait between actions (ms) includeQueryParams: true, // Include URL query parameters cookieStrategy: 'all' // Cookie consent handling strategy } ``` ### Screenshot Options ```javascript { fullPage: false, // Capture full scrollable page viewport: { // Browser viewport size width: 1920, height: 1080, deviceScaleFactor: 1 }, type: 'png', // Image format ('png' or 'jpeg') quality: 90 // JPEG quality (0-100) } ``` ## šŸ“ˆ Contributing We welcome contributions! Please see our contributing guidelines for details on how to: - Report bugs and request features - Submit pull requests - Improve documentation ## šŸ“„ License MIT Ā© [KnowCode](https://github.com/knowcode) ## šŸ”— Links - [NPM Package](https://www.npmjs.com/package/@knowcode/screenshotfetch) - [GitHub Repository](https://github.com/knowcode/screenshotfetch) - [Issue Tracker](https://github.com/knowcode/screenshotfetch/issues) - [Documentation](https://github.com/knowcode/screenshotfetch#readme)