UNPKG

@prathammahajan/csv-bulk-processor

Version:

Powerful, memory-efficient bulk data processing for CSV, Excel, and JSON files with streaming, validation, transformation, and performance monitoring

324 lines (261 loc) 11.2 kB
# 🚀 CSV Bulk Processor - Data Import/Export Suite **Powerful, memory-efficient bulk data processing for CSV, Excel, and JSON files with streaming, validation, transformation, and performance monitoring.** > **SEO Keywords**: CSV bulk processor, data import export, file streaming, data validation, data transformation, bulk data processing, Node.js data processing, enterprise data pipeline, scalable data processing, production-ready data processor [![npm version](https://badge.fury.io/js/@prathammahajan%2Fcsv-bulk-processor.svg)](https://badge.fury.io/js/@prathammahajan%2Fcsv-bulk-processor) [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT) [![Node.js](https://img.shields.io/badge/Node.js-Ready-green.svg)](https://nodejs.org/) [![GitHub stars](https://img.shields.io/github/stars/prathammahajan13/csv-bulk-processor.svg?style=social&label=Star)](https://github.com/prathammahajan13/csv-bulk-processor) [![GitHub forks](https://img.shields.io/github/forks/prathammahajan13/csv-bulk-processor.svg?style=social&label=Fork)](https://github.com/prathammahajan13/csv-bulk-processor/fork) [![GitHub watchers](https://img.shields.io/github/watchers/prathammahajan13/csv-bulk-processor.svg?style=social&label=Watch)](https://github.com/prathammahajan13/csv-bulk-processor) ## ✨ Features - 🔄 **Memory-Efficient Streaming** - Process large files without memory issues - 📊 **Multi-Format Support** - CSV, Excel, and JSON file processing - ✅ **Data Validation** - Schema, format, and custom validation rules - 🔧 **Data Transformation** - Cleaning, mapping, and conversion - 📈 **Progress Tracking** - Real-time progress monitoring and resumable processing - 🛡️ **Error Handling** - Comprehensive error detection and recovery - ⚡ **Performance Monitoring** - Built-in performance analytics and optimization - 🎯 **Batch Processing** - Optimized batch operations for large datasets ## 🎯 Why Choose This Bulk Processor? **Perfect for developers who need:** - **Enterprise-grade data processing** with memory-efficient streaming for large files - **Production-ready data pipelines** with comprehensive validation and transformation - **Scalable data processing** for handling millions of records efficiently - **Real-time progress monitoring** with resumable processing capabilities - **Multi-format data support** for CSV, Excel, and JSON files - **Advanced error handling** with rollback and recovery mechanisms - **Performance optimization** with built-in monitoring and analytics - **Developer-friendly APIs** with intuitive configuration and event handling ## 📦 Installation ```bash npm install @prathammahajan/csv-bulk-processor ``` ## 🚀 Quick Start ```javascript const BulkProcessor = require('@prathammahajan/csv-bulk-processor'); // Create processor with configuration const processor = new BulkProcessor({ streaming: { enabled: true, chunkSize: 1000 }, validation: { enabled: true }, transformation: { enabled: true }, progress: { enabled: true } }); // Process a file with progress tracking processor.on('progress', (data) => { console.log(`Processed ${data.recordsProcessed} records`); }); const result = await processor.processFile('data.csv'); console.log(`✅ Processed ${result.result.recordsProcessed} records`); console.log(`⏱️ Processing time: ${result.processingTime}ms`); console.log(`📊 Performance: ${result.analytics.metrics.throughput.recordsPerSecond} records/sec`); ``` ## 📋 Basic Usage ### Simple File Processing ```javascript const processor = new BulkProcessor(); // Process CSV file const csvResult = await processor.processFile('data.csv'); // Process JSON file const jsonResult = await processor.processFile('data.json'); // Process Excel file const excelResult = await processor.processFile('data.xlsx'); ``` ### Progress Tracking ```javascript const processor = new BulkProcessor(); processor.on('progress', (data) => { console.log(`Progress: ${data.recordsProcessed} records processed`); }); processor.on('complete', (data) => { console.log('Processing completed!'); }); const result = await processor.processFile('large-file.csv'); ``` ### Data Validation ```javascript const processor = new BulkProcessor({ validation: { enabled: true, schema: { name: { type: 'string', required: true }, email: { type: 'email', required: true }, age: { type: 'number', min: 0, max: 120 } } } }); const result = await processor.processFile('data.csv'); // Validation errors will be in result.result.errors ``` ### Data Transformation ```javascript const processor = new BulkProcessor({ transformation: { enabled: true, mapping: { 'Full Name': 'name', 'Email Address': 'email', 'User Age': 'age' }, cleaning: { trimStrings: true, normalizeDates: true } } }); const result = await processor.processFile('data.csv'); ``` ## 🔧 Configuration ```javascript const processor = new BulkProcessor({ streaming: { enabled: true, // Enable streaming for large files chunkSize: 1000, // Records per chunk memoryLimit: '100MB' // Memory limit }, validation: { enabled: true, // Enable validation schema: { /* schema */ }, format: true, // Format validation business: true // Business rules }, transformation: { enabled: true, // Enable transformation mapping: { /* mapping */ }, cleaning: true, // Data cleaning conversion: true // Type conversion }, progress: { enabled: true, // Enable progress tracking tracking: true, // Real-time tracking resumable: true // Resumable processing }, error: { enabled: true, // Enable error handling recovery: true, // Error recovery rollback: true // Rollback on errors }, performance: { enabled: true, // Enable performance monitoring monitoring: true, // Real-time monitoring optimization: true // Performance optimization } }); ``` ## 🎯 Advanced Examples ### Large File Processing with Streaming ```javascript const processor = new BulkProcessor({ streaming: { enabled: true, chunkSize: 5000, memoryLimit: '500MB' }, progress: { enabled: true } }); // Process a 10GB CSV file efficiently const result = await processor.processFile('huge-dataset.csv'); ``` ### Complex Data Transformation ```javascript const processor = new BulkProcessor({ transformation: { enabled: true, mapping: { 'Customer Name': 'name', 'Email': 'email', 'Phone Number': 'phone' }, cleaning: { trimStrings: true, normalizeDates: true, removeEmpty: true }, conversion: { 'age': 'number', 'isActive': 'boolean', 'createdAt': 'date' } } }); const result = await processor.processFile('customer-data.csv'); ``` ### Error Handling and Recovery ```javascript const processor = new BulkProcessor({ error: { enabled: true, recovery: true, rollback: true, retryAttempts: 3 } }); processor.on('error', (error) => { console.error('Processing error:', error); }); processor.on('validation-error', (error) => { console.error('Validation error:', error); }); const result = await processor.processFile('data.csv'); ``` ## 📊 Performance Monitoring ```javascript const processor = new BulkProcessor({ performance: { enabled: true } }); const result = await processor.processFile('data.csv'); console.log('Performance Metrics:', result.analytics.metrics); // { // throughput: { recordsPerSecond: 1500 }, // performance: { averageProcessingTime: 2.5 }, // memory: { peakUsage: '45MB' }, // error: { errorRate: 0.02 } // } ``` ## 🛠️ Supported File Formats | Format | Extension | Features | |--------|-----------|----------| | **CSV** | `.csv` | Headers, custom delimiters, encoding | | **Excel** | `.xlsx`, `.xls` | Multiple sheets, formulas, formatting | | **JSON** | `.json` | Arrays, objects, streaming support | ## 📝 API Reference ### BulkProcessor Methods | Method | Description | |--------|-------------| | `processFile(filePath, options?)` | Process a file with full pipeline | | `getSupportedFormats()` | Get list of supported file formats | | `getMemoryUsage()` | Get current memory usage | | `getPerformanceMetrics()` | Get performance metrics | | `updateConfiguration(config)` | Update processor configuration | ### Events | Event | Description | Data | |-------|-------------|------| | `progress` | Processing progress update | `{ recordsProcessed, batchSize }` | | `complete` | Processing completed | `{ result, analytics, processingTime }` | | `error` | Processing error occurred | `{ error, record, timestamp }` | | `validation-error` | Validation error occurred | `{ error, field, record }` | ## 🤝 Contributing 1. Fork the repository 2. Create your feature branch (`git checkout -b feature/amazing-feature`) 3. Commit your changes (`git commit -m 'Add amazing feature'`) 4. Push to the branch (`git push origin feature/amazing-feature`) 5. Open a Pull Request ## 📄 License This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details. ## 🚀 Use Cases & Applications **Ideal for:** - **Data migration projects** requiring efficient processing of large datasets - **ETL pipelines** with validation, transformation, and error handling - **Data import/export systems** for enterprise applications - **Analytics platforms** processing large volumes of data - **API development** with bulk data processing capabilities - **Microservices architecture** with data processing components - **Real-time data processing** with streaming and progress tracking - **Startup MVPs** needing production-ready data processing solutions ## 🔍 SEO & Discoverability **Search Terms**: CSV bulk processor, data import export, file streaming, data validation, data transformation, bulk data processing, Node.js data processing, enterprise data pipeline, scalable data processing, production-ready data processor, CSV parser, Excel processor, JSON processor, data pipeline, ETL processing, data migration, file processing, streaming data, memory efficient processing, data validation engine, data transformation engine, progress tracking, error handling, performance monitoring, batch processing ## 🙏 Support - 📧 **Issues**: [GitHub Issues](https://github.com/prathammahajan13/csv-bulk-processor/issues) - 📖 **Documentation**: [GitHub Wiki](https://github.com/prathammahajan13/csv-bulk-processor/wiki) - 💬 **Discussions**: [GitHub Discussions](https://github.com/prathammahajan13/csv-bulk-processor/discussions) --- **Made with ❤️ by [Pratham Mahajan](https://github.com/prathammahajan13)**