UNPKG

@bernierllc/csv-import

Version:

Service package that orchestrates CSV core packages to provide higher-level CSV import functionality

380 lines (310 loc) 9.74 kB
# @bernierllc/csv-import Service package that orchestrates CSV core packages to provide higher-level CSV import functionality with schema management, error handling, and import workflows. ## Installation ```bash npm install @bernierllc/csv-import ``` ## Usage ### Basic Import ```typescript import { CSVImport, createSchema } from '@bernierllc/csv-import'; // Create CSV import instance const csvImport = new CSVImport(); // Create schema const schemaId = await csvImport.createSchema({ name: 'Users Import', description: 'Schema for importing user data', fields: [ { name: 'email', type: 'email', required: true, validation: { pattern: /^[^\s@]+@[^\s@]+\.[^\s@]+$/ } }, { name: 'firstName', type: 'string', required: true, validation: { minLength: 1, maxLength: 50 } }, { name: 'lastName', type: 'string', required: true, validation: { minLength: 1, maxLength: 50 } }, { name: 'age', type: 'number', required: false } ] }); // Create import job const job = await csvImport.createImport({ schemaId, options: { batchSize: 100, validateOnly: false, autoMap: true, skipErrors: false, maxErrors: 10 }, callbacks: { onProgress: (progress) => { console.log(`Progress: ${progress.progress}%`); }, onComplete: (result) => { console.log(`Import completed: ${result.validRows} valid rows`); }, onError: (error) => { console.error('Import error:', error.message); } } }); // Start import const fileData = { name: 'users.csv', size: 1024, type: 'text/csv', content: 'email,firstName,lastName,age\njohn@example.com,John,Doe,30', buffer: Buffer.from('email,firstName,lastName,age\njohn@example.com,John,Doe,30') }; const result = await csvImport.startImport(job.id, fileData); if (result.success) { console.log(`Successfully imported ${result.validRows} rows`); } else { console.log(`Import failed with ${result.errors.length} errors`); result.errors.forEach(error => { console.log(`Row ${error.row}: ${error.message}`); }); } ``` ### Advanced Usage with Retry Logic ```typescript import { CSVImport } from '@bernierllc/csv-import'; const csvImport = new CSVImport(); // Create import with retry configuration const job = await csvImport.createImport({ schemaId: 'your-schema-id', options: { batchSize: 50, retryFailed: true, retryOptions: { maxRetries: 3, initialDelayMs: 1000, maxDelayMs: 30000, backoffFactor: 2 } } }); const result = await csvImport.startImport(job.id, fileData); ``` ### Progress Tracking ```typescript // Subscribe to progress updates const unsubscribe = csvImport.progressTracker.subscribeToProgress( job.id, (progress) => { console.log(`Status: ${progress.status}`); console.log(`Progress: ${progress.progress}%`); console.log(`Processed: ${progress.currentRow}/${progress.totalRows}`); console.log(`Valid: ${progress.validRows}, Invalid: ${progress.invalidRows}`); if (progress.estimatedTimeRemaining) { console.log(`ETA: ${progress.estimatedTimeRemaining}ms`); } } ); // Start import await csvImport.startImport(job.id, fileData); // Clean up subscription unsubscribe(); ``` ### Schema Management ```typescript // Create basic schema const basicSchemaId = await csvImport.schemaManager.createBasicSchema( 'Simple CSV', ['name', 'email', 'phone'] ); // Create email-specific schema const emailSchemaId = await csvImport.schemaManager.createEmailSchema( 'Email List Import' ); // Update schema await csvImport.updateSchema(schemaId, { description: 'Updated description', fields: [ // ... updated fields ] }); // Validate schema const validation = await csvImport.validateSchema(schema); if (!validation.isValid) { console.log('Schema errors:', validation.errors); } // List all schemas const allSchemas = await csvImport.schemaManager.listSchemas(); ``` ### Import Management ```typescript // Pause import await csvImport.pauseImport(job.id); // Resume import await csvImport.resumeImport(job.id); // Cancel import await csvImport.cancelImport(job.id); // Get import history const history = await csvImport.getImportHistory({ status: 'completed', limit: 10, offset: 0 }); console.log(`Found ${history.length} completed imports`); ``` ### Error Handling and Recovery ```typescript import { ErrorHandler } from '@bernierllc/csv-import'; const errorHandler = new ErrorHandler(); // Get error suggestions result.errors.forEach(error => { const suggestion = errorHandler.getSuggestion(error); if (suggestion) { console.log(`Error: ${error.message}`); console.log(`Suggestion: ${suggestion}`); } }); // Generate error report const errorReport = errorHandler.generateErrorReport(result.errors); console.log('Error Summary:', errorReport.summary); console.log('Errors by Field:', errorReport.byField); console.log('Errors by Code:', errorReport.byCode); // Get recovery options result.errors.forEach(error => { const options = errorHandler.getRecoveryOptions(error); console.log(`Recovery options for ${error.code}:`, options); }); ``` ## API Reference ### CSVImport Main service class that orchestrates CSV import operations. #### Methods - `createImport(config: ImportConfig): Promise<ImportJob>` - Create a new import job - `startImport(jobId: string, fileData: FileData): Promise<ImportResult>` - Start processing an import - `pauseImport(jobId: string): Promise<void>` - Pause an active import - `resumeImport(jobId: string): Promise<void>` - Resume a paused import - `cancelImport(jobId: string): Promise<void>` - Cancel an import - `createSchema(schema: SchemaData): Promise<SchemaId>` - Create a new schema - `updateSchema(schemaId: SchemaId, updates: Partial<CSVSchema>): Promise<void>` - Update existing schema - `validateSchema(schema: CSVSchema): Promise<SchemaValidationResult>` - Validate schema - `getImportProgress(jobId: string): Promise<ImportProgress>` - Get import progress - `getImportHistory(options?: HistoryOptions): Promise<ImportHistory[]>` - Get import history ### Types #### ImportConfig ```typescript interface ImportConfig { schemaId?: SchemaId; options?: ImportOptions; callbacks?: ImportCallbacks; } ``` #### ImportOptions ```typescript interface ImportOptions { batchSize?: number; validateOnly?: boolean; autoMap?: boolean; skipErrors?: boolean; maxErrors?: number; retryFailed?: boolean; retryOptions?: RetryOptions; } ``` #### ImportResult ```typescript interface ImportResult { jobId: string; success: boolean; totalRows: number; processedRows: number; validRows: number; invalidRows: number; errors: ImportError[]; warnings: ImportWarning[]; processingTime: number; completedAt: Date; } ``` #### CSVSchema ```typescript interface CSVSchema { id: SchemaId; name: string; description?: string; fields: SchemaField[]; validationRules?: ValidationRule[]; mappingRules?: MappingRule[]; createdAt: Date; updatedAt: Date; } ``` ## Configuration ### Import Options - **batchSize** (default: 100) - Number of rows to process in each batch - **validateOnly** (default: false) - Only validate data without processing - **autoMap** (default: true) - Automatically map CSV columns to schema fields - **skipErrors** (default: false) - Skip rows with errors and continue processing - **maxErrors** (default: undefined) - Maximum number of errors before stopping - **retryFailed** (default: false) - Enable retry logic for failed rows ### Retry Options - **maxRetries** (default: 3) - Maximum number of retry attempts - **initialDelayMs** (default: 1000) - Initial delay between retries - **maxDelayMs** (default: 30000) - Maximum delay between retries - **backoffFactor** (default: 2) - Exponential backoff multiplier ## Error Codes ### Common Error Codes - `REQUIRED_FIELD` - Required field is missing or empty - `INVALID_EMAIL` - Invalid email format - `INVALID_NUMBER` - Invalid numeric value - `INVALID_DATE` - Invalid date format - `VALUE_TOO_LONG` - Value exceeds maximum length - `VALUE_TOO_SHORT` - Value is below minimum length - `DUPLICATE_VALUE` - Duplicate value found - `OUT_OF_RANGE` - Value is outside acceptable range - `PARSING_ERROR` - Error parsing CSV data - `SYSTEM_ERROR` - System or processing error ## Dependencies This package orchestrates the following core packages: - `@bernierllc/csv-parser` - CSV parsing functionality - `@bernierllc/csv-validator` - Data validation - `@bernierllc/csv-mapper` - Column mapping - `@bernierllc/file-handler` - File operations - `@bernierllc/retry-policy` - Retry logic ## Performance ### Benchmarks - **Small files** (< 1MB) - < 5s processing time - **Large files** (> 100MB) - < 300s processing time - **Memory usage** - < 50MB for 1GB file - **Optimal batch size** - 1000 rows per batch ### Optimization Tips 1. Use appropriate batch sizes for your data volume 2. Enable auto-mapping to reduce processing time 3. Set reasonable error limits to avoid processing invalid data 4. Use retry logic only when necessary 5. Subscribe to progress updates for better user experience ## See Also - [@bernierllc/csv-parser](../csv-parser) - Core CSV parsing functionality - [@bernierllc/csv-validator](../csv-validator) - Data validation utilities - [@bernierllc/csv-mapper](../csv-mapper) - Column mapping utilities - [@bernierllc/file-handler](../file-handler) - File handling utilities - [@bernierllc/retry-policy](../retry-policy) - Retry logic utilities ## License Copyright (c) 2025 Bernier LLC. All rights reserved.