@bernierllc/csv-import
Version:
Service package that orchestrates CSV core packages to provide higher-level CSV import functionality
380 lines (310 loc) • 9.74 kB
Markdown
# @bernierllc/csv-import
Service package that orchestrates CSV core packages to provide higher-level CSV import functionality with schema management, error handling, and import workflows.
## Installation
```bash
npm install @bernierllc/csv-import
```
## Usage
### Basic Import
```typescript
import { CSVImport, createSchema } from '@bernierllc/csv-import';
// Create CSV import instance
const csvImport = new CSVImport();
// Create schema
const schemaId = await csvImport.createSchema({
name: 'Users Import',
description: 'Schema for importing user data',
fields: [
{
name: 'email',
type: 'email',
required: true,
validation: {
pattern: /^[^\s@]+@[^\s@]+\.[^\s@]+$/
}
},
{
name: 'firstName',
type: 'string',
required: true,
validation: {
minLength: 1,
maxLength: 50
}
},
{
name: 'lastName',
type: 'string',
required: true,
validation: {
minLength: 1,
maxLength: 50
}
},
{
name: 'age',
type: 'number',
required: false
}
]
});
// Create import job
const job = await csvImport.createImport({
schemaId,
options: {
batchSize: 100,
validateOnly: false,
autoMap: true,
skipErrors: false,
maxErrors: 10
},
callbacks: {
onProgress: (progress) => {
console.log(`Progress: ${progress.progress}%`);
},
onComplete: (result) => {
console.log(`Import completed: ${result.validRows} valid rows`);
},
onError: (error) => {
console.error('Import error:', error.message);
}
}
});
// Start import
const fileData = {
name: 'users.csv',
size: 1024,
type: 'text/csv',
content: 'email,firstName,lastName,age\njohn@example.com,John,Doe,30',
buffer: Buffer.from('email,firstName,lastName,age\njohn@example.com,John,Doe,30')
};
const result = await csvImport.startImport(job.id, fileData);
if (result.success) {
console.log(`Successfully imported ${result.validRows} rows`);
} else {
console.log(`Import failed with ${result.errors.length} errors`);
result.errors.forEach(error => {
console.log(`Row ${error.row}: ${error.message}`);
});
}
```
### Advanced Usage with Retry Logic
```typescript
import { CSVImport } from '@bernierllc/csv-import';
const csvImport = new CSVImport();
// Create import with retry configuration
const job = await csvImport.createImport({
schemaId: 'your-schema-id',
options: {
batchSize: 50,
retryFailed: true,
retryOptions: {
maxRetries: 3,
initialDelayMs: 1000,
maxDelayMs: 30000,
backoffFactor: 2
}
}
});
const result = await csvImport.startImport(job.id, fileData);
```
### Progress Tracking
```typescript
// Subscribe to progress updates
const unsubscribe = csvImport.progressTracker.subscribeToProgress(
job.id,
(progress) => {
console.log(`Status: ${progress.status}`);
console.log(`Progress: ${progress.progress}%`);
console.log(`Processed: ${progress.currentRow}/${progress.totalRows}`);
console.log(`Valid: ${progress.validRows}, Invalid: ${progress.invalidRows}`);
if (progress.estimatedTimeRemaining) {
console.log(`ETA: ${progress.estimatedTimeRemaining}ms`);
}
}
);
// Start import
await csvImport.startImport(job.id, fileData);
// Clean up subscription
unsubscribe();
```
### Schema Management
```typescript
// Create basic schema
const basicSchemaId = await csvImport.schemaManager.createBasicSchema(
'Simple CSV',
['name', 'email', 'phone']
);
// Create email-specific schema
const emailSchemaId = await csvImport.schemaManager.createEmailSchema(
'Email List Import'
);
// Update schema
await csvImport.updateSchema(schemaId, {
description: 'Updated description',
fields: [
// ... updated fields
]
});
// Validate schema
const validation = await csvImport.validateSchema(schema);
if (!validation.isValid) {
console.log('Schema errors:', validation.errors);
}
// List all schemas
const allSchemas = await csvImport.schemaManager.listSchemas();
```
### Import Management
```typescript
// Pause import
await csvImport.pauseImport(job.id);
// Resume import
await csvImport.resumeImport(job.id);
// Cancel import
await csvImport.cancelImport(job.id);
// Get import history
const history = await csvImport.getImportHistory({
status: 'completed',
limit: 10,
offset: 0
});
console.log(`Found ${history.length} completed imports`);
```
### Error Handling and Recovery
```typescript
import { ErrorHandler } from '@bernierllc/csv-import';
const errorHandler = new ErrorHandler();
// Get error suggestions
result.errors.forEach(error => {
const suggestion = errorHandler.getSuggestion(error);
if (suggestion) {
console.log(`Error: ${error.message}`);
console.log(`Suggestion: ${suggestion}`);
}
});
// Generate error report
const errorReport = errorHandler.generateErrorReport(result.errors);
console.log('Error Summary:', errorReport.summary);
console.log('Errors by Field:', errorReport.byField);
console.log('Errors by Code:', errorReport.byCode);
// Get recovery options
result.errors.forEach(error => {
const options = errorHandler.getRecoveryOptions(error);
console.log(`Recovery options for ${error.code}:`, options);
});
```
## API Reference
### CSVImport
Main service class that orchestrates CSV import operations.
#### Methods
- `createImport(config: ImportConfig): Promise<ImportJob>` - Create a new import job
- `startImport(jobId: string, fileData: FileData): Promise<ImportResult>` - Start processing an import
- `pauseImport(jobId: string): Promise<void>` - Pause an active import
- `resumeImport(jobId: string): Promise<void>` - Resume a paused import
- `cancelImport(jobId: string): Promise<void>` - Cancel an import
- `createSchema(schema: SchemaData): Promise<SchemaId>` - Create a new schema
- `updateSchema(schemaId: SchemaId, updates: Partial<CSVSchema>): Promise<void>` - Update existing schema
- `validateSchema(schema: CSVSchema): Promise<SchemaValidationResult>` - Validate schema
- `getImportProgress(jobId: string): Promise<ImportProgress>` - Get import progress
- `getImportHistory(options?: HistoryOptions): Promise<ImportHistory[]>` - Get import history
### Types
#### ImportConfig
```typescript
interface ImportConfig {
schemaId?: SchemaId;
options?: ImportOptions;
callbacks?: ImportCallbacks;
}
```
#### ImportOptions
```typescript
interface ImportOptions {
batchSize?: number;
validateOnly?: boolean;
autoMap?: boolean;
skipErrors?: boolean;
maxErrors?: number;
retryFailed?: boolean;
retryOptions?: RetryOptions;
}
```
#### ImportResult
```typescript
interface ImportResult {
jobId: string;
success: boolean;
totalRows: number;
processedRows: number;
validRows: number;
invalidRows: number;
errors: ImportError[];
warnings: ImportWarning[];
processingTime: number;
completedAt: Date;
}
```
#### CSVSchema
```typescript
interface CSVSchema {
id: SchemaId;
name: string;
description?: string;
fields: SchemaField[];
validationRules?: ValidationRule[];
mappingRules?: MappingRule[];
createdAt: Date;
updatedAt: Date;
}
```
## Configuration
### Import Options
- **batchSize** (default: 100) - Number of rows to process in each batch
- **validateOnly** (default: false) - Only validate data without processing
- **autoMap** (default: true) - Automatically map CSV columns to schema fields
- **skipErrors** (default: false) - Skip rows with errors and continue processing
- **maxErrors** (default: undefined) - Maximum number of errors before stopping
- **retryFailed** (default: false) - Enable retry logic for failed rows
### Retry Options
- **maxRetries** (default: 3) - Maximum number of retry attempts
- **initialDelayMs** (default: 1000) - Initial delay between retries
- **maxDelayMs** (default: 30000) - Maximum delay between retries
- **backoffFactor** (default: 2) - Exponential backoff multiplier
## Error Codes
### Common Error Codes
- `REQUIRED_FIELD` - Required field is missing or empty
- `INVALID_EMAIL` - Invalid email format
- `INVALID_NUMBER` - Invalid numeric value
- `INVALID_DATE` - Invalid date format
- `VALUE_TOO_LONG` - Value exceeds maximum length
- `VALUE_TOO_SHORT` - Value is below minimum length
- `DUPLICATE_VALUE` - Duplicate value found
- `OUT_OF_RANGE` - Value is outside acceptable range
- `PARSING_ERROR` - Error parsing CSV data
- `SYSTEM_ERROR` - System or processing error
## Dependencies
This package orchestrates the following core packages:
- `@bernierllc/csv-parser` - CSV parsing functionality
- `@bernierllc/csv-validator` - Data validation
- `@bernierllc/csv-mapper` - Column mapping
- `@bernierllc/file-handler` - File operations
- `@bernierllc/retry-policy` - Retry logic
## Performance
### Benchmarks
- **Small files** (< 1MB) - < 5s processing time
- **Large files** (> 100MB) - < 300s processing time
- **Memory usage** - < 50MB for 1GB file
- **Optimal batch size** - 1000 rows per batch
### Optimization Tips
1. Use appropriate batch sizes for your data volume
2. Enable auto-mapping to reduce processing time
3. Set reasonable error limits to avoid processing invalid data
4. Use retry logic only when necessary
5. Subscribe to progress updates for better user experience
## See Also
- [@bernierllc/csv-parser](../csv-parser) - Core CSV parsing functionality
- [@bernierllc/csv-validator](../csv-validator) - Data validation utilities
- [@bernierllc/csv-mapper](../csv-mapper) - Column mapping utilities
- [@bernierllc/file-handler](../file-handler) - File handling utilities
- [@bernierllc/retry-policy](../retry-policy) - Retry logic utilities
## License
Copyright (c) 2025 Bernier LLC. All rights reserved.