UNPKG

document-extraction-service

Version:
269 lines (213 loc) 5.64 kB
# Document Extraction Service `document-extraction-service` is a Node.js library for seamless integration with document processing APIs. It provides request preparation, response validation, and callback handling functionalities, making document extraction workflows efficient and robust. --- ## Installation To install the package, run: ```bash npm install document-extraction-service ``` --- ## Quick Start ### Configure the Request Validator ```javascript const { createRequestValidator } = require('document-extraction-service'); const config = { endpoint: 'https://your-extraction-api.com', headers: { 'Content-Type': 'multipart/form-data', 'callback_url_pattern': 'https://your-service.com/callback/{{streamId}}/{{extractionStrategyId}}', 'trace_id': '{{traceId}}' }, requestBody: { 'strategies_batch_id': '{{strategiesBatchId}}', 'doc_id': '{{docId}}', 'url': '{{file_url}}', 'document_meta': '{{content}}', }, timeout_days: 2, max_retries: 3 }; const requestValidator = createRequestValidator(config); ``` ### Create the Callback Validator ```javascript const { createCallbackValidator } = require('document-extraction-service'); const callbackValidator = createCallbackValidator(); ``` ### Processing a Document ```javascript const processDocument = async () => { const docId = 'doc123'; const content = { text: 'Your document content' }; const streamId = 'stream456'; try { // Prepare request parameters const requestParams = requestValidator.prepareRequest(docId, content, streamId); // Make API call (using your preferred HTTP client, e.g., axios) const response = await axios(requestParams); // Handle response const result = requestValidator.handleResponse(response, requestParams.headers['X-Trace-ID']); console.log('Document processing initiated:', result); } catch (error) { console.error('Error processing document:', error); } }; ``` ### Handling Callback ```javascript const handleCallback = async (callbackData) => { try { const result = await callbackValidator.handleCallback(callbackData); if (!result.success) { console.error(result.error); } else { console.log('Callback processed:', result); } } catch (error) { console.error('Error processing callback:', error); } }; ``` --- ## API Documentation ### Configuration Object ```javascript const config = { endpoint: 'https://api.example.com', // Required - API endpoint headers: { 'Authorization': 'Bearer token', 'callback_url_pattern': 'https://callback.com/{{docId}}/{{streamId}}' // Required }, timeout_days: 2, // Optional - default: 2 max_retries: 3 // Optional - default: 3 }; ``` ### Request Preparation ```javascript const params = requestValidator.prepareRequest(docId, content, streamId); ``` #### Returns: ```javascript { url: string, 'callback_url_pattern': 'https://your-service.com/callback/{{docId}}/{{streamId}}' method: 'POST', headers: { 'X-Document-ID': string, 'X-Trace-ID': string, 'X-Callback-URL': string, ...other headers }, data: { content: any, streamId: string } } ``` ### Response Handling ```javascript const result = requestValidator.handleResponse(response, traceId); ``` #### Returns: ```javascript { success: boolean, docId: string, traceId: string, message?: string, error?: string } ``` ### Callback Handling ```javascript const result = await callbackValidator.handleCallback(callbackData); ``` #### Input Format: ```javascript { doc_id: string, trace_id: string, chunk_data: Array<{ content: string, index: number, chunkId: string, chunkText: string }>, last_batch: boolean } ``` #### Returns: ```javascript { success: boolean, docId: string, traceId: string, isLastBatch: boolean, chunks: Array<ProcessedChunk>, metadata: { processedAt: string, chunksCount: number } } ``` --- ## Additional Utilities ### Chunk Validation ```javascript const { ChunkData, ExtractionConfig, CustomExtractorFactory } = require('document-extraction-service'); // Validate chunks ChunkData.validateResponse(chunksData); ChunkData.validateChunk(chunk); ``` ### Custom Configurations ```javascript const config = new ExtractionConfig({...}); const factory = new CustomExtractorFactory(); const customRequestValidator = factory.createRequestValidator(config); const customCallbackValidator = factory.createCallbackValidator(); ``` --- ## Error Handling ### Validating Input ```javascript try { const result = await requestValidator.prepareRequest(docId, content, streamId); } catch (error) { if (error.message.includes('Missing required field')) { // Handle validation error } else { // Handle other errors } } ``` ### Handling Callback Validation Errors ```javascript try { const result = await callbackValidator.handleCallback(callbackData); if (!result.success) { // Handle validation failure console.error(result.error); } } catch (error) { // Handle unexpected errors console.error(error); } ``` --- ## Testing Run the provided test suite: ```bash npm test ``` --- ## Features - **Request Preparation**: Simplifies constructing API requests with headers and parameters. - **Response Validation**: Ensures API responses are correctly formatted. - **Callback Processing**: Validates and processes callback data efficiently. - **Customizable Configuration**: Supports flexible timeout, retry logic, and callback URL patterns. --- ## License This project is licensed under the MIT License. Contributions are welcome!