@opensubtitles/video-metadata-extractor

Version:

A comprehensive NPM package for video metadata extraction and subtitle processing using FFmpeg WASM. Supports metadata extraction, individual subtitle extraction, batch subtitle extraction with ZIP downloads, and memory-safe processing of files of any siz

opensubtitles.github.io/video-metadata-extractor-js/

opensubtitles/video-metadata-extractor-js

354 lines (271 loc) • 10.1 kB

Markdown

# @opensubtitles/video-metadata-extractor A comprehensive NPM package for video metadata extraction and subtitle processing using FFmpeg WASM. Supports metadata extraction, individual subtitle extraction, batch subtitle extraction with ZIP downloads, and memory-safe processing of files of any size using chunked streaming. ## Features - 🎥 **Comprehensive Metadata Extraction**: Extract detailed video information including duration, resolution, bitrate, codecs, and stream details - 📄 **Individual Subtitle Extraction**: Extract single subtitle tracks with quick or full processing modes - 📦 **Batch Subtitle Extraction**: Extract all subtitle tracks at once as a downloadable ZIP file - 🧠 **Memory-Safe Processing**: Handle files of any size using 500MB chunked streaming (tested up to 10GB, theoretically unlimited) - ⚡ **Smart File Processing**: Automatic format detection with optimized processing strategies - 🌐 **Browser-Native**: No server required, all processing happens in the browser using FFmpeg WASM - 📱 **TypeScript Support**: Full type safety with comprehensive interfaces - 🎯 **Multiple Formats**: Support for SRT, ASS, VTT subtitle formats - 🗂️ **Smart Filename Generation**: Automatic filename generation with language codes and duplicate handling ## Installation ```bash npm install @opensubtitles/video-metadata-extractor ``` ## Quick Start ```typescript import { VideoMetadataExtractor } from '@opensubtitles/video-metadata-extractor'; // Initialize the extractor const extractor = new VideoMetadataExtractor({ debug: true, onProgress: (progress) => console.log(`Progress: ${progress.progress}% - ${progress.text}`), onError: (error) => console.error('Error:', error.message) }); // Initialize FFmpeg (required before any operations) await extractor.initialize(); // Extract metadata const metadata = await extractor.extractMetadata(file); console.log('Video duration:', metadata.format?.duration); console.log('Subtitle tracks:', metadata.streams?.filter(s => s.codec_type === 'subtitle').length); // Extract a single subtitle track (quick mode) const subtitle = await extractor.extractSubtitle(file, 2, { format: 'srt', quick: true }); console.log('Subtitle preview:', subtitle.preview); extractor.downloadFile(subtitle.data, subtitle.filename); // Extract all subtitles as ZIP const batch = await extractor.extractAllSubtitles(file); console.log(`Extracted ${batch.successfulExtractions}/${batch.totalStreams} subtitle tracks`); extractor.downloadZip(batch.zipBlob, batch.zipFilename); // Clean up when done await extractor.terminate(); ``` ## API Reference ### VideoMetadataExtractor The main class for video processing operations. #### Constructor Options ```typescript interface VideoMetadataExtractorOptions { /** Custom FFmpeg core URL */ ffmpegCoreURL?: string; /** Custom FFmpeg WASM URL */ ffmpegWasmURL?: string; /** Progress callback */ onProgress?: (progress: ProgressState) => void; /** Error callback */ onError?: (error: ErrorState) => void; /** Enable debug logging */ debug?: boolean; /** Timeout for FFmpeg operations in milliseconds */ timeout?: number; /** Chunk size for large file processing in bytes */ chunkSize?: number; } ``` #### Methods ##### `initialize(): Promise<void>` Initialize FFmpeg WASM. Must be called before any processing operations. ##### `extractMetadata(file: File): Promise<VideoMetadata>` Extract comprehensive metadata from a video file. **Parameters:** - `file: File` - The video file to process **Returns:** `VideoMetadata` object containing format info and stream details ##### `extractSubtitle(file: File, streamIndex: number, options?: ExtractionOptions): Promise<SubtitleExtractionResult>` Extract a single subtitle track. **Parameters:** - `file: File` - The video file - `streamIndex: number` - Index of the subtitle stream to extract - `options?: ExtractionOptions` - Extraction configuration **Options:** ```typescript interface ExtractionOptions { /** Output format: 'srt' | 'ass' | 'vtt' | 'webvtt' */ format?: string; /** Use quick extraction (faster but may be incomplete) */ quick?: boolean; /** Timeout in milliseconds */ timeout?: number; /** Custom filename */ filename?: string; } ``` **Returns:** `SubtitleExtractionResult` with subtitle data and metadata ##### `extractAllSubtitles(file: File): Promise<BatchExtractionResult>` Extract all subtitle tracks and create a ZIP file. **Parameters:** - `file: File` - The video file **Returns:** `BatchExtractionResult` with all extracted subtitles and ZIP blob ##### `downloadFile(data: Uint8Array, filename: string, progressCallback?: (progress: number) => void): void` Download a subtitle file (handles large files automatically). ##### `downloadZip(zipBlob: Blob, filename: string): void` Download a ZIP file containing multiple subtitles. ##### `isInitialized(): boolean` Check if FFmpeg is initialized and ready for use. ##### `getSupportedFormats(): string[]` Get list of supported video file formats. ##### `terminate(): Promise<void>` Clean up resources and terminate FFmpeg. ## Data Types ### VideoMetadata Contains comprehensive video file information: ```typescript interface VideoMetadata { format?: { filename: string; format_name: string; duration: string; size: string; bit_rate: string; fps: string; movietimems: string; movieframes: string; }; streams?: Array<{ codec_type: string; codec_name: string; width?: number; height?: number; language?: string; forced?: boolean; default?: boolean; index?: number; // ... additional stream properties }>; } ``` ### SubtitleExtractionResult Result from individual subtitle extraction: ```typescript interface SubtitleExtractionResult { data: Uint8Array; filename: string; extension: string; size: number; preview: string; } ``` ### BatchExtractionResult Result from batch subtitle extraction: ```typescript interface BatchExtractionResult { extractedFiles: Array<{ filename: string; data: Uint8Array; size: number; language?: string; forced?: boolean; streamIndex: number; }>; zipBlob: Blob; zipFilename: string; totalStreams: number; successfulExtractions: number; } ``` ## Supported Formats The library supports a wide range of video formats including: **Video Formats:** MP4, AVI, MOV, MKV, WebM, FLV, WMV, MPG, MPEG, TS, M2TS, OGV, GIF, and many more **Subtitle Formats:** SRT, ASS, VTT, WebVTT ## Memory Management The library uses advanced chunked processing to handle files of any size: - **Chunked Reading**: Files are processed in 500MB chunks to maintain consistent memory usage - **Progressive Extraction**: Large files use progressive chunk extraction for subtitle processing - **Automatic Cleanup**: Temporary files are automatically cleaned up after operations - **Memory-Safe**: Peak memory usage stays around 500MB regardless of file size ## Browser Compatibility Requires modern browsers with WebAssembly support: - Chrome 57+ - Firefox 52+ - Safari 11+ - Edge 79+ ## React Integration For React applications, you can also use the included React hook: ```typescript import { useVideoMetadata } from '@opensubtitles/video-metadata-extractor'; function VideoProcessor() { const { metadata, progress, error, isLoaded, handleFileSelect, extractSubtitle, extractAllSubtitles } = useVideoMetadata(); const handleFile = (file: File) => { handleFileSelect(file); }; // ... component logic } ``` ## Examples ### Basic Metadata Extraction ```typescript import { VideoMetadataExtractor } from '@opensubtitles/video-metadata-extractor'; const extractor = new VideoMetadataExtractor(); await extractor.initialize(); const file = document.querySelector('input[type="file"]').files[0]; const metadata = await extractor.extractMetadata(file); console.log(`Duration: ${metadata.format?.duration} seconds`); console.log(`Resolution: ${metadata.streams?.[0]?.width}x${metadata.streams?.[0]?.height}`); console.log(`Subtitle tracks: ${metadata.streams?.filter(s => s.codec_type === 'subtitle').length}`); ``` ### Subtitle Extraction with Progress ```typescript const extractor = new VideoMetadataExtractor({ onProgress: (progress) => { document.getElementById('progress').textContent = `${progress.progress}% - ${progress.text}`; } }); await extractor.initialize(); // Quick extraction for fast results const quickSubtitle = await extractor.extractSubtitle(file, 2, { format: 'srt', quick: true }); // Full extraction for complete results const fullSubtitle = await extractor.extractSubtitle(file, 2, { format: 'srt', quick: false }); ``` ### Batch Processing ```typescript const extractor = new VideoMetadataExtractor({ debug: true }); await extractor.initialize(); // Extract all subtitles const batch = await extractor.extractAllSubtitles(file); console.log(`Successfully extracted ${batch.successfulExtractions} out of ${batch.totalStreams} subtitle tracks:`); batch.extractedFiles.forEach(subtitle => { console.log(`- ${subtitle.filename} (${subtitle.language || 'unknown'}) - ${subtitle.size} bytes`); }); // Download the ZIP file extractor.downloadZip(batch.zipBlob, batch.zipFilename); ``` ## Error Handling ```typescript const extractor = new VideoMetadataExtractor({ onError: (error) => { console.error('Extraction error:', error.message); // Handle error in UI } }); try { await extractor.initialize(); const metadata = await extractor.extractMetadata(file); } catch (error) { console.error('Failed to process video:', error.message); } ``` ## License MIT ## Contributing This package is part of the [OpenSubtitles](https://www.opensubtitles.org) ecosystem. Contributions are welcome! ## Links - [GitHub Repository](https://github.com/opensubtitles/video-metadata-extractor-js) - [Live Demo](https://opensubtitles.github.io/video-metadata-extractor-js/) - [OpenSubtitles.org](https://www.opensubtitles.org)