UNPKG

t-youtube-transcript-fetcher

Version:

An enhanced TypeScript library for fetching YouTube transcripts with proxy support (based on youtube-transcript)

205 lines (156 loc) 5.75 kB
# t-youtube-transcript-fetcher A TypeScript library for fetching transcripts from YouTube videos. This is an enhanced version of [youtube-transcript](https://github.com/Kakulukian/youtube-transcript) with added proxy support and improved TypeScript integration. ## Features - Fetch transcripts from YouTube videos using URL or video ID - Support for multiple languages - Proxy support with flexible configuration options - Comprehensive error handling - TypeScript support with full type definitions - Promise-based API [youtube-transcript test](https://thanhphuchuynh.github.io/youtube-transcript-fetcher) ## Installation ```bash npm install t-youtube-transcript-fetcher ``` ## Usage ```typescript import { YoutubeTranscript } from 't-youtube-transcript-fetcher'; import { HttpsProxyAgent } from 'https-proxy-agent'; // Basic usage: Fetch transcript with default language const transcript = await YoutubeTranscript.fetchTranscript( 'https://www.youtube.com/watch?v=dQw4w9WgXcQ' ); // Fetch transcript in specific language const spanishTranscript = await YoutubeTranscript.fetchTranscript( 'dQw4w9WgXcQ', // Can use video ID directly { lang: 'es' } ); // Using pre-configured proxy agent const proxyAgent = new HttpsProxyAgent('https://username:password@proxy.example.com:8080'); const transcriptWithProxy = await YoutubeTranscript.fetchTranscript('VIDEO_ID', { proxyAgent: proxyAgent }); // Alternative: Using proxy configuration object const transcriptWithProxyConfig = await YoutubeTranscript.fetchTranscript('VIDEO_ID', { proxy: { host: 'http://proxy.example.com:8080', auth: { username: 'your-username', password: 'your-password' } } }); ``` ## Example Check out [examples/fetch-transcript.ts](examples/fetch-transcript.ts) for a complete example showing: - Fetching transcripts with default language - Fetching transcripts in specific languages - Using proxy configuration - Error handling ## API ### `YoutubeTranscript.fetchTranscript(videoId: string, config?: TranscriptConfig)` Fetches the transcript for a YouTube video. #### Parameters - `videoId`: Video URL or ID - `config` (optional): Configuration options - `lang`: ISO language code (e.g., 'en', 'es', 'fr') - `proxyAgent`: Pre-configured HttpsProxyAgent instance (takes precedence over proxy config) - `proxy`: Proxy configuration object - `host`: Proxy server URL (e.g., 'http://proxy.example.com:8080') - `auth`: Optional proxy authentication - `username`: Proxy username - `password`: Proxy password #### Returns Promise resolving to an array of transcript segments: ```typescript interface TranscriptSegment { text: string; // The text content duration: number; // Duration in seconds offset: number; // Start time in seconds lang: string; // Language code } ``` #### Errors The library throws specific errors for different cases: - `TranscriptError`: Base error class - `RateLimitError`: YouTube's rate limit exceeded - `VideoUnavailableError`: Video doesn't exist or is private - `TranscriptDisabledError`: Transcripts disabled for the video - `NoTranscriptError`: No transcripts available - `LanguageNotFoundError`: Requested language not available ## Testing This project uses Jest for testing. The test suite includes: - Unit tests for core functionality - Error message formatting tests - Proxy configuration tests To run tests: ```bash # Run tests npm test # Run tests with coverage report npm run test:coverage ``` Test reports and coverage information are generated in the `coverage` directory: - Test Report: `coverage/test-report.html` - Coverage Report: `coverage/index.html` ## Project Structure ``` . ├── src/ │ ├── constants.ts # Constants and regex patterns │ ├── errors.ts # Error classes │ ├── types.ts # TypeScript interfaces │ ├── transcript.ts # Main YoutubeTranscript class │ └── index.ts # Public exports └── examples/ └── fetch-transcript.ts # Usage examples ``` ## Error Handling ```typescript try { const transcript = await YoutubeTranscript.fetchTranscript(videoId); // Process transcript } catch (error) { if (error instanceof RateLimitError) { // Handle rate limiting } else if (error instanceof VideoUnavailableError) { // Handle unavailable video } else if (error instanceof TranscriptDisabledError) { // Handle disabled transcripts } else { // Handle other errors } } ``` ## Proxy Support The library provides two ways to configure proxy support: 1. Using a pre-configured proxy agent (recommended for custom setups): ```typescript import { HttpsProxyAgent } from 'https-proxy-agent'; const proxyAgent = new HttpsProxyAgent('https://username:password@proxy.example.com:8080'); const transcript = await YoutubeTranscript.fetchTranscript('VIDEO_ID', { proxyAgent: proxyAgent }); ``` 2. Using the proxy configuration object: ```typescript const transcript = await YoutubeTranscript.fetchTranscript('VIDEO_ID', { proxy: { host: 'http://proxy.example.com:8080', auth: { // Optional username: 'your-username', password: 'your-password' } } }); ``` The `proxyAgent` option takes precedence over the `proxy` configuration if both are provided. Both methods will apply the proxy to all HTTP requests made by the library. ## Credits This project is an enhanced version of [youtube-transcript](https://github.com/Kakulukian/youtube-transcript) by [Kakulukian](https://github.com/Kakulukian), with added features including: - Proxy support with flexible configuration - Enhanced TypeScript support - Improved error handling - More detailed documentation ## License MIT