UNPKG

audiopod-sdk

Version:

AudioPod SDK for Node.js and React - Professional Audio Processing powered by AI

1,785 lines (1,509 loc) 48.3 kB
# AudioPod SDK for Node.js and React Professional Audio Processing SDK powered by AI. Easy-to-use JavaScript/TypeScript client for the AudioPod API. [![npm version](https://badge.fury.io/js/%40audiopod%2Fsdk.svg)](https://badge.fury.io/js/%40audiopod%2Fsdk) [![TypeScript](https://img.shields.io/badge/%3C%2F%3E-TypeScript-%230074c1.svg)](http://www.typescriptlang.org/) [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT) ## Features 🎤 **Voice Cloning & TTS** - Clone voices, multi-voice TTS, voice collections, and voice conversion 🎵 **Music Generation** - Complete music production suite with all AudioPod AI tools 📝 **Transcription** - Convert speech to text with speaker diarization and timestamps 🌍 **Translation** - Speech-to-speech translation in 20+ languages 👥 **Speaker Analysis** - Speaker diarization, identification, and extraction 🔧 **Audio Enhancement** - Denoise, stem extraction, and professional audio processing 🎬 **Karaoke Generation** - Create karaoke videos with lyric timing 🎙️ **Professional Audio** - Mixing, mastering, DDSP synthesis, and audio restoration 🤖 **AI Copilot** - Intelligent content creation, document processing, and workflow automation 📹 **Video Download** - Multi-platform video download and conversion 🔑 **API Key Management** - Create, manage, and revoke API keys 💰 **Credit Management** - Monitor usage, credits, and pay-as-you-go purchases ## Installation ```bash npm install audiopod-sdk ``` ## Quick Start ### Node.js ```javascript const { AudioPodClient } = require('audiopod-sdk'); // Initialize client const client = new AudioPodClient({ apiKey: 'ap_your_api_key_here' // Get from https://app.audiopod.ai/dashboard }); // Generate voice (unified approach for both cloning and TTS) const result = await client.voice.generateVoice({ text: 'Hello from AudioPod!', voiceFile: 'path/to/voice.wav', // For voice cloning // OR voiceId: 'profile-id', // For existing voice profiles language: 'en', audioFormat: 'mp3', generationParams: { speed: 1.0, denoise: true }, waitForCompletion: true }); console.log('Generated audio:', result.outputUrl); ``` ### React ```jsx import { useVoiceGeneration, useCredits } from 'audiopod-sdk/react'; function VoiceApp() { const apiKey = process.env.REACT_APP_AUDIOPOD_API_KEY; // Use voice generation hook (unified approach) const { generateVoice, isLoading, progress, result, error } = useVoiceGeneration(apiKey, { onProgress: (progress) => console.log('Progress:', progress), onSuccess: (result) => console.log('Completed:', result) }); // Check credits const { credits } = useCredits(apiKey); const handleGenerate = async () => { await generateVoice({ text: 'Hello from React!', voiceFile: selectedFile, // For voice cloning language: 'en', audioFormat: 'mp3', generationParams: { speed: 1.0, denoise: true } }); }; return ( <div> <p>Credits: {credits?.totalAvailableCredits}</p> <button onClick={handleGenerate} disabled={isLoading}> {isLoading ? `Generating... ${progress}%` : 'Generate Voice'} </button> {result && <audio src={result.outputUrl} controls />} </div> ); } ``` ## API Reference ### AudioPodClient Main client class for interacting with the AudioPod API. ```javascript const client = new AudioPodClient({ apiKey: 'ap_your_api_key', baseURL: 'https://api.audiopod.ai', // Optional timeout: 30000, // Optional (30s default) debug: false // Optional }); ``` ### Voice Generation (Unified TTS & Cloning) ```javascript // Generate voice with file cloning (unified endpoint) const result = await client.voice.generateVoice({ text: 'Text to generate', voiceFile: 'voice.wav', // For voice cloning - File path, File object, or Blob language: 'en', // Optional - source language audioFormat: 'mp3', // Optional - 'mp3', 'wav', 'ogg' generationParams: { // Optional - provider-specific parameters speed: 1.0, // Speech speed (0.25-4.0) pitch: 1.0, // Voice pitch adjustment (0.5-2.0) denoise: true, // Preprocess voice sample for better quality temperature: 0.75, // Generation randomness (0.1-1.0) topK: 30, // Token selection diversity topP: 0.85 // Nucleus sampling threshold }, waitForCompletion: true, // Optional timeout: 300000 // Optional }); // Generate speech with existing voice profile (unified endpoint) const speech = await client.voice.generateVoice({ text: 'Hello world!', voiceId: 'profile-id-or-uuid', // For existing voice profiles language: 'en', audioFormat: 'mp3', generationParams: { speed: 1.0, temperature: 0.75, topK: 30, topP: 0.85 }, waitForCompletion: true }); // Create reusable voice profile (unchanged) const profile = await client.voice.createVoiceProfile( 'My Voice', 'voice.wav', 'Description', // Optional false, // isPublic true // waitForCompletion ); // Backward compatibility methods (deprecated - use generateVoice instead) const legacyClone = await client.voice.cloneVoice({ voiceFile: 'voice.wav', text: 'Text to generate', // ... other options (internally uses generateVoice) }); const legacySpeech = await client.voice.generateSpeech( 'profile-id', 'Hello world!', // ... other options (internally uses generateVoice) ); // Voice conversion - change voice characteristics const conversion = await client.voice.convertVoice({ audioFile: 'source-audio.wav', targetVoiceId: 'voice-uuid-or-id', waitForCompletion: true }); // Multi-voice TTS for dialogues const dialogue = await client.voice.generateMultiVoiceTTS({ segments: [ { text: 'Hello there!', voiceId: 'voice1' }, { text: 'How are you?', voiceId: 'voice2' }, { text: 'I am fine, thank you!', voiceId: 'voice1' } ], mixMode: 'sequential', // 'sequential' or 'overlapping' silenceDuration: 0.5, // seconds between segments normalizeVolume: true }); // Stream voice generation for real-time audio await client.voice.streamVoiceGeneration( voiceId, 'Long text to generate...', { language: 'en', generationParams: { speed: 1.0, temperature: 0.75, topK: 30, topP: 0.85 }, onProgress: (progress) => console.log(`Progress: ${progress}%`), onAudioChunk: (chunk) => { // Process audio chunk in real-time audioStream.write(chunk); }, onComplete: (result) => console.log('Streaming complete'), onError: (error) => console.error('Stream error:', error) } ); // Voice collections for organizing voices const collection = await client.voice.createVoiceCollection({ name: 'Family Voices', description: 'Collection of family member voices', isPublic: false }); await client.voice.addVoicesToCollection(collection.id, [voice1.id, voice2.id]); ``` #### Voice Generation Parameters The `generationParams` object allows fine-tuning of voice generation: ```javascript const generationParams = { // Core parameters speed: 1.0, // Speech speed (0.5-2.0, default: 1.0) temperature: 0.75, // Generation randomness (0.1-1.0, default: 0.75) // Audio quality parameters denoise: true, // Preprocess voice sample (default: false) pitch: 1.0, // Voice pitch adjustment (0.5-2.0, default: 1.0) // Advanced sampling parameters topK: 30, // Token selection diversity (1-100, default: 30) topP: 0.85, // Nucleus sampling threshold (0.1-1.0, default: 0.85) // Provider-specific parameters // Additional parameters may be available depending on the voice provider }; ``` ### Music Generation ```javascript // Text-to-music generation (with lyrics) const music = await client.music.generateMusic({ prompt: 'upbeat electronic dance music', lyrics: 'Verse 1: Dancing under lights\nChorus: Feel the beat tonight', duration: 120.0, // seconds guidanceScale: 7.5, // Optional (1.0-20.0) numInferenceSteps: 50, // Optional (20-100) seed: 12345, // Optional waitForCompletion: true }); // Generate rap music const rap = await client.music.generateRap({ lyrics: 'Your rap lyrics here', style: 'modern', // 'modern', 'classic', 'trap' tempo: 120, // BPM (80-200) duration: 90.0, waitForCompletion: true }); // Generate instrumental only const instrumental = await client.music.generateInstrumental({ prompt: 'jazz piano and saxophone', duration: 60, instruments: ['piano', 'saxophone'], key: 'C', tempo: 120, guidanceScale: 7.5 }); // Generate vocals from lyrics const vocals = await client.music.generateVocals({ lyrics: 'Amazing vocals with emotion', genre: 'pop', emotion: 'happy', gender: 'female', waitForCompletion: true }); // Generate audio samples/loops const samples = await client.music.generateSamples({ prompt: 'heavy 808 drum pattern', duration: 8.0, // short loops sampleType: 'drums', tempo: 140 }); // Audio-to-audio transformation const transformed = await client.music.transformAudio({ audioFile: audioFile, // File, Blob, or path prompt: 'make this sound like a rock song', referenceStrength: 0.7, // How much to keep from original audioDuration: 120.0, guidanceScale: 15.0 }); // SongBloom - reference audio based generation const songbloom = await client.music.generateWithReference({ audioFile: referenceFile, lyrics: 'New lyrics for the melody', duration: 120.0, guidanceScale: 7.5, preserveMelody: true }); // Music editing operations // Retake - generate variations const retake = await client.music.retakeMusic({ sourceJobId: originalJob.id, variationStrength: 0.8, // How different the retake should be waitForCompletion: true }); // Repaint - modify specific sections const repainted = await client.music.repaintMusic({ sourceJobId: originalJob.id, startTime: 30.0, // seconds endTime: 60.0, // seconds newPrompt: 'make this section more energetic', waitForCompletion: true }); // Extend existing music const extended = await client.music.extendMusic({ sourceJobId: originalJob.id, extendDuration: 30.0, // seconds to add direction: 'end', // 'start' or 'end' waitForCompletion: true }); // Advanced editing const edited = await client.music.editMusic({ sourceJobId: originalJob.id, edits: [ { startTime: 0, endTime: 30, operation: 'fade_in' }, { startTime: 60, endTime: 90, operation: 'replace', newPrompt: 'guitar solo' } ], waitForCompletion: true }); // Social features // Like/unlike tracks await client.music.likeTrack(jobId); await client.music.unlikeTrack(jobId); await client.music.dislikeTrack(jobId); // Share tracks const shareInfo = await client.music.shareTrack({ jobId: musicJob.id, platform: 'public', message: 'Check out my AI-generated music!' }); // Get track statistics const stats = await client.music.getTrackStats(jobId); console.log(`Likes: ${stats.likes}, Shares: ${stats.shares}`); // Comments await client.music.addComment(jobId, 'Amazing track!'); const comments = await client.music.getComments(jobId); // Get shared track (public access, no auth required) const sharedTrack = await client.music.getSharedTrack(shareToken); ``` ### Transcription ```javascript // Transcribe audio file const transcript = await client.transcription.transcribeAudio({ audioFile: 'speech.wav', language: 'en', // Optional (auto-detect if not specified) modelType: 'whisperx', // 'whisperx' or 'faster-whisper' enableSpeakerDiarization: true, // Optional enableWordTimestamps: true, // Optional waitForCompletion: true }); console.log('Transcript:', transcript.transcript); console.log('Confidence:', transcript.confidenceScore); console.log('Segments:', transcript.segments); // Transcribe from URL const urlTranscript = await client.transcription.transcribeUrl( 'https://example.com/audio.mp3', { language: 'en', enableSpeakerDiarization: true } ); ``` ### Translation ```javascript // Translate audio const translation = await client.translation.translateAudio({ audioFile: 'english-speech.wav', targetLanguage: 'es', // Spanish sourceLanguage: 'en', // Optional (auto-detect) waitForCompletion: true }); console.log('Translated audio:', translation.audioOutputUrl); console.log('Video with subtitles:', translation.videoOutputUrl); ``` ### Credit Management ```javascript // Check credit balance const credits = await client.credits.getCreditBalance(); console.log('Available credits:', credits.totalAvailableCredits); console.log('Subscription credits:', credits.balance); console.log('Pay-as-you-go credits:', credits.paygBalance); // Get usage history const usage = await client.credits.getUsageHistory(); console.log('Recent usage:', usage); // Get credit multipliers for different services const multipliers = await client.credits.getCreditMultipliers(); console.log('Voice cloning cost multiplier:', multipliers.voice_cloning); console.log('Music generation cost multiplier:', multipliers.music_generation); // Pay-as-you-go credit purchasing // Get pricing information const paygInfo = await client.credits.getPayAsYouGoInfo(); console.log(`${paygInfo.creditsPerDollar} credits per $1`); console.log(`Min purchase: $${paygInfo.minAmountUsd}, Max: $${paygInfo.maxAmountUsd}`); // Create checkout session for credit purchase const checkout = await client.credits.createPayAsYouGoCheckout({ amountUsd: 50 // Purchase $50 worth of credits }); // Redirect user to checkout.sessionUrl or use checkout.sessionId with Stripe console.log('Checkout URL:', checkout.sessionUrl); console.log('Will receive:', checkout.creditsToReceive, 'credits'); ``` ### Professional Audio Processing ```javascript // Create mixing project const project = await client.audio.createMixingProject({ projectName: 'My Mix', sampleRate: 48000, bitDepth: 24 }); // Add tracks to project await client.audio.addTrackToProject(project.id, { trackName: 'Vocals', audioFile: vocalsFile, startTime: 0.0 }); await client.audio.addTrackToProject(project.id, { trackName: 'Instruments', audioFile: instrumentsFile, startTime: 0.0 }); // Add effects to tracks await client.audio.addEffectToTrack(project.id, 'vocals-track-id', { effectType: 'compressor', parameters: { threshold: -12.0, ratio: 4.0, attack: 10.0, release: 100.0 } }); // Mix project to final audio const mixResult = await client.audio.mixProject(project.id, { outputFormat: 'wav', normalize: true, waitForCompletion: true }); // Comprehensive audio analysis const analysis = await client.audio.analyzeAudio({ audioFile: audioFile, analysisTypes: ['spectral', 'loudness', 'pitch', 'rhythm'], waitForCompletion: true }); console.log('Audio analysis:', analysis.results); // Audio mastering with reference const mastered = await client.audio.masterWithReference({ targetAudio: unmasteredFile, referenceAudio: referenceFile, outputFormats: ['wav_24bit', 'mp3_320'], waitForCompletion: true }); // Audio mastering with preset const masteredPreset = await client.audio.masterWithPreset({ targetAudio: unmasteredFile, preset: 'spotify_loudness', // 'spotify_loudness', 'youtube_loudness', 'apple_music' outputFormats: ['wav_16bit', 'wav_24bit'], waitForCompletion: true }); // Audio restoration const restored = await client.audio.restoreAudio({ audioFile: damagedFile, restorationType: 'denoise', // 'denoise', 'declip', 'dehiss', 'decrackle' strength: 0.7, // How aggressive the restoration should be preservation: 0.8, // How much to preserve original characteristics waitForCompletion: true }); // DDSP neural synthesis const ddspFeatures = await client.audio.extractDDSPFeatures({ audioFile: audioFile, waitForCompletion: true }); const synthesized = await client.audio.ddspSynthesize({ synthType: 'harmonic', // 'harmonic', 'filtered_noise', 'additive' synthesisParams: { fundamental_frequency: 440.0, harmonics: [1.0, 0.5, 0.3, 0.2], noise_level: 0.1 }, waitForCompletion: true }); // Timbre transfer using DDSP const timbreTransfer = await client.audio.ddspTimbreTransfer({ sourceAudio: sourceFile, targetAudio: targetFile, transferStrength: 0.8, waitForCompletion: true }); ``` ### Supersmart AI Copilot ```javascript // Create AI copilot session const session = await client.copilot.createSession(); console.log('Session ID:', session.sessionId); console.log('Capabilities:', session.capabilities); // Send query to AI copilot const response = await client.copilot.sendQuery({ sessionId: session.sessionId, userQuery: 'I want to create an audiobook from my PDF document', attachments: [ { type: 'document', file: pdfFile, description: 'Novel to convert to audiobook' } ] }); // Process documents for content creation const processed = await client.copilot.processDocuments({ files: [pdfFile, docxFile], sessionId: session.sessionId, processingType: 'audiobook_preparation' }); // Extract content from web URLs const webContent = await client.copilot.extractWebContent({ urls: [ 'https://example.com/article1', 'https://example.com/article2' ], sessionId: session.sessionId, extractionType: 'podcast_script' }); // Generate content with AI const content = await client.copilot.generateContent({ contentType: 'audiobook', // 'audiobook', 'podcast', 'meditation', 'educational' parameters: { chapters: 5, narratorStyle: 'professional', genre: 'fiction', targetAudience: 'adults' }, sessionId: session.sessionId }); // Execute complete workflow const workflowExecution = await client.copilot.executeWorkflow({ sessionId: session.sessionId, // Returns a stream of progress updates onProgress: (update) => { console.log('Workflow progress:', update.step, update.progress); } }); // List user workflows const workflows = await client.copilot.listWorkflows({ limit: 10, offset: 0 }); // Get workflow status const workflowStatus = await client.copilot.getWorkflowStatus(workflowId); ``` ### Video Download & Processing ```javascript // Get video information const videoInfo = await client.video.getVideoInfo({ url: 'https://youtube.com/watch?v=example' }); console.log('Title:', videoInfo.title); console.log('Duration:', videoInfo.duration); console.log('Available formats:', videoInfo.formats); // Download video const download = await client.video.downloadVideo({ url: 'https://youtube.com/watch?v=example', format: 'mp4', // 'mp4', 'webm', 'mkv' quality: '720p', // '144p', '360p', '720p', '1080p', '4k' audioQuality: 'high', // 'low', 'medium', 'high' extractAudio: false, // Set to true to get audio only waitForCompletion: true }); console.log('Download URL:', download.downloadUrl); console.log('File size:', download.fileSize); // Bulk download multiple videos const bulkDownload = await client.video.bulkDownload({ urls: [ 'https://youtube.com/watch?v=video1', 'https://youtube.com/watch?v=video2', 'https://tiktok.com/@user/video/123' ], format: 'mp4', quality: '720p', waitForCompletion: false // Get job IDs for tracking }); // List download jobs const downloads = await client.video.listDownloads({ status: 'completed', // 'pending', 'processing', 'completed', 'failed' platform: 'youtube', // 'youtube', 'tiktok', 'instagram', 'vimeo' page: 1, perPage: 20 }); // Get download job status const downloadJob = await client.video.getDownload(jobId); // Get supported platforms and formats const formats = await client.video.getSupportedFormats(); console.log('Supported platforms:', Object.keys(formats)); console.log('Video formats:', formats.video); console.log('Audio formats:', formats.audio); // Public download (no authentication required) const publicDownload = await AudioPodClient.downloadVideoPublic({ url: 'https://youtube.com/watch?v=example', format: 'mp3', // For audio extraction quality: 'high' }); ``` ### API Key Management ```javascript // Create new API key const apiKey = await client.auth.createApiKey({ name: 'Production Key', description: 'API key for production environment', scopes: ['voice:read', 'voice:write', 'music:read', 'music:write'], // Optional expiresInDays: 365 // Optional, defaults to never expire }); console.log('New API key:', apiKey.key); // Only shown once! console.log('Key ID:', apiKey.id); // List API keys const apiKeys = await client.auth.listApiKeys({ status: 'active' // 'active', 'revoked', 'all' }); apiKeys.forEach(key => { console.log(`${key.name}: ${key.status} (Created: ${key.createdAt})`); }); // Revoke API key await client.auth.revokeApiKey(keyId); // Re-enable revoked API key await client.auth.unrevokeApiKey(keyId); // Permanently delete API key await client.auth.deleteApiKey(keyId); // Update API key (name, description only) const updatedKey = await client.auth.updateApiKey(keyId, { name: 'Updated Production Key', description: 'Updated description' }); ``` ### Job Management ```javascript // Check job status const job = await client.getJobStatus(jobId); console.log('Status:', job.status); // 'pending', 'processing', 'completed', 'failed' console.log('Progress:', job.progress); // 0-100 // Wait for job completion with progress monitoring const result = await client.waitForJobCompletion( jobId, 300000, // 5 minutes timeout 5000, // Poll every 5 seconds (job) => { console.log(`Progress: ${job.progress}%`); } ); // Cancel a job await client.cancelJob(jobId); ``` ## React Hooks The SDK provides React hooks for easy integration with all services: ### useVoiceGeneration ```jsx const { generateVoice, cancel, isLoading, progress, result, error, isInitialized } = useVoiceGeneration(apiKey, { onProgress: (progress) => console.log(progress), onSuccess: (result) => console.log(result), onError: (error) => console.error(error), pollInterval: 5000 // Optional }); // Usage for voice cloning const handleClone = async () => { await generateVoice({ text: 'Text to generate', voiceFile: selectedFile, language: 'en', audioFormat: 'mp3' }); }; // Usage for TTS with existing voice const handleTTS = async () => { await generateVoice({ text: 'Text to generate', voiceId: 'existing-voice-id', language: 'en', audioFormat: 'mp3' }); }; // Backward compatibility hook (deprecated) const { cloneVoice, // Now uses generateVoice internally // ... other properties } = useVoiceCloning(apiKey, options); ``` ### useMusicGeneration ```jsx const { generateMusic, generateRap, generateInstrumental, generateVocals, generateSamples, retakeMusic, extendMusic, cancel, isLoading, progress, result, error } = useMusicGeneration(apiKey, { onProgress: (progress) => console.log(progress), onSuccess: (result) => console.log(result) }); // Usage const handleGenerateMusic = async () => { await generateMusic({ prompt: 'Epic orchestral music', duration: 120, waitForCompletion: true }); }; ``` ### useTranscription ```jsx const { transcribeAudio, transcribeUrl, cancel, isLoading, progress, result, error } = useTranscription(apiKey, { onProgress: (progress) => console.log(progress), onSuccess: (result) => console.log(result) }); ``` ### useCredits ```jsx const { credits, isLoading, error, refetch, purchaseCredits } = useCredits(apiKey, { onCreditsUpdated: (newCredits) => console.log('Credits updated:', newCredits) }); // Usage const handlePurchase = async () => { const checkout = await purchaseCredits(50); // $50 purchase window.location.href = checkout.sessionUrl; }; ``` ### useProfessionalAudio ```jsx const { createMixingProject, analyzeAudio, masterAudio, restoreAudio, isLoading, progress, result, error } = useProfessionalAudio(apiKey, { onProgress: (progress) => console.log(progress), onSuccess: (result) => console.log(result) }); ``` ### useSupersmart ```jsx const { createSession, sendQuery, executeWorkflow, uploadDocuments, isLoading, progress, result, error, sessionId } = useSupersmart(apiKey, { onSessionCreated: (session) => console.log('Session created:', session.sessionId), onWorkflowProgress: (update) => console.log('Workflow progress:', update) }); ``` ### useVideoDownload ```jsx const { getVideoInfo, downloadVideo, bulkDownload, isLoading, progress, result, error } = useVideoDownload(apiKey, { onProgress: (progress) => console.log(progress), onSuccess: (result) => console.log(result) }); ``` ### useApiKeys ```jsx const { apiKeys, createApiKey, revokeApiKey, isLoading, error, refetch } = useApiKeys(apiKey, { onKeyCreated: (newKey) => console.log('New API key created:', newKey.id), onKeyRevoked: (keyId) => console.log('API key revoked:', keyId) }); ``` ### Hook Composition Example ```jsx function AudioProductionApp() { const apiKey = process.env.REACT_APP_AUDIOPOD_API_KEY; // Multiple hooks for complete audio production const { credits } = useCredits(apiKey); const voiceCloning = useVoiceCloning(apiKey); const musicGeneration = useMusicGeneration(apiKey); const audioProcessing = useProfessionalAudio(apiKey); const supersmart = useSupersmart(apiKey); const createCompleteProduction = async () => { // Step 1: Create AI content with Supersmart const { sessionId } = await supersmart.createSession(); const contentPlan = await supersmart.sendQuery({ sessionId, userQuery: 'Create a 3-minute song with vocals' }); // Step 2: Generate music const musicResult = await musicGeneration.generateMusic({ prompt: contentPlan.musicPrompt, duration: 180, waitForCompletion: true }); // Step 3: Generate vocals const vocalsResult = await musicGeneration.generateVocals({ lyrics: contentPlan.lyrics, waitForCompletion: true }); // Step 4: Professional mixing const mixingProject = await audioProcessing.createMixingProject({ projectName: 'AI Song Production' }); // Add tracks and mix // ... mixing logic return mixingProject; }; return ( <div> <h1>AI Audio Production Studio</h1> <p>Credits Available: {credits?.totalAvailableCredits}</p> <button onClick={createCompleteProduction} disabled={voiceCloning.isLoading || musicGeneration.isLoading} > Create Complete Production </button> {/* Progress indicators for all services */} {voiceCloning.isLoading && <div>Voice processing: {voiceCloning.progress}%</div>} {musicGeneration.isLoading && <div>Music generation: {musicGeneration.progress}%</div>} {audioProcessing.isLoading && <div>Audio processing: {audioProcessing.progress}%</div>} </div> ); } ``` ## Error Handling The SDK provides comprehensive error handling with detailed error codes: ```javascript try { const result = await client.voice.cloneVoice({ voiceFile: 'voice.wav', text: 'Hello world!', generationParams: { speed: 1.0, denoise: true } }); } catch (error) { switch (error.code) { case 'AUTHENTICATION_ERROR': console.log('Invalid API key or authentication failed'); break; case 'INSUFFICIENT_CREDITS': console.log('Not enough credits to complete this operation'); break; case 'RATE_LIMIT_ERROR': console.log('Rate limit exceeded, please try again later'); break; case 'FILE_TOO_LARGE': console.log('File exceeds maximum size limit'); break; case 'INVALID_AUDIO_FORMAT': console.log('Unsupported audio format'); break; case 'PROCESSING_ERROR': console.log('Error during audio processing'); break; case 'QUOTA_EXCEEDED': console.log('API quota exceeded'); break; case 'VALIDATION_ERROR': console.log('Input validation failed:', error.details); break; case 'NETWORK_ERROR': console.log('Network connectivity issue'); break; case 'TIMEOUT_ERROR': console.log('Operation timed out'); break; case 'NOT_FOUND': console.log('Resource not found'); break; case 'PERMISSION_DENIED': console.log('Permission denied for this operation'); break; default: console.log('Unknown error:', error.message); } // Additional error information if (error.statusCode) { console.log('HTTP Status:', error.statusCode); } if (error.details) { console.log('Error details:', error.details); } } ``` ### Common Error Codes | Code | Description | Common Causes | |------|-------------|---------------| | `AUTHENTICATION_ERROR` | Invalid or missing API key | Wrong API key, expired key | | `INSUFFICIENT_CREDITS` | Not enough credits | Low credit balance, operation cost exceeds balance | | `RATE_LIMIT_ERROR` | Too many requests | Exceeded rate limits, need to slow down requests | | `FILE_TOO_LARGE` | File size exceeds limits | Audio >100MB, Video >200MB | | `INVALID_AUDIO_FORMAT` | Unsupported file format | File format not in supported list | | `PROCESSING_ERROR` | Audio processing failed | Corrupted file, incompatible audio | | `VALIDATION_ERROR` | Input validation failed | Invalid parameters, missing required fields | | `TIMEOUT_ERROR` | Operation timed out | Long processing time, network issues | | `NOT_FOUND` | Resource not found | Invalid job ID, deleted resource | | `PERMISSION_DENIED` | Access denied | Trying to access other user's resources | ### Error Recovery Strategies ```javascript // Retry logic for transient errors async function retryOperation(operation, maxRetries = 3) { for (let i = 0; i < maxRetries; i++) { try { return await operation(); } catch (error) { if (error.code === 'RATE_LIMIT_ERROR' && i < maxRetries - 1) { // Wait and retry for rate limit errors const delay = Math.pow(2, i) * 1000; // Exponential backoff console.log(`Rate limited, retrying in ${delay}ms...`); await new Promise(resolve => setTimeout(resolve, delay)); continue; } if (error.code === 'NETWORK_ERROR' && i < maxRetries - 1) { // Retry network errors console.log(`Network error, retrying attempt ${i + 2}...`); await new Promise(resolve => setTimeout(resolve, 1000)); continue; } // Don't retry for these error types if (['AUTHENTICATION_ERROR', 'INSUFFICIENT_CREDITS', 'VALIDATION_ERROR'].includes(error.code)) { throw error; } // Last attempt or non-retryable error if (i === maxRetries - 1) { throw error; } } } } // Usage try { const result = await retryOperation(() => client.music.generateMusic({ prompt: 'Epic orchestral music', duration: 120 }) ); } catch (error) { console.error('Operation failed after retries:', error); } ``` ## Configuration ### Environment Variables Set your API key using environment variables: ```bash # For Node.js export AUDIOPOD_API_KEY=ap_your_api_key_here # For React REACT_APP_AUDIOPOD_API_KEY=ap_your_api_key_here ``` ### Client Configuration ```javascript const client = new AudioPodClient({ apiKey: process.env.AUDIOPOD_API_KEY, baseURL: 'https://api.audiopod.ai', // Custom API endpoint timeout: 60000, // Request timeout (60s) maxRetries: 3, // Max retry attempts debug: true // Enable debug logging }); // Update API key client.updateApiKey('new_api_key'); // Enable/disable debug mode client.setDebug(true); // Check health const health = await client.checkHealth(); console.log('API Status:', health.status); ``` ## File Upload Support The SDK supports various file input methods with proper validation: ```javascript // File path (Node.js) await client.voice.cloneVoice({ voiceFile: '/path/to/voice.wav', text: 'Hello', generationParams: { speed: 1.0, denoise: true } }); // File object (Browser) const fileInput = document.querySelector('input[type="file"]'); await client.voice.cloneVoice({ voiceFile: fileInput.files[0], text: 'Hello', generationParams: { speed: 1.0, temperature: 0.75 } }); // Blob object const blob = new Blob([audioData], { type: 'audio/wav' }); await client.voice.cloneVoice({ voiceFile: blob, text: 'Hello', generationParams: { denoise: true, pitch: 1.0 } }); ``` ### File Specifications #### Audio Files - **Maximum size**: 100MB - **Supported formats**: WAV, MP3, OGG, FLAC, M4A - **Recommended format**: WAV (best quality) - **Sample rate**: 16kHz or higher recommended - **Bit depth**: 16-bit or 24-bit - **Channels**: Mono or stereo #### Video Files - **Maximum size**: 200MB - **Supported formats**: MP4, MOV, AVI, MKV, WEBM - **Use case**: Translation with video output, karaoke generation #### Voice Cloning Recommendations - **Duration**: 10-30 seconds for optimal quality - **Quality**: Clean, noise-free audio - **Content**: Clear speech without background music - **Language**: Single language per sample - **Speaker**: Single speaker per sample #### Music Generation Recommendations - **Reference audio**: High-quality stereo preferred - **Duration**: Any length supported - **Formats**: WAV or high-quality MP3 (320kbps) ### Upload Progress Tracking ```javascript // Track upload progress await client.voice.createVoiceProfile( 'My Voice', largeAudioFile, 'High quality voice sample', false, true, { onProgress: (progress) => { console.log(`Upload progress: ${progress.percentage}%`); console.log(`${progress.loaded} / ${progress.total} bytes`); } } ); ``` ## TypeScript Support The SDK is written in TypeScript and provides full type definitions: ```typescript import { AudioPodClient, VoiceCloneRequest, VoiceCloneResult } from 'audiopod-sdk'; const client = new AudioPodClient({ apiKey: 'your-key' }); const request: VoiceCloneRequest = { voiceFile: 'voice.wav', text: 'Hello TypeScript!', language: 'en', generationParams: { speed: 1.0, denoise: true, temperature: 0.75 } }; const result: VoiceCloneResult = await client.voice.cloneVoice(request); ``` ## Advanced Examples ### Batch Processing ```javascript // Process multiple files const files = ['file1.wav', 'file2.mp3', 'file3.m4a']; const jobs = []; // Start all jobs for (const file of files) { const result = await client.transcription.transcribeAudio({ audioFile: file, waitForCompletion: false }); jobs.push({ file, jobId: result.job.id }); } // Wait for all to complete for (const { file, jobId } of jobs) { const result = await client.waitForJobCompletion(jobId); console.log(`${file}: ${result.transcript}`); } ``` ### Complete Audiobook Production Pipeline ```javascript // Complete workflow: Document Audiobook with multiple voices async function createAudiobook(documentFile) { // Step 1: Use Supersmart to analyze and structure content const session = await client.copilot.createSession(); const contentAnalysis = await client.copilot.processDocuments({ files: [documentFile], sessionId: session.sessionId, processingType: 'audiobook_preparation' }); // Step 2: Generate narrator and character voices const narratorVoice = await client.voice.createVoiceProfile( 'Narrator Voice', narratorSample, 'Professional audiobook narrator', false, true ); const characterVoice = await client.voice.createVoiceProfile( 'Character Voice', characterSample, 'Character dialogue voice', false, true ); // Step 3: Generate background music const backgroundMusic = await client.music.generateInstrumental({ prompt: 'soft ambient background music for audiobook', duration: 300, // 5 minutes base track tempo: 60, instruments: ['piano', 'strings'], waitForCompletion: true }); // Step 4: Generate audio segments with multiple voices const audioSegments = []; for (const chapter of contentAnalysis.chapters) { const narrativeAudio = await client.voice.generateSpeech( narratorVoice.id, chapter.narrativeText, { speed: 0.9, // Slightly slower for audiobooks audioFormat: 'wav', waitForCompletion: true } ); const dialogueAudio = await client.voice.generateSpeech( characterVoice.id, chapter.dialogueText, { speed: 1.0, audioFormat: 'wav', waitForCompletion: true } ); audioSegments.push({ narrative: narrativeAudio, dialogue: dialogueAudio }); } // Step 5: Professional mixing and mastering const mixingProject = await client.audio.createMixingProject({ projectName: `Audiobook: ${contentAnalysis.title}`, sampleRate: 44100, bitDepth: 24 }); // Add all segments to mixing project for (let i = 0; i < audioSegments.length; i++) { const segment = audioSegments[i]; await client.audio.addTrackToProject(mixingProject.id, { trackName: `Chapter ${i + 1} Narrative`, audioFile: segment.narrative.outputUrl, startTime: i * 600 // 10 minutes per chapter }); if (segment.dialogue) { await client.audio.addTrackToProject(mixingProject.id, { trackName: `Chapter ${i + 1} Dialogue`, audioFile: segment.dialogue.outputUrl, startTime: i * 600 + 300 // Offset dialogue }); } } // Add background music track await client.audio.addTrackToProject(mixingProject.id, { trackName: 'Background Music', audioFile: backgroundMusic.outputUrl, startTime: 0 }); // Apply professional effects await client.audio.addEffectToTrack(mixingProject.id, 'narrative-track-id', { effectType: 'compressor', parameters: { threshold: -18.0, ratio: 3.0 } }); // Final mix and master const finalAudiobook = await client.audio.mixProject(mixingProject.id, { outputFormat: 'mp3', normalize: true, waitForCompletion: true }); // Step 6: Apply professional mastering const masteredAudiobook = await client.audio.masterWithPreset({ targetAudio: finalAudiobook.outputUrl, preset: 'audiobook_loudness', outputFormats: ['mp3_320', 'wav_24bit'], waitForCompletion: true }); return { audiobook: masteredAudiobook, metadata: contentAnalysis, chapters: audioSegments.length, duration: audioSegments.length * 600 // Estimated duration }; } ``` ### AI-Powered Podcast Creation ```javascript // Create a complete podcast episode from web articles async function createPodcastFromUrls(urls, hostVoiceFile, guestVoiceFile) { // Step 1: Extract and process web content const session = await client.copilot.createSession(); const webContent = await client.copilot.extractWebContent({ urls: urls, sessionId: session.sessionId, extractionType: 'podcast_script' }); // Step 2: Generate podcast script with AI const podcastScript = await client.copilot.generateContent({ contentType: 'podcast', parameters: { format: 'interview', duration: 30, // 30 minutes tone: 'conversational', includeIntro: true, includeOutro: true }, sessionId: session.sessionId }); // Step 3: Create voice profiles for host and guest const [hostVoice, guestVoice] = await Promise.all([ client.voice.createVoiceProfile('Podcast Host', hostVoiceFile, 'Professional podcast host voice'), client.voice.createVoiceProfile('Podcast Guest', guestVoiceFile, 'Expert guest voice') ]); // Step 4: Generate intro/outro music const introMusic = await client.music.generateMusic({ prompt: 'upbeat podcast intro music with energy', duration: 15, guidanceScale: 8.0, waitForCompletion: true }); const outroMusic = await client.music.generateMusic({ prompt: 'professional podcast outro music, fade out style', duration: 20, guidanceScale: 7.5, waitForCompletion: true }); // Step 5: Generate multi-voice dialogue const podcastAudio = await client.voice.generateMultiVoiceTTS({ segments: podcastScript.segments.map(segment => ({ text: segment.text, voiceId: segment.speaker === 'host' ? hostVoice.id : guestVoice.id })), mixMode: 'sequential', silenceDuration: 0.5, normalizeVolume: true, waitForCompletion: true }); // Step 6: Professional audio post-processing const enhancedAudio = await client.audio.restoreAudio({ audioFile: podcastAudio.outputUrl, restorationType: 'denoise', strength: 0.3, // Light denoising preservation: 0.9, waitForCompletion: true }); // Step 7: Final mixing with music const mixingProject = await client.audio.createMixingProject({ projectName: 'Podcast Episode', sampleRate: 44100, bitDepth: 16 // Standard podcast quality }); // Add intro music await client.audio.addTrackToProject(mixingProject.id, { trackName: 'Intro Music', audioFile: introMusic.outputUrl, startTime: 0 }); // Add main podcast content await client.audio.addTrackToProject(mixingProject.id, { trackName: 'Main Content', audioFile: enhancedAudio.outputUrl, startTime: 15 // After intro }); // Add outro music const mainDuration = podcastScript.estimatedDuration * 60; // Convert to seconds await client.audio.addTrackToProject(mixingProject.id, { trackName: 'Outro Music', audioFile: outroMusic.outputUrl, startTime: 15 + mainDuration }); // Final mix const finalPodcast = await client.audio.mixProject(mixingProject.id, { outputFormat: 'mp3', normalize: true, waitForCompletion: true }); return { podcast: finalPodcast, script: podcastScript, duration: mainDuration + 35, // Include intro/outro metadata: { title: podcastScript.title, description: podcastScript.description, sources: urls } }; } ``` ### Music Album Production ```javascript // Create a complete music album with consistent style async function createMusicAlbum(albumConcept) { const album = { tracks: [], metadata: albumConcept }; // Step 1: Use AI to plan the album structure const session = await client.copilot.createSession(); const albumPlan = await client.copilot.generateContent({ contentType: 'music_album', parameters: { genre: albumConcept.genre, trackCount: albumConcept.trackCount || 10, theme: albumConcept.theme, duration: albumConcept.duration || 45 // 45 minutes }, sessionId: session.sessionId }); // Step 2: Generate each track with consistent style for (let i = 0; i < albumPlan.tracks.length; i++) { const trackPlan = albumPlan.tracks[i]; let trackAudio; if (trackPlan.type === 'instrumental') { trackAudio = await client.music.generateInstrumental({ prompt: trackPlan.prompt, duration: trackPlan.duration, instruments: trackPlan.instruments, key: albumPlan.musicalKey, tempo: trackPlan.tempo, waitForCompletion: true }); } else if (trackPlan.type === 'vocal') { trackAudio = await client.music.generateMusic({ prompt: trackPlan.prompt, lyrics: trackPlan.lyrics, duration: trackPlan.duration, guidanceScale: 7.5, waitForCompletion: true }); } else if (trackPlan.type === 'rap') { trackAudio = await client.music.generateRap({ lyrics: trackPlan.lyrics, style: albumConcept.rapStyle || 'modern', tempo: trackPlan.tempo, duration: trackPlan.duration, waitForCompletion: true }); } // Apply consistent mastering to each track const masteredTrack = await client.audio.masterWithPreset({ targetAudio: trackAudio.outputUrl, preset: 'album_loudness', outputFormats: ['wav_24bit'], waitForCompletion: true }); album.tracks.push({ number: i + 1, title: trackPlan.title, duration: trackPlan.duration, audio: masteredTrack, metadata: trackPlan }); // Add variation if needed if (trackPlan.needsVariation) { const variation = await client.music.retakeMusic({ sourceJobId: trackAudio.job.id, variationStrength: 0.6, waitForCompletion: true }); album.tracks.push({ number: i + 1.5, title: `${trackPlan.title} (Alternative Version)`, duration: trackPlan.duration, audio: variation, metadata: { ...trackPlan, isVariation: true } }); } } // Step 3: Create album transitions and continuous mix if (albumConcept.createContinuousMix) { const continuousMix = await createContinuousAlbumMix(album.tracks); album.continuousMix = continuousMix; } // Step 4: Generate album artwork (if integrated with image generation) // This would require additional image generation capabilities return album; } async function createContinuousAlbumMix(tracks) { const mixingProject = await client.audio.createMixingProject({ projectName: 'Album Continuous Mix', sampleRate: 44100, bitDepth: 24 }); let currentTime = 0; for (let i = 0; i < tracks.length; i++) { const track = tracks[i]; // Add track to project await client.audio.addTrackToProject(mixingProject.id, { trackName: track.title, audioFile: track.audio.outputUrl, startTime: currentTime }); // Add crossfade transition between tracks if (i < tracks.length - 1) { currentTime += track.duration - 3; // 3-second overlap for crossfade } else { currentTime += track.duration; } } // Apply final mastering to the continuous mix const continuousMix = await client.audio.mixProject(mixingProject.id, { outputFormat: 'wav', normalize: true, waitForCompletion: true }); return continuousMix; } ``` ### Streaming Audio Generation ```javascript // Stream voice generation (Node.js with WebSocket) await client.voice.streamVoiceGeneration( voiceId, 'Long text to generate...', { onProgress: (progress) => console.log(`Progress: ${progress}%`), onAudioChunk: (chunk) => { // Process audio chunk in real-time audioStream.write(chunk); }, onComplete: (result) => { console.log('Streaming complete'); audioStream.end(); }, onError: (error) => console.error('Stream error:', error) } ); ``` ### Progress Monitoring ```javascript // Monitor job with custom polling const result = await client.music.generateMusic({ prompt: 'epic orchestral music', duration: 180, waitForCompletion: false }); const monitorJob = async (jobId) => { while (true) { const job = await client.getJobStatus(jobId); console.log(`Status: ${job.status}, Progress: ${job.progress}%`); if (job.status === 'completed') { return job.result; } else if (job.status === 'failed') { throw new Error(job.errorMessage); } await new Promise(resolve => setTimeout(resolve, 3000)); } }; const finalResult = await monitorJob(result.job.id); ``` ## Browser Support The SDK works in modern browsers with the following features: - **File Upload**: Native File API support - **Progress Tracking**: XMLHttpRequest progress events - **Async/Await**: ES2017+ support required - **WebSocket**: For real-time features (optional) For older browsers, use appropriate polyfills. ## Node.js Support - **Node.js 16+**: Required for modern JavaScript features - **File System**: Native fs module support - **Streams**: Node.js streams for large files - **WebSocket**: ws package for real-time features ## Contributing We welcome contributions! Please see our [Contributing Guide](CONTRIBUTING.md) for details. ## License MIT License - see [LICENSE](LICENSE) file for details. ## Links - [AudioPod Website](https://audiopod.ai) - [API Documentation](https://docs.audiopod.ai/) - [GitHub Repository](https://github.com/AudiopodAI/audiopod) - [npm Package](https://www.npmjs.com/package/audiopod-sdk) ## Support - 📧 Email: [support@audiopod.ai](mailto:support@audiopod.ai) - 💬 Discord: [AudioPod Community](https://discord.gg/audiopod)