audiopod-sdk
Version:
AudioPod SDK for Node.js and React - Professional Audio Processing powered by AI
1,785 lines (1,509 loc) • 48.3 kB
Markdown
# AudioPod SDK for Node.js and React
Professional Audio Processing SDK powered by AI. Easy-to-use JavaScript/TypeScript client for the AudioPod API.
[](https://badge.fury.io/js/%40audiopod%2Fsdk)
[](http://www.typescriptlang.org/)
[](https://opensource.org/licenses/MIT)
## Features
🎤 **Voice Cloning & TTS** - Clone voices, multi-voice TTS, voice collections, and voice conversion
🎵 **Music Generation** - Complete music production suite with all AudioPod AI tools
📝 **Transcription** - Convert speech to text with speaker diarization and timestamps
🌍 **Translation** - Speech-to-speech translation in 20+ languages
👥 **Speaker Analysis** - Speaker diarization, identification, and extraction
🔧 **Audio Enhancement** - Denoise, stem extraction, and professional audio processing
🎬 **Karaoke Generation** - Create karaoke videos with lyric timing
🎙️ **Professional Audio** - Mixing, mastering, DDSP synthesis, and audio restoration
🤖 **AI Copilot** - Intelligent content creation, document processing, and workflow automation
📹 **Video Download** - Multi-platform video download and conversion
🔑 **API Key Management** - Create, manage, and revoke API keys
💰 **Credit Management** - Monitor usage, credits, and pay-as-you-go purchases
## Installation
```bash
npm install audiopod-sdk
```
## Quick Start
### Node.js
```javascript
const { AudioPodClient } = require('audiopod-sdk');
// Initialize client
const client = new AudioPodClient({
apiKey: 'ap_your_api_key_here' // Get from https://app.audiopod.ai/dashboard
});
// Generate voice (unified approach for both cloning and TTS)
const result = await client.voice.generateVoice({
text: 'Hello from AudioPod!',
voiceFile: 'path/to/voice.wav', // For voice cloning
// OR voiceId: 'profile-id', // For existing voice profiles
language: 'en',
audioFormat: 'mp3',
generationParams: {
speed: 1.0,
denoise: true
},
waitForCompletion: true
});
console.log('Generated audio:', result.outputUrl);
```
### React
```jsx
import { useVoiceGeneration, useCredits } from 'audiopod-sdk/react';
function VoiceApp() {
const apiKey = process.env.REACT_APP_AUDIOPOD_API_KEY;
// Use voice generation hook (unified approach)
const {
generateVoice,
isLoading,
progress,
result,
error
} = useVoiceGeneration(apiKey, {
onProgress: (progress) => console.log('Progress:', progress),
onSuccess: (result) => console.log('Completed:', result)
});
// Check credits
const { credits } = useCredits(apiKey);
const handleGenerate = async () => {
await generateVoice({
text: 'Hello from React!',
voiceFile: selectedFile, // For voice cloning
language: 'en',
audioFormat: 'mp3',
generationParams: {
speed: 1.0,
denoise: true
}
});
};
return (
<div>
<p>Credits: {credits?.totalAvailableCredits}</p>
<button onClick={handleGenerate} disabled={isLoading}>
{isLoading ? `Generating... ${progress}%` : 'Generate Voice'}
</button>
{result && <audio src={result.outputUrl} controls />}
</div>
);
}
```
## API Reference
### AudioPodClient
Main client class for interacting with the AudioPod API.
```javascript
const client = new AudioPodClient({
apiKey: 'ap_your_api_key',
baseURL: 'https://api.audiopod.ai', // Optional
timeout: 30000, // Optional (30s default)
debug: false // Optional
});
```
### Voice Generation (Unified TTS & Cloning)
```javascript
// Generate voice with file cloning (unified endpoint)
const result = await client.voice.generateVoice({
text: 'Text to generate',
voiceFile: 'voice.wav', // For voice cloning - File path, File object, or Blob
language: 'en', // Optional - source language
audioFormat: 'mp3', // Optional - 'mp3', 'wav', 'ogg'
generationParams: { // Optional - provider-specific parameters
speed: 1.0, // Speech speed (0.25-4.0)
pitch: 1.0, // Voice pitch adjustment (0.5-2.0)
denoise: true, // Preprocess voice sample for better quality
temperature: 0.75, // Generation randomness (0.1-1.0)
topK: 30, // Token selection diversity
topP: 0.85 // Nucleus sampling threshold
},
waitForCompletion: true, // Optional
timeout: 300000 // Optional
});
// Generate speech with existing voice profile (unified endpoint)
const speech = await client.voice.generateVoice({
text: 'Hello world!',
voiceId: 'profile-id-or-uuid', // For existing voice profiles
language: 'en',
audioFormat: 'mp3',
generationParams: {
speed: 1.0,
temperature: 0.75,
topK: 30,
topP: 0.85
},
waitForCompletion: true
});
// Create reusable voice profile (unchanged)
const profile = await client.voice.createVoiceProfile(
'My Voice',
'voice.wav',
'Description', // Optional
false, // isPublic
true // waitForCompletion
);
// Backward compatibility methods (deprecated - use generateVoice instead)
const legacyClone = await client.voice.cloneVoice({
voiceFile: 'voice.wav',
text: 'Text to generate',
// ... other options (internally uses generateVoice)
});
const legacySpeech = await client.voice.generateSpeech(
'profile-id',
'Hello world!',
// ... other options (internally uses generateVoice)
);
// Voice conversion - change voice characteristics
const conversion = await client.voice.convertVoice({
audioFile: 'source-audio.wav',
targetVoiceId: 'voice-uuid-or-id',
waitForCompletion: true
});
// Multi-voice TTS for dialogues
const dialogue = await client.voice.generateMultiVoiceTTS({
segments: [
{ text: 'Hello there!', voiceId: 'voice1' },
{ text: 'How are you?', voiceId: 'voice2' },
{ text: 'I am fine, thank you!', voiceId: 'voice1' }
],
mixMode: 'sequential', // 'sequential' or 'overlapping'
silenceDuration: 0.5, // seconds between segments
normalizeVolume: true
});
// Stream voice generation for real-time audio
await client.voice.streamVoiceGeneration(
voiceId,
'Long text to generate...',
{
language: 'en',
generationParams: {
speed: 1.0,
temperature: 0.75,
topK: 30,
topP: 0.85
},
onProgress: (progress) => console.log(`Progress: ${progress}%`),
onAudioChunk: (chunk) => {
// Process audio chunk in real-time
audioStream.write(chunk);
},
onComplete: (result) => console.log('Streaming complete'),
onError: (error) => console.error('Stream error:', error)
}
);
// Voice collections for organizing voices
const collection = await client.voice.createVoiceCollection({
name: 'Family Voices',
description: 'Collection of family member voices',
isPublic: false
});
await client.voice.addVoicesToCollection(collection.id, [voice1.id, voice2.id]);
```
#### Voice Generation Parameters
The `generationParams` object allows fine-tuning of voice generation:
```javascript
const generationParams = {
// Core parameters
speed: 1.0, // Speech speed (0.5-2.0, default: 1.0)
temperature: 0.75, // Generation randomness (0.1-1.0, default: 0.75)
// Audio quality parameters
denoise: true, // Preprocess voice sample (default: false)
pitch: 1.0, // Voice pitch adjustment (0.5-2.0, default: 1.0)
// Advanced sampling parameters
topK: 30, // Token selection diversity (1-100, default: 30)
topP: 0.85, // Nucleus sampling threshold (0.1-1.0, default: 0.85)
// Provider-specific parameters
// Additional parameters may be available depending on the voice provider
};
```
### Music Generation
```javascript
// Text-to-music generation (with lyrics)
const music = await client.music.generateMusic({
prompt: 'upbeat electronic dance music',
lyrics: 'Verse 1: Dancing under lights\nChorus: Feel the beat tonight',
duration: 120.0, // seconds
guidanceScale: 7.5, // Optional (1.0-20.0)
numInferenceSteps: 50, // Optional (20-100)
seed: 12345, // Optional
waitForCompletion: true
});
// Generate rap music
const rap = await client.music.generateRap({
lyrics: 'Your rap lyrics here',
style: 'modern', // 'modern', 'classic', 'trap'
tempo: 120, // BPM (80-200)
duration: 90.0,
waitForCompletion: true
});
// Generate instrumental only
const instrumental = await client.music.generateInstrumental({
prompt: 'jazz piano and saxophone',
duration: 60,
instruments: ['piano', 'saxophone'],
key: 'C',
tempo: 120,
guidanceScale: 7.5
});
// Generate vocals from lyrics
const vocals = await client.music.generateVocals({
lyrics: 'Amazing vocals with emotion',
genre: 'pop',
emotion: 'happy',
gender: 'female',
waitForCompletion: true
});
// Generate audio samples/loops
const samples = await client.music.generateSamples({
prompt: 'heavy 808 drum pattern',
duration: 8.0, // short loops
sampleType: 'drums',
tempo: 140
});
// Audio-to-audio transformation
const transformed = await client.music.transformAudio({
audioFile: audioFile, // File, Blob, or path
prompt: 'make this sound like a rock song',
referenceStrength: 0.7, // How much to keep from original
audioDuration: 120.0,
guidanceScale: 15.0
});
// SongBloom - reference audio based generation
const songbloom = await client.music.generateWithReference({
audioFile: referenceFile,
lyrics: 'New lyrics for the melody',
duration: 120.0,
guidanceScale: 7.5,
preserveMelody: true
});
// Music editing operations
// Retake - generate variations
const retake = await client.music.retakeMusic({
sourceJobId: originalJob.id,
variationStrength: 0.8, // How different the retake should be
waitForCompletion: true
});
// Repaint - modify specific sections
const repainted = await client.music.repaintMusic({
sourceJobId: originalJob.id,
startTime: 30.0, // seconds
endTime: 60.0, // seconds
newPrompt: 'make this section more energetic',
waitForCompletion: true
});
// Extend existing music
const extended = await client.music.extendMusic({
sourceJobId: originalJob.id,
extendDuration: 30.0, // seconds to add
direction: 'end', // 'start' or 'end'
waitForCompletion: true
});
// Advanced editing
const edited = await client.music.editMusic({
sourceJobId: originalJob.id,
edits: [
{
startTime: 0,
endTime: 30,
operation: 'fade_in'
},
{
startTime: 60,
endTime: 90,
operation: 'replace',
newPrompt: 'guitar solo'
}
],
waitForCompletion: true
});
// Social features
// Like/unlike tracks
await client.music.likeTrack(jobId);
await client.music.unlikeTrack(jobId);
await client.music.dislikeTrack(jobId);
// Share tracks
const shareInfo = await client.music.shareTrack({
jobId: musicJob.id,
platform: 'public',
message: 'Check out my AI-generated music!'
});
// Get track statistics
const stats = await client.music.getTrackStats(jobId);
console.log(`Likes: ${stats.likes}, Shares: ${stats.shares}`);
// Comments
await client.music.addComment(jobId, 'Amazing track!');
const comments = await client.music.getComments(jobId);
// Get shared track (public access, no auth required)
const sharedTrack = await client.music.getSharedTrack(shareToken);
```
### Transcription
```javascript
// Transcribe audio file
const transcript = await client.transcription.transcribeAudio({
audioFile: 'speech.wav',
language: 'en', // Optional (auto-detect if not specified)
modelType: 'whisperx', // 'whisperx' or 'faster-whisper'
enableSpeakerDiarization: true, // Optional
enableWordTimestamps: true, // Optional
waitForCompletion: true
});
console.log('Transcript:', transcript.transcript);
console.log('Confidence:', transcript.confidenceScore);
console.log('Segments:', transcript.segments);
// Transcribe from URL
const urlTranscript = await client.transcription.transcribeUrl(
'https://example.com/audio.mp3',
{ language: 'en', enableSpeakerDiarization: true }
);
```
### Translation
```javascript
// Translate audio
const translation = await client.translation.translateAudio({
audioFile: 'english-speech.wav',
targetLanguage: 'es', // Spanish
sourceLanguage: 'en', // Optional (auto-detect)
waitForCompletion: true
});
console.log('Translated audio:', translation.audioOutputUrl);
console.log('Video with subtitles:', translation.videoOutputUrl);
```
### Credit Management
```javascript
// Check credit balance
const credits = await client.credits.getCreditBalance();
console.log('Available credits:', credits.totalAvailableCredits);
console.log('Subscription credits:', credits.balance);
console.log('Pay-as-you-go credits:', credits.paygBalance);
// Get usage history
const usage = await client.credits.getUsageHistory();
console.log('Recent usage:', usage);
// Get credit multipliers for different services
const multipliers = await client.credits.getCreditMultipliers();
console.log('Voice cloning cost multiplier:', multipliers.voice_cloning);
console.log('Music generation cost multiplier:', multipliers.music_generation);
// Pay-as-you-go credit purchasing
// Get pricing information
const paygInfo = await client.credits.getPayAsYouGoInfo();
console.log(`${paygInfo.creditsPerDollar} credits per $1`);
console.log(`Min purchase: $${paygInfo.minAmountUsd}, Max: $${paygInfo.maxAmountUsd}`);
// Create checkout session for credit purchase
const checkout = await client.credits.createPayAsYouGoCheckout({
amountUsd: 50 // Purchase $50 worth of credits
});
// Redirect user to checkout.sessionUrl or use checkout.sessionId with Stripe
console.log('Checkout URL:', checkout.sessionUrl);
console.log('Will receive:', checkout.creditsToReceive, 'credits');
```
### Professional Audio Processing
```javascript
// Create mixing project
const project = await client.audio.createMixingProject({
projectName: 'My Mix',
sampleRate: 48000,
bitDepth: 24
});
// Add tracks to project
await client.audio.addTrackToProject(project.id, {
trackName: 'Vocals',
audioFile: vocalsFile,
startTime: 0.0
});
await client.audio.addTrackToProject(project.id, {
trackName: 'Instruments',
audioFile: instrumentsFile,
startTime: 0.0
});
// Add effects to tracks
await client.audio.addEffectToTrack(project.id, 'vocals-track-id', {
effectType: 'compressor',
parameters: {
threshold: -12.0,
ratio: 4.0,
attack: 10.0,
release: 100.0
}
});
// Mix project to final audio
const mixResult = await client.audio.mixProject(project.id, {
outputFormat: 'wav',
normalize: true,
waitForCompletion: true
});
// Comprehensive audio analysis
const analysis = await client.audio.analyzeAudio({
audioFile: audioFile,
analysisTypes: ['spectral', 'loudness', 'pitch', 'rhythm'],
waitForCompletion: true
});
console.log('Audio analysis:', analysis.results);
// Audio mastering with reference
const mastered = await client.audio.masterWithReference({
targetAudio: unmasteredFile,
referenceAudio: referenceFile,
outputFormats: ['wav_24bit', 'mp3_320'],
waitForCompletion: true
});
// Audio mastering with preset
const masteredPreset = await client.audio.masterWithPreset({
targetAudio: unmasteredFile,
preset: 'spotify_loudness', // 'spotify_loudness', 'youtube_loudness', 'apple_music'
outputFormats: ['wav_16bit', 'wav_24bit'],
waitForCompletion: true
});
// Audio restoration
const restored = await client.audio.restoreAudio({
audioFile: damagedFile,
restorationType: 'denoise', // 'denoise', 'declip', 'dehiss', 'decrackle'
strength: 0.7, // How aggressive the restoration should be
preservation: 0.8, // How much to preserve original characteristics
waitForCompletion: true
});
// DDSP neural synthesis
const ddspFeatures = await client.audio.extractDDSPFeatures({
audioFile: audioFile,
waitForCompletion: true
});
const synthesized = await client.audio.ddspSynthesize({
synthType: 'harmonic', // 'harmonic', 'filtered_noise', 'additive'
synthesisParams: {
fundamental_frequency: 440.0,
harmonics: [1.0, 0.5, 0.3, 0.2],
noise_level: 0.1
},
waitForCompletion: true
});
// Timbre transfer using DDSP
const timbreTransfer = await client.audio.ddspTimbreTransfer({
sourceAudio: sourceFile,
targetAudio: targetFile,
transferStrength: 0.8,
waitForCompletion: true
});
```
### Supersmart AI Copilot
```javascript
// Create AI copilot session
const session = await client.copilot.createSession();
console.log('Session ID:', session.sessionId);
console.log('Capabilities:', session.capabilities);
// Send query to AI copilot
const response = await client.copilot.sendQuery({
sessionId: session.sessionId,
userQuery: 'I want to create an audiobook from my PDF document',
attachments: [
{
type: 'document',
file: pdfFile,
description: 'Novel to convert to audiobook'
}
]
});
// Process documents for content creation
const processed = await client.copilot.processDocuments({
files: [pdfFile, docxFile],
sessionId: session.sessionId,
processingType: 'audiobook_preparation'
});
// Extract content from web URLs
const webContent = await client.copilot.extractWebContent({
urls: [
'https://example.com/article1',
'https://example.com/article2'
],
sessionId: session.sessionId,
extractionType: 'podcast_script'
});
// Generate content with AI
const content = await client.copilot.generateContent({
contentType: 'audiobook', // 'audiobook', 'podcast', 'meditation', 'educational'
parameters: {
chapters: 5,
narratorStyle: 'professional',
genre: 'fiction',
targetAudience: 'adults'
},
sessionId: session.sessionId
});
// Execute complete workflow
const workflowExecution = await client.copilot.executeWorkflow({
sessionId: session.sessionId,
// Returns a stream of progress updates
onProgress: (update) => {
console.log('Workflow progress:', update.step, update.progress);
}
});
// List user workflows
const workflows = await client.copilot.listWorkflows({
limit: 10,
offset: 0
});
// Get workflow status
const workflowStatus = await client.copilot.getWorkflowStatus(workflowId);
```
### Video Download & Processing
```javascript
// Get video information
const videoInfo = await client.video.getVideoInfo({
url: 'https://youtube.com/watch?v=example'
});
console.log('Title:', videoInfo.title);
console.log('Duration:', videoInfo.duration);
console.log('Available formats:', videoInfo.formats);
// Download video
const download = await client.video.downloadVideo({
url: 'https://youtube.com/watch?v=example',
format: 'mp4', // 'mp4', 'webm', 'mkv'
quality: '720p', // '144p', '360p', '720p', '1080p', '4k'
audioQuality: 'high', // 'low', 'medium', 'high'
extractAudio: false, // Set to true to get audio only
waitForCompletion: true
});
console.log('Download URL:', download.downloadUrl);
console.log('File size:', download.fileSize);
// Bulk download multiple videos
const bulkDownload = await client.video.bulkDownload({
urls: [
'https://youtube.com/watch?v=video1',
'https://youtube.com/watch?v=video2',
'https://tiktok.com/@user/video/123'
],
format: 'mp4',
quality: '720p',
waitForCompletion: false // Get job IDs for tracking
});
// List download jobs
const downloads = await client.video.listDownloads({
status: 'completed', // 'pending', 'processing', 'completed', 'failed'
platform: 'youtube', // 'youtube', 'tiktok', 'instagram', 'vimeo'
page: 1,
perPage: 20
});
// Get download job status
const downloadJob = await client.video.getDownload(jobId);
// Get supported platforms and formats
const formats = await client.video.getSupportedFormats();
console.log('Supported platforms:', Object.keys(formats));
console.log('Video formats:', formats.video);
console.log('Audio formats:', formats.audio);
// Public download (no authentication required)
const publicDownload = await AudioPodClient.downloadVideoPublic({
url: 'https://youtube.com/watch?v=example',
format: 'mp3', // For audio extraction
quality: 'high'
});
```
### API Key Management
```javascript
// Create new API key
const apiKey = await client.auth.createApiKey({
name: 'Production Key',
description: 'API key for production environment',
scopes: ['voice:read', 'voice:write', 'music:read', 'music:write'], // Optional
expiresInDays: 365 // Optional, defaults to never expire
});
console.log('New API key:', apiKey.key); // Only shown once!
console.log('Key ID:', apiKey.id);
// List API keys
const apiKeys = await client.auth.listApiKeys({
status: 'active' // 'active', 'revoked', 'all'
});
apiKeys.forEach(key => {
console.log(`${key.name}: ${key.status} (Created: ${key.createdAt})`);
});
// Revoke API key
await client.auth.revokeApiKey(keyId);
// Re-enable revoked API key
await client.auth.unrevokeApiKey(keyId);
// Permanently delete API key
await client.auth.deleteApiKey(keyId);
// Update API key (name, description only)
const updatedKey = await client.auth.updateApiKey(keyId, {
name: 'Updated Production Key',
description: 'Updated description'
});
```
### Job Management
```javascript
// Check job status
const job = await client.getJobStatus(jobId);
console.log('Status:', job.status); // 'pending', 'processing', 'completed', 'failed'
console.log('Progress:', job.progress); // 0-100
// Wait for job completion with progress monitoring
const result = await client.waitForJobCompletion(
jobId,
300000, // 5 minutes timeout
5000, // Poll every 5 seconds
(job) => {
console.log(`Progress: ${job.progress}%`);
}
);
// Cancel a job
await client.cancelJob(jobId);
```
## React Hooks
The SDK provides React hooks for easy integration with all services:
### useVoiceGeneration
```jsx
const {
generateVoice,
cancel,
isLoading,
progress,
result,
error,
isInitialized
} = useVoiceGeneration(apiKey, {
onProgress: (progress) => console.log(progress),
onSuccess: (result) => console.log(result),
onError: (error) => console.error(error),
pollInterval: 5000 // Optional
});
// Usage for voice cloning
const handleClone = async () => {
await generateVoice({
text: 'Text to generate',
voiceFile: selectedFile,
language: 'en',
audioFormat: 'mp3'
});
};
// Usage for TTS with existing voice
const handleTTS = async () => {
await generateVoice({
text: 'Text to generate',
voiceId: 'existing-voice-id',
language: 'en',
audioFormat: 'mp3'
});
};
// Backward compatibility hook (deprecated)
const {
cloneVoice, // Now uses generateVoice internally
// ... other properties
} = useVoiceCloning(apiKey, options);
```
### useMusicGeneration
```jsx
const {
generateMusic,
generateRap,
generateInstrumental,
generateVocals,
generateSamples,
retakeMusic,
extendMusic,
cancel,
isLoading,
progress,
result,
error
} = useMusicGeneration(apiKey, {
onProgress: (progress) => console.log(progress),
onSuccess: (result) => console.log(result)
});
// Usage
const handleGenerateMusic = async () => {
await generateMusic({
prompt: 'Epic orchestral music',
duration: 120,
waitForCompletion: true
});
};
```
### useTranscription
```jsx
const {
transcribeAudio,
transcribeUrl,
cancel,
isLoading,
progress,
result,
error
} = useTranscription(apiKey, {
onProgress: (progress) => console.log(progress),
onSuccess: (result) => console.log(result)
});
```
### useCredits
```jsx
const {
credits,
isLoading,
error,
refetch,
purchaseCredits
} = useCredits(apiKey, {
onCreditsUpdated: (newCredits) => console.log('Credits updated:', newCredits)
});
// Usage
const handlePurchase = async () => {
const checkout = await purchaseCredits(50); // $50 purchase
window.location.href = checkout.sessionUrl;
};
```
### useProfessionalAudio
```jsx
const {
createMixingProject,
analyzeAudio,
masterAudio,
restoreAudio,
isLoading,
progress,
result,
error
} = useProfessionalAudio(apiKey, {
onProgress: (progress) => console.log(progress),
onSuccess: (result) => console.log(result)
});
```
### useSupersmart
```jsx
const {
createSession,
sendQuery,
executeWorkflow,
uploadDocuments,
isLoading,
progress,
result,
error,
sessionId
} = useSupersmart(apiKey, {
onSessionCreated: (session) => console.log('Session created:', session.sessionId),
onWorkflowProgress: (update) => console.log('Workflow progress:', update)
});
```
### useVideoDownload
```jsx
const {
getVideoInfo,
downloadVideo,
bulkDownload,
isLoading,
progress,
result,
error
} = useVideoDownload(apiKey, {
onProgress: (progress) => console.log(progress),
onSuccess: (result) => console.log(result)
});
```
### useApiKeys
```jsx
const {
apiKeys,
createApiKey,
revokeApiKey,
isLoading,
error,
refetch
} = useApiKeys(apiKey, {
onKeyCreated: (newKey) => console.log('New API key created:', newKey.id),
onKeyRevoked: (keyId) => console.log('API key revoked:', keyId)
});
```
### Hook Composition Example
```jsx
function AudioProductionApp() {
const apiKey = process.env.REACT_APP_AUDIOPOD_API_KEY;
// Multiple hooks for complete audio production
const { credits } = useCredits(apiKey);
const voiceCloning = useVoiceCloning(apiKey);
const musicGeneration = useMusicGeneration(apiKey);
const audioProcessing = useProfessionalAudio(apiKey);
const supersmart = useSupersmart(apiKey);
const createCompleteProduction = async () => {
// Step 1: Create AI content with Supersmart
const { sessionId } = await supersmart.createSession();
const contentPlan = await supersmart.sendQuery({
sessionId,
userQuery: 'Create a 3-minute song with vocals'
});
// Step 2: Generate music
const musicResult = await musicGeneration.generateMusic({
prompt: contentPlan.musicPrompt,
duration: 180,
waitForCompletion: true
});
// Step 3: Generate vocals
const vocalsResult = await musicGeneration.generateVocals({
lyrics: contentPlan.lyrics,
waitForCompletion: true
});
// Step 4: Professional mixing
const mixingProject = await audioProcessing.createMixingProject({
projectName: 'AI Song Production'
});
// Add tracks and mix
// ... mixing logic
return mixingProject;
};
return (
<div>
<h1>AI Audio Production Studio</h1>
<p>Credits Available: {credits?.totalAvailableCredits}</p>
<button
onClick={createCompleteProduction}
disabled={voiceCloning.isLoading || musicGeneration.isLoading}
>
Create Complete Production
</button>
{/* Progress indicators for all services */}
{voiceCloning.isLoading && <div>Voice processing: {voiceCloning.progress}%</div>}
{musicGeneration.isLoading && <div>Music generation: {musicGeneration.progress}%</div>}
{audioProcessing.isLoading && <div>Audio processing: {audioProcessing.progress}%</div>}
</div>
);
}
```
## Error Handling
The SDK provides comprehensive error handling with detailed error codes:
```javascript
try {
const result = await client.voice.cloneVoice({
voiceFile: 'voice.wav',
text: 'Hello world!',
generationParams: {
speed: 1.0,
denoise: true
}
});
} catch (error) {
switch (error.code) {
case 'AUTHENTICATION_ERROR':
console.log('Invalid API key or authentication failed');
break;
case 'INSUFFICIENT_CREDITS':
console.log('Not enough credits to complete this operation');
break;
case 'RATE_LIMIT_ERROR':
console.log('Rate limit exceeded, please try again later');
break;
case 'FILE_TOO_LARGE':
console.log('File exceeds maximum size limit');
break;
case 'INVALID_AUDIO_FORMAT':
console.log('Unsupported audio format');
break;
case 'PROCESSING_ERROR':
console.log('Error during audio processing');
break;
case 'QUOTA_EXCEEDED':
console.log('API quota exceeded');
break;
case 'VALIDATION_ERROR':
console.log('Input validation failed:', error.details);
break;
case 'NETWORK_ERROR':
console.log('Network connectivity issue');
break;
case 'TIMEOUT_ERROR':
console.log('Operation timed out');
break;
case 'NOT_FOUND':
console.log('Resource not found');
break;
case 'PERMISSION_DENIED':
console.log('Permission denied for this operation');
break;
default:
console.log('Unknown error:', error.message);
}
// Additional error information
if (error.statusCode) {
console.log('HTTP Status:', error.statusCode);
}
if (error.details) {
console.log('Error details:', error.details);
}
}
```
### Common Error Codes
| Code | Description | Common Causes |
|------|-------------|---------------|
| `AUTHENTICATION_ERROR` | Invalid or missing API key | Wrong API key, expired key |
| `INSUFFICIENT_CREDITS` | Not enough credits | Low credit balance, operation cost exceeds balance |
| `RATE_LIMIT_ERROR` | Too many requests | Exceeded rate limits, need to slow down requests |
| `FILE_TOO_LARGE` | File size exceeds limits | Audio >100MB, Video >200MB |
| `INVALID_AUDIO_FORMAT` | Unsupported file format | File format not in supported list |
| `PROCESSING_ERROR` | Audio processing failed | Corrupted file, incompatible audio |
| `VALIDATION_ERROR` | Input validation failed | Invalid parameters, missing required fields |
| `TIMEOUT_ERROR` | Operation timed out | Long processing time, network issues |
| `NOT_FOUND` | Resource not found | Invalid job ID, deleted resource |
| `PERMISSION_DENIED` | Access denied | Trying to access other user's resources |
### Error Recovery Strategies
```javascript
// Retry logic for transient errors
async function retryOperation(operation, maxRetries = 3) {
for (let i = 0; i < maxRetries; i++) {
try {
return await operation();
} catch (error) {
if (error.code === 'RATE_LIMIT_ERROR' && i < maxRetries - 1) {
// Wait and retry for rate limit errors
const delay = Math.pow(2, i) * 1000; // Exponential backoff
console.log(`Rate limited, retrying in ${delay}ms...`);
await new Promise(resolve => setTimeout(resolve, delay));
continue;
}
if (error.code === 'NETWORK_ERROR' && i < maxRetries - 1) {
// Retry network errors
console.log(`Network error, retrying attempt ${i + 2}...`);
await new Promise(resolve => setTimeout(resolve, 1000));
continue;
}
// Don't retry for these error types
if (['AUTHENTICATION_ERROR', 'INSUFFICIENT_CREDITS', 'VALIDATION_ERROR'].includes(error.code)) {
throw error;
}
// Last attempt or non-retryable error
if (i === maxRetries - 1) {
throw error;
}
}
}
}
// Usage
try {
const result = await retryOperation(() =>
client.music.generateMusic({
prompt: 'Epic orchestral music',
duration: 120
})
);
} catch (error) {
console.error('Operation failed after retries:', error);
}
```
## Configuration
### Environment Variables
Set your API key using environment variables:
```bash
# For Node.js
export AUDIOPOD_API_KEY=ap_your_api_key_here
# For React
REACT_APP_AUDIOPOD_API_KEY=ap_your_api_key_here
```
### Client Configuration
```javascript
const client = new AudioPodClient({
apiKey: process.env.AUDIOPOD_API_KEY,
baseURL: 'https://api.audiopod.ai', // Custom API endpoint
timeout: 60000, // Request timeout (60s)
maxRetries: 3, // Max retry attempts
debug: true // Enable debug logging
});
// Update API key
client.updateApiKey('new_api_key');
// Enable/disable debug mode
client.setDebug(true);
// Check health
const health = await client.checkHealth();
console.log('API Status:', health.status);
```
## File Upload Support
The SDK supports various file input methods with proper validation:
```javascript
// File path (Node.js)
await client.voice.cloneVoice({
voiceFile: '/path/to/voice.wav',
text: 'Hello',
generationParams: {
speed: 1.0,
denoise: true
}
});
// File object (Browser)
const fileInput = document.querySelector('input[type="file"]');
await client.voice.cloneVoice({
voiceFile: fileInput.files[0],
text: 'Hello',
generationParams: {
speed: 1.0,
temperature: 0.75
}
});
// Blob object
const blob = new Blob([audioData], { type: 'audio/wav' });
await client.voice.cloneVoice({
voiceFile: blob,
text: 'Hello',
generationParams: {
denoise: true,
pitch: 1.0
}
});
```
### File Specifications
#### Audio Files
- **Maximum size**: 100MB
- **Supported formats**: WAV, MP3, OGG, FLAC, M4A
- **Recommended format**: WAV (best quality)
- **Sample rate**: 16kHz or higher recommended
- **Bit depth**: 16-bit or 24-bit
- **Channels**: Mono or stereo
#### Video Files
- **Maximum size**: 200MB
- **Supported formats**: MP4, MOV, AVI, MKV, WEBM
- **Use case**: Translation with video output, karaoke generation
#### Voice Cloning Recommendations
- **Duration**: 10-30 seconds for optimal quality
- **Quality**: Clean, noise-free audio
- **Content**: Clear speech without background music
- **Language**: Single language per sample
- **Speaker**: Single speaker per sample
#### Music Generation Recommendations
- **Reference audio**: High-quality stereo preferred
- **Duration**: Any length supported
- **Formats**: WAV or high-quality MP3 (320kbps)
### Upload Progress Tracking
```javascript
// Track upload progress
await client.voice.createVoiceProfile(
'My Voice',
largeAudioFile,
'High quality voice sample',
false,
true,
{
onProgress: (progress) => {
console.log(`Upload progress: ${progress.percentage}%`);
console.log(`${progress.loaded} / ${progress.total} bytes`);
}
}
);
```
## TypeScript Support
The SDK is written in TypeScript and provides full type definitions:
```typescript
import { AudioPodClient, VoiceCloneRequest, VoiceCloneResult } from 'audiopod-sdk';
const client = new AudioPodClient({ apiKey: 'your-key' });
const request: VoiceCloneRequest = {
voiceFile: 'voice.wav',
text: 'Hello TypeScript!',
language: 'en',
generationParams: {
speed: 1.0,
denoise: true,
temperature: 0.75
}
};
const result: VoiceCloneResult = await client.voice.cloneVoice(request);
```
## Advanced Examples
### Batch Processing
```javascript
// Process multiple files
const files = ['file1.wav', 'file2.mp3', 'file3.m4a'];
const jobs = [];
// Start all jobs
for (const file of files) {
const result = await client.transcription.transcribeAudio({
audioFile: file,
waitForCompletion: false
});
jobs.push({ file, jobId: result.job.id });
}
// Wait for all to complete
for (const { file, jobId } of jobs) {
const result = await client.waitForJobCompletion(jobId);
console.log(`${file}: ${result.transcript}`);
}
```
### Complete Audiobook Production Pipeline
```javascript
// Complete workflow: Document → Audiobook with multiple voices
async function createAudiobook(documentFile) {
// Step 1: Use Supersmart to analyze and structure content
const session = await client.copilot.createSession();
const contentAnalysis = await client.copilot.processDocuments({
files: [documentFile],
sessionId: session.sessionId,
processingType: 'audiobook_preparation'
});
// Step 2: Generate narrator and character voices
const narratorVoice = await client.voice.createVoiceProfile(
'Narrator Voice',
narratorSample,
'Professional audiobook narrator',
false,
true
);
const characterVoice = await client.voice.createVoiceProfile(
'Character Voice',
characterSample,
'Character dialogue voice',
false,
true
);
// Step 3: Generate background music
const backgroundMusic = await client.music.generateInstrumental({
prompt: 'soft ambient background music for audiobook',
duration: 300, // 5 minutes base track
tempo: 60,
instruments: ['piano', 'strings'],
waitForCompletion: true
});
// Step 4: Generate audio segments with multiple voices
const audioSegments = [];
for (const chapter of contentAnalysis.chapters) {
const narrativeAudio = await client.voice.generateSpeech(
narratorVoice.id,
chapter.narrativeText,
{
speed: 0.9, // Slightly slower for audiobooks
audioFormat: 'wav',
waitForCompletion: true
}
);
const dialogueAudio = await client.voice.generateSpeech(
characterVoice.id,
chapter.dialogueText,
{
speed: 1.0,
audioFormat: 'wav',
waitForCompletion: true
}
);
audioSegments.push({ narrative: narrativeAudio, dialogue: dialogueAudio });
}
// Step 5: Professional mixing and mastering
const mixingProject = await client.audio.createMixingProject({
projectName: `Audiobook: ${contentAnalysis.title}`,
sampleRate: 44100,
bitDepth: 24
});
// Add all segments to mixing project
for (let i = 0; i < audioSegments.length; i++) {
const segment = audioSegments[i];
await client.audio.addTrackToProject(mixingProject.id, {
trackName: `Chapter ${i + 1} Narrative`,
audioFile: segment.narrative.outputUrl,
startTime: i * 600 // 10 minutes per chapter
});
if (segment.dialogue) {
await client.audio.addTrackToProject(mixingProject.id, {
trackName: `Chapter ${i + 1} Dialogue`,
audioFile: segment.dialogue.outputUrl,
startTime: i * 600 + 300 // Offset dialogue
});
}
}
// Add background music track
await client.audio.addTrackToProject(mixingProject.id, {
trackName: 'Background Music',
audioFile: backgroundMusic.outputUrl,
startTime: 0
});
// Apply professional effects
await client.audio.addEffectToTrack(mixingProject.id, 'narrative-track-id', {
effectType: 'compressor',
parameters: { threshold: -18.0, ratio: 3.0 }
});
// Final mix and master
const finalAudiobook = await client.audio.mixProject(mixingProject.id, {
outputFormat: 'mp3',
normalize: true,
waitForCompletion: true
});
// Step 6: Apply professional mastering
const masteredAudiobook = await client.audio.masterWithPreset({
targetAudio: finalAudiobook.outputUrl,
preset: 'audiobook_loudness',
outputFormats: ['mp3_320', 'wav_24bit'],
waitForCompletion: true
});
return {
audiobook: masteredAudiobook,
metadata: contentAnalysis,
chapters: audioSegments.length,
duration: audioSegments.length * 600 // Estimated duration
};
}
```
### AI-Powered Podcast Creation
```javascript
// Create a complete podcast episode from web articles
async function createPodcastFromUrls(urls, hostVoiceFile, guestVoiceFile) {
// Step 1: Extract and process web content
const session = await client.copilot.createSession();
const webContent = await client.copilot.extractWebContent({
urls: urls,
sessionId: session.sessionId,
extractionType: 'podcast_script'
});
// Step 2: Generate podcast script with AI
const podcastScript = await client.copilot.generateContent({
contentType: 'podcast',
parameters: {
format: 'interview',
duration: 30, // 30 minutes
tone: 'conversational',
includeIntro: true,
includeOutro: true
},
sessionId: session.sessionId
});
// Step 3: Create voice profiles for host and guest
const [hostVoice, guestVoice] = await Promise.all([
client.voice.createVoiceProfile('Podcast Host', hostVoiceFile, 'Professional podcast host voice'),
client.voice.createVoiceProfile('Podcast Guest', guestVoiceFile, 'Expert guest voice')
]);
// Step 4: Generate intro/outro music
const introMusic = await client.music.generateMusic({
prompt: 'upbeat podcast intro music with energy',
duration: 15,
guidanceScale: 8.0,
waitForCompletion: true
});
const outroMusic = await client.music.generateMusic({
prompt: 'professional podcast outro music, fade out style',
duration: 20,
guidanceScale: 7.5,
waitForCompletion: true
});
// Step 5: Generate multi-voice dialogue
const podcastAudio = await client.voice.generateMultiVoiceTTS({
segments: podcastScript.segments.map(segment => ({
text: segment.text,
voiceId: segment.speaker === 'host' ? hostVoice.id : guestVoice.id
})),
mixMode: 'sequential',
silenceDuration: 0.5,
normalizeVolume: true,
waitForCompletion: true
});
// Step 6: Professional audio post-processing
const enhancedAudio = await client.audio.restoreAudio({
audioFile: podcastAudio.outputUrl,
restorationType: 'denoise',
strength: 0.3, // Light denoising
preservation: 0.9,
waitForCompletion: true
});
// Step 7: Final mixing with music
const mixingProject = await client.audio.createMixingProject({
projectName: 'Podcast Episode',
sampleRate: 44100,
bitDepth: 16 // Standard podcast quality
});
// Add intro music
await client.audio.addTrackToProject(mixingProject.id, {
trackName: 'Intro Music',
audioFile: introMusic.outputUrl,
startTime: 0
});
// Add main podcast content
await client.audio.addTrackToProject(mixingProject.id, {
trackName: 'Main Content',
audioFile: enhancedAudio.outputUrl,
startTime: 15 // After intro
});
// Add outro music
const mainDuration = podcastScript.estimatedDuration * 60; // Convert to seconds
await client.audio.addTrackToProject(mixingProject.id, {
trackName: 'Outro Music',
audioFile: outroMusic.outputUrl,
startTime: 15 + mainDuration
});
// Final mix
const finalPodcast = await client.audio.mixProject(mixingProject.id, {
outputFormat: 'mp3',
normalize: true,
waitForCompletion: true
});
return {
podcast: finalPodcast,
script: podcastScript,
duration: mainDuration + 35, // Include intro/outro
metadata: {
title: podcastScript.title,
description: podcastScript.description,
sources: urls
}
};
}
```
### Music Album Production
```javascript
// Create a complete music album with consistent style
async function createMusicAlbum(albumConcept) {
const album = {
tracks: [],
metadata: albumConcept
};
// Step 1: Use AI to plan the album structure
const session = await client.copilot.createSession();
const albumPlan = await client.copilot.generateContent({
contentType: 'music_album',
parameters: {
genre: albumConcept.genre,
trackCount: albumConcept.trackCount || 10,
theme: albumConcept.theme,
duration: albumConcept.duration || 45 // 45 minutes
},
sessionId: session.sessionId
});
// Step 2: Generate each track with consistent style
for (let i = 0; i < albumPlan.tracks.length; i++) {
const trackPlan = albumPlan.tracks[i];
let trackAudio;
if (trackPlan.type === 'instrumental') {
trackAudio = await client.music.generateInstrumental({
prompt: trackPlan.prompt,
duration: trackPlan.duration,
instruments: trackPlan.instruments,
key: albumPlan.musicalKey,
tempo: trackPlan.tempo,
waitForCompletion: true
});
} else if (trackPlan.type === 'vocal') {
trackAudio = await client.music.generateMusic({
prompt: trackPlan.prompt,
lyrics: trackPlan.lyrics,
duration: trackPlan.duration,
guidanceScale: 7.5,
waitForCompletion: true
});
} else if (trackPlan.type === 'rap') {
trackAudio = await client.music.generateRap({
lyrics: trackPlan.lyrics,
style: albumConcept.rapStyle || 'modern',
tempo: trackPlan.tempo,
duration: trackPlan.duration,
waitForCompletion: true
});
}
// Apply consistent mastering to each track
const masteredTrack = await client.audio.masterWithPreset({
targetAudio: trackAudio.outputUrl,
preset: 'album_loudness',
outputFormats: ['wav_24bit'],
waitForCompletion: true
});
album.tracks.push({
number: i + 1,
title: trackPlan.title,
duration: trackPlan.duration,
audio: masteredTrack,
metadata: trackPlan
});
// Add variation if needed
if (trackPlan.needsVariation) {
const variation = await client.music.retakeMusic({
sourceJobId: trackAudio.job.id,
variationStrength: 0.6,
waitForCompletion: true
});
album.tracks.push({
number: i + 1.5,
title: `${trackPlan.title} (Alternative Version)`,
duration: trackPlan.duration,
audio: variation,
metadata: { ...trackPlan, isVariation: true }
});
}
}
// Step 3: Create album transitions and continuous mix
if (albumConcept.createContinuousMix) {
const continuousMix = await createContinuousAlbumMix(album.tracks);
album.continuousMix = continuousMix;
}
// Step 4: Generate album artwork (if integrated with image generation)
// This would require additional image generation capabilities
return album;
}
async function createContinuousAlbumMix(tracks) {
const mixingProject = await client.audio.createMixingProject({
projectName: 'Album Continuous Mix',
sampleRate: 44100,
bitDepth: 24
});
let currentTime = 0;
for (let i = 0; i < tracks.length; i++) {
const track = tracks[i];
// Add track to project
await client.audio.addTrackToProject(mixingProject.id, {
trackName: track.title,
audioFile: track.audio.outputUrl,
startTime: currentTime
});
// Add crossfade transition between tracks
if (i < tracks.length - 1) {
currentTime += track.duration - 3; // 3-second overlap for crossfade
} else {
currentTime += track.duration;
}
}
// Apply final mastering to the continuous mix
const continuousMix = await client.audio.mixProject(mixingProject.id, {
outputFormat: 'wav',
normalize: true,
waitForCompletion: true
});
return continuousMix;
}
```
### Streaming Audio Generation
```javascript
// Stream voice generation (Node.js with WebSocket)
await client.voice.streamVoiceGeneration(
voiceId,
'Long text to generate...',
{
onProgress: (progress) => console.log(`Progress: ${progress}%`),
onAudioChunk: (chunk) => {
// Process audio chunk in real-time
audioStream.write(chunk);
},
onComplete: (result) => {
console.log('Streaming complete');
audioStream.end();
},
onError: (error) => console.error('Stream error:', error)
}
);
```
### Progress Monitoring
```javascript
// Monitor job with custom polling
const result = await client.music.generateMusic({
prompt: 'epic orchestral music',
duration: 180,
waitForCompletion: false
});
const monitorJob = async (jobId) => {
while (true) {
const job = await client.getJobStatus(jobId);
console.log(`Status: ${job.status}, Progress: ${job.progress}%`);
if (job.status === 'completed') {
return job.result;
} else if (job.status === 'failed') {
throw new Error(job.errorMessage);
}
await new Promise(resolve => setTimeout(resolve, 3000));
}
};
const finalResult = await monitorJob(result.job.id);
```
## Browser Support
The SDK works in modern browsers with the following features:
- **File Upload**: Native File API support
- **Progress Tracking**: XMLHttpRequest progress events
- **Async/Await**: ES2017+ support required
- **WebSocket**: For real-time features (optional)
For older browsers, use appropriate polyfills.
## Node.js Support
- **Node.js 16+**: Required for modern JavaScript features
- **File System**: Native fs module support
- **Streams**: Node.js streams for large files
- **WebSocket**: ws package for real-time features
## Contributing
We welcome contributions! Please see our [Contributing Guide](CONTRIBUTING.md) for details.
## License
MIT License - see [LICENSE](LICENSE) file for details.
## Links
- [AudioPod Website](https://audiopod.ai)
- [API Documentation](https://docs.audiopod.ai/)
- [GitHub Repository](https://github.com/AudiopodAI/audiopod)
- [npm Package](https://www.npmjs.com/package/audiopod-sdk)
## Support
- 📧 Email: [support@audiopod.ai](mailto:support@audiopod.ai)
- 💬 Discord: [AudioPod Community](https://discord.gg/audiopod)