UNPKG

@snehal96/unimail

Version:

Unified email fetching & document extraction layer for modern web apps

1,631 lines (1,303 loc) 45.9 kB
# Unimail > 📬 Unified Node.js SDK to fetch and parse emails from Gmail, Outlook, and IMAP servers. **Status: Beta Version** Unimail is an open-source Node.js library that provides a standardized way to fetch emails, parse attachments, and normalize metadata across multiple email providers. With its unified API, you can seamlessly integrate with Gmail, Outlook/Microsoft 365, and IMAP servers using the same interface. --- ## Features * **🔧 Unified Interface:** Consistent `fetchEmails()` API across all email providers * **📧 Multiple Provider Support:** * **Gmail Integration:** Full OAuth2 support with Gmail API * **Outlook/Microsoft 365 Integration:** Complete integration using Microsoft Graph API * **IMAP Support:** Direct IMAP server connection for Yahoo, custom mail servers * **🔐 Integrated OAuth Flow:** Built-in OAuth authentication with browser-based flows * **📎 Advanced Attachment Handling:** * Extract attachments as Buffers with metadata (filename, MIME type, size) * Automatic inline image filtering based on content IDs * Support for large attachment processing * **🏷️ Email Normalization:** * Standardized schema across all providers * Gmail labels and Outlook categories unified as `labels` field * Consistent date, sender, recipient handling * **📄 Enhanced Pagination Support:** * **Cross-Provider Compatibility:** Identical pagination API across Gmail, Outlook, and IMAP * Complete pagination with `nextPageToken` for handling large email volumes * New `PaginationHelper` utility for easy navigation and state management * Automatic all-pages fetching with `getAllPages` option * Rich pagination metadata (current page, total pages, navigation flags) * Async iterators for memory-efficient processing of large datasets * Format optimization for performance across all providers * **🔍 Advanced Search & Filtering:** * Date range queries (`since`, `before`) * Provider-specific search queries * Unread-only filtering * Custom label/category filtering * **⚡ Performance Options:** * Multiple fetch formats (`raw`, `full`, `metadata`) * Configurable inclusion of body content and attachments * Batch processing capabilities * **🛡️ TypeScript Support:** Full type definitions for better development experience * **🔌 Extensible Architecture:** Plugin-based adapter system for new providers --- ## Installation ```bash npm install @snehal96/unimail # or yarn add @snehal96/unimail ``` --- ## Quick Start Guide ### Gmail Integration #### Prerequisites 1. **Create a Google Cloud Platform (GCP) Project:** * Enable the **Gmail API** * Create **OAuth 2.0 Client ID** credentials * Add `http://localhost:3000/oauth/callback` to "Authorized redirect URIs" 2. **Environment Variables:** ```env GOOGLE_CLIENT_ID=your_google_client_id GOOGLE_CLIENT_SECRET=your_google_client_secret GOOGLE_REFRESH_TOKEN=your_refresh_token_after_oauth GOOGLE_REDIRECT_URI=http://localhost:3000/oauth/callback ``` #### Method 1: Using Built-in OAuth Flow (Recommended) ```typescript import { GmailAdapter } from '@snehal96/unimail'; import dotenv from 'dotenv'; dotenv.config(); async function authenticateGmail() { const { GOOGLE_CLIENT_ID, GOOGLE_CLIENT_SECRET, GOOGLE_REDIRECT_URI } = process.env; // Start OAuth flow - opens browser automatically await GmailAdapter.startOAuthFlow( GOOGLE_CLIENT_ID!, GOOGLE_CLIENT_SECRET!, GOOGLE_REDIRECT_URI! ); // Save the displayed refresh token to your .env file } authenticateGmail(); ``` #### Method 2: Using Existing Refresh Token ```typescript import { GmailAdapter, FetchOptions } from '@snehal96/unimail'; import dotenv from 'dotenv'; dotenv.config(); async function fetchGmailEmails() { const gmailAdapter = new GmailAdapter(); await gmailAdapter.initialize({ clientId: process.env.GOOGLE_CLIENT_ID!, clientSecret: process.env.GOOGLE_CLIENT_SECRET!, refreshToken: process.env.GOOGLE_REFRESH_TOKEN!, }); const { emails, nextPageToken, totalCount } = await gmailAdapter.fetchEmails({ limit: 20, query: 'has:attachment', since: '2024-01-01', includeBody: true, includeAttachments: true, }); console.log(`Fetched ${emails.length} emails`); if (nextPageToken) { console.log('More emails available'); } emails.forEach(email => { console.log(`Subject: ${email.subject}`); console.log(`From: ${email.from}`); console.log(`Labels: ${email.labels?.join(', ')}`); console.log(`Attachments: ${email.attachments.length}`); }); } fetchGmailEmails(); ``` ### Outlook/Microsoft 365 Integration #### Prerequisites 1. **Register Application in Azure Portal:** * Go to [Azure Portal > App Registrations](https://portal.azure.com/#blade/Microsoft_AAD_RegisteredApps/ApplicationsListBlade) * Create new registration with redirect URI: `http://localhost:3000/oauth/oauth2callback` * Add API permissions: `Mail.Read`, `offline_access`, `User.Read` * Create client secret 2. **Environment Variables:** ```env MICROSOFT_CLIENT_ID=your_client_id MICROSOFT_CLIENT_SECRET=your_client_secret MICROSOFT_REFRESH_TOKEN=your_refresh_token_after_oauth MICROSOFT_REDIRECT_URI=http://localhost:3000/oauth/oauth2callback MICROSOFT_TENANT_ID=optional_tenant_id ``` #### OAuth Flow Example ```typescript import { OutlookAdapter } from '@snehal96/unimail'; import dotenv from 'dotenv'; dotenv.config(); async function authenticateOutlook() { const { MICROSOFT_CLIENT_ID, MICROSOFT_CLIENT_SECRET, MICROSOFT_REDIRECT_URI, MICROSOFT_TENANT_ID } = process.env; // Start OAuth flow await OutlookAdapter.startOAuthFlow( MICROSOFT_CLIENT_ID!, MICROSOFT_CLIENT_SECRET!, MICROSOFT_REDIRECT_URI!, MICROSOFT_TENANT_ID // Optional ); } authenticateOutlook(); ``` #### Fetching Outlook Emails ```typescript import { OutlookAdapter, FetchOptions } from '@snehal96/unimail'; import dotenv from 'dotenv'; dotenv.config(); async function fetchOutlookEmails() { const outlookAdapter = new OutlookAdapter(); await outlookAdapter.initialize({ clientId: process.env.MICROSOFT_CLIENT_ID!, clientSecret: process.env.MICROSOFT_CLIENT_SECRET!, refreshToken: process.env.MICROSOFT_REFRESH_TOKEN!, }); const { emails } = await outlookAdapter.fetchEmails({ limit: 15, includeBody: true, includeAttachments: true, since: new Date('2024-01-01'), }); emails.forEach(email => { console.log(`Subject: ${email.subject || '(No subject)'}`); console.log(`From: ${email.from}`); console.log(`Categories: ${email.labels?.join(', ') || 'None'}`); console.log(`Date: ${email.date.toLocaleString()}`); }); } fetchOutlookEmails(); ``` ### IMAP Integration Perfect for Yahoo Mail, custom mail servers, and other IMAP-compatible providers: ```typescript import { ImapAdapter } from '@snehal96/unimail'; async function fetchImapEmails() { const imapAdapter = new ImapAdapter({ host: 'imap.mail.yahoo.com', port: 993, secure: true, auth: { user: 'your-email@yahoo.com', pass: 'your-app-password', // Use app-specific password }, }); const emails = await imapAdapter.fetchEmails({ since: new Date(Date.now() - 7 * 24 * 60 * 60 * 1000), // Last 7 days limit: 10, mailbox: 'INBOX' }); emails.forEach(email => { console.log(`Subject: ${email.subject}`); console.log(`From: ${email.from}`); console.log(`Attachments: ${email.attachments.length}`); }); } fetchImapEmails(); ``` --- ## Advanced Usage Examples ### Working with Gmail Labels ```typescript import { GmailAdapter } from '@snehal96/unimail'; async function workWithLabels() { const gmailAdapter = new GmailAdapter(); await gmailAdapter.initialize(credentials); // Search by specific labels const { emails } = await gmailAdapter.fetchEmails({ query: 'label:important label:inbox', limit: 20 }); // Group emails by label const emailsByLabel = new Map(); emails.forEach(email => { email.labels?.forEach(label => { if (!emailsByLabel.has(label)) { emailsByLabel.set(label, []); } emailsByLabel.get(label).push(email); }); }); // Find emails with multiple specific labels const importantInboxEmails = emails.filter(email => email.labels?.includes('INBOX') && email.labels?.includes('IMPORTANT') ); console.log(`Found ${importantInboxEmails.length} important inbox emails`); } ``` ### Working with Outlook Categories ```typescript import { OutlookAdapter } from '@snehal96/unimail'; async function workWithCategories() { const outlookAdapter = new OutlookAdapter(); await outlookAdapter.initialize(credentials); const { emails } = await outlookAdapter.fetchEmails({ limit: 30, includeBody: false, // Faster fetching }); // Group by categories (normalized as labels) const emailsByCategory = new Map(); emails.forEach(email => { const categories = email.labels || ['UNCATEGORIZED']; categories.forEach(category => { if (!emailsByCategory.has(category)) { emailsByCategory.set(category, []); } emailsByCategory.get(category).push(email); }); }); // Display category distribution for (const [category, categoryEmails] of emailsByCategory.entries()) { console.log(`${category}: ${categoryEmails.length} emails`); } } ``` ### Advanced Pagination Unimail provides comprehensive pagination support with multiple approaches for different use cases. **All pagination features work identically across Gmail, Outlook, and IMAP adapters**, ensuring consistent behavior regardless of email provider: #### Cross-Provider Compatibility All pagination examples below work with any adapter (Gmail, Outlook, IMAP) - simply replace `GmailAdapter` with `OutlookAdapter` or `ImapAdapter` as needed. The API remains identical across all providers. #### 1. Basic Manual Pagination ```typescript async function basicPagination() { const gmailAdapter = new GmailAdapter(); await gmailAdapter.initialize(credentials); let pageToken = undefined; let pageNumber = 1; do { const { emails, nextPageToken, totalCount } = await gmailAdapter.fetchEmails({ pageSize: 20, // Emails per page pageToken, // Token for current page query: 'has:attachment', includeBody: false, // Faster processing }); console.log(`Page ${pageNumber}: ${emails.length} emails`); if (totalCount) { console.log(`Total available: ~${totalCount} emails`); } pageToken = nextPageToken; pageNumber++; } while (pageToken); } ``` #### 2. Using PaginationHelper (Recommended) ```typescript import { GmailAdapter, createPaginationHelper } from '@snehal96/unimail'; async function paginationHelperExample() { const gmailAdapter = new GmailAdapter(); await gmailAdapter.initialize(credentials); // Create pagination helper const paginationHelper = createPaginationHelper(gmailAdapter, { pageSize: 15, query: 'label:inbox', includeBody: false, includeAttachments: false, }); // Navigate through pages const page1 = await paginationHelper.fetchCurrentPage(); console.log(`Page 1: ${page1.data.length} emails`); console.log(`Has next page: ${page1.pagination.hasNextPage}`); if (page1.pagination.hasNextPage) { const page2 = await paginationHelper.fetchNextPage(); console.log(`Page 2: ${page2?.data.length} emails`); // Go back to previous page if (page2?.pagination.hasPreviousPage) { const backToPage1 = await paginationHelper.fetchPreviousPage(); console.log(`Back to Page 1: ${backToPage1?.data.length} emails`); } } } ``` #### 3. Automatic All-Pages Fetching ```typescript async function fetchAllPages() { const gmailAdapter = new GmailAdapter(); await gmailAdapter.initialize(credentials); const { emails, totalCount } = await gmailAdapter.fetchEmails({ limit: 100, // Maximum total emails to fetch pageSize: 25, // Emails per API call getAllPages: true, // Automatically fetch all pages query: 'has:attachment', includeBody: false, }); console.log(`Fetched ${emails.length} emails total across all pages`); console.log(`Total available: ~${totalCount} emails`); } ``` #### 4. Async Iterator for Large Datasets ```typescript async function processLargeDataset() { const gmailAdapter = new GmailAdapter(); await gmailAdapter.initialize(credentials); const paginationHelper = createPaginationHelper(gmailAdapter, { pageSize: 50, query: 'label:inbox', includeBody: false, }); let totalProcessed = 0; // Process each page as it's fetched for await (const page of paginationHelper.iterateAllPages()) { console.log(`Processing page ${page.pagination.currentPage}: ${page.data.length} emails`); // Process emails in this page page.data.forEach(email => { // Your processing logic here totalProcessed++; }); } console.log(`Processed ${totalProcessed} emails total`); } ``` #### 5. Building Paginated API Responses ```typescript import { PaginationUtils } from '@snehal96/unimail'; async function buildPaginatedAPI(request: { page?: number; pageSize?: number; query?: string; pageToken?: string; }) { const gmailAdapter = new GmailAdapter(); await gmailAdapter.initialize(credentials); const { emails, nextPageToken, totalCount } = await gmailAdapter.fetchEmails({ pageSize: request.pageSize || 20, pageToken: request.pageToken, query: request.query || '', includeBody: true, includeAttachments: false, }); // Calculate pagination metadata const currentPage = request.page || 1; const pagination = PaginationUtils.calculatePaginationMetadata( currentPage, request.pageSize || 20, totalCount, !!nextPageToken, currentPage > 1 ); return { status: 'success', data: emails, pagination: { ...pagination, nextPageToken, }, metadata: { query: request.query, fetchTime: new Date().toISOString(), totalFetched: emails.length, }, }; } ``` #### Pagination Options | Option | Description | Default | |--------|-------------|---------| | `pageSize` | Number of emails per page | 20 | | `pageToken` | Token for fetching specific page | `undefined` | | `limit` | Maximum total emails to fetch | No limit | | `getAllPages` | Automatically fetch all pages up to limit | `false` | #### Pagination Response ```typescript interface PaginationMetadata { currentPage: number; pageSize: number; totalCount?: number; estimatedTotalPages?: number; nextPageToken?: string; previousPageToken?: string; hasNextPage: boolean; hasPreviousPage: boolean; isFirstPage: boolean; isLastPage: boolean; } ``` ### Date Range Queries ```typescript async function dateRangeExample() { const gmailAdapter = new GmailAdapter(); await gmailAdapter.initialize(credentials); // Method 1: Using since/before parameters const { emails: recentEmails } = await gmailAdapter.fetchEmails({ since: '2024-01-01', before: '2024-03-31', limit: 100 }); // Method 2: Using Gmail search syntax const { emails: searchEmails } = await gmailAdapter.fetchEmails({ query: 'after:2024/01/01 before:2024/03/31 has:attachment', limit: 100 }); // Method 3: Using Date objects const { emails: dateEmails } = await gmailAdapter.fetchEmails({ since: new Date('2024-01-01'), before: new Date('2024-03-31'), limit: 100 }); } ``` ### Performance Optimization ```typescript async function performanceExample() { const gmailAdapter = new GmailAdapter(); await gmailAdapter.initialize(credentials); // Metadata only - Fastest for listing const { emails: metadataOnly } = await gmailAdapter.fetchEmails({ format: 'metadata', limit: 100, includeBody: false, includeAttachments: false }); // Full format - Structured but faster than raw const { emails: fullFormat } = await gmailAdapter.fetchEmails({ format: 'full', limit: 20, includeBody: true, includeAttachments: false }); // Raw format - Most complete but slower const { emails: rawFormat } = await gmailAdapter.fetchEmails({ format: 'raw', limit: 10, includeBody: true, includeAttachments: true }); } ``` ### Attachment Processing ```typescript import * as fs from 'fs'; import * as path from 'path'; async function processAttachments() { const gmailAdapter = new GmailAdapter(); await gmailAdapter.initialize(credentials); const { emails } = await gmailAdapter.fetchEmails({ query: 'has:attachment filename:pdf', limit: 10, includeAttachments: true }); const attachmentsDir = './downloads'; if (!fs.existsSync(attachmentsDir)) { fs.mkdirSync(attachmentsDir); } emails.forEach(email => { email.attachments.forEach(attachment => { if (attachment.buffer && attachment.mimeType === 'application/pdf') { const filePath = path.join(attachmentsDir, `${email.id}_${attachment.filename}`); fs.writeFileSync(filePath, attachment.buffer); console.log(`Saved PDF: ${filePath}`); } }); }); } ``` --- ## 🚀 Email Streaming (Recommended for Large Datasets) ### ⚠️ Memory Issues with Traditional Approach ```typescript // ❌ This can crash your application with large datasets const { emails } = await gmailAdapter.fetchEmails({ limit: 10000, // 😱 10,000 emails loaded into memory getAllPages: true, // 😱 loads everything at once includeAttachments: true }); // Problems: // - 2GB+ memory usage for 10k emails // - Potential out-of-memory crashes // - No progress feedback // - All-or-nothing processing ``` ### ✅ Streaming Solutions Unimail's streaming functionality processes emails in small batches, providing **constant memory usage** regardless of dataset size. #### Method 1: Simple Streaming with AsyncIterator (Recommended) ```typescript async function processEmailsStream() { const gmailAdapter = new GmailAdapter(); await gmailAdapter.initialize(credentials); let totalProcessed = 0; // ✅ Process emails in batches of 50 - only ~10MB memory usage for await (const emailBatch of gmailAdapter.streamEmails({ batchSize: 50, query: 'has:attachment', maxEmails: 10000 // Optional limit })) { console.log(`Processing batch of ${emailBatch.length} emails...`); // Process each email in this batch for (const email of emailBatch) { await processEmail(email); totalProcessed++; } console.log(`Processed ${totalProcessed} emails so far...`); // Memory automatically freed after each batch } console.log(`✅ Completed! Processed ${totalProcessed} emails total`); } ``` #### Method 2: Progress Tracking with Callbacks ```typescript async function processWithProgress() { const gmailAdapter = new GmailAdapter(); await gmailAdapter.initialize(credentials); await gmailAdapter.fetchEmailsStream({ batchSize: 30, query: 'is:unread', includeBody: false // Faster processing }, { onBatch: async (emails, progress) => { console.log(`Batch ${progress.batchCount}: ${emails.length} emails`); console.log(`Progress: ${progress.current}/${progress.total} (${ Math.round((progress.current / progress.total!) * 100) }%)`); // Process this batch for (const email of emails) { await processEmail(email); } }, onProgress: async (progress) => { // Real-time progress updates if (progress.current % 100 === 0) { console.log(`📊 Processed ${progress.current} emails so far...`); } }, onError: async (error, progress) => { console.error(`❌ Error at batch ${progress.batchCount}:`, error); // Implement retry logic or continue with next batch }, onComplete: async (summary) => { console.log(`✅ Completed! Processed ${summary.totalProcessed} emails in ${summary.duration}ms`); console.log(`Average rate: ${(summary.totalProcessed / (summary.duration / 1000)).toFixed(1)} emails/sec`); } }); } ``` ### Real-World Integration Patterns #### Database Batch Operations ```typescript async function syncEmailsToDatabase() { const emailsToInsert = []; let totalSynced = 0; for await (const emailBatch of gmailAdapter.streamEmails({ batchSize: 50 })) { // Accumulate emails for bulk insert emailsToInsert.push(...emailBatch.map(email => ({ id: email.id, subject: email.subject, from: email.from, date: email.date }))); // Bulk insert every 100 emails if (emailsToInsert.length >= 100) { await db.emails.insertMany(emailsToInsert); totalSynced += emailsToInsert.length; console.log(`Synced ${totalSynced} emails to database`); emailsToInsert.length = 0; // Clear memory } } // Insert remaining emails if (emailsToInsert.length > 0) { await db.emails.insertMany(emailsToInsert); totalSynced += emailsToInsert.length; } console.log(`Total synced: ${totalSynced} emails`); } ``` #### Express.js API with Real-time Updates ```typescript app.post('/api/sync-emails', async (req, res) => { // Set up Server-Sent Events for real-time progress res.writeHead(200, { 'Content-Type': 'text/event-stream', 'Cache-Control': 'no-cache', 'Connection': 'keep-alive' }); let totalProcessed = 0; try { for await (const emailBatch of gmailAdapter.streamEmails({ batchSize: 25 })) { // Process batch for (const email of emailBatch) { await processEmailForUser(email, req.body.userId); totalProcessed++; } // Send progress update to frontend res.write(`data: ${JSON.stringify({ type: 'progress', processed: totalProcessed, batch: emailBatch.length })}\n\n`); } // Send completion res.write(`data: ${JSON.stringify({ type: 'complete', total: totalProcessed })}\n\n`); } catch (error) { res.write(`data: ${JSON.stringify({ type: 'error', message: error.message })}\n\n`); } finally { res.end(); } }); ``` ### Memory Usage Comparison | Dataset Size | Traditional Approach | Streaming Approach | |--------------|---------------------|-------------------| | **1,000 emails** | ~200MB memory | ~10MB memory ✅ | | **10,000 emails** | ~2GB memory ⚠️ | ~10MB memory ✅ | | **100,000 emails** | ~20GB memory 💥 | ~10MB memory ✅ | ### Configuration Options ```typescript interface EmailStreamOptions { batchSize?: number; // Emails per batch (default: 50) maxEmails?: number; // Max total emails to process query?: string; // Provider-specific search since?: Date | string; // Emails after this date before?: Date | string; // Emails before this date includeBody?: boolean; // Include email body (default: true) includeAttachments?: boolean; // Include attachments (default: true) format?: 'raw' | 'full' | 'metadata'; // Fetch format pageToken?: string; // Resume from specific point } ``` ### Performance Recommendations ```typescript // ⚡ For metadata only (fastest) { batchSize: 100, includeBody: false, includeAttachments: false, format: 'metadata' } // ⚖️ Balanced performance { batchSize: 50, includeBody: true, includeAttachments: false, format: 'full' } // 🔍 Complete data (slower) { batchSize: 25, includeBody: true, includeAttachments: true, format: 'raw' } ``` ### Migration Guide #### Before (Memory Problems) ```typescript // ❌ Can cause out-of-memory errors const { emails } = await adapter.fetchEmails({ limit: 10000, getAllPages: true }); for (const email of emails) { await processEmail(email); } ``` #### After (Memory Efficient) ```typescript // ✅ Constant memory usage, real-time progress for await (const emailBatch of adapter.streamEmails({ batchSize: 50, maxEmails: 10000 })) { for (const email of emailBatch) { await processEmail(email); } } ``` ### Error Handling & Recovery ```typescript async function processWithRecovery() { let lastProcessedId: string | undefined; try { for await (const emailBatch of gmailAdapter.streamEmails({ batchSize: 50, pageToken: getLastCheckpoint() // Resume from saved position })) { try { for (const email of emailBatch) { await processEmail(email); lastProcessedId = email.id; await saveCheckpoint(email.id); // Save progress } } catch (batchError) { console.error('Batch failed:', batchError); // Continue with next batch or implement retry logic } } } catch (error) { console.error('Stream failed, can resume from:', lastProcessedId); // Save checkpoint for resumption } } ``` ### Key Benefits ✅ **Memory Efficient** - Constant ~10MB usage regardless of dataset size ✅ **Real-time Progress** - Live updates and percentage completion ✅ **Error Resilient** - Continue processing after failures ✅ **Scalable** - Handle millions of emails without memory issues ✅ **Flexible** - Multiple integration patterns for different use cases > **💡 Tip:** For large datasets (>1,000 emails), always use streaming methods instead of `getAllPages: true` to avoid memory issues. --- ## 🔄 Gmail Sync & Real-time Updates ### Overview Gmail sync capabilities enable real-time synchronization with Gmail accounts using the Gmail History API and Push Notifications. This allows you to: - **Track Changes**: Monitor additions, deletions, and label changes in real-time - **Efficient Sync**: Only fetch changes since the last sync, not all emails - **Push Notifications**: Receive instant webhooks when changes occur - **Persistent State**: Maintain sync state across application restarts ### Quick Start ```typescript import { GmailAdapter } from '@snehal96/unimail'; const gmailAdapter = new GmailAdapter(); await gmailAdapter.initialize(credentials); // 1. Get starting point for sync const historyId = await gmailAdapter.getCurrentHistoryId(); console.log(`Starting sync from history ID: ${historyId}`); // 2. Later, check for changes const syncResult = await gmailAdapter.processSync({ startHistoryId: historyId, maxResults: 50 }); console.log(`Found ${syncResult.addedEmails.length} new emails`); console.log(`${syncResult.deletedEmailIds.length} emails deleted`); console.log(`${syncResult.updatedEmails.length} emails updated`); ``` ### Core Sync Methods #### `getCurrentHistoryId(): Promise<string>` Get the current history ID to start tracking changes. ```typescript const historyId = await gmailAdapter.getCurrentHistoryId(); // Store this ID to track future changes ``` #### `getHistory(startHistoryId: string, options?: SyncOptions): Promise<HistoryResponse>` Get raw history records since a specific history ID. ```typescript const historyResponse = await gmailAdapter.getHistory(startHistoryId, { maxResults: 100, labelIds: ['INBOX'], includeDeleted: true }); // Process individual history records for (const record of historyResponse.history) { if (record.messagesAdded) { console.log(`${record.messagesAdded.length} messages added`); } if (record.messagesDeleted) { console.log(`${record.messagesDeleted.length} messages deleted`); } } ``` #### `getEmailById(id: string): Promise<NormalizedEmail | null>` Fetch a specific email by ID (useful for processing history records). ```typescript const email = await gmailAdapter.getEmailById('18c2e1b2d4f5a3b1'); if (email) { console.log(`Email: ${email.subject} from ${email.from}`); } ``` #### `processSync(options: SyncOptions): Promise<SyncResult>` High-level method that processes history and returns structured results. ```typescript const syncResult = await gmailAdapter.processSync({ startHistoryId: lastKnownHistoryId, maxResults: 50, includeDeleted: true }); // Process new emails for (const email of syncResult.addedEmails) { await processNewEmail(email); } // Handle deletions for (const deletedId of syncResult.deletedEmailIds) { await removeEmailFromDatabase(deletedId); } // Handle updates (label changes) for (const email of syncResult.updatedEmails) { await updateEmailLabels(email.id, email.labels); } // Update your sync state lastKnownHistoryId = syncResult.newHistoryId; ``` ### Push Notifications Set up real-time push notifications to receive instant updates when changes occur. #### Prerequisites 1. **Google Cloud Pub/Sub Topic**: Create a topic in Google Cloud Console 2. **Service Account**: Grant Gmail API publish permissions to the topic 3. **Webhook Endpoint**: Set up an HTTPS endpoint to receive notifications #### Setup Push Notifications ```typescript const pushSetup = await gmailAdapter.setupPushNotifications({ topicName: 'projects/your-project-id/topics/gmail-push', webhookUrl: 'https://your-domain.com/webhook/gmail', labelIds: ['INBOX'], // Optional: only watch specific labels labelFilterAction: 'include' }); console.log(`Push notifications active until: ${new Date(pushSetup.expiration * 1000)}`); ``` #### Webhook Handler ```typescript import express from 'express'; const app = express(); app.post('/webhook/gmail', async (req, res) => { try { // Parse Pub/Sub message const message = req.body.message; const data = JSON.parse(Buffer.from(message.data, 'base64').toString()); const { emailAddress, historyId } = data; console.log(`Changes detected for ${emailAddress} since ${historyId}`); // Trigger sync for this user await performSyncForUser(emailAddress, historyId); res.status(200).send('OK'); } catch (error) { console.error('Webhook error:', error); res.status(500).send('Error'); } }); ``` ### Real-World Sync Patterns #### Continuous Background Sync ```typescript class GmailSyncService { private syncState: Map<string, string> = new Map(); // userId -> historyId async startContinuousSync(userId: string, gmailAdapter: GmailAdapter) { // Initialize sync state if (!this.syncState.has(userId)) { const historyId = await gmailAdapter.getCurrentHistoryId(); this.syncState.set(userId, historyId); } // Sync loop while (true) { try { const lastHistoryId = this.syncState.get(userId)!; const syncResult = await gmailAdapter.processSync({ startHistoryId: lastHistoryId, maxResults: 50 }); // Process changes await this.processChanges(userId, syncResult); // Update state this.syncState.set(userId, syncResult.newHistoryId); // Wait before next sync if no more changes if (!syncResult.hasMoreChanges) { await this.sleep(30000); // 30 seconds } } catch (error) { console.error(`Sync error for user ${userId}:`, error); await this.sleep(60000); // Wait 1 minute on error } } } private async processChanges(userId: string, syncResult: SyncResult) { // Save new emails to database if (syncResult.addedEmails.length > 0) { await this.database.emails.insertMany( syncResult.addedEmails.map(email => ({ userId, gmailId: email.id, subject: email.subject, from: email.from, date: email.date, labels: email.labels })) ); } // Remove deleted emails if (syncResult.deletedEmailIds.length > 0) { await this.database.emails.deleteMany({ userId, gmailId: { $in: syncResult.deletedEmailIds } }); } // Update labels for changed emails for (const email of syncResult.updatedEmails) { await this.database.emails.updateOne( { userId, gmailId: email.id }, { $set: { labels: email.labels } } ); } } private sleep(ms: number): Promise<void> { return new Promise(resolve => setTimeout(resolve, ms)); } } ``` #### Express.js Integration with Real-time Updates ```typescript import express from 'express'; import { GmailAdapter } from '@snehal96/unimail'; const app = express(); // Server-Sent Events for real-time sync updates app.get('/api/sync-status/:userId', async (req, res) => { res.writeHead(200, { 'Content-Type': 'text/event-stream', 'Cache-Control': 'no-cache', 'Connection': 'keep-alive' }); const userId = req.params.userId; const gmailAdapter = await getGmailAdapterForUser(userId); // Start sync process const syncInterval = setInterval(async () => { try { const lastHistoryId = await getLastHistoryId(userId); const syncResult = await gmailAdapter.processSync({ startHistoryId: lastHistoryId }); // Send real-time updates res.write(`data: ${JSON.stringify({ type: 'sync_complete', addedEmails: syncResult.addedEmails.length, deletedEmails: syncResult.deletedEmailIds.length, updatedEmails: syncResult.updatedEmails.length, timestamp: new Date() })}\n\n`); // Update stored history ID await saveHistoryId(userId, syncResult.newHistoryId); } catch (error) { res.write(`data: ${JSON.stringify({ type: 'sync_error', error: error.message, timestamp: new Date() })}\n\n`); } }, 30000); // Check every 30 seconds // Cleanup on client disconnect req.on('close', () => { clearInterval(syncInterval); }); }); // Manual sync trigger app.post('/api/sync/:userId', async (req, res) => { try { const userId = req.params.userId; const gmailAdapter = await getGmailAdapterForUser(userId); const lastHistoryId = await getLastHistoryId(userId); const syncResult = await gmailAdapter.processSync({ startHistoryId: lastHistoryId, maxResults: 100 }); await saveHistoryId(userId, syncResult.newHistoryId); res.json({ success: true, addedEmails: syncResult.addedEmails.length, deletedEmails: syncResult.deletedEmailIds.length, updatedEmails: syncResult.updatedEmails.length, newHistoryId: syncResult.newHistoryId }); } catch (error) { res.status(500).json({ success: false, error: error.message }); } }); ``` ### Configuration Options ```typescript interface SyncOptions { startHistoryId?: string; // Start from specific history ID maxResults?: number; // Max history records per request (default: 100) labelIds?: string[]; // Filter by specific labels includeDeleted?: boolean; // Include deleted messages (default: true) } interface PushNotificationConfig { topicName: string; // Google Cloud Pub/Sub topic webhookUrl: string; // Your webhook endpoint labelIds?: string[]; // Optional: only watch specific labels labelFilterAction?: 'include' | 'exclude'; // How to apply label filter } ``` ### Error Handling & Recovery ```typescript async function robustSync(gmailAdapter: GmailAdapter, lastHistoryId: string) { try { return await gmailAdapter.processSync({ startHistoryId: lastHistoryId }); } catch (error) { // Handle expired history ID if (error.message.includes('too old or invalid')) { console.log('History ID expired, starting fresh sync...'); const newHistoryId = await gmailAdapter.getCurrentHistoryId(); // You might want to do a full re-sync here await performFullResync(gmailAdapter); return { processedHistoryRecords: 0, addedEmails: [], deletedEmailIds: [], updatedEmails: [], newHistoryId, hasMoreChanges: false }; } // Handle rate limiting if (error.message.includes('Rate limit')) { console.log('Rate limited, waiting before retry...'); await new Promise(resolve => setTimeout(resolve, 60000)); return robustSync(gmailAdapter, lastHistoryId); } throw error; } } ``` ### Performance Considerations - **History Retention**: Gmail history is retained for ~1 week. Store history IDs promptly - **Rate Limits**: Gmail API has quotas. Implement exponential backoff for retries - **Batch Processing**: Process sync results in batches to avoid overwhelming your database - **Push Notifications**: Expire after 7 days. Set up monitoring to renew automatically ### Key Benefits ✅ **Real-time Sync** - Get notified instantly when emails change ✅ **Efficient** - Only fetch changes, not all emails ✅ **Scalable** - Handle multiple users with separate sync states ✅ **Resilient** - Built-in error handling and recovery ✅ **Flexible** - Works with webhooks or polling patterns --- ## API Reference ### Common Interfaces #### `FetchOptions` ```typescript interface FetchOptions { limit?: number; // Max emails to fetch (default: 10) since?: Date | string; // Emails after this date before?: Date | string; // Emails before this date query?: string; // Provider-specific search includeBody?: boolean; // Include text/HTML body (default: true) includeAttachments?: boolean; // Include attachment buffers (default: true) unreadOnly?: boolean; // Only unread emails format?: 'raw' | 'full' | 'metadata'; // Fetch format pageToken?: string; // Pagination token pageSize?: number; // Items per page getAllPages?: boolean; // Auto-fetch all pages } ``` #### `NormalizedEmail` ```typescript interface NormalizedEmail { id: string; // Provider-specific ID threadId?: string; // Thread/conversation ID provider: 'gmail' | 'outlook' | 'imap' | 'unknown'; from: string; // Sender email to: string[]; // Recipients cc?: string[]; // CC recipients bcc?: string[]; // BCC recipients subject?: string; // Email subject date: Date; // Received date bodyText?: string; // Plain text content bodyHtml?: string; // HTML content attachments: Attachment[]; // Attachments array labels?: string[]; // Labels/categories raw?: any; // Raw provider response } ``` #### `Attachment` ```typescript interface Attachment { filename: string; // File name mimeType: string; // MIME type size: number; // Size in bytes buffer?: Buffer; // File content contentId?: string; // For inline attachments } ``` ### GmailAdapter #### Static Methods ```typescript // Start OAuth flow static async startOAuthFlow( clientId: string, clientSecret: string, redirectUri: string, port?: number, callbackPath?: string ): Promise<string> // Handle OAuth callback manually static async handleOAuthCallback( code: string, clientId: string, clientSecret: string, redirectUri: string ): Promise<{accessToken: string, refreshToken?: string}> ``` #### Instance Methods ```typescript // Initialize with credentials async initialize(credentials: GmailCredentials): Promise<void> // Authenticate (called automatically by fetchEmails) async authenticate(): Promise<void> // Fetch emails async fetchEmails(options: FetchOptions): Promise<PaginatedEmailsResponse> // Streaming methods streamEmails(options: EmailStreamOptions): AsyncGenerator<NormalizedEmail[], void, unknown> fetchEmailsStream(options: EmailStreamOptions, callbacks: EmailStreamCallbacks): Promise<void> // Sync capabilities getCurrentHistoryId(): Promise<string> getHistory(startHistoryId: string, options?: SyncOptions): Promise<HistoryResponse> getEmailById(id: string): Promise<NormalizedEmail | null> setupPushNotifications(config: PushNotificationConfig): Promise<PushNotificationSetup> stopPushNotifications(): Promise<void> processSync(options?: SyncOptions): Promise<SyncResult> ``` ### OutlookAdapter #### Static Methods ```typescript // Start OAuth flow static async startOAuthFlow( clientId: string, clientSecret: string, redirectUri: string, tenantId?: string, port?: number, callbackPath?: string ): Promise<string> // Handle OAuth callback manually static async handleOAuthCallback( code: string, clientId: string, clientSecret: string, redirectUri: string, tenantId?: string ): Promise<{accessToken: string, refreshToken?: string}> ``` #### Instance Methods ```typescript // Initialize with credentials async initialize(credentials: OutlookCredentials): Promise<void> // Authenticate (called automatically by fetchEmails) async authenticate(): Promise<void> // Fetch emails async fetchEmails(options: FetchOptions): Promise<PaginatedEmailsResponse> ``` ### ImapAdapter ```typescript // Constructor constructor(config: ImapFlowOptions) // Fetch emails async fetchEmails(options: { since?: Date; limit?: number; mailbox?: string; }): Promise<NormalizedEmail[]> // Close connection async close(): Promise<void> ``` --- ## OAuth Service (Advanced) For more control over the OAuth flow: ```typescript import { OAuthService, GoogleOAuthProvider, OutlookOAuthProvider } from '@snehal96/unimail'; // Google OAuth const googleOAuth = new OAuthService(new GoogleOAuthProvider()); const authUrl = await googleOAuth.startOAuthFlow({ clientId: 'your-client-id', clientSecret: 'your-client-secret', redirectUri: 'your-redirect-uri', scopes: ['https://mail.google.com/'], accessType: 'offline', prompt: 'consent' }); // Microsoft OAuth const outlookOAuth = new OAuthService(new OutlookOAuthProvider()); const outlookAuthUrl = await outlookOAuth.startOAuthFlow({ clientId: 'your-client-id', clientSecret: 'your-client-secret', redirectUri: 'your-redirect-uri', scopes: ['Mail.Read', 'offline_access'], prompt: 'consent' }); ``` --- ## Provider-Specific Features ### Gmail - **System Labels:** `INBOX`, `SENT`, `DRAFT`, `SPAM`, `TRASH`, `IMPORTANT` - **Category Labels:** `CATEGORY_PERSONAL`, `CATEGORY_SOCIAL`, `CATEGORY_PROMOTIONS`, `CATEGORY_UPDATES`, `CATEGORY_FORUMS` - **Search Operators:** `has:attachment`, `filename:pdf`, `from:sender@example.com`, `label:important` - **Advanced Queries:** `after:2024/01/01 before:2024/12/31 has:attachment filename:pdf` ### Outlook - **Categories:** User-defined categories appear in the `labels` field - **Search:** Text-based search across subject, body, and sender - **Date Filtering:** Native support for `since` and `before` parameters - **Graph API:** Full Microsoft Graph API features available ### IMAP - **Mailbox Support:** Access different mailboxes (INBOX, Sent, Drafts, etc.) - **Date-based Fetching:** Efficient server-side date filtering - **Universal Compatibility:** Works with Yahoo, custom servers, and most email providers --- ## Error Handling ```typescript try { const { emails } = await gmailAdapter.fetchEmails(options); } catch (error) { if (error.message.includes('invalid_grant')) { console.error('Refresh token expired or revoked'); // Re-run OAuth flow } else if (error.message.includes('quota')) { console.error('API quota exceeded'); // Implement rate limiting } else { console.error('General error:', error.message); } } ``` --- ## Best Practices ### 1. Authentication Management ```typescript // Store refresh tokens securely // Implement token refresh handling // Use environment variables for credentials ``` ### 2. Performance Optimization ```typescript // Use appropriate format for your needs format: 'metadata', // Fastest - headers only format: 'full', // Balanced - structured content format: 'raw', // Complete - full email parsing // Disable unnecessary features includeBody: false, // Skip body parsing includeAttachments: false, // Skip attachment processing ``` ### 3. Pagination for Large Datasets ```typescript // Always implement pagination for production const getAllEmails = async () => { let allEmails = []; let pageToken = undefined; do { const { emails, nextPageToken } = await adapter.fetchEmails({ pageToken, limit: 100 }); allEmails.push(...emails); pageToken = nextPageToken; } while (pageToken); return allEmails; }; ``` ### 4. Attachment Handling ```typescript // Filter attachments by type and size const processAttachments = (emails) => { emails.forEach(email => { const pdfs = email.attachments.filter(att => att.mimeType === 'application/pdf' && att.size < 10 * 1024 * 1024 // < 10MB ); // Process filtered attachments pdfs.forEach(pdf => { if (pdf.buffer) { // Save or process PDF } }); }); }; ``` --- ## Contributing Contributions are welcome! Please feel free to submit issues, fork the repository, and create pull requests. ### Development Setup ```bash git clone https://github.com/snehal96/unimail.git cd unimail npm install npm run build npm test ``` --- ## License [MIT](LICENSE) --- ## Support - 📧 **Issues:** [GitHub Issues](https://github.com/snehal96/unimail/issues) - 📖 **Documentation:** [GitHub Repository](https://github.com/snehal96/unimail) - 💬 **Discussions:** [GitHub Discussions](https://github.com/snehal96/unimail/discussions)