@agility/cli

# Batch-Optimized Upload Strategy & Architecture Design **Phase 19 - Task 19.1.1 Implementation** **Status**: ✅ **FOUNDATION VALIDATED** - Ready for Implementation **Created**: Phase 19 Development Session **Validated**: Phase 19.5-19.6 - 100% entity reconciliation across 24,164+ entities from 3 customer instances --- ## 🎉 **VALIDATION SUCCESS SUMMARY** **Foundation Proven Ready:** - ✅ **24,164 entities validated** across 3 diverse customer production instances - ✅ **100% reconciliation rate** - zero broken chains across all instances - ✅ **Perfect dependency resolution** - every entity and its dependencies accounted for - ✅ **Modular service architecture** - 12 focused components ready for upload orchestration - ✅ **Debug tooling proven** - `--test` flag enables authentication bypass for testing **Implementation Readiness:** - ✅ **Data integrity validated** - Pull and analysis systems working perfectly - ✅ **Dependency mapping proven** - Complete relationship resolution working - ✅ **Error detection reliable** - Broken chain identification accurate - ✅ **Service architecture ready** - Modular, maintainable codebase established --- ## 🎯 Strategic Objective Design optimal upload strategy leveraging Management SDK batch operations combined with dependency chain traversal to maximize throughput while maintaining data integrity and dependency relationships. **Foundation**: This strategy is built on a rock-solid foundation of proven pull and analysis capabilities validated across 24,164+ real-world entities. --- ## 📊 Available Management SDK Batch Operations ### **Confirmed Batch Methods:** - ✅ **`saveContentItems(contentItems: ContentItem[], guid, locale)`** → `Promise<Batch | number[]>` - ✅ **`upload(formData, agilityFolderPath, guid, groupingID?)`** → `Promise<Media[]>` - ✅ **`savePage(pageItem, guid, locale, parentPageID?, placeBeforePageItemID?)`** → `Promise<Batch | number[]>` ### **Batch Operation Types (from SDK):** ```typescript enum BatchOperationType { Publish = 1, Unpublish = 2, Approve = 3, Decline = 4, RequestApproval = 5, Copy = 6, ClonePageTemplate = 7, Sync = 8, TagExport = 9, TagImport = 10, ContentExport = 11, ContentImport = 12, InstanctTemplateImport = 13, InsertContent = 14, DeleteContent = 15, SavePage = 16, DeletePage = 17 } ``` --- ## 🏗️ Entity Dependency Analysis ### **Dependency Chain Patterns (from Phase 18 Analysis):** **6-Step Analysis Framework:** 1. **Page Chains**: Template → Container → Model → Content → Asset → Gallery 2. **Container Chains**: Independent containers not in page chains 3. **Model-to-Model Chains**: Content Definition field relationships 4. **Broken Chains**: Missing source data validation 5. **Items Outside Chains**: Non-chained entities by type 6. **Reconciliation**: Entity accounting and sync readiness ### **Entity Independence Matrix:** | Entity Type | Dependencies | Batch Capable | Strategy | |-------------|--------------|---------------|----------| | **Models** | None (independent) | ❌ Individual only | Sequential Individual | | **Templates** | Models (weak) | ❌ Individual only | Sequential Individual | | **Containers** | Models (strong) | ❌ Individual only | Sequential Individual | | **Assets** | None (independent) | ✅ **`upload()`** | **BATCH UPLOAD** | | **Galleries** | Assets (strong) | ❌ Individual only | Chain Traversal | | **Content** | Models + Assets (strong) | ✅ **`saveContentItems()`** | **DEPENDENCY-LEVEL BATCHING** | | **Pages** | Templates + Content + Assets (complex) | ⚠️ Batch but complex | **CHAIN TRAVERSAL** | --- ## 🚀 Strategic Decision Matrix ### **Phase 1: Independent Entity Batching** **Batch Strategy Entities:** - ✅ **Assets** → Use `upload()` batch method - ✅ **Content** → Use `saveContentItems()` with dependency-level grouping **Sequential Strategy Entities:** - 🔄 **Models** → Individual `saveModel()` calls (fast, independent) - 🔄 **Templates** → Individual `saveTemplate()` calls (minimal model deps) - 🔄 **Containers** → Individual `saveContainer()` calls (model deps resolved) ### **Phase 2: Complex Dependency Chain Traversal** **Chain Traversal Entities:** - 🔗 **Pages** → Complex zone dependencies require real-time ID mapping - 🔗 **Galleries** → Asset dependencies must be resolved first ### **Phase 3: Post-Upload Batch Operations** **Publishing Batches:** - 📢 **Content Publishing** → Batch publish operations - 📢 **Page Publishing** → Batch publish operations --- ## 🧠 Dependency-Level Batching Strategy ### **Content Batching by Dependency Level:** **Level 0 - No Content Dependencies:** ```typescript // Content items that don't reference other content items const level0Content = content.filter(item => !hasContentReferences(item)); await saveContentItems(level0Content, guid, locale); ``` **Level 1 - References Level 0 Content:** ```typescript // Content items that only reference level 0 content (now uploaded with known IDs) const level1Content = content.filter(item => referencesOnlyLevel0(item, level0Mappings)); // Update references with new IDs from level0Mappings const updatedLevel1 = updateContentReferences(level1Content, level0Mappings); await saveContentItems(updatedLevel1, guid, locale); ``` **Level N - Recursive Dependency Resolution:** ```typescript // Continue until all content items are uploaded with proper ID mapping ``` ### **Asset Batching Strategy:** **Gallery-Independent Assets:** ```typescript // Assets not in any gallery - can be uploaded immediately const independentAssets = assets.filter(asset => !asset.mediaGroupingID); const uploadedAssets = await upload(buildFormData(independentAssets), path, guid); ``` **Gallery-Dependent Assets:** ```typescript // Assets in galleries - upload by gallery group to maintain grouping const galleryGroups = groupAssetsByGallery(assets); for (const [galleryID, galleryAssets] of galleryGroups) { const uploadedAssets = await upload(buildFormData(galleryAssets), path, guid, galleryID); } ``` --- ## ⚡ Parallel Execution Strategy ### **Thread Coordination Plan:** **Thread 1: Independent Entity Upload** ```typescript async function independentEntityThread() { // Parallel execution of independent operations await Promise.all([ uploadIndependentAssets(), uploadModelsSequentially(), uploadTemplatesSequentially() ]); } ``` **Thread 2: Dependency Chain Processing** ```typescript async function dependencyChainThread() { // Wait for models/templates, then process content in batches await waitForModelsAndTemplates(); await uploadContentByDependencyLevels(); } ``` **Thread 3: Complex Entity Processing** ```typescript async function complexEntityThread() { // Wait for content/assets, then process pages and galleries await waitForContentAndAssets(); await Promise.all([ uploadPagesViaChainTraversal(), uploadGalleriesViaChainTraversal() ]); } ``` ### **Coordination Mechanisms:** **Progress Synchronization:** ```typescript interface UploadProgress { modelsComplete: boolean; templatesComplete: boolean; assetsComplete: boolean; contentLevelsComplete: Map<number, boolean>; idMappings: { models: Map<number, number>; templates: Map<number, number>; assets: Map<string, number>; content: Map<number, number>; }; } ``` **Dependency Gates:** ```typescript // Thread coordination points await waitForGate('models-complete'); await waitForGate('content-level-0-complete'); await waitForGate('all-assets-complete'); ``` --- ## 💾 Real-Time ID Mapping System ### **ID Mapping Service Design:** ```typescript interface IdMappingService { // Store new ID mappings as entities are uploaded mapEntity(entityType: string, sourceId: number, targetId: number): void; // Retrieve mapped ID for use in dependent entities getMappedId(entityType: string, sourceId: number): number | null; // Update entity references with mapped IDs updateReferences<T>(entity: T, mappings: EntityMappings): T; // Check if all dependencies are resolved areDependenciesResolved(entity: any): boolean; } ``` ### **Reference Update Patterns:** **Content Item Reference Updates:** ```typescript function updateContentReferences(contentItem: ContentItem, mappings: IdMappings): ContentItem { // Deep scan content fields for entity references const updatedFields = deepUpdateFields(contentItem.fields, mappings); return { ...contentItem, fields: updatedFields }; } ``` **Page Zone Reference Updates:** ```typescript function updatePageZoneReferences(page: PageItem, mappings: IdMappings): PageItem { // Update content IDs in page zones const updatedZones = updateZoneContentIds(page.zones, mappings.content); return { ...page, zones: updatedZones }; } ``` --- ## 🔄 Error Handling & Retry Strategy ### **Batch Operation Error Handling:** **Partial Batch Failures:** ```typescript interface BatchResult { successful: EntityUpload[]; failed: EntityUpload[]; batchId?: number; errors: UploadError[]; } async function handleBatchResult(result: BatchResult) { // Update mappings for successful uploads result.successful.forEach(upload => idMappingService.mapEntity(upload.type, upload.sourceId, upload.targetId) ); // Retry failed uploads individually for (const failed of result.failed) { await retryIndividualUpload(failed); } } ``` **Chain Traversal Error Recovery:** ```typescript async function uploadPageChain(chain: PageChain) { try { for (const page of chain.pages) { await uploadPageWithDependencies(page); } } catch (error) { // Mark chain as broken, continue with other chains await markChainAsBroken(chain, error); await reportBrokenChain(chain, error); } } ``` ### **Rollback Strategy:** **Partial Upload Rollback:** ```typescript async function rollbackPartialUpload(uploadSession: UploadSession) { // Delete successfully uploaded entities in reverse dependency order const deletionOrder = reverseTopologicalSort(uploadSession.uploadedEntities); for (const entity of deletionOrder) { await deleteEntity(entity.type, entity.targetId); } } ``` --- ## 📈 Performance Optimization Strategies ### **Batch Size Optimization:** **Dynamic Batch Sizing:** ```typescript interface BatchConfig { contentBatchSize: number; // 50-100 items per batch assetBatchSize: number; // 10-20 assets per batch (file size dependent) maxConcurrentBatches: number; // 3-5 concurrent batches retryAttempts: number; // 3 attempts with exponential backoff } function calculateOptimalBatchSize(entityType: string, entityCount: number): number { // Dynamic calculation based on entity type and network conditions switch (entityType) { case 'content': return Math.min(100, Math.max(10, entityCount / 10)); case 'assets': return Math.min(20, Math.max(5, entityCount / 5)); default: return 10; } } ``` ### **Progress Tracking Optimization:** **Real-Time Progress Updates:** ```typescript interface ProgressTracker { updateBatchProgress(batchId: string, completed: number, total: number): void; updateChainProgress(chainId: string, currentStep: number, totalSteps: number): void; calculateEstimatedTimeRemaining(): number; getOverallProgress(): ProgressSummary; } ``` --- ## 🧪 Mock Delivery System Specifications ### **Mock Orchestrator Requirements:** **Console Output Specifications:** - Real-time progress bars with Unicode characters - Color-coded status indicators (green=complete, yellow=in-progress, red=error) - Parallel thread status dashboard - Estimated time remaining calculations - Error count and retry status **Timing Simulation:** - Realistic API call delays (models: 100-200ms, content: 200-500ms, assets: 1-3s) - Batch operation timing (proportional to batch size) - Network latency simulation (50-100ms base latency) - Error injection for testing error handling paths **Validation Logic:** - Dependency order validation (no entity uploaded before its dependencies) - ID mapping validation (all references properly updated) - Parallel execution collision detection - Resource contention simulation --- ## 🎯 Implementation Priority Order ### **Phase 1: Foundation (Task 19.2-19.3)** 1. Create mock delivery system with console visualization 2. Implement core batch uploader classes 3. Implement ID mapping service 4. Implement progress tracking system ### **Phase 2: Integration (Task 19.4-19.5)** 1. Integrate batch uploaders with mock operations 2. Implement parallel execution manager 3. Add error handling and retry logic 4. Create comprehensive testing suite ### **Phase 3: SDK Integration (Future Phase)** 1. Replace mock operations with real Management SDK calls 2. Handle batch response processing 3. Implement production error handling 4. Add performance monitoring and optimization --- ## 📊 Success Metrics ### **Performance Targets:** - **Upload Speed**: 80% faster than sequential approach - **Memory Usage**: <50MB peak memory usage - **Error Recovery**: 99% success rate with retry logic - **Parallel Efficiency**: 90%+ CPU utilization during upload ### **Quality Targets:** - **Data Integrity**: 100% dependency relationship preservation - **ID Mapping**: 100% reference accuracy - **Error Handling**: Graceful degradation with detailed error reporting - **User Experience**: Real-time progress with accurate time estimates --- **Status**: ✅ Strategic Architecture Design Complete **Next**: Task 19.1.2 - Design dependency-level batching strategy implementation