@agility/cli
Version:
Agility CLI for working with your content. (Public Beta)
441 lines (349 loc) โข 14 kB
Markdown
# Batch-Optimized Upload Strategy & Architecture Design
**Phase 19 - Task 19.1.1 Implementation**
**Status**: โ
**FOUNDATION VALIDATED** - Ready for Implementation
**Created**: Phase 19 Development Session
**Validated**: Phase 19.5-19.6 - 100% entity reconciliation across 24,164+ entities from 3 customer instances
## ๐ **VALIDATION SUCCESS SUMMARY**
**Foundation Proven Ready:**
- โ
**24,164 entities validated** across 3 diverse customer production instances
- โ
**100% reconciliation rate** - zero broken chains across all instances
- โ
**Perfect dependency resolution** - every entity and its dependencies accounted for
- โ
**Modular service architecture** - 12 focused components ready for upload orchestration
- โ
**Debug tooling proven** - `--test` flag enables authentication bypass for testing
**Implementation Readiness:**
- โ
**Data integrity validated** - Pull and analysis systems working perfectly
- โ
**Dependency mapping proven** - Complete relationship resolution working
- โ
**Error detection reliable** - Broken chain identification accurate
- โ
**Service architecture ready** - Modular, maintainable codebase established
## ๐ฏ Strategic Objective
Design optimal upload strategy leveraging Management SDK batch operations combined with dependency chain traversal to maximize throughput while maintaining data integrity and dependency relationships.
**Foundation**: This strategy is built on a rock-solid foundation of proven pull and analysis capabilities validated across 24,164+ real-world entities.
## ๐ Available Management SDK Batch Operations
### **Confirmed Batch Methods:**
- โ
**`saveContentItems(contentItems: ContentItem[], guid, locale)`** โ `Promise<Batch | number[]>`
- โ
**`upload(formData, agilityFolderPath, guid, groupingID?)`** โ `Promise<Media[]>`
- โ
**`savePage(pageItem, guid, locale, parentPageID?, placeBeforePageItemID?)`** โ `Promise<Batch | number[]>`
### **Batch Operation Types (from SDK):**
```typescript
enum BatchOperationType {
Publish = 1,
Unpublish = 2,
Approve = 3,
Decline = 4,
RequestApproval = 5,
Copy = 6,
ClonePageTemplate = 7,
Sync = 8,
TagExport = 9,
TagImport = 10,
ContentExport = 11,
ContentImport = 12,
InstanctTemplateImport = 13,
InsertContent = 14,
DeleteContent = 15,
SavePage = 16,
DeletePage = 17
}
```
## ๐๏ธ Entity Dependency Analysis
### **Dependency Chain Patterns (from Phase 18 Analysis):**
**6-Step Analysis Framework:**
1. **Page Chains**: Template โ Container โ Model โ Content โ Asset โ Gallery
2. **Container Chains**: Independent containers not in page chains
3. **Model-to-Model Chains**: Content Definition field relationships
4. **Broken Chains**: Missing source data validation
5. **Items Outside Chains**: Non-chained entities by type
6. **Reconciliation**: Entity accounting and sync readiness
### **Entity Independence Matrix:**
| Entity Type | Dependencies | Batch Capable | Strategy |
|-------------|--------------|---------------|----------|
| **Models** | None (independent) | โ Individual only | Sequential Individual |
| **Templates** | Models (weak) | โ Individual only | Sequential Individual |
| **Containers** | Models (strong) | โ Individual only | Sequential Individual |
| **Assets** | None (independent) | โ
**`upload()`** | **BATCH UPLOAD** |
| **Galleries** | Assets (strong) | โ Individual only | Chain Traversal |
| **Content** | Models + Assets (strong) | โ
**`saveContentItems()`** | **DEPENDENCY-LEVEL BATCHING** |
| **Pages** | Templates + Content + Assets (complex) | โ ๏ธ Batch but complex | **CHAIN TRAVERSAL** |
## ๐ Strategic Decision Matrix
### **Phase 1: Independent Entity Batching**
**Batch Strategy Entities:**
- โ
**Assets** โ Use `upload()` batch method
- โ
**Content** โ Use `saveContentItems()` with dependency-level grouping
**Sequential Strategy Entities:**
- ๐ **Models** โ Individual `saveModel()` calls (fast, independent)
- ๐ **Templates** โ Individual `saveTemplate()` calls (minimal model deps)
- ๐ **Containers** โ Individual `saveContainer()` calls (model deps resolved)
### **Phase 2: Complex Dependency Chain Traversal**
**Chain Traversal Entities:**
- ๐ **Pages** โ Complex zone dependencies require real-time ID mapping
- ๐ **Galleries** โ Asset dependencies must be resolved first
### **Phase 3: Post-Upload Batch Operations**
**Publishing Batches:**
- ๐ข **Content Publishing** โ Batch publish operations
- ๐ข **Page Publishing** โ Batch publish operations
## ๐ง Dependency-Level Batching Strategy
### **Content Batching by Dependency Level:**
**Level 0 - No Content Dependencies:**
```typescript
// Content items that don't reference other content items
const level0Content = content.filter(item => !hasContentReferences(item));
await saveContentItems(level0Content, guid, locale);
```
**Level 1 - References Level 0 Content:**
```typescript
// Content items that only reference level 0 content (now uploaded with known IDs)
const level1Content = content.filter(item => referencesOnlyLevel0(item, level0Mappings));
// Update references with new IDs from level0Mappings
const updatedLevel1 = updateContentReferences(level1Content, level0Mappings);
await saveContentItems(updatedLevel1, guid, locale);
```
**Level N - Recursive Dependency Resolution:**
```typescript
// Continue until all content items are uploaded with proper ID mapping
```
### **Asset Batching Strategy:**
**Gallery-Independent Assets:**
```typescript
// Assets not in any gallery - can be uploaded immediately
const independentAssets = assets.filter(asset => !asset.mediaGroupingID);
const uploadedAssets = await upload(buildFormData(independentAssets), path, guid);
```
**Gallery-Dependent Assets:**
```typescript
// Assets in galleries - upload by gallery group to maintain grouping
const galleryGroups = groupAssetsByGallery(assets);
for (const [galleryID, galleryAssets] of galleryGroups) {
const uploadedAssets = await upload(buildFormData(galleryAssets), path, guid, galleryID);
}
```
## โก Parallel Execution Strategy
### **Thread Coordination Plan:**
**Thread 1: Independent Entity Upload**
```typescript
async function independentEntityThread() {
// Parallel execution of independent operations
await Promise.all([
uploadIndependentAssets(),
uploadModelsSequentially(),
uploadTemplatesSequentially()
]);
}
```
**Thread 2: Dependency Chain Processing**
```typescript
async function dependencyChainThread() {
// Wait for models/templates, then process content in batches
await waitForModelsAndTemplates();
await uploadContentByDependencyLevels();
}
```
**Thread 3: Complex Entity Processing**
```typescript
async function complexEntityThread() {
// Wait for content/assets, then process pages and galleries
await waitForContentAndAssets();
await Promise.all([
uploadPagesViaChainTraversal(),
uploadGalleriesViaChainTraversal()
]);
}
```
### **Coordination Mechanisms:**
**Progress Synchronization:**
```typescript
interface UploadProgress {
modelsComplete: boolean;
templatesComplete: boolean;
assetsComplete: boolean;
contentLevelsComplete: Map<number, boolean>;
idMappings: {
models: Map<number, number>;
templates: Map<number, number>;
assets: Map<string, number>;
content: Map<number, number>;
};
}
```
**Dependency Gates:**
```typescript
// Thread coordination points
await waitForGate('models-complete');
await waitForGate('content-level-0-complete');
await waitForGate('all-assets-complete');
```
## ๐พ Real-Time ID Mapping System
### **ID Mapping Service Design:**
```typescript
interface IdMappingService {
// Store new ID mappings as entities are uploaded
mapEntity(entityType: string, sourceId: number, targetId: number): void;
// Retrieve mapped ID for use in dependent entities
getMappedId(entityType: string, sourceId: number): number | null;
// Update entity references with mapped IDs
updateReferences<T>(entity: T, mappings: EntityMappings): T;
// Check if all dependencies are resolved
areDependenciesResolved(entity: any): boolean;
}
```
### **Reference Update Patterns:**
**Content Item Reference Updates:**
```typescript
function updateContentReferences(contentItem: ContentItem, mappings: IdMappings): ContentItem {
// Deep scan content fields for entity references
const updatedFields = deepUpdateFields(contentItem.fields, mappings);
return {
...contentItem,
fields: updatedFields
};
}
```
**Page Zone Reference Updates:**
```typescript
function updatePageZoneReferences(page: PageItem, mappings: IdMappings): PageItem {
// Update content IDs in page zones
const updatedZones = updateZoneContentIds(page.zones, mappings.content);
return {
...page,
zones: updatedZones
};
}
```
## ๐ Error Handling & Retry Strategy
### **Batch Operation Error Handling:**
**Partial Batch Failures:**
```typescript
interface BatchResult {
successful: EntityUpload[];
failed: EntityUpload[];
batchId?: number;
errors: UploadError[];
}
async function handleBatchResult(result: BatchResult) {
// Update mappings for successful uploads
result.successful.forEach(upload =>
idMappingService.mapEntity(upload.type, upload.sourceId, upload.targetId)
);
// Retry failed uploads individually
for (const failed of result.failed) {
await retryIndividualUpload(failed);
}
}
```
**Chain Traversal Error Recovery:**
```typescript
async function uploadPageChain(chain: PageChain) {
try {
for (const page of chain.pages) {
await uploadPageWithDependencies(page);
}
} catch (error) {
// Mark chain as broken, continue with other chains
await markChainAsBroken(chain, error);
await reportBrokenChain(chain, error);
}
}
```
### **Rollback Strategy:**
**Partial Upload Rollback:**
```typescript
async function rollbackPartialUpload(uploadSession: UploadSession) {
// Delete successfully uploaded entities in reverse dependency order
const deletionOrder = reverseTopologicalSort(uploadSession.uploadedEntities);
for (const entity of deletionOrder) {
await deleteEntity(entity.type, entity.targetId);
}
}
```
## ๐ Performance Optimization Strategies
### **Batch Size Optimization:**
**Dynamic Batch Sizing:**
```typescript
interface BatchConfig {
contentBatchSize: number; // 50-100 items per batch
assetBatchSize: number; // 10-20 assets per batch (file size dependent)
maxConcurrentBatches: number; // 3-5 concurrent batches
retryAttempts: number; // 3 attempts with exponential backoff
}
function calculateOptimalBatchSize(entityType: string, entityCount: number): number {
// Dynamic calculation based on entity type and network conditions
switch (entityType) {
case 'content':
return Math.min(100, Math.max(10, entityCount / 10));
case 'assets':
return Math.min(20, Math.max(5, entityCount / 5));
default:
return 10;
}
}
```
### **Progress Tracking Optimization:**
**Real-Time Progress Updates:**
```typescript
interface ProgressTracker {
updateBatchProgress(batchId: string, completed: number, total: number): void;
updateChainProgress(chainId: string, currentStep: number, totalSteps: number): void;
calculateEstimatedTimeRemaining(): number;
getOverallProgress(): ProgressSummary;
}
```
## ๐งช Mock Delivery System Specifications
### **Mock Orchestrator Requirements:**
**Console Output Specifications:**
- Real-time progress bars with Unicode characters
- Color-coded status indicators (green=complete, yellow=in-progress, red=error)
- Parallel thread status dashboard
- Estimated time remaining calculations
- Error count and retry status
**Timing Simulation:**
- Realistic API call delays (models: 100-200ms, content: 200-500ms, assets: 1-3s)
- Batch operation timing (proportional to batch size)
- Network latency simulation (50-100ms base latency)
- Error injection for testing error handling paths
**Validation Logic:**
- Dependency order validation (no entity uploaded before its dependencies)
- ID mapping validation (all references properly updated)
- Parallel execution collision detection
- Resource contention simulation
## ๐ฏ Implementation Priority Order
### **Phase 1: Foundation (Task 19.2-19.3)**
1. Create mock delivery system with console visualization
2. Implement core batch uploader classes
3. Implement ID mapping service
4. Implement progress tracking system
### **Phase 2: Integration (Task 19.4-19.5)**
1. Integrate batch uploaders with mock operations
2. Implement parallel execution manager
3. Add error handling and retry logic
4. Create comprehensive testing suite
### **Phase 3: SDK Integration (Future Phase)**
1. Replace mock operations with real Management SDK calls
2. Handle batch response processing
3. Implement production error handling
4. Add performance monitoring and optimization
## ๐ Success Metrics
### **Performance Targets:**
- **Upload Speed**: 80% faster than sequential approach
- **Memory Usage**: <50MB peak memory usage
- **Error Recovery**: 99% success rate with retry logic
- **Parallel Efficiency**: 90%+ CPU utilization during upload
### **Quality Targets:**
- **Data Integrity**: 100% dependency relationship preservation
- **ID Mapping**: 100% reference accuracy
- **Error Handling**: Graceful degradation with detailed error reporting
- **User Experience**: Real-time progress with accurate time estimates
**Status**: โ
Strategic Architecture Design Complete
**Next**: Task 19.1.2 - Design dependency-level batching strategy implementation