UNPKG

sf-agent-framework

Version:

AI Agent Orchestration Framework for Salesforce Development - Two-phase architecture with 70% context reduction

488 lines (401 loc) 10.5 kB
# Transformation Development Task This task guides the design and implementation of data transformation logic for ETL processes, ensuring accurate data mapping, conversion, and enrichment in Salesforce integrations. ## Purpose Enable ETL developers to: - Design transformation workflows - Implement data mapping logic - Create conversion algorithms - Build data enrichment processes - Optimize transformation performance ## Prerequisites - Source and target data models understood - Transformation requirements documented - Mapping specifications defined - Test data sets available - Performance benchmarks established ## Transformation Architecture ### 1. Transformation Pattern Categories **Pattern Classification Framework** ```yaml Simple Transformations: Description: Direct field-to-field mappings Complexity: Low Examples: - Field renaming - Type casting - Format standardization - Value trimming Processing: In-line transformation Complex Transformations: Description: Multi-field logic and calculations Complexity: Medium Examples: - Concatenation with logic - Conditional mappings - Lookup transformations - Date calculations Processing: Function-based transformation Business Logic Transformations: Description: Rule-based data processing Complexity: High Examples: - Territory assignment - Lead scoring - Duplicate merging - Hierarchy building Processing: Rule engine transformation Enrichment Transformations: Description: Data augmentation from external sources Complexity: Very High Examples: - Address standardization - Company data enrichment - Currency conversion - Geocoding Processing: Service-based transformation ``` ### 2. Transformation Design Principles **Design Strategy Framework** ```yaml Data Integrity Principles: - Maintain referential integrity - Preserve audit trails - Handle null values explicitly - Validate transformed data - Log all transformations Performance Principles: - Minimize transformation passes - Use bulk operations - Implement caching strategies - Parallelize when possible - Monitor resource usage Maintainability Principles: - Use configuration over code - Create reusable components - Document transformation logic - Version control mappings - Test comprehensively Error Handling Principles: - Graceful degradation - Detailed error logging - Recovery mechanisms - Data quarantine - Notification systems ``` ### 3. Transformation Components **Component Architecture** ```yaml Input Stage: Components: - Data readers - Format parsers - Validation filters - Schema validators Responsibilities: - Data ingestion - Initial validation - Format normalization - Error detection Transformation Stage: Components: - Mapping engine - Rule processor - Function library - Lookup cache Responsibilities: - Apply mappings - Execute business logic - Perform calculations - Enrich data Output Stage: Components: - Data writers - Format converters - Validation checkers - Error handlers Responsibilities: - Data formatting - Final validation - Error management - Result logging ``` ## Transformation Development Process ### Phase 1: Analysis and Design **Mapping Specification** ```yaml Field Mapping Document: Source_Field_1: target: Target_Field_1 transformation: direct null_handling: default_value validation: required Source_Field_2: target: Target_Field_2 transformation: uppercase null_handling: skip_record validation: email_format Source_Field_3 + Source_Field_4: target: Target_Field_5 transformation: concatenate_with_space null_handling: partial_concatenation validation: max_length_255 Source_Field_6: target: Target_Field_7 transformation: lookup_mapping lookup_table: Country_Codes null_handling: use_default_country validation: valid_country_code ``` **Transformation Logic Definition** ```yaml Business Rules: Lead_Score_Calculation: inputs: - Company_Size - Industry - Web_Activity - Email_Engagement logic: | score = 0 if Company_Size > 1000: score += 30 if Industry in ['Technology', 'Finance']: score += 20 if Web_Activity > 10: score += 25 if Email_Engagement == 'High': score += 25 return score output: Lead_Score__c Territory_Assignment: inputs: - State - Company_Revenue - Product_Interest logic: | if State in ['CA', 'OR', 'WA']: if Company_Revenue > 10M: return 'West_Enterprise' else: return 'West_SMB' # Additional logic... output: Territory__c ``` ### Phase 2: Implementation **Transformation Pipeline Structure** ```yaml pipeline: name: customer_data_transformation version: 1.0 stages: - stage: extract steps: - read_source_data - validate_schema - filter_invalid_records - stage: transform parallel: true steps: - apply_field_mappings - execute_business_rules - perform_enrichment - calculate_derived_fields - stage: load steps: - final_validation - prepare_bulk_load - execute_load - verify_results ``` **Transformation Functions Library** ```yaml Standard Functions: String Operations: - trim() - uppercase() - lowercase() - substring() - replace() - concatenate() Date Operations: - format_date() - add_days() - date_diff() - fiscal_quarter() - business_days() Numeric Operations: - round() - currency_convert() - percentage() - sum() - average() Lookup Operations: - vlookup() - fuzzy_match() - hierarchical_lookup() - cached_lookup() Custom Functions: Salesforce Specific: - convert_15_to_18_char_id() - validate_salesforce_id() - check_field_permissions() - apply_sharing_rules() - calculate_fiscal_period() ``` ### Phase 3: Testing **Transformation Test Strategy** ```yaml Unit Testing: Purpose: Test individual transformation functions Coverage: - All transformation functions - Edge cases - Error conditions - Null handling Tools: - Function test harness - Mock data generators - Assertion frameworks Integration Testing: Purpose: Test complete transformation pipeline Coverage: - End-to-end data flow - Performance benchmarks - Error propagation - Recovery mechanisms Tools: - Pipeline test framework - Test data sets - Monitoring tools Regression Testing: Purpose: Ensure changes don't break existing transformations Coverage: - All transformation paths - Historical test cases - Performance baselines - Output validation Tools: - Automated test suites - Comparison tools - Regression databases ``` ## Advanced Transformation Techniques ### 1. Conditional Transformations **Multi-Path Logic** ```yaml Conditional Mapping: Account_Type_Determination: conditions: - if: Revenue > 1000000 AND Employees > 100 then: Type: 'Enterprise' Service_Level: 'Platinum' - if: Revenue > 100000 AND Employees > 10 then: Type: 'Mid-Market' Service_Level: 'Gold' - else: Type: 'SMB' Service_Level: 'Silver' Dynamic_Field_Assignment: source: Product_Code conditions: - if: starts_with("ENT") then: map_to: Enterprise_Product__c apply: enterprise_pricing_rules - if: starts_with("SMB") then: map_to: SMB_Product__c apply: smb_discount_rules ``` ### 2. Hierarchical Transformations **Parent-Child Relationships** ```yaml Hierarchy_Building: Account_Hierarchy: steps: 1. Identify root accounts (Parent_ID is null) 2. Build hierarchy levels 3. Calculate rollup values 4. Set hierarchy paths 5. Apply inheritance rules Territory_Hierarchy: steps: 1. Load territory structure 2. Assign accounts to territories 3. Roll up metrics 4. Apply overlay rules 5. Handle exceptions ``` ### 3. Temporal Transformations **Time-Based Logic** ```yaml Historical_Data_Handling: Slowly_Changing_Dimensions: Type_1: Overwrite old values Type_2: Maintain history with effective dates Type_3: Keep current and previous values Effective_Dating: - Set start_date for new records - Update end_date for replaced records - Handle overlapping periods - Maintain temporal integrity ``` ## Performance Optimization ### Optimization Strategies **Processing Efficiency** ```yaml Bulk Processing: - Batch size optimization (10K-100K records) - Parallel stream processing - Memory-efficient algorithms - Lazy evaluation techniques Caching Strategies: - Lookup table caching - Computed value caching - Session-based caching - Distributed cache usage Resource Management: - Connection pooling - Thread pool optimization - Memory allocation tuning - Garbage collection optimization ``` ### Performance Monitoring **Metrics and KPIs** ```yaml Transformation Metrics: Throughput: - Records per second - Batches per hour - Total processing time Quality: - Transformation accuracy - Error rates - Data loss metrics Resource Usage: - CPU utilization - Memory consumption - I/O operations - Network bandwidth ``` ## Error Handling and Recovery ### Error Management Framework **Error Categories and Handling** ```yaml Data Errors: Invalid_Format: Action: Log and skip Recovery: Manual correction Missing_Required_Field: Action: Quarantine record Recovery: Request missing data Constraint_Violation: Action: Apply default/fix Recovery: Automated correction System Errors: Connection_Failure: Action: Retry with backoff Recovery: Circuit breaker Resource_Exhaustion: Action: Pause and recover Recovery: Scale resources Timeout: Action: Checkpoint and retry Recovery: Resume from checkpoint ``` ## Success Criteria ✅ All transformations accurately mapped ✅ Performance targets achieved ✅ Error handling comprehensive ✅ Testing coverage complete ✅ Documentation current ✅ Monitoring operational