sf-agent-framework
Version:
AI Agent Orchestration Framework for Salesforce Development - Two-phase architecture with 70% context reduction
488 lines (401 loc) • 10.5 kB
Markdown
# Transformation Development Task
This task guides the design and implementation of data transformation logic for
ETL processes, ensuring accurate data mapping, conversion, and enrichment in
Salesforce integrations.
## Purpose
Enable ETL developers to:
- Design transformation workflows
- Implement data mapping logic
- Create conversion algorithms
- Build data enrichment processes
- Optimize transformation performance
## Prerequisites
- Source and target data models understood
- Transformation requirements documented
- Mapping specifications defined
- Test data sets available
- Performance benchmarks established
## Transformation Architecture
### 1. Transformation Pattern Categories
**Pattern Classification Framework**
```yaml
Simple Transformations:
Description: Direct field-to-field mappings
Complexity: Low
Examples:
- Field renaming
- Type casting
- Format standardization
- Value trimming
Processing: In-line transformation
Complex Transformations:
Description: Multi-field logic and calculations
Complexity: Medium
Examples:
- Concatenation with logic
- Conditional mappings
- Lookup transformations
- Date calculations
Processing: Function-based transformation
Business Logic Transformations:
Description: Rule-based data processing
Complexity: High
Examples:
- Territory assignment
- Lead scoring
- Duplicate merging
- Hierarchy building
Processing: Rule engine transformation
Enrichment Transformations:
Description: Data augmentation from external sources
Complexity: Very High
Examples:
- Address standardization
- Company data enrichment
- Currency conversion
- Geocoding
Processing: Service-based transformation
```
### 2. Transformation Design Principles
**Design Strategy Framework**
```yaml
Data Integrity Principles:
- Maintain referential integrity
- Preserve audit trails
- Handle null values explicitly
- Validate transformed data
- Log all transformations
Performance Principles:
- Minimize transformation passes
- Use bulk operations
- Implement caching strategies
- Parallelize when possible
- Monitor resource usage
Maintainability Principles:
- Use configuration over code
- Create reusable components
- Document transformation logic
- Version control mappings
- Test comprehensively
Error Handling Principles:
- Graceful degradation
- Detailed error logging
- Recovery mechanisms
- Data quarantine
- Notification systems
```
### 3. Transformation Components
**Component Architecture**
```yaml
Input Stage:
Components:
- Data readers
- Format parsers
- Validation filters
- Schema validators
Responsibilities:
- Data ingestion
- Initial validation
- Format normalization
- Error detection
Transformation Stage:
Components:
- Mapping engine
- Rule processor
- Function library
- Lookup cache
Responsibilities:
- Apply mappings
- Execute business logic
- Perform calculations
- Enrich data
Output Stage:
Components:
- Data writers
- Format converters
- Validation checkers
- Error handlers
Responsibilities:
- Data formatting
- Final validation
- Error management
- Result logging
```
## Transformation Development Process
### Phase 1: Analysis and Design
**Mapping Specification**
```yaml
Field Mapping Document:
Source_Field_1:
target: Target_Field_1
transformation: direct
null_handling: default_value
validation: required
Source_Field_2:
target: Target_Field_2
transformation: uppercase
null_handling: skip_record
validation: email_format
Source_Field_3 + Source_Field_4:
target: Target_Field_5
transformation: concatenate_with_space
null_handling: partial_concatenation
validation: max_length_255
Source_Field_6:
target: Target_Field_7
transformation: lookup_mapping
lookup_table: Country_Codes
null_handling: use_default_country
validation: valid_country_code
```
**Transformation Logic Definition**
```yaml
Business Rules:
Lead_Score_Calculation:
inputs:
- Company_Size
- Industry
- Web_Activity
- Email_Engagement
logic: |
score = 0
if Company_Size > 1000: score += 30
if Industry in ['Technology', 'Finance']: score += 20
if Web_Activity > 10: score += 25
if Email_Engagement == 'High': score += 25
return score
output: Lead_Score__c
Territory_Assignment:
inputs:
- State
- Company_Revenue
- Product_Interest
logic: |
if State in ['CA', 'OR', 'WA']:
if Company_Revenue > 10M:
return 'West_Enterprise'
else:
return 'West_SMB'
# Additional logic...
output: Territory__c
```
### Phase 2: Implementation
**Transformation Pipeline Structure**
```yaml
pipeline:
name: customer_data_transformation
version: 1.0
stages:
- stage: extract
steps:
- read_source_data
- validate_schema
- filter_invalid_records
- stage: transform
parallel: true
steps:
- apply_field_mappings
- execute_business_rules
- perform_enrichment
- calculate_derived_fields
- stage: load
steps:
- final_validation
- prepare_bulk_load
- execute_load
- verify_results
```
**Transformation Functions Library**
```yaml
Standard Functions:
String Operations:
- trim()
- uppercase()
- lowercase()
- substring()
- replace()
- concatenate()
Date Operations:
- format_date()
- add_days()
- date_diff()
- fiscal_quarter()
- business_days()
Numeric Operations:
- round()
- currency_convert()
- percentage()
- sum()
- average()
Lookup Operations:
- vlookup()
- fuzzy_match()
- hierarchical_lookup()
- cached_lookup()
Custom Functions:
Salesforce Specific:
- convert_15_to_18_char_id()
- validate_salesforce_id()
- check_field_permissions()
- apply_sharing_rules()
- calculate_fiscal_period()
```
### Phase 3: Testing
**Transformation Test Strategy**
```yaml
Unit Testing:
Purpose: Test individual transformation functions
Coverage:
- All transformation functions
- Edge cases
- Error conditions
- Null handling
Tools:
- Function test harness
- Mock data generators
- Assertion frameworks
Integration Testing:
Purpose: Test complete transformation pipeline
Coverage:
- End-to-end data flow
- Performance benchmarks
- Error propagation
- Recovery mechanisms
Tools:
- Pipeline test framework
- Test data sets
- Monitoring tools
Regression Testing:
Purpose: Ensure changes don't break existing transformations
Coverage:
- All transformation paths
- Historical test cases
- Performance baselines
- Output validation
Tools:
- Automated test suites
- Comparison tools
- Regression databases
```
## Advanced Transformation Techniques
### 1. Conditional Transformations
**Multi-Path Logic**
```yaml
Conditional Mapping:
Account_Type_Determination:
conditions:
- if: Revenue > 1000000 AND Employees > 100
then:
Type: 'Enterprise'
Service_Level: 'Platinum'
- if: Revenue > 100000 AND Employees > 10
then:
Type: 'Mid-Market'
Service_Level: 'Gold'
- else:
Type: 'SMB'
Service_Level: 'Silver'
Dynamic_Field_Assignment:
source: Product_Code
conditions:
- if: starts_with("ENT")
then:
map_to: Enterprise_Product__c
apply: enterprise_pricing_rules
- if: starts_with("SMB")
then:
map_to: SMB_Product__c
apply: smb_discount_rules
```
### 2. Hierarchical Transformations
**Parent-Child Relationships**
```yaml
Hierarchy_Building:
Account_Hierarchy:
steps: 1. Identify root accounts (Parent_ID is null) 2. Build hierarchy levels 3.
Calculate rollup values 4. Set hierarchy paths 5. Apply inheritance rules
Territory_Hierarchy:
steps: 1. Load territory structure 2. Assign accounts to territories 3. Roll up
metrics 4. Apply overlay rules 5. Handle exceptions
```
### 3. Temporal Transformations
**Time-Based Logic**
```yaml
Historical_Data_Handling:
Slowly_Changing_Dimensions:
Type_1: Overwrite old values
Type_2: Maintain history with effective dates
Type_3: Keep current and previous values
Effective_Dating:
- Set start_date for new records
- Update end_date for replaced records
- Handle overlapping periods
- Maintain temporal integrity
```
## Performance Optimization
### Optimization Strategies
**Processing Efficiency**
```yaml
Bulk Processing:
- Batch size optimization (10K-100K records)
- Parallel stream processing
- Memory-efficient algorithms
- Lazy evaluation techniques
Caching Strategies:
- Lookup table caching
- Computed value caching
- Session-based caching
- Distributed cache usage
Resource Management:
- Connection pooling
- Thread pool optimization
- Memory allocation tuning
- Garbage collection optimization
```
### Performance Monitoring
**Metrics and KPIs**
```yaml
Transformation Metrics:
Throughput:
- Records per second
- Batches per hour
- Total processing time
Quality:
- Transformation accuracy
- Error rates
- Data loss metrics
Resource Usage:
- CPU utilization
- Memory consumption
- I/O operations
- Network bandwidth
```
## Error Handling and Recovery
### Error Management Framework
**Error Categories and Handling**
```yaml
Data Errors:
Invalid_Format:
Action: Log and skip
Recovery: Manual correction
Missing_Required_Field:
Action: Quarantine record
Recovery: Request missing data
Constraint_Violation:
Action: Apply default/fix
Recovery: Automated correction
System Errors:
Connection_Failure:
Action: Retry with backoff
Recovery: Circuit breaker
Resource_Exhaustion:
Action: Pause and recover
Recovery: Scale resources
Timeout:
Action: Checkpoint and retry
Recovery: Resume from checkpoint
```
## Success Criteria
✅ All transformations accurately mapped ✅ Performance targets achieved ✅
Error handling comprehensive ✅ Testing coverage complete ✅ Documentation
current ✅ Monitoring operational