sf-agent-framework
Version:
AI Agent Orchestration Framework for Salesforce Development - Two-phase architecture with 70% context reduction
533 lines (418 loc) • 15.5 kB
Markdown
# Common Salesforce Data Issues
## Overview
This document catalogues common data issues encountered in Salesforce
implementations, their root causes, impact, and remediation strategies.
Understanding these patterns helps prevent and quickly resolve data quality
problems.
## Duplicate Records
### Issue Description
Multiple records representing the same entity exist in the system, causing
confusion, inaccurate reporting, and poor user experience.
### Common Causes
- No duplicate prevention rules configured
- Multiple data entry points (manual, integration, import)
- Inconsistent data entry standards
- Legacy data migrations
- Lead-to-Account conversion issues
### Detection Methods
```apex
// Find potential duplicate accounts by name
SELECT Name, COUNT(Id)
FROM Account
GROUP BY Name
HAVING COUNT(Id) > 1
// Find contacts with same email
SELECT Email, COUNT(Id)
FROM Contact
WHERE Email != null
GROUP BY Email
HAVING COUNT(Id) > 1
```
### Prevention Strategies
1. **Duplicate Rules**: Configure matching rules
2. **Unique Fields**: Use external IDs
3. **Validation Rules**: Check before save
4. **User Training**: Teach search-first approach
5. **Integration Logic**: Upsert operations
### Remediation Approach
```apex
// Automated merge process
public class DuplicateMerger {
public static void mergeAccounts(Id masterId, List<Id> duplicateIds) {
// Preserve child records
List<Contact> contacts = [SELECT Id FROM Contact WHERE AccountId IN :duplicateIds];
for (Contact c : contacts) {
c.AccountId = masterId;
}
update contacts;
// Preserve opportunities
List<Opportunity> opps = [SELECT Id FROM Opportunity WHERE AccountId IN :duplicateIds];
for (Opportunity o : opps) {
o.AccountId = masterId;
}
update opps;
// Delete duplicates
delete [SELECT Id FROM Account WHERE Id IN :duplicateIds];
}
}
```
## Incomplete Records
### Issue Description
Records missing critical information needed for business processes, reporting,
or integration.
### Common Scenarios
- Contacts without email addresses
- Opportunities without close dates
- Accounts missing industry/size data
- Cases without proper categorization
- Leads with minimal information
### Impact Analysis
```apex
// Measure completeness by object
public class CompletenessAnalyzer {
public static Map<String, Decimal> analyzeObject(String objectName, List<String> criticalFields) {
Map<String, Decimal> results = new Map<String, Decimal>();
String query = 'SELECT Id';
for (String field : criticalFields) {
query += ', ' + field;
}
query += ' FROM ' + objectName + ' LIMIT 10000';
List<SObject> records = Database.query(query);
for (String field : criticalFields) {
Integer populated = 0;
for (SObject record : records) {
if (record.get(field) != null) {
populated++;
}
}
results.put(field, (Decimal)populated / records.size() * 100);
}
return results;
}
}
```
### Progressive Data Capture
```apex
trigger ProgressiveDataCapture on Opportunity (before insert, before update) {
for (Opportunity opp : Trigger.new) {
// Calculate completeness score
Integer score = 0;
Integer totalFields = 10;
if (opp.Amount != null) score++;
if (opp.CloseDate != null) score++;
if (opp.NextStep != null) score++;
if (opp.LeadSource != null) score++;
if (opp.Type != null) score++;
if (opp.Probability != null) score++;
if (opp.Description != null) score++;
if (opp.Primary_Contact__c != null) score++;
if (opp.Competitor__c != null) score++;
if (opp.Loss_Reason__c != null || opp.IsWon) score++;
opp.Data_Completeness__c = (Decimal)score / totalFields * 100;
}
}
```
## Data Format Inconsistencies
### Issue Types
- Phone numbers in various formats
- Inconsistent address formatting
- Mixed case in names
- Non-standard date formats
- Currency without proper codes
### Standardization Solutions
```apex
public class DataStandardizer {
// Phone number standardization
public static String standardizePhone(String phone) {
if (String.isBlank(phone)) return phone;
// Remove all non-numeric characters
String cleaned = phone.replaceAll('[^0-9]', '');
// Format based on length
if (cleaned.length() == 10) {
return String.format('({0}) {1}-{2}', new List<String>{
cleaned.substring(0, 3),
cleaned.substring(3, 6),
cleaned.substring(6)
});
} else if (cleaned.length() == 11 && cleaned.startsWith('1')) {
return standardizePhone(cleaned.substring(1));
}
return phone; // Return original if can't standardize
}
// Name case standardization
public static String standardizeName(String name) {
if (String.isBlank(name)) return name;
List<String> parts = name.toLowerCase().split(' ');
List<String> result = new List<String>();
for (String part : parts) {
if (part.length() > 0) {
// Handle special cases (McDonald, O'Brien, etc.)
if (part.startsWith('mc') && part.length() > 2) {
result.add('Mc' + part.substring(2, 3).toUpperCase() + part.substring(3));
} else if (part.contains('\'')) {
Integer index = part.indexOf('\'');
result.add(part.substring(0, 1).toUpperCase() +
part.substring(1, index + 1) +
part.substring(index + 1, index + 2).toUpperCase() +
part.substring(index + 2));
} else {
result.add(part.substring(0, 1).toUpperCase() + part.substring(1));
}
}
}
return String.join(result, ' ');
}
}
```
## Invalid Relationships
### Common Issues
- Orphaned child records
- Circular references
- Invalid lookup values
- Broken hierarchies
- Mismatched record types
### Detection Queries
```sql
-- Orphaned contacts
SELECT Id, Name FROM Contact WHERE AccountId = null
-- Circular account hierarchies
SELECT Id, Name FROM Account WHERE ParentId = Id
-- Invalid owner assignments
SELECT Id, Name FROM Account
WHERE OwnerId NOT IN (SELECT Id FROM User WHERE IsActive = true)
-- Opportunities without account
SELECT Id, Name FROM Opportunity WHERE AccountId = null
```
### Relationship Repair
```apex
public class RelationshipRepairer {
public static void fixOrphanedContacts() {
List<Contact> orphans = [SELECT Id, Email, LastName FROM Contact WHERE AccountId = null];
Map<String, Id> domainToAccount = new Map<String, Id>();
// Build domain map
for (Account acc : [SELECT Id, Website FROM Account WHERE Website != null]) {
String domain = extractDomain(acc.Website);
if (domain != null) {
domainToAccount.put(domain, acc.Id);
}
}
// Match contacts by email domain
for (Contact con : orphans) {
if (con.Email != null && con.Email.contains('@')) {
String domain = con.Email.substring(con.Email.indexOf('@') + 1);
if (domainToAccount.containsKey(domain)) {
con.AccountId = domainToAccount.get(domain);
}
}
}
update orphans;
}
}
```
## Stale Data
### Indicators
- Last modified dates > 1 year
- Opportunities past close date
- Inactive contacts still marked active
- Outdated product information
- Historical data affecting performance
### Archival Strategy
```apex
// Identify stale records
public class StaleDataIdentifier {
public static List<Id> findStaleOpportunities(Integer daysOld) {
Date cutoffDate = Date.today().addDays(-daysOld);
return new List<Id>(
new Map<Id, Opportunity>([
SELECT Id FROM Opportunity
WHERE IsClosed = true
AND CloseDate < :cutoffDate
AND LastActivityDate < :cutoffDate
]).keySet()
);
}
public static void archiveRecords(List<Id> recordIds) {
// Create archive records
List<Data_Archive__c> archives = new List<Data_Archive__c>();
for (Id recordId : recordIds) {
Data_Archive__c archive = new Data_Archive__c();
archive.Original_Record_Id__c = recordId;
archive.Archive_Date__c = DateTime.now();
archive.Archive_Reason__c = 'Stale Data';
archives.add(archive);
}
insert archives;
// Delete originals after successful archive
delete [SELECT Id FROM Opportunity WHERE Id IN :recordIds];
}
}
```
## Data Security Issues
### Common Problems
- Overly permissive sharing
- Sensitive data in wrong fields
- Unencrypted sensitive information
- Improper field-level security
- Audit trail gaps
### Security Audit
```apex
// Check for sensitive data exposure
public class SecurityAuditor {
public static List<SecurityIssue> auditFieldSecurity() {
List<SecurityIssue> issues = new List<SecurityIssue>();
// Check for SSN in non-encrypted fields
Map<String, Schema.SObjectType> schemaMap = Schema.getGlobalDescribe();
for (String objName : schemaMap.keySet()) {
Schema.SObjectType objType = schemaMap.get(objName);
Map<String, Schema.SObjectField> fieldMap = objType.getDescribe().fields.getMap();
for (String fieldName : fieldMap.keySet()) {
Schema.DescribeFieldResult fieldDesc = fieldMap.get(fieldName).getDescribe();
// Check for potential sensitive data
if ((fieldDesc.getType() == Schema.DisplayType.STRING ||
fieldDesc.getType() == Schema.DisplayType.TEXTAREA) &&
(fieldDesc.getName().containsIgnoreCase('ssn') ||
fieldDesc.getName().containsIgnoreCase('social') ||
fieldDesc.getName().containsIgnoreCase('tax_id')) &&
!fieldDesc.isEncrypted()) {
issues.add(new SecurityIssue(
'Unencrypted Sensitive Field',
objName + '.' + fieldName,
'HIGH'
));
}
}
}
return issues;
}
}
```
## Integration Data Mismatches
### Sync Issues
- Field mapping errors
- Data type mismatches
- Timezone discrepancies
- Character encoding problems
- API limit violations
### Sync Validation
```apex
// Validate data synchronization
public class SyncValidator {
public static List<SyncDiscrepancy> validateSync(String externalSystem) {
List<SyncDiscrepancy> discrepancies = new List<SyncDiscrepancy>();
// Compare checksums
List<Account> accounts = [
SELECT Id, External_Id__c, Checksum__c, LastModifiedDate
FROM Account
WHERE External_Id__c != null
];
for (Account acc : accounts) {
// Call external system API to get checksum
String externalChecksum = getExternalChecksum(acc.External_Id__c);
if (acc.Checksum__c != externalChecksum) {
discrepancies.add(new SyncDiscrepancy(
acc.Id,
'Checksum mismatch',
acc.Checksum__c,
externalChecksum
));
}
}
return discrepancies;
}
}
```
## Performance-Related Data Issues
### Large Data Volume Problems
- Slow queries on large objects
- Skinny table violations
- Non-selective queries
- Missing indexes
- Data skew
### Performance Optimization
```apex
// Identify and fix data skew
public class DataSkewAnalyzer {
public static Map<Id, Integer> analyzeOwnershipSkew(String objectName) {
String query = 'SELECT OwnerId, COUNT(Id) cnt FROM ' + objectName +
' GROUP BY OwnerId HAVING COUNT(Id) > 10000';
Map<Id, Integer> skewedOwners = new Map<Id, Integer>();
for (AggregateResult ar : Database.query(query)) {
skewedOwners.put((Id)ar.get('OwnerId'), (Integer)ar.get('cnt'));
}
return skewedOwners;
}
public static void redistributeRecords(Id skewedOwnerId, List<Id> newOwnerIds) {
List<Account> accounts = [
SELECT Id FROM Account
WHERE OwnerId = :skewedOwnerId
LIMIT 50000
];
Integer batchSize = accounts.size() / newOwnerIds.size();
Integer ownerIndex = 0;
for (Integer i = 0; i < accounts.size(); i++) {
if (i > 0 && Math.mod(i, batchSize) == 0) {
ownerIndex++;
}
if (ownerIndex < newOwnerIds.size()) {
accounts[i].OwnerId = newOwnerIds[ownerIndex];
}
}
update accounts;
}
}
```
## Data Migration Issues
### Common Problems
- Data truncation
- Character encoding errors
- Relationship mapping failures
- Picklist value mismatches
- Date/time conversion errors
### Migration Validation
```apex
// Post-migration validation
public class MigrationValidator {
public static ValidationReport validateMigration(String sourceSystem) {
ValidationReport report = new ValidationReport();
// Record counts
report.addMetric('Account Count',
[SELECT COUNT() FROM Account WHERE Source_System__c = :sourceSystem]);
// Data quality checks
report.addMetric('Accounts Missing Email',
[SELECT COUNT() FROM Account
WHERE Source_System__c = :sourceSystem AND Email = null]);
// Relationship integrity
report.addMetric('Orphaned Contacts',
[SELECT COUNT() FROM Contact
WHERE Source_System__c = :sourceSystem AND AccountId = null]);
return report;
}
}
```
## Prevention Best Practices
### Data Quality Framework
1. **Validation Rules**: Implement comprehensive validation
2. **Triggers**: Real-time data standardization
3. **Batch Jobs**: Regular cleanup processes
4. **Monitoring**: Dashboard and alerts
5. **Training**: User education programs
### Automated Quality Checks
```apex
// Scheduled data quality monitor
global class DataQualityMonitor implements Schedulable {
global void execute(SchedulableContext ctx) {
// Run quality checks
checkDuplicates();
checkCompleteness();
checkRelationships();
checkDataAge();
// Send report
sendQualityReport();
}
}
```
## Additional Resources
- [Salesforce Data Quality Best Practices](https://help.salesforce.com/s/articleView?id=sf.data_quality_best_practices.htm)
- [Large Data Volume Considerations](https://developer.salesforce.com/docs/atlas.en-us.salesforce_large_data_volumes_bp.meta/salesforce_large_data_volumes_bp/)
- [Data Loader Guide](https://developer.salesforce.com/docs/atlas.en-us.dataLoader.meta/dataLoader/)
- [Duplicate Management](https://help.salesforce.com/s/articleView?id=sf.duplicate_management_overview.htm)