@coffeeandfun/remove-pii
Version:
A Node.js module to remove personally identifiable information (PII) from text.
337 lines (255 loc) โข 8.27 kB
Markdown
# ๐ @coffeeandfun/remove-pii
**Protect privacy by removing personally identifiable information (PII) from text!**
@coffeeandfun/remove-pii is a powerful Node.js package designed to help with privacy by automatically detecting and removing personally identifiable information from text. Originally developed for Helperbird.com, this module has evolved into a comprehensive tool for protecting privacy in text processing.
Created by **Robert James Gabriel** at **Coffee & Fun LLC** - making the web more accessible and privacy-focused for everyone.
## ๐ฏ Why Use This?
**For Privacy Protection:**
- โ
Automatically removes sensitive information
- โ
Prevents accidental data leaks
- โ
GDPR and privacy compliance helper
- โ
Customizable for different use cases
**For Developers:**
- โ
Easy integration (just 1 line of code!)
- โ
Lightweight and fast
- โ
Comprehensive PII detection
- โ
Detailed analysis and reporting
## ๐ฆ Installation
```bash
npm install @coffeeandfun/remove-pii
```
## ๐โโ๏ธ Quick Start
```javascript
import { removePII } from '@coffeeandfun/remove-pii';
const text = "John's email is john@example.com and his phone number is 123-456-7890.";
const cleanedText = removePII(text);
console.log(cleanedText);
// Output: "John's email is [email removed] and his phone number is [phone removed]."
```
## ๐ก๏ธ PII Types Detected
- **๐ง Email Addresses** - `john@example.com`
- **๐ Phone Numbers** - `123-456-7890`, `(555) 123-4567`
- **๐ Social Security Numbers** - `123-45-6789`
- **๐ณ Credit Card Numbers** - `1234 5678 9012 3456`
- **๐ Physical Addresses** - `123 Main Street`
- **๐ Passport Numbers** - `AB1234567`
- **๐ Driver's License Numbers** - `D123456789`
- **๐ IP Addresses** - `192.168.1.1`
- **๐ฎ ZIP Codes** - `12345`, `12345-6789`
- **๐ฆ Bank Account Numbers** - `1234567890123456`
- **๐ URLs** - `https://example.com`
- **๐
Dates of Birth** - `01/15/1990`
## ๐ API Reference
### `removePII(text, options)`
Main function that removes PII and returns cleaned text.
```javascript
const cleanedText = removePII("Email: john@example.com, Phone: 123-456-7890");
// Returns: "Email: [email removed], Phone: [phone removed]"
```
### `removePIIDetailed(text, options)`
Enhanced version with detailed information about what was removed.
```javascript
const result = removePIIDetailed("Email: john@example.com");
console.log(result);
// {
// cleanedText: "Email: [email removed]",
// removedItems: [{ type: 'email', count: 1, items: ['john@example.com'] }],
// originalLength: 23,
// cleanedLength: 20,
// reductionPercentage: 13
// }
```
### `detectPII(text, options)`
Detects PII without removing it - useful for analysis.
```javascript
const analysis = detectPII("Email: john@example.com, Phone: 123-456-7890");
console.log(analysis);
// {
// text: "Email: john@example.com, Phone: 123-456-7890",
// hasPII: true,
// totalMatches: 2,
// types: ['email', 'phone'],
// detectedItems: [...]
// }
```
### `analyzePII(text, options)`
Comprehensive analysis with statistics and risk assessment.
```javascript
const analysis = analyzePII("Email: john@example.com, SSN: 123-45-6789");
console.log(analysis);
// {
// original: { text: "...", length: 45, wordCount: 6 },
// cleaned: { text: "...", length: 35, wordCount: 6 },
// pii: { detected: [...], totalCount: 2, types: ['email', 'ssn'] },
// risk: { level: 'medium', score: 13 }
// }
```
### `validatePIICompliance(text, options)`
Check if text is PII-compliant with recommendations.
```javascript
const compliance = validatePIICompliance("Email: john@example.com");
console.log(compliance);
// {
// isCompliant: false,
// violations: [{ type: 'email', count: 1 }],
// riskLevel: 'low',
// recommendations: ['๐ง Email detected - Consider using hashed emails']
// }
```
## ๐จ Customization
### Basic Configuration
```javascript
const options = {
email: { remove: true, replacement: "[EMAIL HIDDEN]" },
phone: { remove: false },
ssn: { remove: true, replacement: "[SSN REDACTED]" }
};
const cleaned = removePII(text, options);
```
### Privacy Levels
#### High Privacy Mode
```javascript
const highPrivacy = {
email: { remove: true },
phone: { remove: true },
ssn: { remove: true },
creditCard: { remove: true },
address: { remove: true },
passport: { remove: true },
driversLicense: { remove: true },
ipAddress: { remove: true },
zipCode: { remove: true },
bankAccount: { remove: true },
url: { remove: true },
dateOfBirth: { remove: true }
};
```
#### Moderate Privacy Mode
```javascript
const moderatePrivacy = {
ssn: { remove: true },
creditCard: { remove: true },
bankAccount: { remove: true },
email: { remove: false },
phone: { remove: false },
address: { remove: true }
};
```
#### Custom Replacements
```javascript
const customReplacements = {
email: { replacement: "๐ง [CONTACT INFO]" },
phone: { replacement: "๐ [PHONE NUMBER]" },
address: { replacement: "๐ [LOCATION]" }
};
```
## ๐ง Advanced Features
### Batch Processing
```javascript
import { processBatch } from '@coffeeandfun/remove-pii';
const texts = [
"Email: john@example.com",
"Phone: 123-456-7890",
"Regular text"
];
const results = processBatch(texts);
console.log(results);
// Array of results with success/failure status
```
### Risk Assessment
```javascript
const compliance = validatePIICompliance(text);
console.log(`Risk Level: ${compliance.riskLevel}`);
console.log(`Risk Score: ${compliance.riskScore}`);
console.log(`Recommendations: ${compliance.recommendations.join(', ')}`);
```
### Available PII Types
```javascript
import { getAvailableTypes } from '@coffeeandfun/remove-pii';
const types = getAvailableTypes();
types.forEach(type => {
console.log(`${type.type}: ${type.description}`);
});
```
## ๐ญ Real-World Examples
### Data Cleaning Pipeline
```javascript
import { removePII, validatePIICompliance } from '@coffeeandfun/remove-pii';
function cleanUserData(userData) {
const compliance = validatePIICompliance(userData);
if (!compliance.isCompliant) {
console.log(`โ ๏ธ PII detected: ${compliance.violationCount} violations`);
return removePII(userData);
}
return userData;
}
```
### Log Sanitization
```javascript
import { removePIIDetailed } from '@coffeeandfun/remove-pii';
function sanitizeLogs(logEntry) {
const result = removePIIDetailed(logEntry);
if (result.removedItems.length > 0) {
console.log(`๐ Sanitized log: removed ${result.removedItems.length} PII items`);
}
return result.cleanedText;
}
```
### API Response Cleaning
```javascript
import { analyzePII } from '@coffeeandfun/remove-pii';
function sanitizeApiResponse(response) {
const analysis = analyzePII(JSON.stringify(response));
if (analysis.pii.totalCount > 0) {
console.log(`โ ๏ธ API response contains PII: ${analysis.pii.types.join(', ')}`);
return JSON.parse(analysis.cleaned.text);
}
return response;
}
```
---
## ๐งช Testing
```bash
npm test
```
We've included comprehensive tests covering:
- โ
All PII types and patterns
- โ
Edge cases and error handling
- โ
Performance and consistency
- โ
Batch processing
- โ
Custom configurations
## ๐ค Contributing
We welcome contributions!
1. **๐ Report Issues** - Found a bug or missing PII type?
2. **๐ก Suggest Features** - Ideas for better privacy protection?
3. **๐ง Submit PRs** - Code improvements welcome!
### Development Setup
```bash
git clone https://github.com/RobertJGabriel/remove-pii
cd remove-pii
npm install
npm test
```
## ๐ License
MIT License - feel free to use in your projects!
## ๐ Credits
**Created with โค๏ธ by:**
- **Robert James Gabriel** - Lead Developer
**Originally developed for:**
- **Helperbird** - An accessibility extension making the web accessible for everyone
## ๐ Support
Need help protecting privacy in your applications?
- ๐ **Bug Reports**: [GitHub Issues](https://github.com/RobertJGabriel/remove-pii/issues)
- ๐ก **Feature Requests**: [GitHub Discussions](https://github.com/RobertJGabriel/remove-pii/discussions)
**Stay privacy-focused! ๐**