turkish-profanity-filter
Version:
A configurable Turkish profanity filter for text content
387 lines (293 loc) • 11.1 kB
Markdown
# Turkish Profanity Filter
[](https://www.npmjs.com/package/turkish-profanity-filter)
[](https://www.npmjs.com/package/turkish-profanity-filter)
[](https://opensource.org/licenses/MIT)
[](https://github.com/derdogant/turkish-profanity-filter/actions)
A configurable library for detecting and censoring Turkish profanity in text content. This package properly handles Turkish special characters and provides flexible options for content moderation.
## Features
- ✅ Built-in Turkish profanity word list
- ✅ Proper handling of Turkish special characters (ğ, ü, ş, ı, ö, ç)
- ✅ Configurable word matching (whole words or partial)
- ✅ Case-sensitive or case-insensitive filtering
- ✅ Customizable replacement text
- ✅ Methods for detection, extraction, and censoring
- ✅ Dynamic word list management (add/remove words)
## Installation
```bash
# Using npm
npm install turkish-profanity-filter
# Using yarn
yarn add turkish-profanity-filter
# Using pnpm
pnpm add turkish-profanity-filter
```
## Quick Start
### CommonJS (Traditional Node.js)
```javascript
const TurkishProfanityFilter = require('turkish-profanity-filter');
// Create a new filter instance with default settings
const filter = new TurkishProfanityFilter();
// Check if text contains profanity
const hasProfanity = filter.check('Bu cümlede kötü bir kelime var mı?');
console.log('Contains profanity:', hasProfanity);
// Censor profanity in text
const censored = filter.censor('Bu cümlede kötü kelime sansürlenecek.');
console.log('Censored text:', censored);
// Get all profanity words found in text
const badWords = filter.getWords('Burada birkaç kötü kelime olabilir.');
console.log('Found profanity words:', badWords);
```
### ES6 Modules
```javascript
import TurkishProfanityFilter from 'turkish-profanity-filter';
// Create a new filter instance
const filter = new TurkishProfanityFilter();
// Using async/await with the filter
const processChatMessage = async (message) => {
// Check if message contains inappropriate content
if (filter.check(message)) {
// Censor the message
const cleanMessage = filter.censor(message);
return {
original: message,
censored: cleanMessage,
containsProfanity: true
};
}
return {
original: message,
censored: message,
containsProfanity: false
};
};
// Using with modern JavaScript features
const messages = [
'Merhaba nasılsın?',
'Bu kötü bir mesaj.',
'Güzel bir gün!'
];
// Using array methods with the filter
const processedMessages = messages.map(msg => ({
text: msg,
isProfane: filter.check(msg),
censored: filter.check(msg) ? filter.censor(msg) : msg
}));
console.log(processedMessages);
```
## Advanced Usage
### Custom Configuration
```javascript
// CommonJS
const TurkishProfanityFilter = require('turkish-profanity-filter');
// ES6
// import TurkishProfanityFilter from 'turkish-profanity-filter';
// Initialize with custom options
const filter = new TurkishProfanityFilter({
// Use your own word list instead of the default
wordList: ['kötü', 'çirkin', 'küfür'],
// Match only whole words (default: true)
wholeWords: true,
// Make matching case-sensitive (default: false)
caseSensitive: false,
// Custom replacement string (default: '***')
replacement: '[sansürlendi]'
});
```
### Modifying Word List Dynamically
```javascript
// ES6 with destructuring and spread operator
import TurkishProfanityFilter from 'turkish-profanity-filter';
const filter = new TurkishProfanityFilter();
// Add single word to the filter
filter.addWords('yeni-kötü-kelime');
// Add multiple words at once with ES6 array
const newBadWords = ['kelime1', 'kelime2', 'kelime3'];
filter.addWords(newBadWords);
// Remove a word from the filter
filter.removeWords('artık-kötü-değil');
// Remove multiple words at once
const wordsToRemove = ['temiz1', 'temiz2'];
filter.removeWords(wordsToRemove);
// Using the filter with ES6 string templates
const userName = 'Ahmet';
const userMessage = 'Bu bir kötü mesajdır';
const processedMessage = filter.check(userMessage)
? `${userName}: ${filter.censor(userMessage)}`
: `${userName}: ${userMessage}`;
console.log(processedMessage); // "Ahmet: Bu bir *** mesajdır"
```
### Integration with Express.js (ES6)
```javascript
import express from 'express';
import TurkishProfanityFilter from 'turkish-profanity-filter';
const app = express();
const filter = new TurkishProfanityFilter();
app.use(express.json());
// Middleware to filter profanity in request body using arrow functions
app.use((req, res, next) => {
if (req.body?.content) {
// Optional chaining operator (?.) - ES2020
// Check if content contains profanity
if (filter.check(req.body.content)) {
// Either reject the request
// return res.status(400).json({ error: 'Content contains inappropriate language' });
// Or censor the content
req.body.content = filter.censor(req.body.content);
}
}
next();
});
// Using arrow functions
app.post('/comments', (req, res) => {
// req.body.content is now free of profanity
// Save to database, etc.
res.json({ success: true });
});
// Using async/await with Express
app.get('/filter-stats', async (req, res) => {
try {
const stats = {
wordListSize: filter.options.wordList.length,
configuration: {
caseSensitive: filter.options.caseSensitive,
wholeWords: filter.options.wholeWords
}
};
res.json(stats);
} catch (error) {
res.status(500).json({ error: error.message });
}
});
const PORT = process.env.PORT || 3000;
app.listen(PORT, () => {
console.log(`Server running on port ${PORT}`);
});
```
## Special Handling for Turkish Characters
This library is specifically designed to work with Turkish text and properly handles Turkish special characters (ğ, ü, ş, ı, ö, ç). The implementation uses a custom approach for word boundary detection that works better with non-ASCII characters than JavaScript's standard `\b` word boundary.
```javascript
// ES6 example with Turkish characters
import TurkishProfanityFilter from 'turkish-profanity-filter';
const filter = new TurkishProfanityFilter();
// These will all be properly detected (with default case-insensitive setting)
console.log(filter.check('kötü')); // true
console.log(filter.check('KÖTÜ')); // true
console.log(filter.check('Çirkin')); // true
console.log(filter.check('KÜFÜR')); // true
// With whole word matching (default)
console.log(filter.check('kötülük')); // false - only matches whole words
// ES6 string methods with the filter
const textSamples = [
'Güzel bir gün',
'kötü bir söz',
'ÇIRKIN davranış',
'normal yazı'
];
// Using filter with array methods
const results = textSamples
.filter(text => filter.check(text))
.map(text => ({
original: text,
censored: filter.censor(text)
}));
console.log(results);
// [
// { original: 'kötü bir söz', censored: '*** bir söz' },
// { original: 'ÇIRKIN davranış', censored: '*** davranış' }
// ]
```
## API Reference
### Constructor
`new TurkishProfanityFilter(options)`
Creates a new filter instance with optional configuration.
**Options:**
| Option | Type | Default | Description |
|--------|------|---------|-------------|
| `wordList` | Array | Built-in list | Array of profanity words to detect |
| `wholeWords` | Boolean | `true` | Whether to match only whole words |
| `caseSensitive` | Boolean | `false` | Whether to match case-sensitively |
| `replacement` | String | `'***'` | String to replace profanity with |
### Methods
| Method | Parameters | Return Type | Description |
|--------|------------|-------------|-------------|
| `check(text)` | String | Boolean | Returns `true` if text contains profanity |
| `censor(text)` | String | String | Returns text with profanity replaced by the replacement string |
| `getWords(text)` | String | Array | Returns an array of all profanity words found in text |
| `addWords(words)` | String or Array | void | Add one word (string) or multiple words (array) to the filter |
| `removeWords(words)` | String or Array | void | Remove one word (string) or multiple words (array) from the filter |
## Performance Considerations
For optimal performance, especially with large texts or high traffic applications:
1. **Cache Results**: If checking the same text repeatedly, cache the results
2. **Batch Processing**: When processing large volumes of text, consider batching
3. **Word List Size**: Larger word lists will impact performance; keep it optimized
## ES6 Performance Example
```javascript
import TurkishProfanityFilter from 'turkish-profanity-filter';
// Create a memory cache using Map
const cache = new Map();
const checkWithCache = (filter, text) => {
// Return cached result if available
if (cache.has(text)) {
return cache.get(text);
}
// Calculate result and store in cache
const result = filter.check(text);
cache.set(text, result);
return result;
};
// Batch processing example
const batchProcess = (filter, textArray) => {
// Using Promise.all for parallel processing
return Promise.all(
textArray.map(async (text) => {
// Process each text item
return {
original: text,
containsProfanity: filter.check(text),
censored: filter.check(text) ? filter.censor(text) : text
};
})
);
};
// Usage
const filter = new TurkishProfanityFilter();
const messages = ['Merhaba', 'kötü kelime', 'Nasılsın?'];
// Process messages in batch
batchProcess(filter, messages)
.then(results => console.log(results))
.catch(error => console.error(error));
```
## Contributing
Contributions are welcome! Here's how you can help:
1. **Fork the repository**
2. **Create a feature branch**: `git checkout -b feature/amazing-feature`
3. **Make your changes**
4. **Run tests**: `npm test`
5. **Commit your changes**: `git commit -m 'Add amazing feature'`
6. **Push to your branch**: `git push origin feature/amazing-feature`
7. **Open a Pull Request**
### Contribution Guidelines
- Ensure all tests pass before submitting a PR
- Add tests for new features
- Follow the existing code style
- Update documentation for any changes
- Keep pull requests focused on a single feature/fix
## Word List Contributions
When contributing to the word list:
- Submit additions/removals as separate PRs
- Include reasoning for additions/removals
- Be mindful of cultural context and sensitivity
## Development
```bash
# Clone the repository
git clone https://github.com/derdogant/turkish-profanity-filter.git
cd turkish-profanity-filter
# Install dependencies
npm install
# Run tests
npm test
# Run the debug script
npm run debug
```
## License
MIT