peertube-plugin-german-ai-mod
Version:
Local AI moderation for German comments using ml6team DistilBERT and deepset BERT models
120 lines (90 loc) • 4.17 kB
Markdown
# PeerTube Plugin: German AI Moderation
AI-powered comment moderation plugin for PeerTube using two-stage German toxicity detection with ml6team DistilBERT and deepset BERT models.
## Features
- **Dual-Model System**:
- Primary: `ml6team/distilbert-base-german-cased-toxic-comments` - Fast general toxicity detection (67M, optimized for comments)
- Secondary: `deepset/bert-base-german-cased-hatespeech-GermEval18Coarse` - Specialized hate speech detection
- **German Language Detection**: Automatically detects comment language using `langdetect`
- **Two-Level Filtering**:
- General toxicity threshold (configurable, default: 0.7)
- Hate speech threshold (configurable, default: 0.5)
- **Non-German Comments**: Configurable action for comments in other languages (moderation queue or block)
- **Fail-Open Strategy**: Comments are allowed if AI service is unavailable (prevents site breakage)
## Requirements
- PeerTube >= 7.0.0
- AI moderation service (FastAPI) running in Docker network
- Docker container with AI service accessible at `http://ai-moderator:8000/analyze`
## Installation
1. Install via PeerTube admin panel:
- Go to **Administration → Plugins/Themes**
- Click **Install a plugin/the theme**
- Enter package name: `peertube-plugin-german-ai-mod`
- Or upload ZIP file
2. Or install via npm:
```bash
npm install peertube-plugin-german-ai-mod
```
## Configuration
After installation, configure the plugin in PeerTube admin panel:
- **AI Endpoint URL**: URL of the AI moderation service (default: `http://ai-moderator:8000/analyze`)
- **Toxizitäts-Schwelle (0–1)**: Threshold for general toxicity detection (default: `0.7`)
- **Hate Speech Schwelle (0–1)**: Threshold for hate speech detection (default: `0.5`)
- **Aktion für nicht-deutsche Kommentare**:
- `Zur Moderation zurückhalten` - Send to moderation queue (default)
- `Blockieren` - Block immediately
### Recommended Thresholds
- **Moderate (default)**: Toxicity 0.7, Hate Speech 0.5 - Blocks only explicit insults
- **Strict**: Toxicity 0.5, Hate Speech 0.4 - Blocks more aggressive comments
- **Soft**: Toxicity 0.8, Hate Speech 0.6 - Blocks only very explicit insults
## AI Service Setup
This plugin requires a separate AI moderation service. See the `ai-moderation` directory for FastAPI service setup with Docker.
The AI service should provide a POST endpoint `/analyze` that accepts:
```json
{
"text": "comment text here"
}
```
And returns:
```json
{
"toxic": false,
"score": 0.15,
"language": "de",
"is_german": true,
"requires_moderation": false,
"hate_score": 0.05
}
```
Where:
- `toxic`: boolean - whether comment should be blocked (based on thresholds)
- `score`: float (0-1) - general toxicity score from primary model
- `hate_score`: float (0-1) - hate speech score from secondary model
- `language`: string - detected language code
- `is_german`: boolean - whether comment is in German
- `requires_moderation`: boolean - whether comment requires manual review
## How It Works
1. When a comment is created, the plugin sends it to the AI service
2. AI service detects the language using `langdetect`
3. For non-German comments: sends to moderation or blocks (based on settings)
4. For German comments:
- Primary model checks general toxicity
- Secondary model checks for hate speech
- Comment is blocked if `score >= toxicity_threshold` OR `hate_score >= hate_threshold`
5. Toxic/hate speech comments are automatically blocked
## Hooks Used
- `filter:api.video-thread.create.accept.result` - Main comments
- `filter:api.video-comment-reply.create.accept.result` - Replies
- `filter:activity-pub.remote-video-comment.create.accept.result` - Federated comments
## Models Used
- **Primary**: `ml6team/distilbert-base-german-cased-toxic-comments`
- Lightweight (67M parameters)
- Trained on 5 German toxicity datasets
- Optimized for comment toxicity detection
- **Secondary**: `deepset/bert-base-german-cased-hatespeech-GermEval18Coarse`
- Specialized for hate speech detection
- Trained on GermEval18 dataset
- Coarse-grained classification
## License
AGPL-3.0
## Author
yarkolife