UNPKG

content-guard

Version:

🛡️ Advanced content analysis and moderation system with multi-variant optimization. Features context-aware detection, harassment prevention, and ML-powered toxicity analysis. Pre-1.0 development version.

716 lines (629 loc) 27.9 kB
/** * 🕵️ ContentGuard v3.0 - Subtle Harassment Detection Plugin * * Specialized plugin for detecting sophisticated harassment that doesn't rely * on obvious profanity or threats. Focuses on condescending language, * gaslighting, coded harassment, and power dynamics. */ /** * Condescending and dismissive language patterns */ const CONDESCENDING_PATTERNS = [ // Direct dismissal (Enhanced from benchmark failures) /i'?m not surprised you'?re (struggling|having trouble|confused)/i, /if you actually (knew|understood|had)/i, /maybe you should (stick to|try|consider)/i, /it'?s cute that you (think|believe)/i, /perhaps it would be better if (you|someone)/i, /clearly (lack|don'?t understand|have no)/i, /you obviously (don'?t|can'?t|aren'?t)/i, /someone (more qualified|who actually knows)/i, /leave (this|it) to (the professionals|someone)/i, /that'?s exactly (what|the kind of)/i, // NEW: Capability questioning from benchmark /this kind of work (typically|usually) requires people who/i, /requires someone with more (experience|expertise)/i, /it'?s becoming clear that you lack/i, /i think it would be better if we assigned this to someone/i, /your performance.*shows you'?re not quite ready/i, /perhaps someone from the senior team should handle/i, /since it'?s (clearly|obviously) beyond your capabilities/i, /you might want to stick to simpler tasks until you develop/i, /beyond your capabilities/i, /more qualified.*should handle/i, /assigned.*to someone/i, // NEW: Identity-based dismissal from benchmark /you'?re not (exactly|really) what we had in mind/i, /people like you (usually|typically|often) struggle with/i, /this (industry|field|role) really isn'?t designed for someone/i, /you'?d probably be more comfortable in a role/i, /our (company )?culture tends to work better with people who/i, /you might want to consider whether you'?re really cut out/i, /this field requires a certain type of person/i, /given your obvious limitations/i, /not exactly what we had in mind/i, // MISSING FROM BENCHMARK: Key workplace condescension patterns /maybe someone else should take the lead on this since it's pretty technical/i, /you probably don't understand the nuances of this industry like the rest of us do/i, /are you sure you can handle this level of responsibility given your experience/i, /this might be too much for you - do you want me to assign it to someone else/i, /you look tired - is everything okay at home affecting your work/i, /given your experience level, i think we should have someone else review your work/i, /you might want to consider whether you're really cut out for this type of role/i, /you seem stressed lately - are you having trouble keeping up with the workload/i, /maybe someone else should take the lead.*since it's.*technical/i, /you probably don't understand.*nuances.*industry/i, /handle this level of responsibility.*given your experience/i, /might be too much for you.*assign.*someone else/i, /look tired.*everything okay at home.*affecting.*work/i, /experience level.*someone else review.*work/i, /consider whether you're really cut out.*type of role/i, /seem stressed.*trouble keeping up.*workload/i, // Intellectual superiority /you might want to educate yourself/i, /requires a certain level of (expertise|knowledge)/i, /this (requires|needs) (someone|a person) who/i, /people at your level/i, /beyond your (capabilities|understanding)/i, /not exactly what we had in mind/i, /different type of person/i, // Patronizing language /i'?m not surprised you'?re confused/i, /let me explain this (simply|in terms you)/i, /you seem to (struggle|have trouble) with/i, /this might be (too advanced|difficult) for/i, /bless your heart/i, /how (adorable|sweet|precious)/i, // NEW: Professional boundary violations /any competent (person|engineer|professional) would have/i, /this is (exactly )?why i never trust (junior|new)/i, /i can'?t believe you'?re still employed here/i, /this makes me question your basic qualifications/i, /i'?m embarrassed for you that you submitted/i, /this is the worst (proposal|work|analysis) i'?ve seen/i, /still employed here/i, /question.*basic qualifications/i, /embarrassed for you/i, // NEW: Condescending management /maybe you should consider whether this role is really/i, /it'?s becoming clear that you lack the (critical thinking|skills)/i, /who actually understands the requirements/i, /lack.*critical thinking/i, /actually understands/i, // NEW: Missing subtle performance harassment patterns /your performance.*shows you'?re not quite ready for/i, /not quite ready for (responsibilities|tasks|this level)/i, /maybe someone else should (take the lead|handle this)/i, /you probably don'?t understand the nuances/i, /this might be (too much|overwhelming|too complex) for you/i, /are you sure you can handle this level of responsibility/i, /given your (experience level|background|limitations)/i, /perhaps.*someone.*more (qualified|experienced)/i, // NEW: Emotional invalidation and gaslighting that's still missing /your emotional response is making it difficult/i, /you'?re being way too (sensitive|emotional|dramatic)/i, /maybe you should take a step back and think/i, /you'?re making this about something it'?s not/i, /that'?s not how normal people (react|respond|handle)/i, /you always (react this way|do this|make it about)/i, // BENCHMARK MISSING: Coded identity harassment patterns /you bring such a unique perspective to our traditional workplace environment/i, /it's great to see diversity working out so well in our technical department/i, /you're so well-spoken and articulate - much better than i expected initially/i, /i'm impressed by how professional you've become since starting here/i, /you fit in surprisingly well with our established company culture and values/i, /your work ethic is quite impressive for someone with your background/i, /you've really adapted well to our professional standards and expectations/i, /it's refreshing to work with someone who defies the usual stereotypes/i, /bring such a.*unique perspective.*traditional workplace/i, /diversity working out so well.*technical department/i, /well-spoken.*articulate.*better than.*expected/i, /impressed.*professional.*become since starting/i, /fit in surprisingly well.*company culture/i, /work ethic.*impressive.*someone with your background/i, /adapted well.*professional standards/i, /refreshing.*defies.*usual stereotypes/i ] /** * Gaslighting and manipulation patterns */ const GASLIGHTING_PATTERNS = [ // Denying reality - ENHANCED FROM BENCHMARK ANALYSIS /you'?re (clearly|obviously) (overreacting|being dramatic)/i, /i never said that.*you (must be|are) (imagining|misremembering)/i, /(everyone else|no one else)( on the team| in the group)? (understands|has|agrees with|gets) (this|any|it|the)?( problem| issue| concept| point)?( perfectly fine| without issue)?/i, /you'?re being (way )?too sensitive/i, /you'?re (projecting|making this up)/i, /that'?s not what (happened|i said|i meant)/i, // MISSING CRITICAL BENCHMARK PATTERNS: Direct gaslighting /you're clearly overreacting to what was obviously just constructive feedback/i, /i never said that - you must be misremembering our conversation/i, /everyone else on the team understands this perfectly fine, so the problem must be with you/i, /you're being way too sensitive about this - no one else has any issues/i, /you're projecting your own insecurities onto this situation/i, /maybe you should take a step back and consider whether you're being rational/i, /your emotional response is making it difficult to have a productive conversation/i, /that's not what happened and you know it - you're distorting the facts/i, /clearly overreacting.*obviously.*constructive feedback/i, /never said that.*misremembering our conversation/i, /everyone else.*understands.*perfectly.*problem.*with you/i, /being way too sensitive.*no one else.*issues/i, /projecting.*insecurities.*situation/i, /step back.*consider.*being rational/i, /emotional response.*difficult.*productive conversation/i, /not what happened.*know it.*distorting.*facts/i, // Emotional invalidation /your emotional response (is|makes it)/i, /you need to (calm down|relax|chill)/i, /you'?re being (irrational|unreasonable|hysterical)/i, /maybe you should (take a step back|think about)/i, /you'?re making this (about|into)/i, // Reality distortion /you'?re reading too much into/i, /that'?s not how (normal|reasonable) people/i, /you have a habit of/i, /you always (do this|react this way)/i, /this is why (no one|people don'?t)/i ] /** * Coded threats and veiled intimidation */ const VEILED_THREAT_PATTERNS = [ // Consequences and implications - ENHANCED FROM BENCHMARK ANALYSIS /it would be (unfortunate|a shame) if/i, /i wonder what (people|your|others) would (think|say)/i, /accidents happen (all the time )?to people who/i, /you seem like the type who wouldn'?t want/i, /interesting how easy it is to find/i, /i hope nothing bad happens to/i, /people who cause problems.*find that problems/i, // MISSING CRITICAL BENCHMARK PATTERNS: Veiled threats /it would be unfortunate if certain information about your performance became widely known/i, /i wonder what your colleagues would think if they knew about your situation at home/i, /accidents happen all the time to people who don't know when to keep quiet/i, /you seem like the type who wouldn't want their personal information floating around/i, /it's interesting how easy it is to find someone's address and family details these days/i, /i hope nothing bad happens to disrupt your comfortable little life and career/i, /people who cause problems for others sometimes find that problems come back to them/i, /it would be a shame if your reputation in this industry were to be permanently damaged/i, /unfortunate.*certain information.*performance.*widely known/i, /wonder what.*colleagues.*think.*knew.*situation at home/i, /accidents happen.*people.*don't know.*keep quiet/i, /type who wouldn't want.*personal information.*floating around/i, /easy.*find.*address.*family details/i, /hope nothing bad happens.*disrupt.*comfortable.*life.*career/i, /cause problems.*others.*problems come back/i, /shame.*reputation.*industry.*permanently damaged/i, // Information threats /certain information about you/i, /your personal information.*floating around/i, /if they knew about your.*situation/i, /what would happen if.*became public/i, /some things are better kept (private|quiet)/i, // Workplace intimidation /might not be the right (fit|match) for/i, /this position.*different kind of/i, /your (position|future) here (depends|relies)/i, /others have been (let go|terminated) for less/i, /you'?d hate to (lose|jeopardize)/i ] /** * Identity-based coded attacks */ const IDENTITY_ATTACK_PATTERNS = [ // Coded discrimination /people like you (always|usually|tend to)/i, /this (field|industry|role) (really )?isn'?t (meant|designed) for/i, /you'?re not (exactly|really) what we (had in mind|were looking for)/i, /better suited to your (background|experience)/i, /fits our (company )?culture/i, /certain types of people/i, // NEW: Missing coded discrimination patterns from benchmark /you'?re surprisingly articulate for someone with your background/i, /i'?m impressed by how well you'?ve adapted to our professional environment/i, /you'?re different from what we usually see in this type of role/i, /it'?s refreshing to work with someone who'?s not like the typical people/i, /you'?ve exceeded our expectations.*which were honestly pretty low/i, /for someone like you.*you'?ve really managed to fit in/i, /you'?re not bad at this job.*which is surprising given your/i, /i have to admit.*you'?re more competent than i thought/i, /surprisingly articulate/i, /exceeded our expectations.*pretty low/i, /more competent than i thought/i, /adapted to our professional environment/i, /different from what we usually see/i, /not like the typical people in this field/i, // Othering language /you people (always|never|tend to)/i, /where you come from/i, /your kind of/i, /not from around here/i, /the way your people/i, // Competence questioning /surprised you (got this far|made it)/i, /how did you even (get|end up)/i, /must have been (diversity|quota)/i, /affirmative action/i, /token/i ] /** * Power dynamic exploitation patterns */ const POWER_DYNAMIC_PATTERNS = [ // Authority abuse (Enhanced for flexibility) /because i said so/i, /i don'?t have to explain/i, /you'?re in no position to/i, /know your place/i, /remember who (you'?re talking to|signs your)/i, /i can make (your life|things).*(difficult|hell)/i, // More flexible matching /make your life.*(difficult|hell|miserable)/i, // Additional pattern // MISSING CRITICAL BENCHMARK PATTERNS: Core power abuse /remember who signs your paycheck before you decide to push back on my decisions/i, /you work for me, so i suggest you adjust your attitude and follow my instructions/i, /i can make your life very difficult here if you continue to question my authority/i, /you're easily replaceable, and there are plenty of people who would love this job/i, /good luck finding another job in this industry after i'm done with you/i, /i own this department and everyone in it, including you/i, /you serve at my pleasure, and that pleasure is rapidly diminishing/i, /don't forget who's in charge here and what that means for your future/i, /remember who signs your paycheck.*push back.*decisions/i, /work for me.*adjust your attitude.*follow.*instructions/i, /make your life.*difficult.*continue.*question.*authority/i, /easily replaceable.*plenty of people.*love this job/i, /good luck finding.*job.*industry.*done with you/i, /own this department.*everyone.*including you/i, /serve at my pleasure.*rapidly diminishing/i, /don't forget.*in charge.*means.*future/i, // Hierarchical intimidation (Enhanced) /don'?t forget who'?s in charge/i, /you work for me/i, /i own (you|this department|everyone)/i, /you serve at (my|our) pleasure/i, /easily replaced/i, /(you'?re )?dime a dozen/i, // Economic threats (Enhanced) /good luck finding another/i, /no one else will hire/i, /i'?ll make sure you never/i, /blacklist/i, /reputation in this industry/i, // NEW: Missing patterns from benchmark failures /i can make.*(difficult|hell|miserable)/i, /you'?re easily replaceable/i, /be careful about how you speak to me/i, /watch your tone/i, /question my authority/i, /job security depends on/i, /keeping me happy/i, /insubordinate behavior/i, // Additional authority exploitation from benchmark analysis /serve at (my|our) pleasure/i, /that pleasure is rapidly diminishing/i, /good luck finding another (job|position) in this industry/i, /i'll make sure you never (work|get hired)/i, /your job security depends on keeping me happy/i, /insubordinate behavior/i, /remember who signs your paycheck/i, /adjust your attitude/i, /i own this department and everyone in it/i, /make your life.*difficult here if you continue/i ] /** * Social exclusion and isolation tactics */ const SOCIAL_EXCLUSION_PATTERNS = [ // Group dynamics (Enhanced for benchmark failures) /no one (here|else) (likes|wants|agrees with)/i, /(everyone|we all) (thinks|agrees|knows) you/i, /you don'?t belong here/i, /(maybe you should|you might want to|perhaps you should) find (somewhere|someone|something)( else| better)?/i, /not welcome here/i, /you\'?re not (really |truly |actually )?one of us/i, // MISSING CRITICAL BENCHMARK PATTERNS: Team exclusion /nobody on the team really wants to work with you on important projects/i, /everyone's been talking about your attitude problems behind your back/i, /you don't really fit in with our group dynamic here/i, /the team doesn't trust you with sensitive or important tasks/i, /you're not really one of us, and it shows in how you approach work/i, /colleagues have been complaining about having to collaborate with you/i, /people are starting to notice your behavior and it's affecting team morale/i, /you're making everyone uncomfortable with your approach to things/i, /nobody.*team.*wants to work.*important projects/i, /everyone.*talking.*attitude problems.*behind your back/i, /don't really fit.*group dynamic/i, /team doesn't trust.*sensitive.*important tasks/i, /not really one of us.*shows.*approach work/i, /colleagues.*complaining.*collaborate with you/i, /people.*starting to notice.*behavior.*team morale/i, /making everyone uncomfortable.*approach/i, // Professional isolation (Enhanced for benchmark failures) /no one wants to work with/i, /team doesn'?t (trust|respect)/i, /colleagues have been (talking|complaining)/i, /might want to consider (transferring|leaving)/i, /not a team player/i, /doesn'?t fit in/i, // NEW: Missing patterns from benchmark failures /nobody on the team.*wants to work with you/i, /everyone'?s been talking about.*(behind your back|attitude problems)/i, /you don'?t really fit in with.*group dynamic/i, /team.*doesn'?t want/i, /nobody.*wants to work with/i, /everyone.*talking.*behind/i, /attitude problems/i, /group dynamic.*team culture/i, /really fit in/i, /talking.*behind your back/i, /team culture here/i, // Team exclusion /nobody.*wants to work with you/i, /everyone.*talking.*behind your back/i, /you don'?t.*fit in with.*group/i, /maybe you should.*transfer.*different department/i, /team doesn'?t trust you/i, /you'?re not really one of us/i, /colleagues.*complaining.*about.*collaborating with you/i, /find somewhere.*your personality.*better fit/i, // NEW: Missing social exclusion patterns from benchmark /nobody on the team really wants to work with you/i, /everyone's been talking about your attitude problems/i, /you don't really fit in with our group dynamic/i, /maybe you should consider transferring to a different department/i, /the team doesn't trust you with important tasks/i, /colleagues have been complaining about having to collaborate/i, /you might want to find somewhere else where your personality/i, /not really one of us.*shows in how you approach/i, // Workplace ostracism /no one.*wants.*work.*with you/i, /people.*avoiding.*working.*with/i, /excluded.*from.*meetings/i, /left out.*discussions/i, /not invited.*team/i, /better.*somewhere else/i, /doesn'?t.*belong.*here/i ] /** * Microaggression patterns */ const MICROAGGRESSION_PATTERNS = [ // Backhanded compliments /you'?re (pretty|really) (smart|articulate) for/i, /surprisingly (good|competent|capable)/i, /better than (i|we) expected/i, /not bad for (a|someone)/i, /you'?re (different|not like)/i, // Competence assumptions /are you sure you (can|should)/i, /maybe someone else should (handle|do)/i, /this might be (too much|overwhelming)/i, /you probably don'?t (understand|know)/i, /let me (help|show) you/i, // Appearance/personal comments /you look (tired|stressed|overwhelmed)/i, /having a bad (day|week)/i, /you seem (upset|emotional|moody)/i, /time of the month/i, /is everything (okay|alright) at home/i ] class HarassmentPlugin { constructor() { this.initialized = false this.options = {} } async init(config = {}) { this.options = { // Scoring weights for different harassment types condescendingWeight: 8, gaslightingWeight: 10, veiledThreatWeight: 12, identityAttackWeight: 10, powerDynamicWeight: 11, socialExclusionWeight: 9, microaggressionWeight: 6, // Thresholds minimumConfidence: 0.3, contextAware: true, // Detection sensitivity requireMultipleIndicators: false, escalationBonus: 1.5, // Bonus for multiple harassment types ...config } this.initialized = true } async analyze(content, input, globalOptions = {}) { if (!this.initialized) { await this.init() } let score = 0 const flags = [] const detectedTypes = [] const evidence = [] // Analyze for different harassment patterns const results = [ this.analyzeCondescending(content), this.analyzeGaslighting(content), this.analyzeVeiledThreats(content), this.analyzeIdentityAttacks(content), this.analyzePowerDynamics(content), this.analyzeSocialExclusion(content), this.analyzeMicroaggressions(content) ] // Combine results results.forEach(result => { if (result.score > 0) { score += result.score flags.push(...result.flags) detectedTypes.push(result.type) evidence.push(...result.evidence) } }) // Apply escalation bonus for multiple harassment types if (detectedTypes.length >= 2) { const bonus = Math.floor(score * (this.options.escalationBonus - 1)) score += bonus flags.push(`Multiple harassment patterns detected (${detectedTypes.length} types) +${bonus} points`) } // Context-aware adjustments if (this.options.contextAware && content.context?.isProfessional) { // Workplace harassment is often more serious if (detectedTypes.includes('power_dynamics') || detectedTypes.includes('identity_attacks')) { const contextBonus = Math.floor(score * 0.3) score += contextBonus flags.push(`Professional context harassment detected +${contextBonus} points`) } } // Pattern analysis for sophisticated detection const sophisticationAnalysis = this.analyzeSophistication(content, evidence) if (sophisticationAnalysis.isSubtle) { score += sophisticationAnalysis.bonus flags.push(`Sophisticated harassment pattern detected +${sophisticationAnalysis.bonus} points`) } if (globalOptions.debug) { console.log('🕵️ HARASSMENT: Detected types:', detectedTypes) console.log('🕵️ HARASSMENT: Evidence count:', evidence.length) console.log('🕵️ HARASSMENT: Final score:', score) } return { score: Math.round(score), flags, detectedTypes, evidence: globalOptions.debug ? evidence : undefined, confidence: this.calculateConfidence(detectedTypes, evidence) } } analyzeCondescending(content) { const matches = this.findPatternMatches(content.allText, CONDESCENDING_PATTERNS) const score = matches.length * this.options.condescendingWeight return { type: 'condescending', score, flags: matches.map(match => `Condescending language: "${match}"`), evidence: matches } } analyzeGaslighting(content) { const matches = this.findPatternMatches(content.allText, GASLIGHTING_PATTERNS) const score = matches.length * this.options.gaslightingWeight return { type: 'gaslighting', score, flags: matches.map(match => `Gaslighting pattern: "${match}"`), evidence: matches } } analyzeVeiledThreats(content) { const matches = this.findPatternMatches(content.allText, VEILED_THREAT_PATTERNS) const score = matches.length * this.options.veiledThreatWeight return { type: 'veiled_threats', score, flags: matches.map(match => `Veiled threat: "${match}"`), evidence: matches } } analyzeIdentityAttacks(content) { const matches = this.findPatternMatches(content.allText, IDENTITY_ATTACK_PATTERNS) const score = matches.length * this.options.identityAttackWeight return { type: 'identity_attacks', score, flags: matches.map(match => `Identity-based attack: "${match}"`), evidence: matches } } analyzePowerDynamics(content) { const matches = this.findPatternMatches(content.allText, POWER_DYNAMIC_PATTERNS) const score = matches.length * this.options.powerDynamicWeight return { type: 'power_dynamics', score, flags: matches.map(match => `Power dynamic abuse: "${match}"`), evidence: matches } } analyzeSocialExclusion(content) { const matches = this.findPatternMatches(content.allText, SOCIAL_EXCLUSION_PATTERNS) const score = matches.length * this.options.socialExclusionWeight return { type: 'social_exclusion', score, flags: matches.map(match => `Social exclusion: "${match}"`), evidence: matches } } analyzeMicroaggressions(content) { const matches = this.findPatternMatches(content.allText, MICROAGGRESSION_PATTERNS) const score = matches.length * this.options.microaggressionWeight return { type: 'microaggressions', score, flags: matches.map(match => `Microaggression: "${match}"`), evidence: matches } } findPatternMatches(text, patterns) { const matches = [] patterns.forEach(pattern => { const match = text.match(pattern) if (match) { matches.push(this.cleanMatch(match[0])) } }) return matches } cleanMatch(match) { // Clean up match for display, limit length return match.length > 50 ? match.substring(0, 47) + '...' : match } analyzeSophistication(content, evidence) { // Check for sophisticated harassment patterns const text = content.allTextLower // Look for professional language mixed with harassment const professionalWords = ['colleague', 'professional', 'workplace', 'team', 'project', 'feedback'] const hasProfessionalLanguage = professionalWords.some(word => text.includes(word)) // Look for indirect language const indirectPhrases = ['it might be', 'perhaps', 'maybe you should consider', 'it would be wise'] const hasIndirectLanguage = indirectPhrases.some(phrase => text.includes(phrase)) // Look for false politeness const politenessPhrases = ['with all due respect', 'i mean no offense', 'just trying to help'] const hasFalsePoliteness = politenessPhrases.some(phrase => text.includes(phrase)) const sophisticationScore = (hasProfessionalLanguage ? 1 : 0) + (hasIndirectLanguage ? 1 : 0) + (hasFalsePoliteness ? 1 : 0) + (evidence.length >= 2 ? 1 : 0) return { isSubtle: sophisticationScore >= 2, bonus: sophisticationScore * 2, indicators: { hasProfessionalLanguage, hasIndirectLanguage, hasFalsePoliteness, multiplePatterns: evidence.length >= 2 } } } calculateConfidence(detectedTypes, evidence) { if (detectedTypes.length === 0) return 'No harassment detected' const typeCount = detectedTypes.length const evidenceCount = evidence.length // Calculate confidence based on multiple factors let confidence = 0 // Multiple harassment types increase confidence confidence += Math.min(typeCount * 0.3, 0.8) // Multiple pieces of evidence increase confidence confidence += Math.min(evidenceCount * 0.2, 0.6) // Sophisticated patterns increase confidence if (typeCount >= 2 && evidenceCount >= 3) { confidence += 0.2 } confidence = Math.min(1.0, confidence) if (confidence >= 0.8) return 'Very high confidence' if (confidence >= 0.6) return 'High confidence' if (confidence >= 0.4) return 'Moderate confidence' if (confidence >= 0.2) return 'Low confidence' return 'Very low confidence' } } module.exports = HarassmentPlugin