--- name: content-moderation-api description: Content moderation API integration using OpenAI Moderation, Perspective API, and others allowed-tools: - Read - Write - Edit - Bash - Glob - Grep --- # Content Moderation API Skill ## Capabilities - Integrate OpenAI Moderation API - Set up Perspective API for toxicity detection - Configure moderation thresholds - Implement content filtering pipelines - Design moderation response handling - Create moderation logging and reporting ## Target Processes - content-moderation-safety - system-prompt-guardrails ## Implementation Details ### Moderation APIs 1. **OpenAI Moderation**: Hate, violence, self-harm, sexual content 2. **Perspective API**: Toxicity, insult, profanity, threat 3. **Azure Content Safety**: Text and image moderation 4. **LlamaGuard**: Open-source safety classifier ### Configuration Options - API credentials and endpoints - Category thresholds - Action policies (block, warn, flag) - Logging configuration - Fallback behavior ### Best Practices - Set appropriate thresholds - Handle edge cases gracefully - Log moderation decisions - Regular threshold review - Multi-layer moderation ### Dependencies - openai - google-cloud-language (Perspective) - azure-ai-contentsafety