---
name: ai-ethics
description: Responsible AI development and ethical considerations. Use when evaluating
  AI bias, implementing fairness measures, conducting ethical assessments, or ensuring
  AI systems align with human values.
author: Joseph OBrien
status: unpublished
updated: '2025-12-23'
version: 1.0.1
tag: skill
type: skill
---

# AI Ethics

Comprehensive AI ethics skill covering bias detection, fairness assessment, responsible AI development, and regulatory compliance.

## When to Use This Skill

- Evaluating AI models for bias
- Implementing fairness measures
- Conducting ethical impact assessments
- Ensuring regulatory compliance (EU AI Act, etc.)
- Designing human-in-the-loop systems
- Creating AI transparency documentation
- Developing AI governance frameworks

## Ethical Principles

### Core AI Ethics Principles

| Principle | Description |
|-----------|-------------|
| **Fairness** | AI should not discriminate against individuals or groups |
| **Transparency** | AI decisions should be explainable |
| **Privacy** | Personal data must be protected |
| **Accountability** | Clear responsibility for AI outcomes |
| **Safety** | AI should not cause harm |
| **Human Agency** | Humans should maintain control |

### Stakeholder Considerations

- **Users**: How does this affect people using the system?
- **Subjects**: How does this affect people the AI makes decisions about?
- **Society**: What are broader societal implications?
- **Environment**: What is the environmental impact?

## Bias Detection & Mitigation

### Types of AI Bias

| Bias Type | Source | Example |
|-----------|--------|---------|
| Historical | Training data reflects past discrimination | Hiring models favoring male candidates |
| Representation | Underrepresented groups in training data | Face recognition failing on darker skin |
| Measurement | Proxy variables for protected attributes | ZIP code correlating with race |
| Aggregation | One model for diverse populations | Medical model trained only on one ethnicity |
| Evaluation | Biased evaluation metrics | Accuracy hiding disparate impact |

### Fairness Metrics

**Group Fairness:**

- Demographic Parity: Equal positive rates across groups
- Equalized Odds: Equal TPR and FPR across groups
- Predictive Parity: Equal precision across groups

**Individual Fairness:**

- Similar individuals should receive similar predictions
- Counterfactual fairness: Would outcome change if protected attribute differed?

### Bias Mitigation Strategies

**Pre-processing:**

- Resampling/reweighting training data
- Removing biased features
- Data augmentation for underrepresented groups

**In-processing:**

- Fairness constraints in loss function
- Adversarial debiasing
- Fair representation learning

**Post-processing:**

- Threshold adjustment per group
- Calibration
- Reject option classification

## Explainability & Transparency

### Explanation Types

| Type | Audience | Purpose |
|------|----------|---------|
| Global | Developers | Understand overall model behavior |
| Local | End users | Explain specific decisions |
| Counterfactual | Affected parties | What would need to change for different outcome |

### Explainability Techniques

- **SHAP**: Feature importance values
- **LIME**: Local interpretable explanations
- **Attention maps**: For neural networks
- **Decision trees**: Inherently interpretable
- **Feature importance**: Global model understanding

### Model Cards

Document for each model:

- Model purpose and intended use
- Training data description
- Performance metrics by subgroup
- Limitations and ethical considerations
- Version and update history

## AI Governance

### AI Risk Assessment

**Risk Categories (EU AI Act):**

| Risk Level | Examples | Requirements |
|------------|----------|--------------|
| Unacceptable | Social scoring, manipulation | Prohibited |
| High | Healthcare, employment, credit | Strict requirements |
| Limited | Chatbots | Transparency obligations |
| Minimal | Spam filters | No requirements |

### Governance Framework

1. **Policy**: Define ethical principles and boundaries
2. **Process**: Review and approval workflows
3. **People**: Roles and responsibilities (ethics board)
4. **Technology**: Tools for monitoring and enforcement

### Documentation Requirements

- Data provenance and lineage
- Model training documentation
- Testing and validation results
- Deployment and monitoring plans
- Incident response procedures

## Human Oversight

### Human-in-the-Loop Patterns

| Pattern | Use Case | Example |
|---------|----------|---------|
| Human-in-the-Loop | High-stakes decisions | Medical diagnosis confirmation |
| Human-on-the-Loop | Monitoring with intervention | Content moderation escalation |
| Human-out-of-Loop | Low-risk, high-volume | Spam filtering |

### Designing for Human Control

- Clear escalation paths
- Override capabilities
- Confidence thresholds for automation
- Audit trails
- Feedback mechanisms

## Privacy Considerations

### Data Minimization

- Collect only necessary data
- Anonymize when possible
- Aggregate rather than individual data
- Delete data when no longer needed

### Privacy-Preserving Techniques

- Differential privacy
- Federated learning
- Secure multi-party computation
- Homomorphic encryption

## Environmental Impact

### Considerations

- Training compute requirements
- Inference energy consumption
- Hardware lifecycle
- Data center energy sources

### Mitigation

- Efficient architectures
- Model distillation
- Transfer learning
- Green hosting providers

## Reference Files

- **`references/bias_assessment.md`** - Detailed bias evaluation methodology
- **`references/regulatory_compliance.md`** - AI regulation requirements

## Integration with Other Skills

- **machine-learning** - For model development
- **testing** - For bias testing
- **documentation** - For model cards