--- name: mediapipe-pose-detection description: MediaPipe pose detection expertise. Use when debugging landmark tracking, adjusting confidence thresholds, fixing pose detection issues, working with pose.py and video_io.py, or validating pose detection with manual observation. --- # MediaPipe Pose Detection ## Key Landmarks for Jump Analysis ### Lower Body (Primary for Jumps) | Landmark | Left Index | Right Index | Use Case | | -------- | ---------- | ----------- | --------------------------- | | Hip | 23 | 24 | Center of mass, jump height | | Knee | 25 | 26 | Triple extension, landing | | Ankle | 27 | 28 | Ground contact detection | | Heel | 29 | 30 | Takeoff/landing timing | | Toe | 31 | 32 | Forefoot contact | ### Upper Body (Secondary) | Landmark | Left Index | Right Index | Use Case | | -------- | ---------- | ----------- | ------------------ | | Shoulder | 11 | 12 | Arm swing tracking | | Elbow | 13 | 14 | Arm action | | Wrist | 15 | 16 | Arm swing timing | ### Reference Points | Landmark | Index | Use Case | | --------- | ----- | ---------------- | | Nose | 0 | Head position | | Left Eye | 2 | Face orientation | | Right Eye | 5 | Face orientation | ## Confidence Thresholds ### Default Settings ```python min_detection_confidence = 0.5 # Initial pose detection min_tracking_confidence = 0.5 # Frame-to-frame tracking ``` ### Quality Presets (auto_tuning.py) | Preset | Detection | Tracking | Use Case | | ---------- | --------- | -------- | ---------------------------------- | | `fast` | 0.3 | 0.3 | Quick processing, tolerates errors | | `balanced` | 0.5 | 0.5 | Default, good accuracy | | `accurate` | 0.7 | 0.7 | Best accuracy, slower | ### Tuning Guidelines - **Increase thresholds** when: Jittery landmarks, false detections - **Decrease thresholds** when: Missing landmarks, tracking loss - **Typical adjustment**: ±0.1 increments ## Common Issues and Solutions ### Landmark Jitter **Symptoms**: Landmarks jump erratically between frames **Solutions**: 1. Apply Butterworth low-pass filter (cutoff 6-10 Hz) 2. Increase tracking confidence 3. Use One-Euro filter for real-time applications ```python # Butterworth filter (filtering.py) from kinemotion.core.filtering import butterworth_filter smoothed = butterworth_filter(landmarks, cutoff=8.0, fps=30) # One-Euro filter (smoothing.py) from kinemotion.core.smoothing import one_euro_filter smoothed = one_euro_filter(landmarks, min_cutoff=1.0, beta=0.007) ``` ### Left/Right Confusion **Symptoms**: MediaPipe swaps left and right landmarks mid-video **Cause**: Occlusion at 90° lateral camera angle **Solutions**: 1. Use 45° oblique camera angle (recommended) 2. Post-process to detect and correct swaps 3. Use single-leg tracking when possible ### Tracking Loss **Symptoms**: Landmarks disappear for several frames **Causes**: - Athlete moves out of frame - Fast motion blur - Occlusion by equipment/clothing **Solutions**: 1. Ensure full athlete visibility throughout video 2. Use higher frame rate (60+ fps) 3. Interpolate missing frames (up to 3-5 frames) ```python # Simple linear interpolation for gaps import numpy as np def interpolate_gaps(landmarks, max_gap=5): # Fill NaN gaps with linear interpolation for i in range(landmarks.shape[1]): mask = np.isnan(landmarks[:, i]) if mask.sum() > 0 and mask.sum() <= max_gap: landmarks[:, i] = np.interp( np.arange(len(landmarks)), np.where(~mask)[0], landmarks[~mask, i] ) return landmarks ``` ### Low Confidence Scores **Symptoms**: Visibility scores consistently below threshold **Causes**: - Poor lighting (backlighting, shadows) - Low contrast clothing vs background - Partial occlusion **Solutions**: 1. Improve lighting (front-lit, even) 2. Ensure clothing contrasts with background 3. Remove obstructions from camera view ## Video Processing (video_io.py) ### Rotation Handling Mobile videos often have rotation metadata that must be handled: ```python # video_io.py handles this automatically # Reads EXIF rotation and applies correction from kinemotion.core.video_io import read_video_frames frames, fps, dimensions = read_video_frames("mobile_video.mp4") # Frames are correctly oriented regardless of source ``` ### Manual Rotation (if needed) ```bash # FFmpeg rotation options ffmpeg -i input.mp4 -vf "transpose=1" output.mp4 # 90° clockwise ffmpeg -i input.mp4 -vf "transpose=2" output.mp4 # 90° counter-clockwise ffmpeg -i input.mp4 -vf "hflip" output.mp4 # Horizontal flip ``` ### Frame Dimensions Always read actual frame dimensions from first frame, not metadata: ```python # Correct approach cap = cv2.VideoCapture(video_path) ret, frame = cap.read() height, width = frame.shape[:2] # Incorrect (may be wrong for rotated videos) width = int(cap.get(cv2.CAP_PROP_FRAME_WIDTH)) height = int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT)) ``` ## Coordinate Systems ### MediaPipe Output - Normalized coordinates: (0.0, 0.0) to (1.0, 1.0) - Origin: Top-left corner - X: Left to right - Y: Top to bottom - Z: Depth (relative, camera-facing is negative) ### Conversion to Pixels ```python def normalized_to_pixel(landmark, width, height): x = int(landmark.x * width) y = int(landmark.y * height) return x, y ``` ### Visibility Score Each landmark has a visibility score (0.0-1.0): - > 0.5: Likely visible and accurate - < 0.5: May be occluded or estimated - = 0.0: Not detected ## Debug Overlay (debug_overlay.py) ### Skeleton Drawing ```python # Key connections for jump visualization POSE_CONNECTIONS = [ (23, 25), (25, 27), (27, 29), (27, 31), # Left leg (24, 26), (26, 28), (28, 30), (28, 32), # Right leg (23, 24), # Hips (11, 23), (12, 24), # Torso ] ``` ### Color Coding | Element | Color (BGR) | Meaning | | -------------- | ------------- | ------------------------- | | Skeleton | (0, 255, 0) | Green - normal tracking | | Low confidence | (0, 165, 255) | Orange - visibility < 0.5 | | Key angles | (255, 0, 0) | Blue - measured angles | | Phase markers | (0, 0, 255) | Red - takeoff/landing | ## Performance Optimization ### Reducing Latency 1. Use `model_complexity=0` for fastest inference 2. Process every Nth frame for batch analysis 3. Use GPU acceleration if available ```python import mediapipe as mp pose = mp.solutions.pose.Pose( model_complexity=0, # 0=Lite, 1=Full, 2=Heavy min_detection_confidence=0.5, min_tracking_confidence=0.5, static_image_mode=False # False for video (uses tracking) ) ``` ### Memory Management - Release pose estimator after processing: `pose.close()` - Process videos in chunks for large files - Use generators for frame iteration ## Integration with kinemotion ### File Locations - Pose estimation: `src/kinemotion/core/pose.py` - Video I/O: `src/kinemotion/core/video_io.py` - Filtering: `src/kinemotion/core/filtering.py` - Smoothing: `src/kinemotion/core/smoothing.py` - Auto-tuning: `src/kinemotion/core/auto_tuning.py` ### Typical Pipeline ```text Video → read_video_frames() → pose.process() → filter/smooth → analyze ``` ## Manual Observation for Validation During development, use manual frame-by-frame observation to establish ground truth and validate pose detection accuracy. ### When to Use Manual Observation 1. **Algorithm development**: Validating new phase detection methods 2. **Parameter tuning**: Comparing detected vs actual frames 3. **Debugging**: Investigating pose detection failures 4. **Ground truth collection**: Building validation datasets ### Ground Truth Data Collection Protocol **Step 1: Generate Debug Video** ```bash uv run kinemotion cmj-analyze video.mp4 --output debug.mp4 ``` **Step 2: Manual Frame-by-Frame Analysis** Open debug video in a frame-stepping tool (QuickTime, VLC with frame advance, or video editor). **Step 3: Record Observations** For each key phase, record the frame number where the event occurs: ```text === MANUAL OBSERVATION: PHASE DETECTION === Video: ________________________ FPS: _____ Total Frames: _____ PHASE DETECTION (frame numbers) | Phase | Detected | Manual | Error | Notes | |-------|----------|--------|-------|-------| | Standing End | ___ | ___ | ___ | | | Lowest Point | ___ | ___ | ___ | | | Takeoff | ___ | ___ | ___ | | | Peak Height | ___ | ___ | ___ | | | Landing | ___ | ___ | ___ | | LANDMARK QUALITY (per phase) | Phase | Hip Visible | Knee Visible | Ankle Visible | Notes | |-------|-------------|--------------|---------------|-------| | Standing | Y/N | Y/N | Y/N | | | Countermovement | Y/N | Y/N | Y/N | | | Flight | Y/N | Y/N | Y/N | | | Landing | Y/N | Y/N | Y/N | | ``` ### Phase Detection Criteria **Standing End**: Last frame before downward hip movement begins - Look for: Hip starts descending, knees begin flexing **Lowest Point**: Frame where hip reaches minimum height - Look for: Deepest squat position, hip at lowest Y coordinate **Takeoff**: First frame where both feet leave ground - Look for: Toe/heel landmarks separate from ground plane - Note: May be 1-2 frames after visible liftoff due to detection lag **Peak Height**: Frame where hip reaches maximum height - Look for: Hip at highest Y coordinate during flight **Landing**: First frame where foot contacts ground - Look for: Heel or toe landmark touches ground plane - Note: Algorithm may detect 1-2 frames late (velocity-based) ### Landmark Quality Assessment For each landmark, observe: | Quality | Criteria | | ----------- | ---------------------------------------------------- | | **Good** | Landmark stable, positioned correctly on body part | | **Jittery** | Landmark oscillates ±5-10 pixels between frames | | **Offset** | Landmark consistently displaced from actual position | | **Lost** | Landmark missing or wildly incorrect | | **Swapped** | Left/right landmarks switched | ### Recording Observations Format When validating, provide structured data: ```text ## Ground Truth: [video_name] **Video Info:** - Frames: 215 - FPS: 60 - Duration: 3.58s - Camera: 45° oblique **Phase Detection Comparison:** | Phase | Detected | Manual | Error (frames) | Error (ms) | |-------|----------|--------|----------------|------------| | Standing End | 64 | 64 | 0 | 0 | | Lowest Point | 91 | 88 | +3 (late) | +50 | | Takeoff | 104 | 104 | 0 | 0 | | Landing | 144 | 142 | +2 (late) | +33 | **Error Analysis:** - Mean absolute error: 1.25 frames (21ms) - Bias detected: Landing consistently late - Accuracy: 2/4 perfect, 4/4 within ±3 frames **Landmark Issues Observed:** - Frame 87-92: Hip jitter during lowest point - Frame 140-145: Ankle tracking unstable at landing ``` ### Acceptable Error Thresholds At 60fps (16.67ms per frame): | Error Level | Frames | Time | Interpretation | | ----------- | ------ | ----- | --------------------------------- | | Perfect | 0 | 0ms | Exact match | | Excellent | ±1 | ±17ms | Within human observation variance | | Good | ±2 | ±33ms | Acceptable for most metrics | | Acceptable | ±3 | ±50ms | May affect precise timing metrics | | Investigate | >3 | >50ms | Algorithm may need adjustment | ### Bias Detection Look for systematic patterns across multiple videos: | Pattern | Meaning | Action | | -------------------- | ----------------------- | ------------------------ | | Consistent +N frames | Algorithm detects late | Adjust threshold earlier | | Consistent -N frames | Algorithm detects early | Adjust threshold later | | Variable ±N frames | Normal variance | No action needed | | Increasing error | Tracking degrades | Check landmark quality | ### Integration with basic-memory Store ground truth observations: ```python # Save validation results write_note( title="CMJ Phase Detection Validation - [video_name]", content="[structured observation data]", folder="biomechanics" ) # Search previous validations search_notes("phase detection ground truth") # Build context for analysis build_context("memory://biomechanics/*") ``` ### Example: CMJ Validation Study Reference See basic-memory for complete validation study: - `biomechanics/cmj-phase-detection-validation-45deg-oblique-view-ground-truth` - `biomechanics/cmj-landing-detection-bias-root-cause-analysis` - `biomechanics/cmj-landing-detection-impact-vs-contact-method-comparison` Key findings from validation: - Standing End: 100% accuracy (0 frame error) - Takeoff: ~0.7 frame mean error (excellent) - Lowest Point: ~2.3 frame mean error (variable) - Landing: +1-2 frame consistent bias (investigate)