--- name: audio-analyzer description: Comprehensive audio analysis with waveform visualization, spectrogram, BPM detection, key detection, frequency analysis, and loudness metrics. --- # Audio Analyzer A comprehensive toolkit for analyzing audio files. Extract detailed information about audio including tempo, musical key, frequency content, loudness metrics, and generate professional visualizations. ## Quick Start ```python from scripts.audio_analyzer import AudioAnalyzer # Analyze an audio file analyzer = AudioAnalyzer("song.mp3") analyzer.analyze() # Get all analysis results results = analyzer.get_results() print(f"BPM: {results['tempo']['bpm']}") print(f"Key: {results['key']['key']} {results['key']['mode']}") # Generate visualizations analyzer.plot_waveform("waveform.png") analyzer.plot_spectrogram("spectrogram.png") # Full report analyzer.save_report("analysis_report.json") ``` ## Features - **Tempo/BPM Detection**: Accurate beat tracking with confidence score - **Key Detection**: Musical key and mode (major/minor) identification - **Frequency Analysis**: Spectrum, dominant frequencies, frequency bands - **Loudness Metrics**: RMS, peak, LUFS, dynamic range - **Waveform Visualization**: Multi-channel waveform plots - **Spectrogram**: Time-frequency visualization with customization - **Chromagram**: Pitch class visualization for harmonic analysis - **Beat Grid**: Visual beat markers overlaid on waveform - **Export Formats**: JSON report, PNG/SVG visualizations ## API Reference ### Initialization ```python # From file analyzer = AudioAnalyzer("audio.mp3") # With custom sample rate analyzer = AudioAnalyzer("audio.wav", sr=44100) ``` ### Analysis Methods ```python # Run full analysis analyzer.analyze() # Individual analyses analyzer.analyze_tempo() # BPM and beat positions analyzer.analyze_key() # Musical key detection analyzer.analyze_loudness() # RMS, peak, LUFS analyzer.analyze_frequency() # Spectrum analysis analyzer.analyze_dynamics() # Dynamic range ``` ### Results Access ```python # Get all results as dict results = analyzer.get_results() # Individual results tempo = analyzer.get_tempo() # {'bpm': 120, 'confidence': 0.85, 'beats': [...]} key = analyzer.get_key() # {'key': 'C', 'mode': 'major', 'confidence': 0.72} loudness = analyzer.get_loudness() # {'rms_db': -14.2, 'peak_db': -0.5, 'lufs': -14.0} freq = analyzer.get_frequency() # {'dominant_freq': 440, 'spectrum': [...]} ``` ### Visualization Methods ```python # Waveform analyzer.plot_waveform( output="waveform.png", figsize=(12, 4), color="#1f77b4", show_rms=True ) # Spectrogram analyzer.plot_spectrogram( output="spectrogram.png", figsize=(12, 6), cmap="magma", # viridis, plasma, inferno, magma freq_scale="log", # linear, log, mel max_freq=8000 # Hz ) # Chromagram (pitch classes) analyzer.plot_chromagram( output="chromagram.png", figsize=(12, 4) ) # Onset strength / beat grid analyzer.plot_beats( output="beats.png", figsize=(12, 4), show_strength=True ) # Combined dashboard analyzer.plot_dashboard( output="dashboard.png", figsize=(14, 10) ) ``` ### Export ```python # JSON report with all analysis analyzer.save_report("report.json") # Summary text summary = analyzer.get_summary() print(summary) ``` ## Analysis Details ### Tempo Detection Uses beat tracking algorithm to detect: - **BPM**: Beats per minute (tempo) - **Beat positions**: Timestamps of detected beats - **Confidence**: Reliability score (0-1) ```python tempo = analyzer.get_tempo() # { # 'bpm': 128.0, # 'confidence': 0.89, # 'beats': [0.0, 0.469, 0.938, 1.406, ...], # seconds # 'beat_count': 256 # } ``` ### Key Detection Analyzes harmonic content to identify: - **Key**: Root note (C, C#, D, etc.) - **Mode**: Major or minor - **Confidence**: Detection confidence - **Key profile**: Correlation with each key ```python key = analyzer.get_key() # { # 'key': 'A', # 'mode': 'minor', # 'confidence': 0.76, # 'profile': {'C': 0.12, 'C#': 0.08, ...} # } ``` ### Loudness Metrics Comprehensive loudness analysis: - **RMS dB**: Root mean square level - **Peak dB**: Maximum sample level - **LUFS**: Integrated loudness (broadcast standard) - **Dynamic Range**: Difference between loud and quiet sections ```python loudness = analyzer.get_loudness() # { # 'rms_db': -14.2, # 'peak_db': -0.3, # 'lufs': -14.0, # 'dynamic_range_db': 12.5, # 'crest_factor': 8.2 # } ``` ### Frequency Analysis Spectrum analysis including: - **Dominant frequency**: Strongest frequency component - **Frequency bands**: Energy in bass, mid, treble - **Spectral centroid**: "Brightness" of audio - **Spectral rolloff**: Frequency below which 85% of energy exists ```python freq = analyzer.get_frequency() # { # 'dominant_freq': 440.0, # 'spectral_centroid': 2150.3, # 'spectral_rolloff': 4200.5, # 'bands': { # 'sub_bass': -28.5, # 20-60 Hz # 'bass': -18.2, # 60-250 Hz # 'low_mid': -12.1, # 250-500 Hz # 'mid': -10.8, # 500-2000 Hz # 'high_mid': -14.3, # 2000-4000 Hz # 'high': -22.1 # 4000-20000 Hz # } # } ``` ## CLI Usage ```bash # Full analysis with all visualizations python audio_analyzer.py --input song.mp3 --output-dir ./analysis/ # Just tempo and key python audio_analyzer.py --input song.mp3 --analyze tempo key --output report.json # Generate specific visualization python audio_analyzer.py --input song.mp3 --plot spectrogram --output spec.png # Dashboard view python audio_analyzer.py --input song.mp3 --dashboard --output dashboard.png # Batch analyze directory python audio_analyzer.py --input-dir ./songs/ --output-dir ./reports/ ``` ### CLI Arguments | Argument | Description | Default | |----------|-------------|---------| | `--input` | Input audio file | Required | | `--input-dir` | Directory of audio files | - | | `--output` | Output file path | - | | `--output-dir` | Output directory | `.` | | `--analyze` | Analysis types: tempo, key, loudness, frequency, all | `all` | | `--plot` | Plot type: waveform, spectrogram, chromagram, beats, dashboard | - | | `--format` | Output format: json, txt | `json` | | `--sr` | Sample rate for analysis | `22050` | ## Examples ### Song Analysis ```python analyzer = AudioAnalyzer("track.mp3") analyzer.analyze() print(f"Tempo: {analyzer.get_tempo()['bpm']:.1f} BPM") print(f"Key: {analyzer.get_key()['key']} {analyzer.get_key()['mode']}") print(f"Loudness: {analyzer.get_loudness()['lufs']:.1f} LUFS") analyzer.plot_dashboard("track_analysis.png") ``` ### Podcast Quality Check ```python analyzer = AudioAnalyzer("podcast.mp3") analyzer.analyze_loudness() loudness = analyzer.get_loudness() if loudness['lufs'] > -16: print("Warning: Audio may be too loud for podcast standards") elif loudness['lufs'] < -20: print("Warning: Audio may be too quiet") else: print("Loudness is within podcast standards (-16 to -20 LUFS)") ``` ### Batch Analysis ```python import os from scripts.audio_analyzer import AudioAnalyzer results = [] for filename in os.listdir("./songs"): if filename.endswith(('.mp3', '.wav', '.flac')): analyzer = AudioAnalyzer(f"./songs/{filename}") analyzer.analyze() results.append({ 'file': filename, 'bpm': analyzer.get_tempo()['bpm'], 'key': f"{analyzer.get_key()['key']} {analyzer.get_key()['mode']}", 'lufs': analyzer.get_loudness()['lufs'] }) # Sort by BPM for DJ set results.sort(key=lambda x: x['bpm']) ``` ## Supported Formats Input formats (via librosa/soundfile): - MP3 - WAV - FLAC - OGG - M4A/AAC - AIFF Output formats: - JSON (analysis report) - PNG (visualizations) - SVG (visualizations) - TXT (summary) ## Dependencies ``` librosa>=0.10.0 soundfile>=0.12.0 matplotlib>=3.7.0 numpy>=1.24.0 scipy>=1.10.0 ``` ## Limitations - Key detection works best with melodic content (less accurate for drums/percussion) - BPM detection may struggle with free-tempo or complex time signatures - Very short clips (<5 seconds) may have reduced accuracy - LUFS calculation is simplified (not full ITU-R BS.1770-4)