Spectrogram Settings

From Audacity Development Manual
Jump to: navigation, search



It is possible to temporarily change the Spectrogram settings for any particular Spectrogram track, overriding whatever setting you have in Spectrograms Preferences.

Per track Spectrogram Settings

Open the Audio Track Dropdown Menu on the Spectrogram track you want to change, then choose Spectrogram Settings. This opens the dialog below:

Spectrogram Track Settings.png
The available settings are the same as those offered in Spectrograms Preferences with the addition of a Use Preferences checkbox.

Persistence of changes made here

Changes you make when you press the OK button only persist for that track while the project window is open. This is the case even if you save a project. Use Spectrograms Preferences instead to make permanent changes to the default Spectrogram settings with which a new Spectrogram track will open.

You can also preview what your changes will look like in the track by clicking the Apply button. If necessary you can tweak the settings and click Apply again to view the result.

If you do click Apply you cannot then discard changes made in the dialog, even if you click the Cancel button. If you do not click the Apply button, Cancel discards your settings changes.


Use Preferences

The Spectrogram Settings dialog defaults to the "Use Preferences" checkbox being enabled, so on first opening the settings for a track the settings will be as already set in Spectrograms Preferences. Changing a setting in Spectrogram Settings will automatically disable the "Use Preferences" checkbox. Re-enabling the "Use Preferences" checkbox changes the settings in the dialog back to as they are in Spectrograms Preferences.

Scale

  • Scale (in Spectrogram views):
    • Linear The linear vertical scale goes linearly from 0 kHz to 20 kHz frequency by default.
    • Logarithmic: This view is the same as the linear view except that the vertical scale is logarithmic. See Spectrogram View for a contrasting example of linear versus logarithmic spectrogram view.
    • Mel: The name Mel comes from the word melody to indicate that the scale is based on pitch comparisons. See this Wikipedia page. This is the default scale.
    • Bark: This is a psychoacoustical scale based on subjective measurements of loudness. It is related to, but somewhat less popular than, the Mel scale. See this Wikipedia page.
    • ERB: The Equivalent Rectangular Bandwidth scale or ERB is a measure used in psychoacoustics, which gives an approximation to the bandwidths of the filters in human hearing. It is implemented as a function ERBS(f) which returns the number of equivalent rectangular bandwidths below the given frequency "f". See this Wikipedia page.
    • Period: is the previously undocumented scale used by Pitch (EAC) view. It is for making the same displays of Pitch possible as in earlier versions of Audacity.
  • Minimum Frequency: This value corresponds to the bottom of the vertical scale in the spectrogram. Frequencies below this value will not be visible. The default value of "0" here will be treated as "1" when using "Spectrogram Logarithmic" view mode because a logarithmic scale cannot start at zero.
  • Maximum Frequency: This value corresponds to the top of the vertical scale. The value can be set to 100 Hz or any higher value. Irrespective of the entered value, the top of the scale will never exceed half the current sample rate of the track (for example, 22,050 Hz if the track rate is 44,100 Hz) because any given sample rate can only carry frequencies up to half that rate. A good use of this setting is in speech recognition or pitch extraction, where you can hide the visually unimportant highest frequencies and focus on the lower frequencies.

Colors

  • Gain (dB): Increases / decreases the brightness of the display. For small signals where the display is mostly "blue" (dark) you can increase this value to see brighter colors and give more detail. If the display has too much "white", decrease this value. The default is 20dB and corresponds to a -20 dB signal at a particular frequency being displayed as "white". This option has no effect and is grayed out when the Pitch (EAC) algorithm is selected.
  • Range (dB): Affects the range of signal sizes that will be displayed as colors. The default is 80 dB and means that you will not see anything for signals 80 dB below the value set for "Gain". This option has no effect and is grayed out when the Pitch (EAC) algorithm is selected.
  • Frequency Gain (dB/dec): A positive value here gives some extra gain to higher frequencies (above 1,000 Hz), as they tend to be smaller and so cannot be seen as well. You get less gain at lower frequencies as well. The default is 0 dB. This option has no effect and is grayed out when the Pitch (EAC) algorithm is selected.
  • Scheme: Choice of two colorways or two grayscale settings.

Algorithm

  • Algorithm:
    • Frequencies (default): Audio frequency determines the pitch of a sound. Measured in Hz, higher frequencies have higher pitch. See this Wikipedia article.
    • Reassignment: The method of reassignment sharpens blurry time-frequency data by relocating the data according to local estimates of instantaneous frequency and group delay. This mapping to reassigned time-frequency coordinates is very precise for signals that are separable in time and frequency with respect to the analysis window.
    • Pitch (EAC): Highlights the contour of the fundamental frequency (musical pitch) of the audio, using the Enhanced Autocorrelation (EAC) algorithm. The EAC Algorithm was developed to produce a mathematical representation of the changes of pitch in a piece of audio. The aim was to allow automated comparison of sound files so that two versions of the same tune could be recognized as being similar, even if played in different keys, or on different instruments.
  • Window Size: The dropdown menu lets you choose the size of the Fast Fourier Transform (FFT) window which affects how much vertical (frequency) detail you see. Larger FFT window sizes give more low frequency resolution and less temporal resolution, and are slower.
  • Window type: Determines precisely how the spectrogram is computed. Hann is the default setting. 'Rectangular' is slightly faster than other methods, but introduces some artifacts. All methods give broadly similar results.
  • Zero padding factor: Larger values give finer interpolation of the colors along the vertical axis, at the expense of more computation time. Does not affect the time vs. frequency resolution tradeoff. This option has no effect and is grayed out when the Pitch (EAC) algorithm is selected.

Enable Spectral Selection

Check this box "on" if you want to enable spectral selections. These are used to make selections that include a frequency range as well as a time range on tracks in one of the Spectrogram views. Spectral selection is used with special spectral edit effects to make changes to the frequency content of the selected audio. Among other purposes, spectral selection and editing can be used for cleaning up unwanted sound, enhancing certain resonances, changing the quality of a voice or removing mouth sounds from voice work.

Links

|< Spectrogram View