Spectral Derivatives

< PREVIOUS: Glossary of terms > NEXT: Segmentation of Syllable Units


What are Spectral Derivatives? The traditional sonogram represents the power of sound in a time-frequency plane, while the spectral derivatives represent the change of power. For each point of the two-dimensional time-frequency plane of a sonogram, one can measure change of power from left to right (on time), from bottom to top (on frequency) or at any arbitrary direction. So, spectral derivatives are derivatives of the spectrogram in an ‘appropriate’ direction in the time-frequency plane. Spectral derivatives can be estimated using MultiTaper spectral methods, they have the same resolution and are not artificially broadened.

Sound Analysis Pro uses spectral derivatives to track frequency traces in the spectrogram as follows. As one cuts across a horizontal frequency trace, from low to high, there is a sharp increase in power, then a plateau, then a decrease in power. The same cuts are first positive and than negative, passing through zero at the peak power location. A useful property of these derivatives is that they show a sharp transition from positive to negative values, providing a contour that is more accurately defined than just the frequency trace.

If the frequency trace is not horizontal, then the direction of maximum change in power is not in the frequency axis, but rather at an angle to both time and frequency axes. To capture the direction of maximal power change in the frequency trace, it is then natural to take a directional derivative perpendicular to the direction of frequency modulation (think about detecting waves in the ocean by cutting through the surface in many arbitrary direction until you hit the wave). The directional derivative is easily computed as a linear combination of the derivatives in the time and frequency directions, and may be thought of as an edge detector in the time-frequency plane.

We find the derivatives spectrogram an excellent means of visualizing the spectral information in a song. The derivatives of each point are calculated in an angle that is perpendicular to the direction of frequency modulation. As a result of this edge detector technique, zero crossings (transitions from black to white in the middle of frequency traces) look equally sharp in the modulated and in the unmodulated portions of a note. Peak Frequency contour is defined by the zero crossings of successive directional derivatives.
Fig 1: The Explore and Score Interface
Fig 2: A MultiTaper Sonogram of a bird song segment
Spectral derivatives of the same sound in Fig 2 here the frequency traces are more distinct. Since the derivatives are calculated in a different direction for each point, subtle modulations are also visible Estimates of frequency and time derivatives of the spectrum may be robustly obtained using quadratic inverse techniques (Thomson, 1990, 1993).

These estimates have the general form: An approximation of the above matrix is define by:
Fig 3: Spectral Derivatives of the same sound

Estimates of frequency and time derivatives of the spectrum may be robustly obtained using quadratic inverse techniques (Thomson, 1990, 1993).
These estimates have the general form:

An approximation of the above matrix is define by:

Empirically:
Where:
is a directional derivative of the spectrogram in the time-frequency plan, the direction being specified by the angular parameter:
In particular, the time and frequency derivatives of the spectrogram may be obtained by setting: