Pitch related measures

< PREVIOUS: Amplitude > NEXT: Mean frequency

The term pitch is used to describe the perceived tone of sounds (high, low, etc). Quantitatively, pitch estimates are measures of the period of oscillation. It is the only feature that requires careful adjustments depending on the species and acoustic structure. When the spectral structure is simple, as in a whistle, the pitch can be easily estimated as the (only) peak in the power spectrum. the location of this peak can be assessed by one of two features: peak frequency -- the frequency of highest power, or the mean frequency -- the gravity center of the power spectrum. However, things get more complicated when the power spectrum is composed of several peaks:

In the case of a pure sinusoid, all the energy is centered in one frequency and we see a sharp peak in the power spectrum:

sound waveform                                       power spectrum

sine

However, a non-sinusoidal periodic signal appears as an harmonic series of peaks in the power spectrum. For example, if we create a short vector of random numbers, create several copies of that vector, and glue those head to tail to create a long vector, and then compute the spectrum of that vector, the result looks like this:

harmonics

 

It this case, the perceived pitch is usually the lowest common denominator of these peaks, called fundamental frequency. Harmonic pitch is an estimate of the fundamental-frequency of a complex sound composed of harmonically related frequencies.

 

The main challenge is the automatic distinction between tonal pitch and harmonic pitch. In previous version of SAP we used the cepstrum (spectrum of log spectrum) to detect harmonic pitch, and dynamically shifted our pitch estimate to the mean frequency when harmonic pitch was undetectable. In the current version of SAP we include the YIN method for fundamental frequency estimate (by Alain de Cheveigne), which is also used by the Michael Brainard lab to estimate pitch. In SAP2011 we use the term pitch for our cepstrum based estimate, and 'fundamental frequency' for the YIN estimate.

In general, pitch is difficult to estimate, but nevertheless, pitch is a central feature of song, and with careful adjustments, one can often obtain a good estimate. Furthermore, when we calculate the mean pitch of a syllable, we adjust the weight of each time window by the goodness of pitch, which often stabilizes the mean pitch estimate of that syllable type.