Exploring similarity across features

< PREVIOUS: Time course & mean values ___________________________________________________________> NEXT: Self similarity

The global similarity estimate is based on five acoustic features: pitch,FM, AM, Goodness of pitch and Wiener entropy. You can view the similarity across each feature separately by clicking one of the buttons in the similarity display group. As noted earlier, asymmetric similarity is estimated in two stages: first global comparisons across 70ms intervals are used to threshold the match and detect similarity sections. Then, similarity is estimated locally, based on frame-by-frame scores. Both local and global distances can be viewed, and those views are useful to assess what might have gone wrong when the similarity results do not seem reasonable. For example, you might discover that the pitch is not properly estimated, which will show similarity across all features but pitch.

The effect of global versus local estimates can be seen in the example below showing FM and AM partial similarities on local and global scales. Note that locally, FM shows a similar area in the middle of the matrix where the two sounds are not modulated, and we can see four bulges emerging from each corner of the central rectangle. Those are the similarities between the modulated parts of the syllable. Since the syllable is frequency modulated both in its onset and offset, we have similarity between beginning and end parts. Now look at the global similarity and note how the rectangle turned into a diagonal line, which captures the similarity in the transitions from high-low-high FM. In addition, we see short sidebands, indicating the shorter scale similarity between the beginning of one syllable and the end of the other. Now examine the partial similarity of AM. Here the local similarity does not show any similarity between the beginning of one sound and the end of the other sound, but it does show strong similarity between the two beginnings and the two ends. This is because the sign of amplitude modulation is positive in the onset and negative at the offset of each sound. Hence, when looking at the global AM matrix we do not have side-bands.
Overall, the message is that by comparing similarity across different features we capture different aspects of the similarity. By taking all those features into account, we can often obtain a reasonable overall assessment of how good the similarity is, and we might then also develop some understanding of meaningful articulatory variables that are similar or different across the two sounds. However, it might also happen that the similarity is good in some features and poor with respect to others, and in such cases, it might be desired to omit some features from the global estimate (this is not something you want to do just in order to obtain a better match!). In the options (similarity tab), you can set different scales and exclusion of features: