Sound data: Wave files & features tables

 

< PREVIOUS: Playbacks setting > NEXT: Sound metadata

The Wave Files
All the raw data generated by SAP2011 are saved as wave files of the following file naming format:
When multiple channels are recorded (in a master-slave mode), e.g., when recording electrophysiology data, the master channel (typically the sound) has the name annotation shown above, and the slave channels will have the same name as the master, but starts with Sn~ where n is the channel number (e.g., S3~109_39013.... is the third slave channel of this bird, in this case an electrode located in RA).

When SAP2011 records & processes data from a certain animal (life recording) it identifies vocalization events (e.g., a song bout) and saves it into a wave file. Over a few days of recording, the number of these files might reach tens of thousands. Files are automatically organized into daily folders. The name of each folder is the age of the bird, e.g., the file is from bird 111 when 62 days old. The age is calculated by subtracting the age of the file from the hatch date of the bird (hatching date is automatically retrieved from the Settings table).

Having many, relatively short files could be difficult to maintain, but SAP2011 provides, in addition to the folder and name annotation, an important utility called the File Table. The File table contains information about all the files recorded for a certain bird, and it can be used to query an arbitrary subset of the data - e.g., the first 100 files of the morning. This is why it is convenient to keep the SAP2 raw data in small packages.
The Syllable Table
Syllable table is perhaps the most useful table the SAP2011 generates (for example, see DVD maps). What makes the syllable table so useful is that it captures much of the dynamics of vocal changes at the syllabic level during an entire development or during an entire experiment and have it all in a single file. It allows you to cluster syllables to type, watch how syllable types 'move' in their feature space until model match is achieved, and it provides an extensive summary of different features of different syllables, namely summarizing their structure. In fact, you already created a syllable table in the previous chapter and exported to Excel.
recnum is the global index (and the primary key) of the syllable table. SAP2 uses recnum to retrieve data epochs from the table, e.g., when displaying a DVD map. Also, the recnum tells you in what order the syllables were originally inserted into the table (and this order should match the order of serial number (the date & time 'age' stamp) of each syllable.

serial_number is the date & time stamp of each syllable, and it is identical to the age of the sound (wave) file that the syllable belongs to. As in the file name encoding, the integer number is the number of days elapsed since 1900. The fraction, however, is not the milliseconds elapsed since midnight but the fraction of the day elapsed (both numbers are given by the 'FileAge Windows function).

bird_ID is a number that identifies the bird. In Explore & Score it is zero by default, but in the Live mode, it corresponds to the bird's name.  In Batch mode, user is always asked to provide a bird ID.

start_on is the time (in milliseconds) within the sound file where the syllable begin. It is an important feature that allows SAP2 (or you) to accurately identify the syllable in the raw data. For example, SAP2 can perform batch similarity measurements within and across syllable types, automatically retrieving sound files and outlining a specific syllable according to start_on and duration.

The features first and second order statistics fields are duration, mean features, minimum and maximum number of features, as well as the variance of features.

Date & time fields albeit redundant, it is often convenient to have date and time (month, day of month, hour, etc) represented separately in fields, so as to allow simple querying of data subsets.

file_name is the name of the sound file that the syllable belongs to.
The Raw Features Table
SAP2011 performs spectral analysis and calculate acoustic (and other) features across partially overlapping spectral (FFT) windows. By default those windows are 9.27ms long and we advance the window 1ms in each step. Namely, we start from calculating features in the first interval of the sound file (1-9.27ms), then we calculate features in the interval 2-10.27, 3-11.27, and so forth. The raw features table presents those raw features as records. Let's make an example: open SAP2011, click “Explore & Score” and then in at the bottom right check “save raw features”. In the popup window type “raw_features1” and click Ok. Now click “open sound” and open the sound “example1.wav”. This sound is about 820ms long. Next open the MySQL Control Center (if already open, right click the database SAP and click “refresh”). The new table you just created,  raw_features1, should show up in the list of tables in SAP and it should include 812 records, one for each 1ms in the file excluding edges (note that the number of records is determined by the “advance window” not by the FFT window size!).  Double click this table and you should see a display like this:
Note that 'time' and 'file_index' fields have a little key above them. Those fields are the 'primary key' of the raw feature table. Primary keys must be unique, namely the table cannot include duplicate of the same time within the same file_index (each file_index identifies a wave file, and indeed, the same time should not occur twice within a file). The primary key is an index used by MySQL to sort the table and to accelerate access to different locations in the table.

We will now go through the fields:
Time: display the time of day in units of 0.1milliseconds elapsed since midnight. For example, the first number 391,740,010 can be divided by 860,400,000 (the number of 0.1milliseconds in each day) to obtain the fraction of the day elapsed = 0.45. So in units of hours (24 x 0.45) the time is 10.8 (10:48AM). In other words, we are saying that this sound was recorded at 10:48AM - how can we tell? In this case (the default mode of the “Explore & Score” module) the time was retrieved by the Windows API utility FileAge(). FileAge() has the advantage of returning the time of when the original file was created. It remains unchanged in any copy of the original file, which is nice. There are two issues you should keep in mind regarding FileAge():

1. If your original data are not wave files (e.g., mat files) then the generated wave files will give meaningless time stamp. In such cases, the solution is to generate file names of appropriate annotation (see section 6b) and then instruct SAP2 to extract “time” from the file name.
To manipulate SAP2 method of extracting time information from wave files go to options -> input & output settings, and check the appropriate “extract time stamp” option.

2. The Windows time stamp is only about 2 seconds accurate. That is, our time estimate can be very accurate within each file, but across files we have considerable time jitter. In the SAP2011 recorder, we overcame this limitation by implementing an accurate millisecond counter. The accurate milliseconds count is then displayed in the recorder generated file names, and then the sound processing live uses this information to use raw features table of 1ms accuracy across files. Note that raw feature tables generated using the new Recorder are indistinguishable in their time format from raw features tables generated using Explore & Score or Batch - it is the user responsibility to keep track of the cross-file accuracy of the time field in any given data. For example, all the song development data generated in our lab were, unfortunately, created prior to SAP2 and therefore, our cross-file time field is of 2 second accuracy and there is nothing we can do about it.

Note that the raw features table does not contain any information the date but only the time of day. The file_index field (see below) allows you to retrieve the information if needed, however, in most cases you will not need to: When the raw features table is generated in “live recording”, SAP2 creates one raw features table in each day for each bird. The (automatically generated) name of the daily table (e.g., b109_76) will tell you the identity of the bird and its age (bird 109 on day 76 post hatch).

File_index field points to an automatically generated File Table (see below), which provides several details about the wave file, so as to make easy to link the features at  the sound data.

The features fields: In order to minimize storage space (Raw Features Tables are huge) and decrease computation time (for some features) we encoded some of the features to units that allow efficient storage and quick calculations. Here is a list of those decoding together with decoding (namely, the procedure that will transform the features back to their appropriate units):

Feature Original units Raw Features units Decoding
Amplitude Db Db None
Mean Frequency Amplitude Db> Db None
Pitch Hz Hz None
Mean Frequency Hz Hz None
FM Degrees (00-900)> Degrees x 10 /10
AM 1/t 1/t x 100 /100
Goodness of pitch None None None
Wiener entropy None x 100 /100
peak frequency Hz Hz -
DAS ms ms
Continuity over time milliseconds Milliseconds x 100 100
Continuity over frequency Hz Hz x 100 /100


A wave file of, say, 10s contains 441,000 samples of sound (sampling rate is 44,100Hz). Each sample is a 16 bit number. When we analyze this file and save raw features table the number of records is only 10,000 (this is the number of milliseconds of sound we analyzed). However, each record contains several numbers, and therefore keeping the number of bits per record reasonably low can save much storage space. The field types we chose, in addition to some simple encoding of units reducing the size of the raw feature tables to about one third of the raw data, as described below.