Bengalese Finch Song Repository#
About this dataset#
This is a collection of song from four Bengalese finches recorded in the Sober lab at Emory University. The song has been annotated manually by two of the authors.
We share this collection as a means of testing different machine learning algorithms for classifying the elements of birdsong, known as syllables.
The dataset is housed on Figshare here: https://figshare.com/articles/dataset/Bengalese_Finch_song_repository/4805749. This site is provided as documentation.
Data collection#
This dataset consists of song from four adult (>100-day-old) male Bengalese finches (Lonchura striata var. domestica). Any file associated with one of the four birds is prefixed with that bird’s ID: “bl26lb16”, “gr41rd51”, “gy6or6”, “or60yw70”. Song was recorded from birds as baseline recordings for behavioral experiments (not included in this dataset). Birds were isolated in sound-attenuating chambers and maintained on a 14 h:10 h light/dark cycle, with lights on from 7 A.M. to 9 P.M. All recordings analyzed here are from undirected song (i.e., no other bird was present). All procedures were approved by the Emory University Institutional Animal Care and Use Committee.
Recordings were made with a lavalier microphone, fed into a preamplifier that then ran to a National Instruments DAQ board. An automated data acquisition program in LabView (“evTAF”) detected when audio amplitude stayed abve a threshold for a target period, which then triggered automatic recordings until amplitude dropped again for another fixed period. The files thus recorded were then sorted with a separate program so that the final dataset consisted only of audio files that contained bouts of song.
Annotation#
After the audio files were sorted so that they all contained bouts of song, the song was annotated. Annotation was done by two of the authors (David Nicholson and Jonah Queen) using a custom MATLAB GUI application (“evsonganaly”, originally written by Evren Tumer). Audio was segmented into syllables by setting an amplitude threshold. Segments were post-processed by setting a minimum syllable duration and a minimum duration for the silent gap (the brief quiet period between syllables). Silent gaps less than the minimum duration were removed so that the two neighboring segments merged, and syllables less than the minimum duration were thrown out. A Python implementation of the segmenting algorithm used by evsonganaly can be found here.
After segmenting audio into syllables separate by silent gaps, labels were applied to syllables using the GUI. A schema was created for each individual bird’s song (shown as images above and included as .tif files in the dataset). This schema consisted of syllable labels, chosen by using the acoustic structure of syllables as visualized in spectrograms as well as their sequential context. The schema was typically settled on after labeling a few songs by hand. Syllable labels are arbitrary across birds, i.e., use of the label ‘e’ for a syllable in the song of bird gy6or6 and use of the same label ‘e’ for another syllable in the song of bird bl26lb16 is not meant to indicate that there is any resemblance between these two syllables. (In Bengalese finches, song can vary widely across invididuals.) The one exception is that the label ‘i’ is usually used to annotate the squawky, low-amplitude, high-entropy syllables often referred to as “introductory notes”. (See for example https://royalsocietypublishing.org/doi/full/10.1098/rspb.2020.2796).
While labeling the entire set of songs for each individual bird, any syllables that did not clearly fit into the schema were annotated with other characters that were not part of the set chosen for the schema, so these syllables could be identified later. Some audio files contained contact calls that birds produced before or after song. In some cases (when the bird emitted contact calls frequently) these were annotated, again with a character that was not in the set chosen for the labeling schema.
Usage#
To make it easy to work with the dataset,
we have created a Python package, evfuncs,
available at soberlab/evfuncs.
You can install it with pip
or conda
.
pip install evfuncs
conda install evfuncs -c conda-forge
How to work with the files is described on the README of that library,
but we describe the types of files here briefly.
The actual sound files have the extension .cbin
and were created by an application that runs behavioral experiments
and collects data called EvTAF.
Each .cbin file has an associated annotation file, with the extension .not.mat,
that contains song syllable onsets, offsets, labels, etc.,
created by a GUI for song annotation called evsonganaly.
Each .cbin file also has an associated .rec file,
also created by EvTAF.
These are included only because they contain the sampling rate,
and are used by evfuncs
to provide this sampling rate
along with raw audio from .cbin files, via the load_cbin
function.
New audio and annotation format#
We have now added separate .tar.gz archives that contain the audio files in the .wav format, and the annotations in a .csv format.
The addition of a widely-used audio format makes it easier to load the audio files, e.g. with pysoundfile: https://pysoundfile.readthedocs.io/en/latest/.
The annotation files are in the ‘simple-seq’ format
and can be easily worked with in Python using the crowsetta
library:
https://crowsetta.readthedocs.io/en/latest/formats/seq/simple-seq.html#simple-seq
or if you prefer, any library that can read .csv files,
such as pandas
: https://pandas.pydata.org/
Scripts to build the dataset and source code for this site#
Scripts used to organize the dataset, as well as the source code for this site, can be found at: NickleDave/bfsongrepo. These may provide helpful examples for working with the data and/or creating a similar project.
Download#
Through the terminal, using a script#
Choose the audio + annotation format you wish to use, then paste the corresponding command below into a terminal (hover the mouse cursor over the text for a copy button).
.wav audio + .csv annotations#
curl -sSL https://raw.githubusercontent.com/NickleDave/bfsongrepo/main/src/scripts/download_dataset.py | python3 - --audio-annot-type wav-csv
(Invoke-WebRequest -Uri https://raw.githubusercontent.com/NickleDave/bfsongrepo/main/src/scripts/download_dataset.py -UseBasicParsing).Content | py - --audio-annot-type wav-csv
.cbin audio + .not.mat annotations#
curl -sSL https://raw.githubusercontent.com/NickleDave/bfsongrepo/main/src/scripts/download_dataset.py | python3 - --audio-annot-type cbin-notmat
(Invoke-WebRequest -Uri https://raw.githubusercontent.com/NickleDave/bfsongrepo/main/src/scripts/download_dataset.py -UseBasicParsing).Content | py - --audio-annot-type cbin-notmat
Citation#
The dataset can be cited as follows:
Nicholson, D., et al. Bengalese Finch Song Repository. 7, figshare, 18 Oct. 2017, doi:10.6084/m9.figshare.4805749.v7.
Nicholson, D., Queen, J. E., & J. Sober, S. (2017). Bengalese Finch song repository (Version 7).
figshare. https://doi.org/10.6084/m9.figshare.4805749.v7
@article{Nicholson2022, author = "David Nicholson and Jonah E. Queen and Samuel J. Sober", title = "{Bengalese Finch song repository}", year = "2022", month = "9", url = "https://figshare.com/articles/dataset/Bengalese_Finch_song_repository/4805749", doi = "10.6084/m9.figshare.4805749.v7" }
Please be sure to include the DOI of the version you use with your citation.
If you are developing machine learning algorithms, we ask that you cite other relevant publications and software (see below) and also consider benchmarking against related algorithms. Our impression is that it will require a community of researchers working together to advance the state of the art in this area.
Cited by#
The following works use and/or cite this dataset. Works are listed in order of publication. If you are aware of a scholarly work that uses this dataset that is not cited here, please feel free to contact the maintainer. See Contact below.
Nicholson, D. (2016, July). Comparison of machine learning methods applied to birdsong element classification. In Proceedings of the 15th Python in Science Conference (Vol. 57, p. 61). https://conference.scipy.org/proceedings/scipy2016/david_nicholson.html
Sainburg, T., Theilman, B., Thielk, M., & Gentner, T. Q. (2019). Parallels in the sequential organization of birdsong and human speech. Nature communications, 10(1), 1-11. https://www.nature.com/articles/s41467-019-11605-y.
Sainburg, T., Thielk, M., & Gentner, T. Q. (2020). Finding, visualizing, and quantifying latent structure across diverse animal vocal repertoires. PLoS computational biology, 16(10), e1008228. https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1008228. Preprint: https://www.biorxiv.org/content/10.1101/870311v1.full.pdf
Steinfath, E., Palacios-Muñoz, A., Rottschäfer, J. R., Yuezak, D., & Clemens, J. (2021). Fast and accurate annotation of acoustic signals with deep neural networks. Elife, 10, e68837. https://elifesciences.org/articles/68837. Preprint: https://www.biorxiv.org/content/biorxiv/early/2021/03/29/2021.03.26.436927.full.pdf.
Cohen, Y., Nicholson, D. A., Sanchioni, A., Mallaber, E. K., Skidanova, V., & Gardner, T. J. (2022). Automated annotation of birdsong with a neural network that segments spectrograms. Elife, 11, e63853. https://elifesciences.org/articles/63853. Preprint: https://www.biorxiv.org/content/biorxiv/early/2020/10/13/2020.08.28.272088.full.pdf
Linhart, P., Mahamoud-Issa, M., Stowell, D., & Blumstein, D. T. (2022). The potential for acoustic individual identification in mammals. Mammalian Biology, 1-17. https://link.springer.com/article/10.1007/s42991-021-00222-2. https://blumsteinlab.eeb.ucla.edu/wp-content/uploads/sites/104/2022/04/Linhart_etal_2022_MammBiol.pdf.
Contact#
Please feel free to contact David Nicholson (nicholdav at gmail dot com) with questions and feedback.
Acknowledgements#
This work was supported by National Institutes of Health National Institute of Neurological Disorders and Stroke R01 NS084844, National Institutes of Health National Institute of Neurological Disorders and Stroke F31NS089406.