Learning speech recognition from songbirds

Yildiz, Izzet B; von Kriegstein, Katharina; Kiebel, Stefan J

doi:10.1186/1471-2202-14-S1-P210

Volume 14 Supplement 1

Abstracts from the Twenty Second Annual Computational Neuroscience Meeting: CNS*2013

Poster presentation
Open access
Published: 08 July 2013

Learning speech recognition from songbirds

Izzet B Yildiz¹,
Katharina von Kriegstein¹ &
Stefan J Kiebel^1,2

BMC Neuroscience volume 14, Article number: P210 (2013) Cite this article

1317 Accesses
Metrics details

Our knowledge about the computational mechanisms underlying human learning and recognition of speech is still very limited [1]. One difficulty in deciphering the exact means by which humans recognize speech is that there are scarce experimental findings at a neuronal, microscopic level. Here, we show that our neuronal-computational understanding of speech learning and recognition may be vastly improved by looking at a different species, i.e., the songbird, which faces the same challenge as humans: to learn and decode complex auditory input partitioned into sequences of syllables, in an online fashion [2]. Motivated by striking similarities between the human and songbird neural recognition systems at the macroscopic level [3, 4], we assumed that the human brain uses the same computational principles at a microscopic level and translated a birdsong model [5] into a human speech learning and recognition model. The model performs a Bayesian version of dynamical, predictive coding [6] based on an internal generative model of how speech dynamics are produced. This generative model consists of a two-level hierarchy of recurrent neural networks similar to the song production hierarchy of songbirds [7]. In this predictive coding scheme, predictions about the future trajectory of the speech stimulus are dynamically formed based on a learned repertoire and the ongoing stimulus. The hierarchical inference uses top-down and bottom-up messages, which aim to minimize an error signal, the so-called prediction error.

We show that the resulting neurobiologically plausible model can learn words rapidly and recognize them robustly, even in adverse conditions. Also, the model is capable of dealing with variations in speech rate and competition by multiple speakers. In addition, we show that recognition can be performed even when words are spoken by different speakers and with different accents--an everyday situation in which current state-of-the-art speech recognition models often fail. We use the model to provide computational explanations for inter-individual differences in accent adaptation, as well as age of acquisition effects in second language learning. For the latter, we qualitatively modeled behavioral results from an experimental study [8].

References

Hickok G, Poeppel D: Opinion - The cortical organization of speech processing. Nat Rev Neurosci. 2007, 8 (5): 393-402. 10.1038/nrn2113.
Article CAS PubMed Google Scholar
Prather JF, Nowicki S, Anderson RC, Peters S, Mooney R: Neural correlates of categorical perception in learned vocal communication. Nat Neurosci. 2009, 12 (2): 221-228. 10.1038/nn.2246.
Article PubMed Central CAS PubMed Google Scholar
Bolhuis JJ, Okanoya K, Scharff C: Twitter evolution: converging mechanisms in birdsong and human speech. Nat Rev Neurosci. 2010, 11 (11): 747-759.
Article CAS PubMed Google Scholar
Doupe AJ, Kuhl PK: Birdsong and human speech: Common themes and mechanisms. Annu Rev Neurosci. 1999, 22: 567-631. 10.1146/annurev.neuro.22.1.567.
Article CAS PubMed Google Scholar
Yildiz IB, Kiebel SJ: A Hierarchical Neuronal Model for Generation and Online Recognition of Birdsongs. Plos Comput Biol. 2011, 7 (12): e1002303-10.1371/journal.pcbi.1002303.
Article PubMed Central CAS PubMed Google Scholar
Friston KJ, Trujillo-Barreto N, Daunizeau J: DEM: A variational treatment of dynamic systems. Neuroimage. 2008, 41 (3): 849-885. 10.1016/j.neuroimage.2008.02.054.
Article CAS PubMed Google Scholar
Fee MS, Kozhevnikov AA, Hahnloser RHR: Neural mechanisms of vocal sequence generation in the songbird. Annals of the New York Academy of Sciences. 2004, 1016: 153-170. 10.1196/annals.1298.022.
Article PubMed Google Scholar
Meador D, Flege JE, Mackay IRA: Factors affecting the recognition of words in a second language. Bilingualism: Language and Cognition. 2000, 3: 55-67. 10.1017/S1366728900000134.
Article Google Scholar

Download references

Author information

Authors and Affiliations

Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, 04103, Germany
Izzet B Yildiz, Katharina von Kriegstein & Stefan J Kiebel
Biomagnetic Center, Hans Berger Clinic for Neurology, University Hospital Jena, Friedrich-Schiller-University Jena, 07747, Germany
Stefan J Kiebel

Authors

Izzet B Yildiz
View author publications
You can also search for this author in PubMed Google Scholar
Katharina von Kriegstein
View author publications
You can also search for this author in PubMed Google Scholar
Stefan J Kiebel
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Izzet B Yildiz.

Rights and permissions

This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Reprints and permissions

About this article

Cite this article

Yildiz, I.B., von Kriegstein, K. & Kiebel, S.J. Learning speech recognition from songbirds. BMC Neurosci 14 (Suppl 1), P210 (2013). https://doi.org/10.1186/1471-2202-14-S1-P210

Download citation

Published: 08 July 2013
DOI: https://doi.org/10.1186/1471-2202-14-S1-P210

Abstracts from the Twenty Second Annual Computational Neuroscience Meeting: CNS*2013

Learning speech recognition from songbirds

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

BMC Neuroscience

Contact us

Abstracts from the Twenty Second Annual Computational Neuroscience Meeting: CNS*2013

Learning speech recognition from songbirds

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

BMC Neuroscience

Contact us