Skip to main content


We're creating a new version of this page. See preview

  • Poster presentation
  • Open Access

Auditory object feature maps with a hierarchical network of independent components?

BMC Neuroscience201415 (Suppl 1) :P66

  • Published:


  • Receptive Field
  • Sparse Representation
  • Inferior Colliculus
  • Cochlear Nucleus
  • Hierarchical Organization
Auditory Object representation in the brain is still a controversial question [1, 2]. Kumar et al. [3] discuss the hierarchical organization for auditory object perception and observe that the Planum Temporale (PT) area of the cortex encodes invariant representations of the spectral envelops of sounds. Many other studies find maps of representations elsewhere in the brain (Cochlear Nucleus, Inferior Nucleus, etc.). Sparse representations with minimum overlap could be considered, according to Barlow [4]. Griffiths and Warren [5] propose that auditory object representations might be segmented or segregated in the Planum Temporal (PT) by increasing the independence between the neural activities. We therefore explore the potential of a hierarchical neural assembly - with the use of a computer simulation - whose layers increase the feature independence during training, to represent auditory object parts. It is observed that learned features are organized into non-overlapping maps (Figure 1) and that redundancy of the representation is in fact reduced. Learning was done on three categories of sounds having distinct acoustical statistics: speech, music and natural sounds. We observed that the learned feature maps are very different from one sound category to another and might be, to some extend, comparable to receptive fields measured in the brain. We discuss of their potential similarity with receptive fields measured in the Inferior Colliculus of the Guinea Pig and how they might be part of a representation of auditory objects in the brain.
Figure 1
Figure 1

Learned representations illustrated for speech and natural sounds. FastICA training is first done on patches of 80ms x 32 cochlear channels of the envelopes coming from a 128 channels cochleagram (Level L0). Then, patches of 160ms x 64 cochlear channels are created at level L1 with a concatenation through time and space of the learned L0 features. FastICA is then performed on these larger patches to generate the new L1 representations. The same procedure is repeated for level L2.

Authors’ Affiliations

NECOTIS, Département génie électrique, génie informatique, Université de Sherbrooke, Québec, Canada, J1K 2R1


  1. Bizley JK, Cohen YE: The what, where and how of auditory-object perception. Nat Rev. Neuro. 2013, 14 (10): 693-707. 10.1038/nrn3565.View ArticleGoogle Scholar
  2. Griffiths TD, Warren JD: What is an auditory object?. Nat Rev Neurosci. 2004, 5 (887): 892-Google Scholar
  3. Kumar S, Stephan KE, Warren JD, Friston KJ, Griffiths TD: Hierarchical processing of auditory objects in humans. PLoS Comput Biol, Public Library of Science,. 2007, 3 (6): e100-e100. 10.1371/journal.pcbi.0030100.View ArticleGoogle Scholar
  4. Barlow H: Redundancy reduction revisited. Network : Comput. Neural Syst. 2001, 12: 241-253. 10.1088/0954-898X/12/3/301.View ArticleGoogle Scholar
  5. Griffiths TD, Warren JD: The planum temporale as a computational hub. Trends in neurosciences. 2002, 25 (7): 348-353. 10.1016/S0166-2236(02)02191-4.View ArticlePubMedGoogle Scholar


© Rouat et al; licensee BioMed Central Ltd. 2014

This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. The Creative Commons Public Domain Dedication waiver ( applies to the data made available in this article, unless otherwise stated.