- Research article
- Open Access
Maturation of auditory temporal integration and inhibition assessed with event-related potentials (ERPs)
BMC Neuroscience volume 11, Article number: 49 (2010)
We examined development of auditory temporal integration and inhibition by assessing electrophysiological responses to tone pairs separated by interstimulus intervals (ISIs) of 25, 50, 100, 200, 400, and 800 ms in 28 children aged 7 to 9 years, and 15 adults.
In adults a distinct neural response was elicited to tones presented at ISIs of 25 ms or longer, whereas in children this was only seen in response to tones presented at ISIs above 100 ms. In adults, late N1 amplitude was larger for the second tone of the tone pair when separated by ISIs as short as 100 ms, consistent with the perceptual integration of successive stimuli within the temporal window of integration. In contrast, children showed enhanced negativity only when tone pairs were separated by ISIs of 200 ms. In children, the amplitude of the P1 component was attenuated at ISIs below 200 ms, consistent with a refractory process.
These results indicate that adults integrate sequential auditory information into smaller temporal segments than children. These results suggest that there are marked maturational changes from childhood to adulthood in the perceptual processes underpinning the grouping of incoming auditory sensory information, and that electrophysiological measures provide a sensitive, non-invasive method allowing further examination of these changes.
The maturation of auditory processing abilities shows a protracted developmental time course, with many specific processes not reaching adult levels until adolescence. This prolonged maturation has been shown with behavioral measures of performance on auditory tasks and by analysis of the electrical activity of the brain's response to auditorily presented information . Auditory temporal processing has been a focus of particular interest because it appears important for language and literacy acquisition . Behaviorally, temporal acuity has often been assessed using gap detection tasks, where participants are required to respond to brief gaps of varying duration embedded in the signal. Several studies have shown improvements in gap detection with age, but there are marked differences in the estimates of the gap detection thresholds and the sensitivity of the tasks to development, depending on the characteristics of the signal in which the gaps are embedded . When the gap is embedded between 2 ms tone pips of identical frequencies, young children are able to detect gaps at thresholds of 5.6 ms , whereas higher thresholds are reported when the gaps are embedded in tones of differing frequencies [5, 6]. Other tasks assessing auditory temporal processing, such as detection of frequency modulation and sensitivity to backward masking, show a protracted developmental course, with school-aged children performing below adult levels [7, 8].
The study of auditory maturation using ERPs is complicated by the fact that there are marked age effects on the morphology, amplitudes, and latencies of the obligatory sensory components elicited by auditory stimuli. In children younger than 10 years of age, the auditory evoked response to clicks or tones is dominated by a positive peak at a latency of approximately 100 ms (P1). At approximately 10 years of age, the first negative deflection (N1) becomes apparent in the ERP and there is a corresponding decrease in P1 latency and amplitude [9–11]. Ponton et al.  argued, based on the developmental trajectories of the amplitudes, latencies, and dipole sources identified for these two peaks, that the modulation of the ERP waveform morphology reflected the superimposition of a developing N1 source generated in superficial cortical layers of the supratemporal gyrus.
When two auditory stimuli are presented in succession, there are three potential effects we might expect to see in the ERP: summation of responses, attenuation of the response to the second stimulus, or enhancement of the response to the second stimulus. The simplest situation is where identical ERPs are elicited to each stimulus, and these summate. In such a case there is no influence of one evoked response on the other, but the resulting waveform may look very different to that for a single tone if the time interval is short, so that the response to the second stimulus begins before the response to the first stimulus is complete. If the response to the paired-tone sequence was the result of simple summation, alignment and subtraction of the response to the single stimulus from the waveform elicited by the paired-stimulus waveform would result in an identical response to that seen to stimulus one alone.
A second possibility is that the response to the second stimulus in a pair or sequence may be attenuated relative to the first stimulus. Such attenuation has been described for responses to auditory stimulus sequences, but there is debate as to the underlying mechanism. One proposition is that attenuation is the consequence of neural refractoriness, a phenomenon well-understood at the level of individual neurons, which show reduced sensitivity to incoming stimuli immediately after firing. The notion of refractoriness can be scaled up to postulate refractory periods affecting the neural generators of the auditory ERP [12–15]. However, the notion of the refractory period has been questioned because the time course of response attenuation is not as expected. Refractoriness is greatest immediately after a neuron has responded, with sensitivity gradually increasing as time passes. However, attenuation of the response to the second stimulus in a tone pair or tone train is greater when the tones of the pair or train are separated by 400 ms than when they are separated by shorter ISIs of 100-300 ms. To account for this, Sable et al.  proposed a process of latent inhibition, derived from a physiological model developed by Loveless and colleagues [17, 18] to explain the effects of within-pair ISI on the neuromagnetic response to tone pairs. According to this model, the initial tone of the pair activates N1 generators and this activation then spreads to other areas where further excitatory and inhibitory circuits are activated. The inhibitory circuits act in a feedback loop that reduces the activation of the N1 generators to subsequent stimuli, although this process takes approximately 400 ms to become fully functional.
The third possibility, of response enhancement to the second tone of a pair, was also discussed by Loveless et al. , with regard to situations where the second tone occurred after a short ISI. According to the latent inhibition model, we would not expect response attenuation if the ISI was less than 400 ms. Loveless et al.  reported that in this case enhancement of the response to the second tone was observed. They identified two distinct neural generators contributing to the neuromagnetic response (N100 m) and the amplitude enhancement seen at ISIs from 70-300 ms reflected activation of the later, more anteriorly-located source (N100 mA).
Sable et al.  examined amplitude modulation of the electrophysiological analogue of this neuromagnetic response (N1) following presentation of short trains of stimuli at ISIs of 100, 200, 300, and 400 ms to evaluate whether latent inhibition, refractoriness, or a combination of both processes best explained the N1 amplitude modulation. They found that N1 amplitude attenuation to the second tone of the sequence was maximal at ISIs of 400 ms, consistent with the latent inhibition model. However, they did not find enhancement of N1, compared to the first tone in the train, at ISIs less than 400 ms. These differences may reflect the use of short, five-tone trains of stimuli by Sable et al., rather than longer trains of paired-tone sequences, as used by Loveless et al.  and Müller et al. . Although they differ in terms of whether simple summation or enhancement occurs at ISIs of less than 400 ms, Müller et al.  and Sable et al. , note that the period prior to which response attenuation is seen may correspond to the temporal window over which successive auditory events are integrated as a single unit.
According to this account, development of auditory temporal integration can be studied by measuring the brain's response to tones separated by varying ISIs. Bishop and McArthur  examined the neural processing of unattended auditory stimuli by comparing the event-related potentials (ERPs) elicited to single tones with those of tones pairs separated by ISIs of 20, 50, and 150 ms. Their results indicated that the ERPs elicited by tones separated by 50 and 150 ms were distinguishable in the older participants aged between 14 and 19 years. In contrast, younger participants (aged 11-13 years) failed to elicit distinguishable neural responses to tone pairs separated by intervals of 20 or 50 ms. Rather a single neural response was elicited, similar to that seen when only one tone was presented. Delays in the maturation of this discrimination process have been linked to developmental disorders of language, and these authors reported that the neural processing of auditory stimuli in a group of children with specific language impairment resembled that of the younger control cohort. The mechanisms underpinning the normal developmental trajectory are not understood, but measurement of the neural activity elicited in response to auditory stimuli presented at varying ISIs allows examination of the processes contributing to this aspect of auditory processing maturation.
The study by Bishop and McArthur  included only eight typically-developing children in each of two broad age bands. Furthermore, the stimuli in their tone pairs differed in frequency and duration, and they did not attempt to distinguish a summation account of responses to tone pairs from hypotheses involving refractoriness or enhancement. The present study took this line of work further. We assessed a sample of younger children than those reported in Bishop and McArthur  and the auditory task included a broader range of ISIs (25, 50, 100, 200, 400, and 800 ms) to allow examination of refractory processes, temporal integration, and latent inhibition. In the present study both tones of the pair were the same frequency (1000 Hz) and duration (20 ms), making it possible to compare responses to tone 1 and tone 2 directly. Tones were presented at a lower intensity than was used by Bishop and McArthur , and no frequency-deviant stimuli were presented.
The aims of the study were twofold: the first aim was to identify an index of integration of rapidly presented auditory information, reflected in the evoked responses to tone pairs presented at increasing ISIs. It was predicted that adults would show N1 enhancement to the second tone of the pair at within-pair ISIs of 100-200 ms relative to that elicited by the single tone. The second aim was to examine whether children would show attenuation of the early sensory ERP peak (P1) to the second tone in a pair at shorter ISIs, in line with a refractory process, or latent inhibition as proposed by Sable et al. (2004). The patterns of ERP peak amplitude modulation expected based on models of latent inhibition, refractoriness, and facilitation, across the differing ISIs used in this study, are shown graphically in Figure 1.
Figure 2 shows the distribution across the scalp of the average amplitude during the first 200 ms following presentation of the single tone and the second tone of the pair, corrected for response overlap. Visual inspection of the scalp topography indicates that the distribution of the N1 in adults and the P1 in children were maximal at fronto-central sites, therefore the intra-class correlation coefficient was assessed at Fz in subsequent analyses.
Figure 3 shows the grand-averaged waveforms at midline (Fz, Cz) and lateral sites (T7, T8) for the adult and the child samples. Adults showed a clear N1 peak to the first tone in the pair, with a smaller amplitude lateral response also seen at sites T7 and T8. A distinct neural response to the second tone of the pair was also observed at ISIs exceeding 25 ms, although identification of the N1 peaks to the second tone was obscured by overlapping responses to the first tone. In contrast, the children's auditory evoked response was dominated by frontally-distributed P1 and N2 peaks, and the temporally-distributed T-complex can be seen at T7 and T8 sites. In children, a distinct neural response to the second tone in the pair was observed only at ISIs greater than 200 ms.
Intra-class correlation coefficient (ICC) analyses
The similarity of the ERP waveforms elicited to the tone pairs presented at each ISI was compared to that elicited by a single tone by computing the individual ICCs over the 0-400 ms time range (see Methods), and the mean ICCs for the two age groups are summarised in Table 1. This method of analysis allowed examination of the early ERP waveforms elicited, and is particularly useful when characteristic peaks are not identifiable . A high ICC indicates that the waveform to the tone pair is similar to the waveform for a single tone, i.e. that the responses to the two tones in the pair are integrated. The auditory evoked responses elicited in the child sample were considerably noisier than those elicited in the adult sample, reflected in a lower ICC between the waveform elicited to the single tone and the waveform elicited to the 400 ms ISI tone-pair condition (t(39) = 2.32, p = .03). Given this overall age group difference, the effects of ISI were analysed within each age group. For the adult sample, the main effect of ISI was significant, (F(4,56) = 5.78, p = .002, ε = .783, partial η2 = .29); the ICCs were significantly lower than the 400-ms ISI condition for tone pairs presented at all ISIs below 400 ms, and in excess of 29% of the variance in scores within each condition was explained by ISI. For the child sample, there was a significant main effect of ISI, F(4,100) = 3.19, p = .03, ε = .72, partial η2 = .11; ICCs were significantly lower than the 400-ms ISI condition for tone pairs presented at 200 ms ISI, but effect sizes were negligible at ISIs smaller than 200 ms (see Table 1). In sum, the waveforms of the adults indicated differentiation of the neural responses to rapid tone pairs and those to single tones, whereas for the children, this differentiation was not seen with ISIs below 200 ms.
Analyses of ERPs corrected for response overlap
When the tones were presented at short ISIs, the identification of the evoked response to the second tone of the tone pair is confounded by overlapping potentials elicited in response to the first tone. Therefore, the second set of analyses was conducted on the ERP waveforms corrected for response overlap as described in Methods. The corrected grand-averaged waveforms elicited in response to the second tone of the tone-pair are shown in Figure 4. For the adult sample, a clear N1 peak is identifiable in each ISI condition and the amplitude of the N1 peak appears smaller following the 800 ms ISI than following shorter ISIs. For the child sample, the grand-averaged waveform at midline sites is dominated by a positive deflection, peaking at 86 ms (P1), and a late negative deflection, peaking at 254 ms (N2) At temporal sites the waveform is dominated by a positive deflection, peaking at 126 ms (Ta), and a subsequent negativity, peaking at 206 ms (N1c).
Principal Components Analysis (PCA) to identify latency ranges
PCA of the subtracted waveforms from adults over the interval 78 - 166 ms was conducted based on the latency ranges over which the N1 deflection differed significantly from zero, using the correction for multiple comparisons as recommended by Guthrie & Buchwald . Four factors with eigenvalues greater than 1.0 were extracted, accounting for 92.7% of the variance in the data. Plots of the rotated component loadings are presented in Figure 5, graphically depicting the time course of each factor, as recommended by Picton et al. . For ease of visual comparison, the grand averaged ERPs at each site, summed across participants and ISIs, are also presented on the same time-scale. The latency intervals determined for the PCA components were calculated as the interval over which the rotated factor loading for a specific component exceeded the loadings on any of the orthogonal components. The maximal loadings on the first PCA component (eigenvalue = 32.4, explaining 52.1% of the variance in the data), from 98-122 ms, corresponded in latency with the middle portion of the N1 peak in adults (labelled mid-N1). The maximal loadings on the second PCA component (eigenvalue = 14.0, explaining 22.4% of the variance in the data), from 74-94 ms, corresponded in latency with the early portion of the N1 peak (labelled early-N1). The maximal loadings on the third PCA component (eigenvalue = 8.4, explaining 13.6% of the variance in the data), from 126 ms to 146 ms, corresponded in latency with the late portion of the N1 peak (labelled late-N1). The fourth component (eigenvalue = 2.9) explained less than 10% of the variance in the data and has not been interpreted further. Based on these results, the early-N1, mid-N1, and late-N1 components in the adult sample were quantified at Fz and Cz over the latency windows 74-94 ms, 98-122 ms, and 126-146 ms respectively, for subsequent analyses. In the child sample, PCA of the corrected waveforms over the interval 58-310 ms (based on the interval over which the amplitude of the waveform differed from zero, as above) identified seven factors with eigenvalues greater than 1.0, explaining 91.5% of the variance in the data, although components 4 to 7 explained less than 10% of the variance in the data and have not been interpreted further. The maximal loadings on the first PCA component (eigenvalue = 92.6, explaining 42.2% of the variance in the data) occurred over the interval 150-202 ms, corresponding in latency to a negative deflection, seen most clearly at the lateral sites, T7 and T8. We have labeled this PCA component N1c, after the nomenclature used by Bruneau et al. , Pang and Taylor , and Woods . The second PCA component (eigenvalue = 33.0, explaining 15.0% of the variance in the data) loaded maximally over the latency interval 102-146 ms and corresponds to the positive deflection at the lateral temporal sites, predominant over the right hemisphere (T8). We labelled this PCA component Ta, based on the nomenclature adopted by Woods . The third PCA component (eigenvalue = 28.6, explaining 13.0% of the variance in the data), with maximal factor loadings from 58-98 ms, corresponded in latency with the clearly identifiable frontally-distributed positive peak, labelled P1. Based on the results from this analysis, the ERP components were quantified at Fz, Cz, T7, and T8 as the mean amplitudes over the latency windows 58-98 ms (P1), 102-146 ms (Ta) and 150-202 ms (N1c).
Given the marked differences in morphology of the waveforms for the adult and child sample, we have not attempted to directly relate the component structure for the adult fronto-central N1 to those identified in the analysis of the P1/T-complex observed in the child sample. Although source dipole analysis has identified similar neural generators in children and adults, it is not clear which deflections in the two waveforms represent activity from the same underlying neural generators (Albrecht et al., 2000).
Analyses of amplitude measures of ERPs corrected for response overlap
Mean amplitudes over the early, middle, and late N1 latency intervals for the adult sample are summarized in Table 2. Mean amplitudes over the P1 and Ta latency intervals for the child sample are summarized in Table 3.
For adults, early-N1 elicited by the second tone of the pair was significantly smaller in amplitude than following a single tone at longer ISIs, with the amplitude attenuation failing to reach statistical significance at ISIs below 400 ms (main effect of ISI, F(6, 84) = 2.50, p = .028, ε = .64, partial η2 = .15). This agrees with findings by Sable et al.  for brief trains of tones, and is consistent with attenuation resulting from an inhibitory process that does not become fully functional for approximately 400 ms after the first tone onset. Amplitude modulation of the mid-N1 latency interval as a function of ISI failed to reach statistical significance (F(6, 84) = 1.24, p = .30, ε = .56, partial η2 = .08). In contrast, amplitude over the late-N1 latency interval was significantly enhanced to the second tone, relative to the single tone, when separated by ISIs of 100-400 ms (main effect of ISI F(6, 84) = 6.37, p = .001, ε = .52, partial η2 = .31).
In the child sample, P1 amplitude elicited to the second tone of the pair was significantly reduced when the second tone was presented at shorter ISIs, but not at longer ISIs, consistent with attenuation resulting from a refractory effect rather than an inhibitory process. P1 amplitude was significantly reduced, compared to the single tone, when tone pairs were presented at ISIs shorter than 200 ms (main effect of ISI, F(6, 150) = 5.39, p < .001, ε = .76, partial η2 = 0.18). To assess whether P1 was elicited at the shorter ISIs, the amplitude of the waveform over the P1 latency range at Fz was compared with a test value of zero. No statistically significant P1 was elicited at 25 ms ISIs (t(25) = -0.23, p = .82), 50 ms ISIs (t(25) = 0.47, p = .64), or 100 ms ISIs (t(25) = 1.19, p = .25). Ta amplitude was more positive over the right hemisphere (T8) than the left hemisphere (T7) (main effect of site, F(3,75) = 3.30, p = .04, ε = .73, partial η2 = 0.12), and was significantly less positive for tones presented at ISIs shorter than 400 ms than for the single tone (main effect of ISI, F(6,150) = 5.49, p < .001, ε = .78, partial η2 = 0.18).
ISI modulated N1c amplitude, and the effect of ISI was further qualified by a significant interaction with site (F(18,450) = 4.91, p < .001, ε = .48, partial η2 = 0.16). Separate analyses of the effect of ISI were conducted at each site, adjusting for the number of comparisons. Thus, the alpha level used for each test of significance was 0.0125. The mean amplitudes at each site across the differing ISI conditions are presented graphically in Figure 6. The amplitude over this latency interval was significantly enhanced at 200 ms ISIs relative to the single tone condition at Fz and Cz sites, with statistically non-significant amplitude modulation at the lateral sites, T7 and T8 (Fz main effect of ISI, F(6,150) = 4.61, p < .001, partial η2 = 0.16; Cz main effect of ISI, F(6,150) = 3.88, p = .001; partial η2 = 0.13; T7 main effect of ISI, F(6,150) = 1.41, p = .23, partial η2 = 0.05; T8 main effect of ISI, F(6,150) = 1.91, p = .08, partial η2 = 0.07). This result indicates that the fronto-centrally distributed negative enhancement over this interval is distinct from the temporally-distributed N1c component. In view of the latency of the component (150-202 ms), the topography of the ISI effect (fronto-central), and the effect of the experimental manipulation of ISI (amplitude enhancement), we have identified this component as equivalent to the late-N1 identified in the adult sample.
The primary aim of the current study was to examine the development of auditory processing, as indexed by electrophysiological measures which are not influenced by response demands that can contribute to age-related differences on behavioral measures. The results from this study showed that there is a marked change in the neural responses to brief tone pairs from childhood to adulthood. In children a distinct neural response was elicited at ISIs above 200 ms, but not when tone pairs were separated by ISIs of 25 ms, 50 ms, or 100 ms, whereas in early adulthood, a distinct neural response was elicited to tones presented at ISIs of 25 ms and longer. The results provide further validation for the use of the ICC in the study of auditory temporal processing  and extend the previously reported findings by showing that this measure can provide a sensitive index of neural responsiveness to rapidly presented auditory information in children as young as 7 years. It is necessary to be cautious when directly comparing the results from the present study with those reported by Bishop and McArthur given the methodological differences between the two studies, such as differences in the specific ISIs used; differences in the duration and intensity of the tones, and differences in the tasks that children were engaged in during tone presentation. Nevertheless, the observed estimate of between 100 and 200 ms for the younger children (7-9 years) in our study and between 50 and 150 ms for the older children (11-13 years) reported in the earlier study  leads us to speculate that developmental changes in auditory temporal processing may be identifiable over these age ranges with the ICC index, and warrants further investigation using this technique. However, it is noted that the signal-to-noise ratio was lower in the child waveforms than in the adult waveforms, and this factor may have contributed to the attenuation of the response to the second tone at ISIs of 25-100 ms.
The absolute magnitude of the temporal resolution in children estimated in the present study and Bishop et al.  is considerably larger (between 50 and 200 ms) than with previous estimates using behavioral gap detection tasks (between 3 and 40 ms) [4–6, 26, 27]. In adults, the elicitation of the P1-N1-P2 complex has been shown to relate to behaviorally determined gap detection thresholds, with attenuation of the response to the second tone when presented at subthreshold gap durations . It does not therefore seem appropriate to regard the developmental trend seen here as indicative of an improvement in auditory temporal discrimination. An alternative way of interpreting the results is to see the distinctive neural responses to tones at differing ISIs as reflecting a process of temporal integration, whereby repeated sounds are grouped together in a single auditory object. The notion of auditory objects is a highly disputed one, and the specific features, such as the spectral and temporal composition of the incoming sound, which define an object have been the subject of recent neuroscientific research . The results from the present study highlight the importance of the contextual temporal features of the incoming sound source in auditory perception, as postulated by Krumbholz et al, and show that detailed analysis of the N1 sub-components that can be identified in electroencephalograph (EEG)  and magnetoencephalograph (MEG) recordings  can contribute to an understanding of the role of temporal processing in auditory perception. A listener may still be able to discriminate changes in stimulus features of an auditory object, as may be caused by presence of a gap , and so this explanation is compatible with gap detection thresholds being smaller than auditory integration thresholds. An explanation in terms of auditory integration was proposed by Loveless  following observation of an enhancement in the neuromagnetic response (N100 mA) to tone pairs presented at stimulus onset asynchronies ranging from 70-300 ms. However, again, the time intervals obtained in this study are not entirely in line with expectation. Wang et al. , for instance, compared children and adult's neural responses to deviant tones in a mismatch negativity (MMN) paradigm, and noted that a sequence of two different deviants elicited a single MMN when tones were separated by inter-stimulus intervals of 100 ms for adults, 200 ms for 9- to 11-year-olds, and 300 ms for 5- to 8-year-olds. If we regard the enhancement of the fronto-central negativity culminating at approximately 145 ms in adults and 170 ms in children as a marker of the integration of the second stimulus with the preceding stimulus, then our results suggest a somewhat shorter window of temporal integration of 200 ms for 7-9 year-old children, and, as far as the morphology of the response to a tone pair is concerned adults show a distinctive response to the second tone with ISI as short as 25 ms. However, it should be noted that the estimates derived from the MMN paradigm indicate the earliest latency that is longer than the temporal window of integration, rather than the ISIs that fall within the temporal window of integration. An alternative explanation for the varying estimates of the temporal integration time could relate to the postulated neural mechanisms contributing to auditory integration associated with different psychoacoustic features. In the double deviant mismatch paradigm used by Wang et al. , the first deviant stimuli differed from standard tones in frequency and the second deviant tone differed from the standard tones in intensity. Temporal integration time was assessed by determining the ISI at which the second deviant elicited a distinct MMN. The finding of two distinct MMNs at ISIs exceeding 200 ms supports the contention that both features were independently coded despite the regular co-occurrence of the deviants in the sequence. Neural coding that contributes to auditory segregation based on frequency commences peripherally, with tonotopic representation at the cochlear level. Perceptual streaming following presentation of continuous sequences of tones increases with increases in the magnitude of the frequency difference between the tones . In the present study, auditory objects were defined by temporal separation of sequential tone pairs presented at the same frequency and intensity. Estimates of the temporal integration window may depend on the specific features underpinning auditory segregation. Alternatively, differing estimates of the temporal window of integration in these two studies could relate to the refractory period of the neural generators. In the current study, tone pairs were separated by a longer inter-trial interval (two seconds), allowing greater recovery of the neural responses to stimuli than was possible with the continuous tone sequences presented at the relatively short ISIs by Wang et al. . These results suggest that the notion of a constant temporal window of integration that applies across all stimulus types and methods may be misguided, and that both stimulus characteristics and methods of measurement may give different temporal estimates.
The second major aim of the present study was to examine the relative contributions of refractoriness and latent inhibition to the amplitude modulation observed at varying ISIs in adults and children. In adults, the early N1 was attenuated at longer ISIs, with the largest effect at ISIs of 800 ms, consistent with the operation of a latent inhibitory process, as suggested by Sable et al. . In children, the amplitude of P1 was attenuated at shorter ISIs, consistent with a refractory process, and there was no evidence for the operation of the latent inhibitory process that had been observed at longer ISIs in the adult sample. It is not possible to determine from the current results whether the latent inhibitory process is not yet functional in 7-9 year-old children, or whether it operates over a longer temporal interval than we examined. Our results raise the possibility that one factor influencing the length of the temporal window in children is the enhanced refractoriness of neurons in auditory cortex. Given that we found the same pattern of results using two different forms of analysis (comparison of amplitudes in the subtracted waveforms, and size of the intra-class correlation coefficient for the unsubtracted waveforms), it is unlikely that they are due to artefact introduced by the subtraction method. It is possible that the suppression of P1 at 100 ms ISIs observed in both analyses is due to overlap with the N2 elicited in response to the previous tone, although we think this explanation does not fully explain the results, given that the N2 component had been substantially reduced by application of the 2 Hz high-pass filter, and given that we observed a similar amplitude attenuation of the Ta peak at similar ISIs at lateral sites (T7 and T8) where N2 was not evident. The results from the current study are at odds with the findings reported by Dinces and Sussman , where a pronounced P1, modulated by the intensity of the stimuli, was elicited at 100 ms ISIs to deviant frequency tones in 9-11 year-old children. As noted by these authors, modulation of P1 amplitude in their study may have reflected attentional capture by relatively loud stimuli embedded within constantly varying intensity stimuli and further research is required to determine whether the differing results are related to differences in the ages of the samples studied (7-9 years in the current study; 9-11 years in their study) and/or to differences in the acoustic deviance of the stimuli relative to the preceding train (same intensity for all tones in the current study, intensity deviants ranging from 66 dB to 86 dB following standard 70 dB tones in their study).
In terms of the physiological model developed by Loveless et al.  and McEvoy et al. , the enhancement of the N100 mA observed at short ISIs reflects the activation of a neural generator located in auditory association areas summating with the response to successively presented auditory information. Other factors related to cortical maturation, such as synaptic efficacy, myelination and conduction velocity could also have contributed to delay activation of these association areas and alter the timing of the subsequent integration of successively presented tones, thus modulating the neural response to the second tone at faster rates.1 The results could also be interpreted in terms of a phase-resetting account, whereby the peaks and troughs in the averaged auditory ERP result from ongoing brain oscillations being synchronised at the onset of a stimulus ; in this case we would need to postulate that the likelihood of phase resetting is a function of the interval between two stimuli.
Although we have identified late N1 enhancement in response to auditory stimuli in the present study, Wang et al.  report a similar enhancement of the N1 elicited following presentation of somatosensory stimuli, raising the possibility that this effect may reflect modulation that is common across different sensory systems.
It is difficult to dissociate facilitation from inhibition in most experimental designs, and Sable et al.  argued that inhibitory mechanisms more parsimoniously explained the postulated facilitation observed in studies where tones had been presented at varying ISIs. By comparing the neural response to tone pairs with that elicited by a single tone, and by dissociating the underlying N1 sub-components with PCA, the results from the current study suggest that both processes operate, as proposed by McEvoy et al. .
These results indicate that adults integrate sequential auditory information into smaller temporal segments than children, and suggest that there are marked maturational changes from childhood to adulthood in the perceptual processes underpinning the grouping of incoming auditory sensory information. In future studies, it would be valuable to include behavioral measures of auditory temporal grouping; our prediction is that this measure will relate to the ERP indices studied here. In addition, future research investigating the relationship between the electrophysiological indices elicited in the present paradigm and individual differences in language proficiency is warranted.
Adult sample: Fifteen healthy participants ranging in age from 19-28 years were recruited. All volunteered to take part in the experiment after receiving information regarding the nature of the procedures and provided written informed consent. None reported hearing difficulties.
Child sample: Children participated in a two-day holiday activity program, designed to investigate the cognitive, emotional, and social development of children aged 7-9 years (Project K.I.D.S.). ERP data were excluded from individuals where a history of neurological disorders was reported, or where reliable auditory evoked responses were not elicited to the tones (see section on EEG acquisition and analysis for details). Auditory ERPs from a sample of 28 children ranging in age from 7 years 2 months to 9 years 11 months were available for analyses.
Auditory stimuli were 1000 Hz, 20 ms sinusoidal tones with 2 ms rise and fall times. In the paired-tone conditions, a second identical tone was presented following ISIs of 25 ms, 50 ms, 100 ms, 200 ms, 400 ms, or 800 ms. Sound intensity was calibrated using a 1-second continuous 75 dB SPL tone measured with a Bruel and Kjaer sound level meter.
An electrode cap was fitted and participants were presented with the auditory stimuli while they concurrently completed a visual flanker task and during interspersed breaks as they silently read or played electronic games. They were instructed to ignore the tone sequences, but to remain quiet and still throughout the recording session. Trials were presented in blocks at an inter-trial interval of 2 s, with random selection of each of the seven tone pairs within each block. Delivery of the first tone of the pair was randomly jittered between 300 and 800 ms from the start of the trial to avoid anticipatory ERP effects and prevent synchronization of the auditory evoked responses with the visual evoked responses elicited during the cognitive task. The number of trials varied across participants, as the task was terminated at the end of the allocated scheduled timeslot and there was variation across individuals in the time taken to apply the electrode cap. The length of the interspersed breaks between the segments of flanker task was increased for the children. For the adult sample, an average of 68 epochs were included in the individual averaged ERPs at each ISI (range 63 - 74), and for the child sample an average of 115 epochs were included in the individual averaged ERPs at each ISI (range 72 - 135).
The protocol was approved by the University of Western Australia Human Research Ethics Committee (RA/4/1/1436).
EEG acquisition and analyses
The electroencephalogram (EEG) was recorded continuously from 33 scalp locations using an electrode cap (EasyCap, Montage 40, excluding TP9 and TP10). Electrodes were also placed above and below the left eye, and on the left and right mastoids with an averaged mastoid reference digitally computed online. The ground was located at site AFz. Data were amplified with a NuAmps 40-channel amplifier, and digitized at a sampling rate of 250 Hz. The analog signal was filtered online with a low pass 70 Hz filter and 50 Hz notch filter, and digitally filtered off-line with a 2-30 Hz, zero phase shift band-pass filter (12 dB down). The 2 Hz high-pass filter was applied to reduce overlap from slow responses to the first tone , and is within the range of high-pass filters applied in much of the literature cited (1 Hz to 5 Hz). Amplitude modulations of slow ERP components (e.g. N2, P3) can not be sensibly examined in the current data set, and no specific hypotheses about the effects of ISI on these components had been made. Ocular artifact reduction was performed on the continuous EEG using the algorithm developed by Semlitsch et al.  with regression-based subtraction of the averaged blink artefact identified in the bipolar VEOG channel. Epochs encompassing an interval from 50 ms prior to the onset of the first tone in the pair to 1200 ms post-stimulus were extracted and trials contaminated by artifact exceeding 150 μV rejected from the individual subject ERP averages.
Reliability of the individual auditory ERPs was assessed by means of the intra-class coefficient (ICC), computed over the interval from stimulus onset to 400 ms post-stimulus, between the individual's auditory evoked response to the single tone and the first tone of the tone pair presented at 800 ms ISI. One would expect these two conditions to give identical ERPs to the first tone and so they can be used to assess reliability of an individual's data.
Two methods for analysis of the ERP waveforms elicited by the tone pairs were used. In the first set of analyses, the similarity of the evoked responses to the tone pairs presented at increasing ISIs was compared by computing the ICC between the individual's single tone ERP and the ERP elicited to the tone pairs presented at increasing ISIs, consistent with the approach reported by Bishop and McArthur. The ICC provides a measure of the similarity of the waveform amplitude and shape, with higher values representing greater similarity between the ERPs elicited to the different tone-pair conditions. ICC should be low for conditions where two distinct neural responses were represented in the ERP. The ICC is a summary statistic, similar to the Pearson correlation coefficient, sensitive to differences in the magnitude of the values as well as to differences in the pattern of the numerical arrays (i.e. waveform shape). For the present analyses, the ICC between the two arrays, X and Y, containing N numbers of pairs of data points was computed as (MS between - MS within)/(MS between + MS within), where MS between = (((Σ (X2) + Σ (Y2) + 2 * Σ (X.Y))/2 - (Σ (Y^2/2 * N)))/(N-1) and MS within = (0.5 * (Σ (X2) + (Y2)) - Σ (X.Y)/N (Bishop et al., 2005). ICCs were computed over a 0-400 ms interval, normalised by applying the Fisher's z-transformation. Repeated measures ANOVA was used to assess the statistical significance of the ICC modulation across the five ISI conditions (25, 50, 100, 200, and 400). Differences in the magnitude of the ICC at each ISI were compared with the ICC of the 400 ms tone pair condition, as any response to the second tone would not have been elicited within the 0-400 ms epoch. Given the predominance of the early negative deflection (N1) in the adult waveform in contrast to the P1-N2 complex evident in the child waveform, statistical analyses were conducted within each age group. Post-hoc analyses were conducted by comparing the mean ICC at each ISI with the 400 ms ISI condition, and the overall family error rate adjusted for the number of post-hoc comparisons conducted.
In the second set of analyses, responses to the second tone of the pair were examined by correcting for response overlap, consistent with the approach reported by Sable et al. . Specifically, the ERP waveform elicited following presentation of the single tone condition was subtracted from each of the paired-tone ERP averaged waveforms. The resultant corrected ERP waveform was baseline adjusted around the 50 ms preceding the onset of the second tone, and the onset of the second tone in the pair aligned as the zero time-point. To assist in the identification and quantification of the N1 component structure [14, 25], temporal principal components analysis (PCA) based on the covariance matrix, with Varimax rotation of the factors [39, 40] was conducted on the individual ERP averaged data, computed for each ISI condition at the midline and lateral sites (Fz, Cz, Pz, T7, T8) to identify the latency ranges over which to quantify the ERP mean amplitudes calculated relative to the mean amplitude of the 50 ms pre-stimulus epoch. To assess the reliability of the PCA components extracted, the Mahalanobis distance of each case was determined, and cases identified as multivariate outliers with respect to the solutions were removed from the dataset. Of the 525 cases entered into the PCA for the adult sample (5 scalp sites × 7 conditions × 15 participants), nine cases were identified as multivariate outliers, and excluded for the determination of the latency intervals over which the mean amplitudes were subsequently measured. Of the 980 cases entered into the PCA for the child sample, 24 were identified as multivariate outliers and similarly excluded from the PCA. Repeated measures ANOVAs were used to assess the statistical significance of the ERP mean amplitude modulation across the seven ISI conditions (single tone, 25, 50, 100, 200, 400, and 800 ms) and post-hoc analyses were conducted by comparing the amplitudes at each ISI with the amplitude elicited by the single tone. This comparison allows examination of the independent effects of facilitation and attenuation, without the confound of response overlap, which affects only the short ISI conditions.
The distributions of variables within each condition and age group were inspected for skewness and kurtosis, and univariate outliers (z > 3.29). Data from two children were excluded from the analyses. The overall family error rate was adjusted for the number of post-hoc comparisons conducted. To account for violations of sphericity in analyses where there were greater than two levels of the repeated measures factor, probability levels based on the Greenhouse-Geisser adjustment to the degrees of freedom are reported [41–43], together with the uncorrected degrees of freedom and epsilon.
Eggermont JJ, Ponton CW: Auditory-evoked potential studies of cortical maturation in normal hearing and implanted children: Correlations with changes in structure and speech perception. Acta Otolaryngol. 2003, 123: 249-252. 10.1080/0036554021000028098.
Bishop DVM: Uncommon Understanding: Development and Disorders of Language Comprehension in Children. 1997, Hove: Psychology Press
Müller D, Widmann A, Schröger E: Auditory streaming affects the processing of successive deviant and standard sounds. Psychophysiology. 2005, 42: 668-676. 10.1111/j.1469-8986.2005.00355.x.
Trehub SE, Schneider BA, Henderson JL: Gap detection in infants, children, and adults. J Acoust Soc Am. 1995, 98: 2532-2541. 10.1121/1.414396.
Smith NA, Trainor LJ, Shore DI: The development of temporal resolution: Between-channel gap detection in infants and adults. J Speech Lang Hear Res. 2006, 49: 1104-1113. 10.1044/1092-4388(2006/079).
Walker KMM, Hall SE, Klein RM, Phillips DP: Development of perceptual correlates of reading performance. Brain Res. 2006, 1124: 126-141. 10.1016/j.brainres.2006.09.080.
Dawes P, Bishop DVM: Maturation of visual and auditory temporal processing in school-aged children. J Speech Lang Hear Res. 2008, 51: 1002-1015. 10.1044/1092-4388(2008/073).
Hartley DEH, Wright BA, Hogan SC, Moore DR: Age-related improvements in auditory backward and simultaneous masking in 6- to 10-year-old children. J Speech Lang Hear Res. 2000, 43: 1402-1415.
Albrecht R, Suchodoletz Wv, Uwer R: The development of auditory evoked dipole source activity from childhood to adulthood. Clin Neurophysiol. 2000, 111: 2268-2276. 10.1016/S1388-2457(00)00464-8.
Ponton C, Eggermont JJ, Khosla D, Kwong B, Don M: Maturation of human central auditory system activity: Separating auditory evoked potentials by dipole source modeling. Clin Neurophysiol. 2002, 113: 407-420. 10.1016/S1388-2457(01)00733-7.
Sussman E, Steinschneider M, Gumenyuk V, Grushko J, Lawson K: The maturation of human evoked brain potentials to sounds presented at different stimulus rates. Hear Res. 2008, 236: 61-79. 10.1016/j.heares.2007.12.001.
Coch D, Skendzel W, Neville HJ: Auditory and visual refractory period effects in children and adults: An ERP study. Clin Neurophysiol. 2005, 116: 2184-2203. 10.1016/j.clinph.2005.06.005.
Čeponienė R, Cheour M, Näätänen R: Interstimulus interval and auditory event-related potentials in children: Evidence for multiple generators. Electroencephalogr Clin Neurophysiol. 1998, 108: 345-354. 10.1016/S0168-5597(97)00081-6.
Näätänen R, Picton T: The N1 wave of the human electric and magnetic response to sound: A review and an analysis of the component structure. Psychophysiology. 1987, 24: 375-425. 10.1111/j.1469-8986.1987.tb00311.x.
Wang W, Datta H, Sussman E: The development of the length of the temporal window of integration for rapidly presented auditory information as indexed by MMN. Clin Neurophysiol. 2005, 116: 1695-1706. 10.1016/j.clinph.2005.03.008.
Sable JJ, Low KA, Maclin EL, Fabiani M, Gratton G: Latent inhibition mediates N1 attenuation to repeating sounds. Psychophysiology. 2004, 41: 636-642. 10.1111/j.1469-8986.2004.00192.x.
Loveless N, Levänen S, Jousmäki V, Sams M, Hari R: Temporal integration in auditory sensory memory: Neuromagnetic evidence. Electroencephalogr Clin Neurophysiol. 1996, 100: 220-228. 10.1016/0168-5597(95)00271-5.
McEvoy L, Levänen S, Loveless N: Temporal characteristics of auditory sensory memory: Neuromagnetic evidence. Psychophysiology. 1997, 34: 308-316. 10.1111/j.1469-8986.1997.tb02401.x.
Bishop DVM, McArthur GM: Immature cortical responses to auditory stimuli in specific language impairment: Evidence from ERPs to rapid tone sequences. Dev Sci. 2004, 7: F11-F18. 10.1111/j.1467-7687.2004.00356.x.
Bishop DVM, McArthur GM: Individual differences in auditory processing in specific language impairment: A follow-up study using event-related potentials and behavioural thresholds. Cortex. 2005, 41: 327-341. 10.1016/S0010-9452(08)70270-3.
Guthrie D, Buchwald JS: Significance testing of difference potentials. Psychophysiology. 1991, 28: 240-244. 10.1111/j.1469-8986.1991.tb00417.x.
Picton TW, Bentin S, Berg P, Donchin E, Hillyard SA, Johnson R, Miller GA, Ritter W, Ruchkin DS, Rugg MD, Taylor MJ: Guidelines for using human event-related potentials to study cognition: Recording standards and publication criteria. Psychophysiology. 2000, 37: 127-152. 10.1017/S0048577200000305.
Bruneau N, Roux S, Guerin P, Barthelemy C, Lelord G: Temporal prominence of auditory evoked potentials (N1 wave) in 4-8-year-old children. Psychophysiology. 1997, 34: 32-38. 10.1111/j.1469-8986.1997.tb02413.x.
Pang EW, Taylor MJ: Tracking the development of the N1 from age 3 to adulthood: An examination of speech and non-speech stimuli. Clin Neurophysiol. 2000, 111: 388-397. 10.1016/S1388-2457(99)00259-X.
Woods DL: The component structure of the N1 wave of the human auditory evoked potential. Electroencephalogr Clin Neurophysiol. 1995, 44: 102-109.
Hautus MJ, Setchell GJ, Waldie KE, Kirk IJ: Age-related improvements in auditory temporal resolution in reading-impaired children. Dyslexia. 2003, 9: 37-45. 10.1002/dys.234.
Irwin RJ, Ball AKR, Kay N, Stillman JA, Rosser J: The development of auditory temporal acuity in children. Child Dev. 1985, 56: 614-620. 10.2307/1129751.
Lister JJ, Maxfield ND, Pitt GJ: Cortical evoked response to gaps in noise: Within-channel and across-channel conditions. Ear Hear. 2007, 28: 862-878. 10.1097/AUD.0b013e3181576cba.
Griffiths TD, Warren JD: What is an auditory object?. Nat Rev Neurosci. 2004, 5: 887-892. 10.1038/nrn1538.
Krumbholz K, Patterson RD, Nobbe A, Fastl H: Microsecond temporal resolution in monaural hearing without spectral cues?. J Acoust Soc Am. 2003, 113: 2790-2800. 10.1121/1.1547438.
Sussman E, Steinschneider M: Attention effects on auditory scene analysis in children. Neuropsychologia. 2009, 47: 771-785. 10.1016/j.neuropsychologia.2008.12.007.
Okamoto H, Stracke H, Draganova R, Pantev C: Hemispheric asymmetry of auditory evoked fields elicited by spectral versus temporal stimulus change. Cereb Cortex. 2009, 19: 2290-2297. 10.1093/cercor/bhn245.
Snyder JS, Alain C, Picton TW: Effects of attention on neuroelectric correlates of auditory stream segregation. J Cogn Neurosci. 2006, 18: 1-13. 10.1162/089892906775250021.
Dinces E, Sussman E: Processing intensity at rapid rates: Evidence from auditory evoked potentials in 9-11-year-old children. Int J Pediatr Otorhinolaryngol. 2008, 72: 1317-1322. 10.1016/j.ijporl.2008.05.005.
Başar E, Başar-Eroglu C, Parnefjord R, Rahn E, Schürmann M: Evoked potentials: Ensembles of brain induced rhythmicities in the alpha, theta and gamma ranges. Induced rhythms in the brain. Edited by: Başar E, Bullock TH. 1992, Boston, MA: Birkhauser, 155-181.
Wang A, Mouraux A, Liang M, Iannetti G: The enhancement of the N1 wave elicited by sensory stimuli presented at very short inter-stimulus intervals is a general feature across sensory systems. PLoS One. 2008, 3: e3929-10.1371/journal.pone.0003929.
Woldorff MG: Distortion of ERP averages due to overlap from temporally adjacent ERPs: Analysis and correction. Psychophysiology. 1993, 30: 98-119.
Semlitsch HV, Anderer P, Schuster P, Presslich O: A solution for reliable and valid reduction of ocular artifacts, applied to the P300 ERP. Psychophysiology. 1986, 23: 695-703. 10.1111/j.1469-8986.1986.tb00696.x.
Kayser J, Tenke CE: Optimizing PCA methodology for ERP component identification and measurement: Theoretical rationale and empirical evaluation. Clin Neurophysiol. 2003, 114: 2307-2325. 10.1016/S1388-2457(03)00241-4.
Kayser J, Tenke CE: Editorial: Trusting in or breaking with convention: Towards a renaissance of principal components analysis in electrophysiology. Clin Neurophysiol. 2005, 116: 1747-1753. 10.1016/j.clinph.2005.03.020.
Jennings JR: Editorial policy on analyses of variance with repeated measures. Psychophysiology. 1987, 24: 474-475. 10.1111/j.1469-8986.1987.tb00320.x.
Keselman HJ: Testing treatment effects in repeated measures designs: An update for psychophysiological researchers. Psychophysiology. 1998, 35: 470-478. 10.1017/S0048577298000237.
Vasey MW, Thayer JF: The continuing problem of false positives in repeated measures ANOVA in psychophysiology: A multivariate solution. Psychophysiology. 1987, 24: 479-486. 10.1111/j.1469-8986.1987.tb00324.x.
Funding support provided by Australian Research Council DP0665616 and the School of Psychology, University of Western Australia. DVMB is supported by a Principal Research Fellowship from the Wellcome Trust. Thanks to John Love for technical support, Aoibheann O'Brien and Catherine Campbell for co-ordination of Project K.I.D.S, and to Lucy Cragg, Claire Nulsen, and Pia Van Beek for assistance with data collection.
AMF, MA, CR and DVMB developed the experimental protocols, AMF and TS collected and averaged the ERP data; AMF and DVMB analysed the data and prepared the manuscript. All authors reviewed and approved the final manuscript.
Authors’ original submitted files for images
About this article
Cite this article
Fox, A.M., Anderson, M., Reid, C. et al. Maturation of auditory temporal integration and inhibition assessed with event-related potentials (ERPs). BMC Neurosci 11, 49 (2010). https://doi.org/10.1186/1471-2202-11-49
- Single Tone
- Principal Component Analysis Component
- Short ISIs
- Tone Pair
- Longe ISIs