Test subjects
Eleven right-handed subjects (5 females) aged between 22 and 30 years took part in this study. None of these test subjects had a history of otological or neurological disorders. Normal audiological status, defined as an air conduction hearing level threshold of less than 10 dB in the frequency range between 250 and 4000 Hz, was verified by means of pure tone audiometry. Participants gave written informed consent to participation in the study in accordance with procedures approved by the Ethics Commission of the University of Muenster.
Experimental design
Common experimental design
The study was composed of two separate parts, neurophysiological and behavioral measurements as illustrated in the graph in Figure 1AC. The acoustic-stimuli paradigms are demonstrated in Figure 1B for the MEG measurements and in Figure 1D for the behavioral measurements.
For all participants a baseline MEG session was carried out prior to the first behavioral discrimination test (bef) Figure 1AC. Thereafter they carried out training sessions on five consecutive days (from Monday to Friday). Immediately after the last training session, a second behavioral discrimination test and MEG session were performed (aft). One week (1 w) and one month (1 m) after the training a third and fourth discrimination tests and MEG sessions were carried out, as illustrated in Figure 1AC.
Stimuli
The stimuli were amplitude modulated (AM) sinusoidal tones (duration 400 ms, rise and decay time 12.8 ms, 100% modulation depth). An AM tone with a carrier frequency (Fc) of 500 Hz and a modulation frequency (Fm) of 39 Hz was used as a standard stimulus in all sessions. The wave-shape and the power-spectrum of the standard tone are illustrated in Figure 1B. Its power spectrum is characterized by a peak at Fc and two side-band frequencies (Fc-Fm and Fc+Fm), Figure 1B. An AM tone is perceived as pitch at the carrier frequency fluctuating at the modulation frequency. Fluctuations at modulation frequencies around 40 Hz elicit a perception of roughness. The corresponding cortical steady-state response (SSR) to an AM tone is generated at the modulation frequency and is recorded from the primary auditory cortex [24]. Such a tone elicits also a transient response recorded from the non-primary auditory cortex [25].
Acoustical paradigm in MEG measurements
The stimuli were presented as standards randomly intermixed with deviants in an oddball sequence (probability of the deviants 20%) as demonstrated in Figure 1B. The sound intensity of all stimuli was equalized to 60 dB above individual sensation level. The inter-stimulus interval from onset to onset, the so-called sound onset asynchrony (SOA), was randomly assigned from 900, 1000 and 1100 ms.
Each MEG session contained two spectral conditions (SC) and three temporal conditions (TC). They were performed in random order in five separate blocks. In one block, we presented only one deviant as shown in Figure 1B, which belonged to either the SC or the TC. In the SC conditions two deviants tone deviate from the standard in carrier frequencies (515 Hz (SC15) and 525 Hz (SC25)), whereas the modulation frequency remains identical for the standard and deviant tones. In order to investigate the effects caused by different modulation rates, three TC conditions were designed. Thereby, deviant tones had a fixed carrier frequency of 500 Hz and deviated from the standard in modulation frequencies by 8 Hz (TC8), 14 Hz (TC14) and 18 Hz (TC18).
Acoustical paradigm for the Discrimination Training
To perform the behavioral training and the corresponding discrimination tests a software package developed especially for this purpose (based on Presentation - Neurobehavioral Systems Inc. and MATLAB) was used, running on a standard personal computer.
Subjects performed discrimination training for 5 days, preceded and followed by a behavioral discrimination test. The discrimination training was separated in two different blocks (spectral and temporal conditions) performed in one session. Both blocks were randomly presented across the days. Each training block lasted half an hour. Between both training blocks, the subject had a break of 10 min. The training block for each condition was limited to 300 tone pairs, presented binaurally. Standard-Standard (S-S) pairs presented with a probability of 70% contained two AM standard tones with a duration of 200 ms and a 600 ms interval between stimulus onsets, as shown in Figure 1D. The Standard-Deviant combination (S-D) consisting of one standard and one deviant tone was presented with a probability of 30%, the interval between two trains was 2400 ms, Figure 1D. Ten pairs were presented in the beginning of each training block as a pre-test to adapt the subjects to the procedure and were excluded from further analyze. In each training block, five S-S pairs appeared prior to the first S-D pair. The subjects were instructed to concentrate on perceivable differences between tones in a pair, starting with a carrier frequency difference of 25 Hz in the SC and a modulation frequency difference of 74 Hz in the TC. They were asked to press with the right hand the right mouse button when they perceived a difference and the left button, when they did not notice a difference. A stair-case two-alternative, forced-choice procedure with two-down, one-up adaptive rule was used to adjust the deviant stimuli during the training procedure [26]. The adjustment of the deviants followed the rules of the psychophysical method of limits for discrimination threshold detection [16, 27, 28]. The deviant frequency was changed exponentially dependent on the correct detection (difference decrease) or incorrect response (difference increase) according to the formula:
where S was the standard carrier or modulation frequency in SC or TC, respectively, Δf is the (carrier or modulation) frequency difference between standard and deviant, Δf0 is the preceding difference and f is a factor, which determines the step width of the learning curve. In this test, f was adopted as 0.05. The smallest frequency difference in S-D pairs was 0.96 Hz for the SC and 1.97 Hz for the TC. Within the training blocks the subjects obtained information about the correctness of their responses. They received visual feedback on a monitor situated in front of them (green square - right, red square - wrong), which appeared after pressing the mouse button for 300 ms, Figure 1D.
Four different responses were recorded in the behavioral data analysis. For the S-D pair, a Hit was registered when a difference between S-D tones was recognized and a Miss for a response by mistake. Correspondingly, for the pair S-S a Correct Rejection was collected when the S-S pair was correctly recognized, for a wrong response a False alarm was registered (Figure 1D).
Acoustical paradigm of the behavioral Discrimination test
The discrimination test was performed once before the training session, then immediately after the training and in each post-training phase (after one week and after one month), as shown in Figure 1C. During the behavioral discrimination test, the presentation of the stimuli was the same as in the discrimination training, but without visual feedback. For each condition, 200 stimulus pairs were presented. The S-D pairs were randomly intermixed with 50% probability with the S-S pairs. The carrier frequency difference in the SC was either 2, 3, 5, 10, 15, 20, 25, 30, 40, or 50 Hz. The modulation frequency difference in the TC was set either at 2, 4, 6, 8, 14, 18, 22, 27, 32, or 37 Hz.
The subject's responses (Hit, Miss, Correct rejection and False alarm) were collected after one correct or wrong mouse button press. Each discrimination block (SC or TC), contained 10 S-D pairs, was presented randomly in the oddball sequence without adjustment of the deviant tones (method of constants) [27, 28].
MEG Data Acquisition
The MEG recording was performed in a quiet magnetically shielded room using a 275-channel whole-head neuromagnetometer system (Omega2005, VSM-Medtech, Port Coquitlam, BC, Canada). The magnetic field data were sampled at a rate of 300 Hz after low-pass filtering at 100 Hz. The subjects were seated comfortably in an upright position, watching a soundless movie.
Data Analysis
Analysis of Behavioral Data
In the training session, behavioral performance (P(H) = Hits divided by the number of the stimulus presentations) was calculated for each frequency of the deviant tone in a S-D pair. The discrimination threshold corresponding to P(H) = 0.5 for each day was calculated and plotted on a graph, whereby the x-axis represents the corresponding day of the training session and the y-axis the threshold in Hz [27].
The same calculations were done for the discrimination test, here the discrimination threshold was derived for P = 0.75. For this type of test we used a more sensitive method for discrimination and the behavioral performance was calculated according to the formula: P = [P(H)-P(FA)]/[1-P(FA)], where P(FA) stands for the probability of the False alarm response. The formula was adopted from [16].
Data analysis of MEG Data
The MEG analysis was performed with the CTF Sofware Package. Stimulus related magnetic field data to deviant and standard tones including pre- and post-stimulus intervals (-100 ms to 600 ms, respectively) were collected after rejecting artifact-contaminated epochs in which magnetic field changes larger than 3pT occurred. For each sub-condition (Spectral or Temporal) approximately 100 deviants and 400 standards were averaged for further analysis. The magnetic field data were filtered first between 0.1 and 20 Hz, in order to extract only the transient responses. A subtraction "deviant minus standard" was calculated for visual inspection of the individual data. Assuming the model of an equivalent current dipole (ECD) in a spherical volume conductor a spatio-temporal dipole fit was performed in the latency range of the N1m component elicited by the standard tones. The interval used for the fit (~30 ms) was placed around the local maximum of the global field power derived from the filtered magnetic field data. For each subject two ECDs (one in each hemisphere) were determined, defined by their dipole moment, orientation and spatial coordinates.
The dipole location was determined in a head-based Cartesian coordinate system with the origin at the midpoint of the medio-lateral axis (y-axis), which joined the center points of the entrances to the ear canals (positive toward the left ear). The posterior-anterior axis (x-axis) was oriented from the origin to the nasion (positive toward the nasion), and the inferior-superior axis (z-axis) was perpendicular to the x-y plane (positive toward the vertex). Source locations fulfilling the following anatomical considerations characterizing the human auditory cortex area were included for further analysis: anterior-posterior value (x) within ± 3 cm, medial-lateral value (y - distance from the mid-sagittal plane) greater than 2.5 cm. Additionally, the statistical consideration of goodness of fit larger than 85% derived for source estimations was imposed. Median values of x, y, and z coordinates of the ECDs and of the angles of the dipole orientation were calculated across all blocks, separately for SC and TC. The individual mean values of the source coordinates and orientations of the N1m source were averaged across all subjects in each condition. The average values were used for fix the dipole position and orientation in order to apply the source space projection method [29]. This procedure collapsed the 275 MEG sensor data to two source wave forms representing the activity of the sources in the left and right hemisphere derived for all responses to deviant and to standard tones. These time series reaches a maximum only for a typical dipolar magnetic field pattern of a single current source in an a priori specified brain region and therefore this method is spatially sensitive. The source-space projection allows calculation of the grand averages of dipole moment time-series across different subjects and measurement blocks thereby enhancing the signal-to-noise ratio canceling the uncorrelated system noise. The method is maximally sensitive for brain activity from sources at selected origins and orientations. Other neuronal activity from more distant sources or sources having different orientation is combined less optimally and therefore the activity of these sources is reduced in the dipole moment waveforms.
A subtraction of source wave-forms for the deviants minus standards (difference wave form) resulted in the MMN response. A grand-average across all subjects was calculated for responses to the standard, deviant and subtracted waveform (MMN), and was further analyzed in context of peak amplitudes, latencies and time-courses.
Statistical analysis
Repeated measurement ANOVAs were conducted for the peak amplitudes and latencies of MMN, N1 and P3a components within the time-window from 100 ms to 350 ms after stimulus onset. Analysis of the auditory evoked responses includes two factors. The first one was "training conditions" consisting of 4 within subjects variables - (i) before training, (ii) immediately after training, (iii) one week after and (iv) one month after training. The second factor "hemispheres", contains two variables, right (RH) and left hemisphere (LH). The contrasts were evaluated by one tailed t-test analysis and post hoc contrasts with the Least Significant Difference test. The comparison between Spectral and Temporal sub-conditions was evaluated by a repeated-measures ANOVA with two factors. The first contains variables of five "conditions" - (i) SC15, (ii) SC25, (iii) TC8, (iv) TC14 and (v) TC18 and the second "hemispheres", contains two variables - RH and LH.
The N1 component was measured from the response to the standard tone; the MMN response was obtained through the subtraction (deviant response - standard response) and the P3a component was measured from the response to the deviant tone. The subject's data, in which the response time course did not contain the investigated components N1 (two subjects) or P3a (one subject), were rejected from the group analysis. The group averaging of the source-waves of the responses to the standard, deviant and subtracted waveforms were estimated in 11 subjects. The 99% confidence intervals for the grand-average subtracted waveforms were estimated from non-parametric bootstrap resampling in order to indicate the noise level.