Skip to main content

It's not what you say but the way that you say it: an fMRI study of differential lexical and non-lexical prosodic pitch processing



This study aims to identify the neural substrate involved in prosodic pitch processing. Functional magnetic resonance imaging was used to test the premise that prosody pitch processing is primarily subserved by the right cortical hemisphere.

Two experimental paradigms were used, firstly pairs of spoken sentences, where the only variation was a single internal phrase pitch change, and secondly, a matched condition utilizing pitch changes within analogous tone-sequence phrases. This removed the potential confounder of lexical evaluation. fMRI images were obtained using these paradigms.


Activation was significantly greater within the right frontal and temporal cortices during the tone-sequence stimuli relative to the sentence stimuli.


This study showed that pitch changes, stripped of lexical information, are mainly processed by the right cerebral hemisphere, whilst the processing of analogous, matched, lexical pitch change is preferentially left sided. These findings, showing hemispherical differentiation of processing based on stimulus complexity, are in accord with a 'task dependent' hypothesis of pitch processing.


Non-verbal components of language, included under the collective term prosody, play a central role in human communication [1]. First defined by Monrad-Krohn in 1947 [2], prosodic elements of speech can be subdivided into the two broad categories of linguistic and emotional prosody. Linguistic prosody conveys information about semantic meaning, such as pragmatic category - e.g. determining if a sentence is a statement, a question or a command - and syntactic relation - e.g. determining clause boundaries within sentences [3, 4]. Emotional prosody is the mechanism by which humans convey attitudes and emotions in speech. There has been debate about how clearly these two categories can be delineated.

Initial behavioural and lesion studies implicated both right [59] and left [1012] hemispheric regions, likely confounded both by the inherent difficulties in comparing lesion studies [13, 14] and assessing "global" prosodic function without considering specific subcomponents.

PET data first suggested that prosodic content and judgement activated the prefrontal cortex bilaterally [15, 16], more so on the left, and hemispheric asymmetry has been demonstrated for most regions of activation [17]. Subsequent imaging studies have implicated right superior temporal regions [16, 1821] - most recent work suggesting particularly within Brodmann's area [22], with additional, partially bilateral responses within the frontal cortex [18, 20, 2325], the anterior insula [16, 23, 25], amygdalae [26], and the basal ganglia [27, 28]. Emotional speech produces greater cortical activation than that which is prosodically neutral [16, 22, 29]. Electrophysiological work has supported neuroimaging findings that the right temporal cortex displays enhanced event-related potentials to emotional stimuli [30].

Variations in results, due in no small part to different experimental paradigms, have failed to definitively clarify whether cerebral regional and hemispheric activation are specific to prosodic subcomponent analysis or the functional demand of the task, known as the cue dependent [31, 32] and task dependent [17, 3336] hypotheses respectively.

However by far the majority of work has been on emotional prosody, and it's unclear how well such data can be applied to linguistic prosody that, in comparison, has had a paucity of research. Furthermore, work on linguistic or semantic aspects of prosody have typically focused on psychometric measures of language conceptualisation and understanding [37, 38] rather than the underlying neurobiology. Most authors have recognized the difficulties of the confounding influences of the lexical content of the stimuli and the problem of the higher level cognitive processes involved in the more global process of emotional prosody [39].

The neuroimaging data that exist for linguistic prosody typically favour hemispheric specialisation [40], with left fronto-temporal regions subserving 'simpler' short [41] syntactic and lexical segments of speech [42], and right hemispheric analogues processing larger suprasegmental elements at a sentence level [43], most in keeping with the task dependent hypothesis.

In light of this, this study set out to utilise fMRI to examine a single crucial element of linguistic prosodic comprehension; pitch change. We specifically looked at internal pitch changes, or "emphasis shift", as our earlier work suggested that these were more sensitive markers of subtle neurological deficits and less confounded by working memory primacy and recency phenomena [44]. As the name suggests, internal pitch changes occur within - as opposed to at the beginning or end of - a sentence. Furthermore, in an effort to try eliminate the major confounder of lexical comprehension, following the work of Patel et al [45] we introduced an analogous tone-sequence paradigm that contained a delexicalised pitch pattern. By removing the lexical content but keeping the tone sequence otherwise matched this design would also allow testing of the validity of the task dependent hypothesis as the same prosodic element, pitch, was being tested, but at different levels, with the tone sequence involving suprasegmental data analysis.

We hypothesised that a) there are common cortical regions including bilateral prefrontal and temporal cortices associated with pitch processing in both speech and tone-sequence analogues; b) the more "pure" pitch processing associated with tone-sequence analogues would preferentially recruit right sided frontal and temporal cortices while more lexically loaded speech would preferentially recruit left temporal cortex; and c) increasing demands on prosodic comprehension would be associated with enhanced activation in the right frontal and temporal cortex.



Twelve subjects were recruited through advertisements in a city-wide newspaper. Inclusion criteria were: males aged between 18 and 55, right-handedness, English as a first language. Exclusion criteria were: previous psychiatric or neurological illness, hearing or speech impairment and illicit drug use in the previous six months. All subjects provided written informed consent. Mean age was 31 (SD = 9.6). All subjects had completed secondary education; none had any formal training in playing musical instruments. The study had been approved by the local ethics committee.

Stimuli and materials

A modified version of the tone-sequence and prosody discrimination task previously described by the authors [44] was used, based on the earlier protocol of Patel et al [45]. The recorded stimuli consisted of 12 lexically identical sentence pairs, spoken by an adult female native English speaker, and their non-verbal, tone-sequence analogue pairs; and 12 sentence and tone-sequence pairs that differed prosodically in internal pitch pattern on a single word or tone (e.g. "I like blue ties on gentlemen" and "I like blue ties on gentlemen", with the italicized word emphasised). Tone-sequence stimuli were created by digitizing each sentence at 40,000 Hz with subsequent normalization to the same amplitude into a tone sequence which corresponded with the sentence's fundamental frequency in pitch and timing, with one level-pitch tone per syllable; a more detailed description is available in Patel et al [45]. An alternative method of low-pass filtering of the sentence pairs to remove lexical information was felt to be less satisfactory, as such filtering can leave residual phonological information, and previous studies [46] had validated this method.


Subjects were trained on the prosodic discrimination task, which consisted of six counterbalanced blocks. Each block was composed of twelve trials comprising four pairs of sentences, four pairs of tone-sequences, and four null trials (a silent period equal in length to four paired stimuli) presented in random order. Each trial consisted of a pair of stimuli separated by a one second interval. The pair of stimuli differed in the pitch of an internal component in 50% of trials. As some sentences were longer than others, the duration of the stimuli varied from 3432-6134 milliseconds, with an average length of 5036 ms. Following a visual cue at the end of each trial, subjects indicated whether the paired stimuli were the same or different by using their right index finger and a button press. There was a variable intertrial interval of between 8.6-11.3 seconds before the onset of the next trial. Such a jittered design results in peristimulus distribution of MRI sampling, thus ensuring that all components of an event-related haemodynamic response are sampled, and avoids that bias of having stimulus presentation and data acquisition time-locked [46]. The total length of the six counterbalanced blocks was 17 minutes 39 seconds.

fMRI Acquisition

Gradient echo echoplanar imaging (EPI) data were acquired on a neuro-optimised GE Signa 1.5 Tesla system (General Electric, Milwaukee WI, USA) at the Maudsley Hospital, London. A quadrature birdcage headcoil was used for radio frequency transmission and reception. Foam padding was placed around the subject's head in the coil to minimize head movement. One hundred and forty four T2*-weighted whole-brain volumes depicting blood oxygen level-dependent (BOLD) contrast were acquired at each of 24 near-axial non-contiguous planes parallel to the intercommissural (AC-PC) line (slice thickness = 5 mm; gap = 0.5 mm; TR = 2.1 seconds; echo time = 40 milliseconds; flip angle = 90°; matrix = 64 × 64). This EPI data set provided complete brain coverage. At the same session, a high-resolution gradient echo image of the whole brain was acquired in the intercommissural plane consisting of 43 slices (slice thickness = 3 mm; gap = 0.3 mm; TR = 3 seconds; flip angle = 90°; matrix = 128 × 128).

Scanner noise during stimuli presentation was minimised by using a partially silent acquisition [47] during the stimuli presentation lasting 6.3 seconds while fMRI data (associated with prominent scanner noise) was collected during the following 8.4 seconds.

fMRI Analysis

The data were first realigned [48] to minimise motion related artefacts and smoothed using a Gaussian filter (FWHM 7.2 mm). Responses to the experimental paradigm were then detected by time-series analysis using Gamma variate functions (peak responses at 4 and 8 sec) to model the BOLD response. The analysis was implemented as follows. First, in each experimental condition, trial onsets were modelled as stick-functions which were convolved separately with the 4 and 8 sec Poisson functions to yield two regressors of the expected haemodynamic response to that condition. The weighted sum of these two convolutions that gave the best fit (least-squares) to the time series at each voxel was then computed and a goodness of fit statistic was computed at each voxel, the SSQratio. It has been shown that this permutation method gives very good type I error control with minimal distributional assumptions [49].

In order to extend inference to the group level, the observed and randomized SSQratio maps were transformed into standard space by a two stage process involving first a rigid body transformation of the fMRI data into a high-resolution inversion recovery image of the same subject followed by an affine transformation onto a Talairach template [50]. In order to increase sensitivity and reduce the multiple comparison problem encountered in fMRI, hypothesis testing was carried out at the cluster level using the method developed by Bullmore et al. [48], shown to give excellent cluster-wise type I error control in functional fMRI analysis. All analyses were performed with < 1 false positive clusters expected per image, under the null hypothesis.

We examined regions of activation common to both sentence and tone-sequence prosodic comprehension with conjunction analysis. As the levels of activation in the various experiments will vary, the statistical issue is whether the minimum level of activation in any of the tasks is significantly different from zero. In parametric analysis this is done by testing the minimum t statistic. The statistical analysis program utilized (XBAM) found which task had the smallest median level of activation and tested this median against the null distribution of the activation by estimating the SSQratio for each subject at each voxel for each task [49, 51]. Then we compared prosodic comprehension between the tone-sequence and sentence stimuli to clarify the effects of lexical processing. Subsequent analyses compared identical stimuli pairs with differing stimuli pairs (same versus different stimuli pairs). During the pilot phase, volunteers subjectively reported the appraisal of identical stimuli to be more demanding. This was used to examine the effects of postulated increased demand on prosodic assessment. We employed a 2 × 2 factorial design to examine the interaction of factor condition (tone sequence, sentence) with the variables of pair type (same, different). The SSQ values were extracted from whole clusters, and plotted for regions demonstrating significant interaction effects between tone sequence and sentence processing and task demand assessed by same or differing stimuli pairs.

A confounder in all fMRI studies is the intrinsic scanner noise: this is particularly the case in tasks with an auditory component such as this one. We minimized this by having the scanner at a partially silent acquisition phase [47] during the presentation of stimuli. It has been shown that handedness and gender may affect the neural structures involved in the processing of language [52] and prosody [53], as such we only examined right handed males.

Behavioural data were analyzed using the statistical package SPSS.


Behavioural data

There were no significant differences in response time or accuracy rates between sentence and tone-sequence categories either overall, or when analysed in the subcategories of same and different tasks, using a two tailed t-test (α = 0.05). The subjects were generally highly accurate (0.75 - 0.98 on the tone-sequence task, 0.83 - 1.00 on the sentence task), with four individuals getting 100% accuracy on the sentence task, suggestive of a possible ceiling effect. However, subjects were more accurate during same tasks (mean accuracy 0.948) than during different tasks (mean accuracy 0.866) overall.

Neuroimaging data

The conjunction analysis showed significant activation common to both sentence and tone sequence prosodic processing in the bilateral Inferior Frontal Gyri, Middle (MTG) and Superior Temporal Gyri (STG), in addition to bilateral Inferior Parietal lobule and the right Superior Frontal Gyrus (Figure 1; Table 1).

Figure 1

Conjuction analysis of regions of cerebral activation common to both the sentence and tone-sequence tasks. 5 ascending transverse slices, with a sagittal section to the right of the image indicating where these are taken from. Exact cluster coordinates are given in Table 1.

Table 1 Areas of activation shown in Figure 1.

Activation was significantly greater within the right frontal and temporal cortices during the tone-sequence stimuli relative to the sentence stimuli (Figure 2, bottom half; Table 2). Regions of greater activation in the sentence task relative to the tone sequence task (Figure 2, top half; Table 3) were predominantly left hemispheric, including the cingulate gyrus, left MTG, STG, inferior parietal lobule as well as the basal ganglia; with additional activation in the right precuneus, right cingulate gyrus and right lingual gyrus.

Figure 2

ANOVA of regions of task-dependent differential activation. Ascending transverse slices with the sagittal section to the right indicating where they are taken from. The top half displays regions of relative increased activation during the sentence task; the lower half displays those more active during the tone-sequence task. The exact cluster coordinates are provided in Table 2 and Table 3 respectively.

Table 2 Areas of activation shown in Figure 2, BOTTOM HALF.
Table 3 Areas of activation shown in Figure 2-TOP HALF.

A statistically significant interaction between factor condition (tone sequence, sentence) and stimulus pair type (same, different) was evident in the right Inferior and Middle Frontal Gyrus and right STG (Figures 3; Table 4).

Figure 3

Interaction analysis of cerebral regions that can differentiate task (sentence/tone-sequence) and trial type (same/different). Cluster coordinates are provided in Table 4.

Table 4 Areas of activation in Figure 3.

Discussion and Conclusions

As hypothesized, there was activation in bilateral MTG and STG common to prosodic pitch processing across both sentence and tone-sequence stimuli. There was more prominent right inferior frontal cortex activation, although left inferior frontal activation was also present (Figure 1; Table 1). This was in accordance with previous imaging data of prosodic comprehension [18, 20, 2325]. There was a large bilateral activation in the Inferior Parietal Lobule, a region associated with storage within the working memory system [1]. Such a role is in accordance with our data, as differential activation maps fail to show differences in parietal activation between the two tasks, coinciding with a purely working memory role. Left precentral and postcentral gyral and basal ganglia activity were common to both conditions, something which would be anticipated in an experimental paradigm involving a right handed finger press.

Comparison between the tone-sequence and sentence stimuli aimed to clarify the relative contribution of cortical regions associated with a purer linguistic prosodic pitch analysis (tone sequence > sentences) and those associated with greater lexical or phonological analysis (sentence > tone sequence), which has been recognised as a major confounder in such studies generally [45, 5459]. Stripped of this lexical information, the tone-sequence demonstrated significant activation in the right inferior and medial frontal, and right STG compared to the sentence task (Figure 2, bottom half; Table 2). Wildgruber et al [1] suggested that at lower levels of both linguistic and emotional prosody processing, the same right hemispheric network is accessed but that the explicit judgment of linguistic aspects of speech prosody is more associated with left hemispheric language regions and explicit evaluation of emotional prosody is related to bilateral orbitofrontal regions. Our data support this assertion, with evident overlap between the regions preferentially activated by the tone-sequence and those elicited during emotional prosodic tasks. Explicit analysis of linguistic aspects preferentially evoked appraisal by left hemispheric regions, fitting with other work [34, 55, 6062] and this may reflect the processing of this lexical content of the stimuli.

Our third hypothesis was that 'increased demand' would be associated with enhanced activation in the right frontal and temporal cortices. Subjects reported finding tone-sequence trials harder than sentence ones - fitting with Patel's notion of extra 'redundancy' cues in the lexical trials [45, 58] - and that same pairs were 'more difficult' than different ones. Interestingly, behavioural data conflicts with subjective perception, demonstrating that subjects more accurate in same tasks: during these subjects needed to hold the entire same trial pair in working memory, and examine these for subtle (non-existent) differences; as opposed to different pairs where participants could discard the stimuli once any pitch difference was noted. As such, same and tone-sequence may have been proxy markers for cognitive demand rather than 'difficulty' per se, as well as tone-sequence exploring 'purer' pitch processing.

During the tone-sequence task there was relatively increased right STG and left precuneus activation when paired tone stimuli were the same, compared to when different. The interaction analysis (Figures 3; Table 4) looked at the effect of one factor (stimulus type: sentence or tone sequence) on another factor (trial type: same or different). Regions which can differentiate between these factors are all right sided: the STG, Inferior Frontal Gyrus and Middle Frontal Gyrus. Each of these discriminatory regions show increased activation in tone-sequence, as opposed to sentence, tasks, and during same compared to different stimuli (Figure 4).

Figure 4

Graphed differential activation in the right inferior frontal gyrus in the interaction analysis of Figure 3 demonstrating activation in the two task types (sentence/tone-sequence) and two trial types (same/different). SSQ, a "sum of squares ratio" is a statistical indicator of activity, as described in the methods section.

The authors' interpretation of our data is that it best fits with the task dependent hypothesis that the left hemisphere is hemispherically specialized for lexical and short syntactic aspects of pitch whilst the right hemisphere is superior at processing suprasegmental pitch. Subjects' reports place tone sequence and same trials as being more difficult and the interaction analysis of activation in the right inferior frontal gyrus (Figure 4) shows increasing activation for these: in both instances subjects are processing a larger, full trial, sequence at a suprasegmental level.

In conclusion, our data support the premise that prosodic pitch perception is subserved by the bifrontal and temporal cortices, specifically the Superior Temporal Gyrus, Inferior Frontal Gyrus and Middle Frontal Gyrus, with the degree of hemispheric involvement dependent upon the task. These areas were activated when both tone-sequence and sentence paradigms were used, thus confounding lexical stimuli were removed, though the former preferentially activated the right hemispheric regions, the latter the left. There was a relative increase in activation in the right frontal and temporal cortices during 'same' stimuli tasks, which was deemed to be more demanding, as subjectively reported by subjects, in terms of prosodic comprehension and this, in our opinion, is due to the need to analyse pitch at a broader 'sentence level'. Our data is in agreement with the assertion [40] of hemispheric specialisation fitting with the task dependent hypothesis, which would have predicted the lateralization [63] found in this study.

Language prosody processing is complex and consists of multiple components. Current understanding of it involves several competing theories, neither of which has garnered consistent support. The vast majority of the literature focuses on emotional prosody: further work is needed to provide a more coherent and distinctive conceptualization of linguistic prosodic processing.


  1. 1.

    Wildgruber D, Ackermann H, Kreifelts B, Ethofer T: Chapter 13 Cerebral processing of linguistic and emotional prosody: fMRI studies. Prog Brain Res. 2006, 156: 249-268.

    Article  CAS  PubMed  Google Scholar 

  2. 2.

    Monrad-Kohn GH: The prosodic quality of speech and its disorders; a brief survey from a neurologist's point of view. Acta Psychiatrica Neurologica Scandinavica. 1947, 22: 255-269. 10.1111/j.1600-0447.1947.tb08246.x.

    Article  Google Scholar 

  3. 3.

    Lehiste I: Rhythmic units and syntactic units in production and perception. J Acoust Soc Am. 1973, 54 (5): 1228-1234. 10.1121/1.1914379.

    Article  CAS  PubMed  Google Scholar 

  4. 4.

    Price PJ, Ostendorf M, Shattuck-Hufnagel S, Fong C: The use of prosody in syntactic disambiguation. J Acoust Soc Am. 1991, 90 (6): 2956-2970. 10.1121/1.401770.

    Article  CAS  PubMed  Google Scholar 

  5. 5.

    Borod JC, Andelman F, Obler LK, Tweedy JR, Welkowitz J: Right hemisphere specialization for the identification of emotional words and sentences: evidence from stroke patients. Neuropsychologia. 1992, 30 (9): 827-844. 10.1016/0028-3932(92)90086-2.

    Article  CAS  PubMed  Google Scholar 

  6. 6.

    Pell MD: The temporal organization of affective and non-affective speech in patients with right-hemisphere infarcts. Cortex. 1999, 35 (4): 455-477. 10.1016/S0010-9452(08)70813-X.

    Article  CAS  PubMed  Google Scholar 

  7. 7.

    Ross ED, Mesulam MM: Dominant language functions of the right hemisphere? Prosody and emotional gesturing. Arch Neurol. 1979, 36 (3): 144-148. 10.1001/archneur.1979.00500390062006.

    Article  CAS  PubMed  Google Scholar 

  8. 8.

    Shah AP, Baum SR, Dwivedi VD: Neural substrates of linguistic prosody: Evidence from syntactic disambiguation in the productions of brain-damaged patients. Brain Lang. 2006, 96 (1): 78-89. 10.1016/j.bandl.2005.04.005.

    Article  PubMed  Google Scholar 

  9. 9.

    Wunderlich A, Ziegler W, Geigenberger A: Implicit processing of prosodic information in patients with left and right hemisphere stroke. Aphasiology. 2003, 17 (9): 861-879. 10.1080/02687030344000283.

    Article  Google Scholar 

  10. 10.

    Pell MD: Cerebral mechanisms for understanding emotional prosody in speech. Brain Lang. 2005, 20.

    Google Scholar 

  11. 11.

    Seron X, Van der Kaa MA, Vanderlinden M, Remits A, Feyereisen P: Decoding paralinguistic signals: effect of semantic and prosodic cues on aphasics' comprehension. J Commun Disord. 1982, 15 (3): 223-231. 10.1016/0021-9924(82)90035-1.

    Article  CAS  PubMed  Google Scholar 

  12. 12.

    Speedie LJ, Coslett HB, Heilman KM: Repetition of affective prosody in mixed transcortical aphasia. Arch Neurol. 1984, 41 (3): 268-270. 10.1001/archneur.1984.04050150046014.

    Article  CAS  PubMed  Google Scholar 

  13. 13.

    Rorden C, Brett M: Stereotaxic display of brain lesions. Behav Neurol. 2000, 12 (4): 191-200.

    Article  PubMed  Google Scholar 

  14. 14.

    Rorden C, Karnath HO: Using human brain lesions to infer function: a relic from a past era in the fMRI age?. Nat Rev Neurosci. 2004, 5 (10): 813-819.

    Article  PubMed  Google Scholar 

  15. 15.

    Ethofer T, Anders S, Erb M, Herbert C, Wiethoff S, Kissler J, Grodd W, Wildgruber D: Cerebral pathways in processing of affective prosody: a dynamic causal modeling study. Neuroimage. 2006, 30 (2): 580-587. 10.1016/j.neuroimage.2005.09.059.

    Article  PubMed  Google Scholar 

  16. 16.

    Ethofer T, Kreifelts B, Wiethoff S, Wolf J, Grodd W, Vuilleumier P, Wildgruber D: Differential influences of emotion, task, and novelty on brain regions underlying the processing of speech melody. J Cogn Neurosci. 2009, 21 (7): 1255-1268. 10.1162/jocn.2009.21099.

    Article  PubMed  Google Scholar 

  17. 17.

    Glasser MF, Rilling JK: DTI tractography of the human brain's language pathways. Cereb Cortex. 2008, 18 (11): 2471-2482. 10.1093/cercor/bhn011.

    Article  PubMed  Google Scholar 

  18. 18.

    Buchanan TW, Lutz K, Mirzazade S, Specht K, Shah NJ, Zilles K, Jancke L: Recognition of emotional prosody and verbal components of spoken language: an fMRI study. Brain Res Cogn Brain Res. 2000, 9 (3): 227-238. 10.1016/S0926-6410(99)00060-9.

    Article  CAS  PubMed  Google Scholar 

  19. 19.

    Mitchell RL, Elliott R, Barry M, Cruttenden A, Woodruff PW: The neural response to emotional prosody, as revealed by functional magnetic resonance imaging. Neuropsychologia. 2003, 41 (10): 1410-1421. 10.1016/S0028-3932(03)00017-4.

    Article  PubMed  Google Scholar 

  20. 20.

    Wildgruber D, Pihan H, Ackermann H, Erb M, Grodd W: Dynamic brain activation during processing of emotional intonation: influence of acoustic parameters, emotional valence, and sex. Neuroimage. 2002, 15 (4): 856-869. 10.1006/nimg.2001.0998.

    Article  CAS  PubMed  Google Scholar 

  21. 21.

    Wiethoff S, Wildgruber D, Kreifelts B, Becker H, Herbert C, Grodd W, Ethofer T: Cerebral processing of emotional prosody--influence of acoustic parameters and arousal. Neuroimage. 2008, 39 (2): 885-893. 10.1016/j.neuroimage.2007.09.028.

    Article  PubMed  Google Scholar 

  22. 22.

    Ethofer T, Bretscher J, Gschwind M, Kreifelts B, Wildgruber D, Vuilleumier P: Emotional Voice Areas: Anatomic Location, Functional Properties, and Structural Connections Revealed by Combined fMRI/DTI. Cereb Cortex. 2011

    Google Scholar 

  23. 23.

    Imaizumi S, Mori K, Kiritani S, et al.: Vocal identification of speaker and emotion activates different brain regions. Neuroreport. 1997, 8 (12): 2809-2812. 10.1097/00001756-199708180-00031.

    Article  CAS  PubMed  Google Scholar 

  24. 24.

    George MS, Parekh PI, Rosinsky N, Ketter TA, Kimbrell TA, Heilman KM, Herscovitch P, Post RM: Understanding emotional prosody activates right hemisphere regions. Arch Neurol. 1996, 53 (7): 665-670. 10.1001/archneur.1996.00550070103017.

    Article  CAS  PubMed  Google Scholar 

  25. 25.

    Wildgruber D, Hertrich I, Riecker A, Erb M, Anders S, Grodd W, Ackermann H: Distinct frontal regions subserve evaluation of linguistic and emotional aspects of speech intonation. Cereb Cortex. 2004, 14 (12): 1384-1389. 10.1093/cercor/bhh099.

    Article  CAS  PubMed  Google Scholar 

  26. 26.

    Wiethoff S, Wildgruber D, Grodd W, Ethofer T: Response and habituation of the amygdala during processing of emotional prosody. Neuroreport. 2009, 20 (15): 1356-1360. 10.1097/WNR.0b013e328330eb83.

    Article  PubMed  Google Scholar 

  27. 27.

    Pell MD, Leonard CL: Processing emotional tone from speech in Parkinson's disease: a role for the basal ganglia. Cogn Affect Behav Neurosci. 2003, 3 (4): 275-288. 10.3758/CABN.3.4.275.

    Article  PubMed  Google Scholar 

  28. 28.

    Kotz SA, Meyer M, Alter K, Besson M, von Cramon DY, Friederici AD: On the lateralization of emotional prosody: an event-related functional MR investigation. Brain Lang. 2003, 86 (3): 366-376. 10.1016/S0093-934X(02)00532-1.

    Article  PubMed  Google Scholar 

  29. 29.

    Beaucousin V, Lacheret A, Turbelin MR, Morel M, Mazoyer B, Tzourio-Mazoyer N: FMRI study of emotional speech comprehension. Cereb Cortex. 2007, 17 (2): 339-352.

    Article  PubMed  Google Scholar 

  30. 30.

    Paulmann S, Pell MD, Kotz SA: Functional contributions of the basal ganglia to emotional prosody: evidence from ERPs. Brain Res. 2008, 1217: 171-178.

    Article  CAS  PubMed  Google Scholar 

  31. 31.

    Robin DA, Tranel D, Damasio H: Auditory perception of temporal and spectral events in patients with focal left and right cerebral lesions. Brain Lang. 1990, 39 (4): 539-555. 10.1016/0093-934X(90)90161-9.

    Article  CAS  PubMed  Google Scholar 

  32. 32.

    Van Lancker D, Sidtis JJ: The identification of affective-prosodic stimuli by left- and right-hemisphere-damaged subjects: all errors are not created equal. J Speech Hear Res. 1992, 35 (5): 963-970.

    Article  CAS  PubMed  Google Scholar 

  33. 33.

    Baum S, Pell M, Leonard C, Gordon J: Using prosody to resolve temporary syntactic ambiguities in speech production: Acoustic data on brain-damaged speakers. Clinical linguistics and phonetics. 2001, 15: 441-456. 10.1080/02699200110044813.

    Article  Google Scholar 

  34. 34.

    Gandour J, Wong D, Dzemidzic M, Lowe M, Tong Y, Li X: A cross-linguistic fMRI study of perception of intonation and emotion in Chinese. Hum Brain Mapp. 2003, 18 (3): 149-157. 10.1002/hbm.10088.

    Article  PubMed  Google Scholar 

  35. 35.

    Obleser J, Wise RJ, Alex Dresner M, Scott SK: Functional integration across brain regions improves speech perception under adverse listening conditions. J Neurosci. 2007, 27 (9): 2283-2289. 10.1523/JNEUROSCI.4663-06.2007.

    Article  CAS  PubMed  Google Scholar 

  36. 36.

    Schirmer A, Kotz SA: Beyond the right hemisphere: brain mechanisms mediating vocal emotional processing. Trends Cogn Sci. 2006, 10 (1): 24-30. 10.1016/j.tics.2005.11.009.

    Article  PubMed  Google Scholar 

  37. 37.

    Pell MD, Jaywant A, Monetta L, Kotz SA: Emotional speech processing: disentangling the effects of prosody and semantic cues. Cogn Emot. 2011, 25 (5): 834-853. 10.1080/02699931.2010.516915.

    Article  PubMed  Google Scholar 

  38. 38.

    Guo X, Zheng L, Zhu L, Yang Z, Chen C, Zhang L, Ma W, Dienes Z: Acquisition of conscious and unconscious knowledge of semantic prosody. Conscious Cogn. 2011, 20 (2): 417-425. 10.1016/j.concog.2010.06.015.

    Article  CAS  PubMed  Google Scholar 

  39. 39.

    Wong PC: Hemispheric specialization of linguistic pitch patterns. Brain Res Bull. 2002, 59 (2): 83-95. 10.1016/S0361-9230(02)00860-2.

    Article  PubMed  Google Scholar 

  40. 40.

    Sammler D, Kotz SA, Eckstein K, Ott DV, Friederici AD: Prosody meets syntax: the role of the corpus callosum. Brain. 2010, 133 (9): 2643-2655. 10.1093/brain/awq231.

    Article  PubMed  Google Scholar 

  41. 41.

    Hagoort P: On Broca, brain, and binding: a new framework. Trends Cogn Sci. 2005, 9 (9): 416-423. 10.1016/j.tics.2005.07.004.

    Article  PubMed  Google Scholar 

  42. 42.

    Shalom DB, Poeppel D: Functional anatomic models of language: assembling the pieces. Neuroscientist. 2008, 14 (1): 119-127.

    Article  PubMed  Google Scholar 

  43. 43.

    Meyer M, Steinhauer K, Alter K, Friederici AD, von Cramon DY: Brain activity varies with modulation of dynamic pitch variance in sentence melody. Brain Lang. 2004, 89 (2): 277-289. 10.1016/S0093-934X(03)00350-X.

    Article  PubMed  Google Scholar 

  44. 44.

    Matsumoto K, Samson GT, O'Daly OD, Tracy DK, Patel AD, Shergill SS: Prosodic discrimination in patients with schizophrenia. Br J Psychiatry. 2006, 189: 180-181. 10.1192/bjp.bp.105.009332.

    Article  CAS  PubMed  Google Scholar 

  45. 45.

    Patel AD, Peretz I, Tramo M, Labreque R: Processing prosodic and musical patterns: a neuropsychological investigation. Brain Lang. 1998, 61 (1): 123-144. 10.1006/brln.1997.1862.

    Article  CAS  PubMed  Google Scholar 

  46. 46.

    Veltman DJ, Mechelli A, Friston KJ, Price CJ: The importance of distributed sampling in blocked functional magnetic resonance imaging designs. Neuroimage. 2002, 17 (3): 1203-1206. 10.1006/nimg.2002.1242.

    Article  PubMed  Google Scholar 

  47. 47.

    Amaro E, Williams SC, Shergill SS, Fu CH, MacSweeney M, Picchioni MM, Brammer MJ, McGuire PK: Acoustic noise and functional magnetic resonance imaging: current strategies and future prospects. J Magn Reson Imaging. 2002, 16 (5): 497-510. 10.1002/jmri.10186.

    Article  PubMed  Google Scholar 

  48. 48.

    Bullmore ET, Brammer MJ, Rabe-Hesketh S, Curtis VA, Morris RG, Williams SC, Sharma T, McGuire PK: Methods for diagnosis and treatment of stimulus-correlated motion in generic brain activation studies using fMRI. Hum Brain Mapp. 1999, 7 (1): 38-48. 10.1002/(SICI)1097-0193(1999)7:1<38::AID-HBM4>3.0.CO;2-Q.

    Article  CAS  PubMed  Google Scholar 

  49. 49.

    Bullmore E, Long C, Suckling J, Fadili J, Calvert G, Zelaya F, Carpenter TA, Brammer M: Colored noise and computational inference in neurophysiological (fMRI) time series analysis: resampling methods in time and wavelet domains. Hum Brain Mapp. 2001, 12 (2): 61-78. 10.1002/1097-0193(200102)12:2<61::AID-HBM1004>3.0.CO;2-W.

    Article  CAS  PubMed  Google Scholar 

  50. 50.

    Brammer MJ, Bullmore ET, Simmons A, Williams SC, Grasby PM, Howard RJ, Woodruff PW, Rabe-Hesketh S: Generic brain activation mapping in functional magnetic resonance imaging: a nonparametric approach. Magn Reson Imaging. 1997, 15 (7): 763-770. 10.1016/S0730-725X(97)00135-5.

    Article  CAS  PubMed  Google Scholar 

  51. 51.

    Rubia K, Russell T, Overmeyer S, et al.: Mapping motor inhibition: conjunctive brain activations across different versions of go/no-go and stop tasks. Neuroimage. 2001, 13 (2): 250-261. 10.1006/nimg.2000.0685.

    Article  CAS  PubMed  Google Scholar 

  52. 52.

    Hagmann P, Cammoun L, Martuzzi R, Maeder P, Clarke S, Thiran JP, Meuli R: Hand preference and sex shape the architecture of language networks. Hum Brain Mapp. 2006, 27 (10): 828-835. 10.1002/hbm.20224.

    Article  PubMed  Google Scholar 

  53. 53.

    Rymarczyk K, Grabowska A: Sex differences in brain control of prosody. Neuropsychologia. 2006

    Google Scholar 

  54. 54.

    Luo H, Husain FT, Horwitz B, Poeppel D: Discrimination and categorization of speech and non-speech sounds in an MEG delayed-match-to-sample study. Neuroimage. 2005, 28 (1): 59-71. 10.1016/j.neuroimage.2005.05.040.

    Article  PubMed  Google Scholar 

  55. 55.

    Mitchell RL, Crow TJ: Right hemisphere language functions and schizophrenia: the forgotten hemisphere?. Brain. 2005, 128 (Pt 5): 963-978.

    Article  PubMed  Google Scholar 

  56. 56.

    Murray IR, Arnott JL: Toward the simulation of emotion in synthetic speech: a review of the literature on human vocal emotion. J Acoust Soc Am. 1993, 93 (2): 1097-1108. 10.1121/1.405558.

    Article  CAS  PubMed  Google Scholar 

  57. 57.

    Nicholson KG, Baum S, Kilgour A, Koh CK, Munhall KG, Cuddy LL: Impaired processing of prosodic and musical patterns after right hemisphere damage. Brain Cogn. 2003, 52 (3): 382-389. 10.1016/S0278-2626(03)00182-9.

    Article  PubMed  Google Scholar 

  58. 58.

    Patel AD: Impaired speech intonation perception in musically tone-deaf individuals: the Melodic Contour Deafness Hypothesis. 2006, San Francisco, Paper presented at: Cognitive Neuroscience Meeting; April 8-11, 2006

    Google Scholar 

  59. 59.

    Paulmann S, Pell MD, Kotz SA: Comparative processing of emotional prosody and semantics following basal ganglia infarcts: ERP evidence of selective impairments for disgust and fear. Brain Res. 2009, 1295: 159-169.

    Article  CAS  PubMed  Google Scholar 

  60. 60.

    Gennari S, Poeppel D: Processing correlates of lexical semantic complexity. Cognition. 2003, 89 (1): B27-41. 10.1016/S0010-0277(03)00069-6.

    Article  PubMed  Google Scholar 

  61. 61.

    Hodge CJ, Huckins SC, Szeverenyi NM, Fonte MM, Dubroff JG, Davuluri K: Patterns of lateral sensory cortical activation determined using functional magnetic resonance imaging. J Neurosurg. 1998, 89 (5): 769-779. 10.3171/jns.1998.89.5.0769.

    Article  PubMed  Google Scholar 

  62. 62.

    Kotz SA, Meyer M, Paulmann S: Lateralization of emotional prosody in the brain: an overview and synopsis on the impact of study design. Prog Brain Res. 2006, 156: 285-294.

    Article  PubMed  Google Scholar 

  63. 63.

    Li X, Gandour JT, Talavage T, Wong D, Hoffa A, Lowe M, Dzemidzic M: Hemispheric asymmetries in phonological processing of tones versus segmental units. Neuroreport. 2010, 21 (10): 690-694.

    PubMed Central  PubMed  Google Scholar 

Download references


SSS was supported by an Advanced Clinical Training Fellowship from the Wellcome Trust (SSS) and a Young Investigator Award from NARSAD. OOD was supported by the Psychiatry Research Trust. The authors gratefully acknowledge the provision of original stimuli by Dr. A. Patel, The Neurosciences Institute, San Diego CA 92121.

Author information



Corresponding author

Correspondence to Derek K Tracy.

Additional information

Authors' contributions

DKT participated in the study design, recruited participants, participated in fMRI data collection & analysis, interpreted the results and drafted the manuscript. DKH contributed to the drafting of the manuscript. OOD recruited participants, participated in fMRI data analysis and contributed to the drafting of the manuscript. PM participated in fMRI data collection and interpretation of the results. KM participated in the study design and coordination. LCL participated in fMRI analysis and contributed to the drafting of the manuscript. ED participated in fMRI analysis. SSS conceived the study and participated in its design and coordination. All authors read and approved the final manuscript.

Authors’ original submitted files for images

Rights and permissions

This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Reprints and Permissions

About this article

Cite this article

Tracy, D.K., Ho, D.K., O'Daly, O. et al. It's not what you say but the way that you say it: an fMRI study of differential lexical and non-lexical prosodic pitch processing. BMC Neurosci 12, 128 (2011).

Download citation


  • Temporal Cortex
  • Inferior Frontal Gyrus
  • Superior Temporal Gyrus
  • Inferior Parietal Lobule
  • Middle Frontal Gyrus