During all mapping sessions, and given our previously defined spatial limitations, rTMS was well tolerated by all volunteers. There were neither adverse events nor statistically significant differences concerning the experienced pain. Within our small study group, we conclude that intraprocedural discomfort is not related to the investigator.
The technique of rTMS is a promising new method for preoperative planning of surgery in eloquent cortical areas. In clinical daily routines, it is common practice that a certain examination method is conducted by different investigators. Therefore, high reliability of the respective method is necessary. The authors decided to conduct the remappings blinded to previous mapping setups (frequency, number of pulses, RMT) and results, to both mirror clinical reality and to include the entire rTMS procedure. Our data showed that the rTMS language mapping has limited reliability not only in intra-observer comparisons, but also in interobserver comparisons; although there is not a perfect match, there is a reasonably small range of differences.
The observed variability is ascribable to at least three different causes, though the abovementioned design of the present study does not allow us to determine which of the following three variables has the most important influence on the observed intra- and interobserver variability. First, differences in the performance of the mappings, such as applying distinct mapping parameters, probably lead to different language positive points. Furthermore, reproducibility can also be impaired by the language analysis itself because analyzing the recorded videos is not completely objective. Although our setup with baseline measurements, standardized protocols, and video recording for detailed post-hoc analysis of the language responses reduces a lot of bias, the evaluation of language errors still harbors some subjective issues. Additionally, we have to proceed on the assumption that there exists no absolute stability in the organization of human language function in the brain, which also has a conceivably significant influence on variability in language mapping
With regard to the different mapping parameters as a source of mapping differences, the investigators strictly followed the mapping protocol mentioned above during all examinations. According to our mapping protocol, it remains to the investigator to choose between three different setups (5 Hz/5 pulses, 5 Hz/7 pulses, or 7 Hz/7 pulses). Furthermore, the hand knob and the RMT were determined again for each remapping. However, the chosen mapping parameters and the detected RMT show no significant difference concerning both intra- and interobserver investigations (Table
2). Hence, we assume that the choice of parameters is distributed randomly. In other words, none of the examiners seems to prefer a certain mapping setup. Thus, these data suggest that the optimal mapping setup varies considerably even for one subject across the three sessions, yielding a small variability in mapping outcomes.
Concerning the naming error analysis as a source of mapping difference, on one hand, by categorizing the produced errors into pre-defined different error types, the language analysis reaches a certain degree of objectivity. As a result, both the intra- and interobserver variability that may arise at the stage of error analysis could be considered reasonably small. On the other hand, there are still cases in which the investigators are not in accordance. Consequently, it is reasonable that our results demonstrate higher intraobserver compared to interobserver reliability (Figure
When we take a closer look at the intra- and interobserver variability by error types, they were not uniform. Hesitation errors show—besides no response and semantic errors—a high reproducibility. In the existing literature, hesitation errors are suspected to represent a rather untrustworthy error category, which some authors do not even implicate in their analysis
. However, our data support the argument that hesitation errors evoked by rTMS should not be ignored; instead, they should be regarded as evidence for disrupted language processing in the brain
, as is the case with non-navigated rTMS studies that standardly measure naming latency difference to localize specific language functions in healthy adult brains
Another important aspect is the fact that little is known about the stability of language eloquent cortical regions per se. Direct cortical stimulation—the current gold standard for language mapping—assumes that the same stimulated areas do not evoke errors in that region consistently
. In the standard DCS language mapping procedure, a cortical site is judged language positive when 2 out of 3 stimulations elicit language errors
[9, 15]. This 66% criterion means that, even within a very short time lag, absolute reproducibility is impossible even for the gold standard, due to the complex connectivity and therefore plasticity of language function. Furthermore, we have to keep in mind previous studies, which have shown that reorganization of the brain exists not only after strokes or in the course of tumor disease, but also in healthy subjects
This natural plasticity may not only be demonstrated for rolandic regions in short-term motor-learning experiments
[30, 32], cognition, and memory structures in longitudinal real-life extensive subjects learning situation
, but it also appears in perisylvian eloquent areas and inferior parietal cortex for language perception and memory in a longitudinal code-deciphering learning study. According to the long time lag between first and second or third mapping, plasticity might indeed be a reason for the varying results in the present study. Thus, both intra- and interobserver variability are inevitable.
Additionally, when we take a closer look at the variability by error types, we noticed that, given our protocol, certain error types are better reproducible than others are Table
6). Altogether, in comparison to performance errors, neologisms, and phonological errors, the error categories “no response”, “hesitations”, and “semantic” tend to show higher reproducibility (Table
6). It could be possible that errors that are associated with pronunciation and language production itself exhibit greater fluctuation.
Another explanation could be that, as reported earlier
, rTMS following the abovementioned mapping protocol seems to be especially useful in anterior sites (Table
6). Compared with posterior sites, anterior language areas tend to demonstrate higher reproducibility and correlation with intraoperative DCS. When reviewing the literature on human language processing, activation of posterior language sites seems to be earlier than 300 ms
[1, 6]. Therefore, rTMS pulses 300 ms after picture presentation might be too late to disrupt posterior language processing. Thus, when improving the protocol in future investigations, one of the questions we have to ask is how we can evoke more reliable errors in posterior regions, and whether other mapping parameters could provide more reliable results.
Nonetheless, the aim of this study was to gather data about reproducibility of language mapping by rTMS. However, one bias of reproducibility is the spatial inconsistency of the investigated language function itself, which cannot be separated from the inaccuracy of the method. Yet, since is crucial to have data on the reproducibility of a new technique, this study has still its justification. Moreover, being the first study investigating this question, this study intended to mirror current practice of rTMS language mapping rather than examining a new rigid mapping protocol. These further aspects of reproducibility have to undergo further investigation in the future.