15 healthy male right-handed subjects (age range: 23-27 years) with no history of psychiatric or neurological disease gave written informed consent. The study was approved by the ethics committee of the University of Ulm. Before scanning, all subjects completed a practice version of the task.

### Task

We used a monetary incentive task (figure 1) with a parametric variation of probabilities (25%, 50%, 75% and 100%) to win an announced amount of money (120, 60, 40, 30, 20 or 15 Eurocent). The different amounts of money (magnitude of reward) were assigned to the different reward probabilities in order to create eight different trial types of different probabilities but with two stable levels of expected reward values (2 value levels: 30 and 15 Eurocent, figure 1; expected reward value = probability × magnitude). This allowed analyzing reward value and uncertainty independently while probability and magnitude were defined as mutually dependant variables. Each session consisted of 96 trials (6250 ms each; 12 trials for each of the eight types: 25%-120¢, 25%-60¢, 50%-60¢, 50%-30¢, 75%-40¢, 75%-20¢, 100%-30¢, 100%-15¢). Each trial started with one of eight different indications (cue, 3750 ms) of the probability and the amount of money to win later on in this specific trial. After this expectation phase, subjects had to correctly react with a left or right button press (index or middle finger of their right hand) on two symbols, a square or a triangle (target) within a fixed interval of 1s. Subjects were notified in advance about the symbol-to-button press relation (square/right button, triangle/left button or vice versa). In reacting correctly they preserved themselves the previously announced chance to win the announced amount of money. Depending on the reward probabilities, subjects were not rewarded despite pressing the correct button in a number of trials (omission trials). Incorrect button presses resulted in a feedback of zero Eurocents. Win and omission trials as well as the eight trial types appeared in a random order. To ensure that all trials included a button press of any kind, subjects were told that they would lose 1 Euro if no button press occurred. Feedback (outcome, 1500 ms) followed the targets disappearance and notified the subjects of the amount of money (120, 60, 40, 30, 20, 15 or 0 Eurocent) they won in the trial. Reaction times (see figure 2) and errors were registered. Right before scanning, all subjects completed 2 sessions (60 trials/10 min each) of a practice version of the task. Contingencies between symbols and probabilities were explained beforehand. Subjects were told in advance that they could not win real money in the practice trial. All subjects had a performance of more than 95% correct trials during practice and could easily name the significance of each of the symbols used afterwards. In the scanner, during acquisition of the functional images, participants performed two sessions of the task (96 trials, 16 min each). The same stimuli as in the practice version were used but in a different, randomized order. The fact that we did not counterbalance the two colors (blue and red) in the task may represent a possible but very unlikely source of confounds.

### fMRI acquisition

A 3.0 Tesla Siemens ALLEGRA Scanner (Siemens, Erlangen, Germany) equipped with a head coil was used to acquire T1 anatomical volume images (1 × 1 × 1 mm voxels) and functional MR images. 23 sagittal slices were acquired with an image size of 64 × 64 pixels and a FoV of 192 mm. Slice thickness was 3 mm with a 0.75 mm gap resulting in a voxel size of 3 × 3 × 3.75 mm. Images were centered on basal structures of the brain including subcortical regions of interest (basal ganglia and prefrontal regions). Functional images were recorded using a T2*-sensitive gradient echo planar sequence measuring changes in BOLD-contrast. 650 volumes were obtained during each of the two sessions at a TR of 1500 ms (TE 35 ms, flip angle 90°).

### fMRI analysis

Image processing and statistical analysis were carried out using Statistical Parametric Mapping (SPM5, Friston, The Wellcome Department of Cognitive Neurology, London, UK). Preprocessing of the functional scans included realignment to correct for motion artifacts, slice timing, spatial normalization to a standard template (Montreal Neurological Institute, MNI) and smoothing with a 6 mm Gaussian kernel. Intrinsic autocorrelations were accounted for by AR(1) and low frequency drifts were removed via high pass filter with a cutoff frequency of 1/128 Hz.

After preprocessing, first level analyses were performed on each subject estimating the variance of each and every voxel according to the General Linear Model. Regressors (see figures 1, 3 and 4) for each of the eight trial types were defined in two different ways to capture early and late BOLD responses: For Model 1 each regressor modeled one of the eight trial types (A: 25%-120¢, B: 25%-60¢, C: 50%-60¢, D: 50%-30¢, E: 75%-40¢, F: 75%-20¢, G: 100%-30¢, H: 100%-15¢) spanning the entire time interval of one trial, from presentation of the cue to the outcome phase. In this, we stayed close to the models used by Hsu et al. [4] and Tobler et al. [6] who did not model cues, responses and outcomes separately. Model 2 had again the eight regressors to model the expectation phases (cue) of each trial (a: 25%-120¢, b: 25%-60¢, c: 50%-60¢, d: 50%-30¢, e: 75%-40¢, f: 75%-20¢, g: 100%-30¢, h: 100%-15¢), another 14 regressors to model the phase of the different outcomes depending on reward expectation (exp. 25%-100%) and actual outcome (win/omission): a-w: 25%-120¢/win, a-o: 25%-120¢/omission, b-w: 25%-60¢/win, b-o: 25%-60¢/omission, and so on. An additional regressor was defined to model the button press. Phases were each modeled as a boxcar function and convolved with the hemodynamic response function. The six realignment parameters modeling residual motion were also included as regressors in each of the two models.

The contrast images of parameter estimates from Model 1 and 2 were then combined in second level group analyses, treating intersubject variability as a random effect to account for interindividual variance. We computed three separate ANOVAs. The first ANOVA (ANOVA 1) comprised eight conditions according to the regressors of Model 1 (entire trial duration). The second ANOVA (ANOVA 2) had as conditions the eight expectation regressors from Model 2. Finally, the third ANOVA (ANOVA 3) comprised the fourteen outcome conditions formulated in Model 2. Within each ANOVA, conditions were weighted with contrasts to model effects of reward magnitude (figure 3), probability, uncertainty (figure 4), expected value (regressors A/a, C/c, E/e > regressor B/b, D/d, F/f) and prediction error (difference between reward expected and actually received) according to our hypotheses. For statistical maps we used conservative thresholds of p < 0.05 FDR corrected for multiple comparisons and an extent threshold at the cluster-level of p < 0.05.

For the analysis of the signal time course data to investigate the fMRI signal independently of any model, functional regions of interest (ROIs) of group activations were defined at p < 0.05 to p < 0.001 FDR corrected. Even more conservative thresholds were chosen in the case of large clusters like the medial orbitofrontal cortex to only include the most significant voxels. For each subject, the first eigenvariate of signal intensities of all voxels within a predefined functional region was extracted to obtain fMRI signal time series. The event-related time courses as depicted in figure 3 and 4 were obtained by first extracting series of 9 time points (TRs) starting with the onset of each trial, and then averaging over all trials of the same type, the two runs and subjects. T-tests were used to compute differential effects in the voxel time series using external software (Microsoft Excel, Statsoft Statistica).