Multilevel analysis quantifies variation in the experimental effect while optimizing power and preventing false positives

Background In neuroscience, experimental designs in which multiple measurements are collected in the same research object or treatment facility are common. Such designs result in clustered or nested data. When clusters include measurements from different experimental conditions, both the mean of the dependent variable and the effect of the experimental manipulation may vary over clusters. In practice, this type of cluster-related variation is often overlooked. Not accommodating cluster-related variation can result in inferential errors concerning the overall experimental effect. Results The exact effect of ignoring the clustered nature of the data depends on the effect of clustering. Using simulation studies we show that cluster-related variation in the experimental effect, if ignored, results in a false positive rate (i.e., Type I error rate) that is appreciably higher (up to ~20–~50 %) than the chosen \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\alpha$$\end{document}α-level (e.g., \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\alpha$$\end{document}α = 0.05). If the effect of clustering is limited to the intercept, the failure to accommodate clustering can result in a loss of statistical power to detect the overall experimental effect. This effect is most pronounced when both the magnitude of the experimental effect and the sample size are small (e.g., ~25 % less power given an experimental effect with effect size d of 0.20, and a sample size of 10 clusters and 5 observations per experimental condition per cluster). Conclusions When data is collected from a research design in which observations from the same cluster are obtained in different experimental conditions, multilevel analysis should be used to analyze the data. The use of multilevel analysis not only ensures correct statistical interpretation of the overall experimental effect, but also provides a valuable test of the generalizability of the experimental effect over (intrinsically) varying settings, and a means to reveal the cause of cluster-related variation in experimental effect. Electronic supplementary material The online version of this article (doi:10.1186/s12868-015-0228-5) contains supplementary material, which is available to authorized users.


Additional file 1
Effect of neurite location (axon/dendrite) on traveling speed of intracellular vesicles: a worked example To clarify the procedure of a multilevel analysis, we use a hypothetical example in which measurements of velocity of intracellular vesicles are nested within neurons (N euron ID). In this example, we investigate whether velocity differs between axonal and dendritic measurements (Location). Collected from 20 neurons, there are on average 100 measurements per neuron (44 to 58 axonal measurements [Location = 0], and 43 to 56 dendritic measurements [Location = 1]), resulting in a total of 2000 measurements on velocity. The outcome variable velocity is standardized (ZV elocity; i.e., the variable is transformed such that it has a mean of 0 and a standard deviation of 1). Standardized variables are easily obtained in e.g. SPSS (Analyze → Descriptive Statistics → Descriptives: select the variables you want to standardize and tick the box "Save standardized values as variables"). Location is dummy coded 0 (axonal) and 1 (dendritic). The advantage of using data in which the outcome variable is standardized and the dummy indicator is coded as 0 and 1, is that the amount of neuron-related variation in the experimental effect σ 2 u1 can be interpreted according to the guidelines of Raudenbush and Liu [1]. Using these conventions, values of σ 2 u1 equaling 0.05, 0.10, and 0.15 are considered small, medium, and large, respectively. An added advantage of using standardized data is that the intercept variance σ 2 u0 approximates the ICC, and an added advantage of using the dummy coding 0 and 1 is that the intercept variance σ 2 u0 equals the cluster-related variation in the mean value of axonal measures (i.e., the condition coded as 0). We illustrate multilevel analysis using the statistical package SPSS, and syntax is provided for each step. To illustrate how the same analyses can be run in R, corresponding R code is provided at the end of the document.

Assumptions
One of the assumptions of standard multilevel analysis is that the outcome variable is normally distributed. A visual inspection of the distribution of ZV elocity for the axonal and dendritic measurements separately shows that ZV elocity can be considered normally distributed. When data are non-normal, transformations can be considered, or a multilevel model for non-normal data can be used (i.e., SPSS also allows for multilevel analysis of dichotomous and Poisson distributed outcome variables). When the results of multilevel analysis of the transformed and untransformed data are similar, interpreting the results of the untransformed data can be easier, and is therefore recommended. Another assumption concerns the absence of outliers, i.e., standardized values below -3.33 and above 3.33 (assuming standard normally distributed data) need to be excluded from the analysis.

Analysis
Multilevel analysis is conducted in a stepwise manner, building up the model from simple to more complex. Before conducting the actual multilevel analysis, we will first visually examine the variance between and within neurons of the measured velocities to get an idea of the degree of relative similarity between observations 1 obtained from the same neuron, and how much the difference between axonal and dendritic measurements varies over neurons. We plot the measured velocities for each neuron separately and color code the distinct measurements from the axon and dendrites: syntax and output are shown in Table S1. The figure shows that there is considerable variation in both axonal and dendritic measurements, both within and between neurons. In addition, the difference in velocity between axonal and dendritic measurements varies over neurons: in some neurons the measured velocity of axonal and dendritic vesicles completely overlap, and in others they do not. In general, however, the velocity seems slightly lower for dendritic measurements, but we of course need to test this.
Intercept only model In order to perform the analysis, one additional variable has to be created: an artificial intercept (int), which is a variable that always has value 1.
Next, an estimate of the intracluster correlation (ICC) can be obtained by running an intercept only model (see equation 1 in Box 1), i.e., a model in which every neuron is allowed to have its own mean velocity, but that does not include Location as experimental variable: syntax and selected output are presented in Table S2.
In the intercept only model, the intercept represents the overall mean value for velocity, i.e., velocity calculated across all cells and across both axonal and dendritic measures. As we standardized the variable velocity, the overall mean is zero. In the Table "Estimates of covariance parameters" we see that the variation in the mean velocity over neurons equals .505 (i.e., the intercept has a variance of .505, suggesting that the mean velocity shows variation between neurons). To obtain an estimate of the intracluster correlation (ICC; a standardized measure of the variation of the 2 mean value over neurons), apply equation 3 in the main text: This means that when only considering the mean ZV elocity of each neuron (i.e., not distinguishing between Location), 50.6% of the variability in ZV elocity is due to differences between neurons, i.e., can be explained by neuron-membership. Note that because the outcome variable vesicle velocity is standardized in our model, the variance estimate of the intercept (.505411) can simply be interpreted as the ICC because the total variance adds up to 1 (the slight deviation in the fourth decimal is due to the fact that ZV elocity is not perfectly normally distributed).
Note that the residual variance (denoted as σ 2 e in the equation to obtain the ICC and estimated at 0.494) represents the variation observed within each neuron, i.e., the variability in velocity measures taken from the same neuron.
The statistical significance of the variation in the intercept can also be assed. However, the Wald test reported in the table is not appropriate to test significance of variances (i.e., the asymptotic Wald test assumes normally distributed variance components, which is unrealistic [2]). Whether the variance component is significantly different from 0 can, however, be tested using a chi-square (χ 2 ) test. If we square the Wald Z statistic in the table, we get approximately a chi-square value, with the number of degrees of freedom being 1 (i.e., we test only 1 parameter, 3 namely the variance of the intercept) [1] . So we get: Since a variance component cannot be negative and this parameter is thus subject to boundary constraints (see e.g. [3][4][5]), the accompanying p-value, which equals .002, needs to be divided by 2: p = .001. Assuming α = .05, this test is significant, i.e., the variation of the intercept over neurons is significantly different from 0. Note that in research design B, not accommodating the variation in the intercept results in a decreased power to detect the overall experimental effect (which is different to research design A, where not accommodating the variation in the intercept results in an inflated false positive rate).
Model including fixed effect of Location on vesicle velocity After we estimated the intercept only model and the ICC, we add Location to our model to identify its effect on vesicle velocity, i.e., is the velocity of vesicles different in axons compared to dendrites. We first add Location only as a fixed variable to the model, and then extend the model to include the possible variation in the difference between axonal and dendritic measurements over neurons. The syntax and selected output for the model only including the fixed effect of location is presented in Table S3. We see that the overall effect of location equals -0.564. However, we cannot draw any conclusions on the significance of this effect, as we did not accommodate the [1] Note that SPSS prints -2 Log Likelihood information in the table with information criteria.
Usually, this -2LL information is used to calculate the chi-square test. However, SPSS sometimes uses pseudo maximum likelihood estimation, and then the -2LL values of different models cannot be used to compute a chi-square value.
possible variance of the effect of location over neurons. Note that adding the experimental variable Location results in a decreased residual error (i.e., from .494 to .414), i.e., ZLocation partly explains why the observations within neurons vary. Also note that the variation between neurons (i.e., the intercept variance of 0.506) remains almost the same (up to the third decimal place) compared to the model that does not include ZLocation. However, the interpretation of the variation in the intercept is now changed to neuron-related variation in the mean velocity of axonal measurements specifically (while it was interpreted as neuron-related variation in mean velocity in general (i.e., both axonal and dendritic measures) before Location was included as predictor in the model).

Model including both the fixed and random effect of Location on vesicle velocity
The syntax and selected output for the model including variance in the effect of Location is presented in Table S4. Note that we save some variables in the last line of the syntax, these are used later to plot the results. Also note that we set covtype to DIAG, i.e., the variance-covariance matrix between the parameter estimates is diagonal, meaning that we assume that the intercept and Location effect parameters have variances (on the diagonal) but that they do not correlate (i.e., the covariance, which is noted on the off-diagonal elements, is 0, i..e, the neuron-specific mean axonal velocity is not related to the neuron-specific effect of ZLocation on vesicle velocity). The overall effect of Location on vesicle velocity is approximately the same as in the previous analysis: -0.565, and is highly significant with p < .001. The effect size d of Location is obtained through γ 10 /σ 2 e [6], which corresponds to −0.565/0.387 = −1.460. By convention, effect sizes of 0.20, 0.50 and 0.80 are considered small, medium, and large, respectively [7]. As such, the overall effect of location corresponds to a (very) large effect. The 95% confidence interval (CI) assuming a normal distribution is obtained through γ 10 ± Z 1−α * SE γ10 , which corresponds to -0.565 ± 0.079 * 1.96 = [-0.720, -0.410] (the deviation with the SPSS output is because SPSS uses the t distribution with 20.05 degrees of freedom to obtain the 95% CI).