Meta-analysis of gene coexpression networks in the post-mortem prefrontal cortex of patients with schizophrenia and unaffected controls
© Mistry et al.; licensee BioMed Central Ltd. 2013
Received: 5 February 2013
Accepted: 23 September 2013
Published: 26 September 2013
Gene expression profiling of the postmortem human brain is part of the effort to understand the neuropathological underpinnings of schizophrenia. Existing microarray studies have identified a large number of genes as candidates, but efforts to generate an integrated view of molecular and cellular changes underlying the illness are few. Here, we have applied a novel approach to combining coexpression data across seven postmortem human brain studies of schizophrenia.
We generated separate coexpression networks for the control and schizophrenia prefrontal cortex and found that differences in global network properties were small. We analyzed gene coexpression relationships of previously identified differentially expressed ‘schizophrenia genes’. Evaluation of network properties revealed differences for the up- and down-regulated ‘schizophrenia genes’, with clustering coefficient displaying particularly interesting trends. We identified modules of coexpressed genes in each network and characterized them according to disease association and cell type specificity. Functional enrichment analysis of modules in each network revealed that genes with altered expression in schizophrenia associate with modules representing biological processes such as oxidative phosphorylation, myelination, synaptic transmission and immune function. Although a immune-function enriched module was found in both networks, many of the genes in the modules were different. Specifically, a decrease in clustering of immune activation genes in the schizophrenia network was coupled with the loss of various astrocyte marker genes and the schizophrenia candidate genes.
Our novel network-based approach for evaluating gene coexpression provides results that converge with existing evidence from genetic and genomic studies to support an immunological link to the pathophysiology of schizophrenia.
KeywordsSchizophrenia Microarray Gene coexpression network Postmortem brain
Schizophrenia is a severe psychiatric disorder with an elusive etiology. Gene expression profiling of the postmortem human brain has been frequently used as a means to investigate patterns of molecular disruption in the brains of patients with schizophrenia. One of the most common types of analysis applied to expression profiling data is differential expression; which is used to identify over- or under-expressed genes associated with the illness. Candidate genes identified from expression profiling studies in schizophrenia have implicated alterations in different cellular systems, including myelination, synaptic transmission, metabolism, and ubiquitination . These findings are not always replicated across studies, nor have they been successfully integrated into a comprehensive biological framework.
In our previous work, we used a large combined cohort to identify a meta-signature of genes which are consistently differentially expressed in the prefrontal cortex of patients with schizophrenia . The functions reflected in these genes are diverse and the interactions among them are largely unexplored. Because gene function is partly defined by interactions with other genes (at the biochemical, physical interaction, genetic or regulatory levels), it is attractive to apply gene coexpression network analysis to aid in interpretation.
In general, gene networks can be analyzed to identify higher-level features of gene-gene relationships based on graph theoretic considerations such as node degree or clustering coefficient [3–5]. Evaluating the broader network structure allows us to detect modularity in the graph, or groups of densely connected nodes with sparse connections between groups. Characterization of these ‘modules’ can convey useful information as they may be associated with specific molecular complexes or functions, yielding hypotheses that would be difficult to ascertain based on a gene-by-gene analysis. It is important to note that the terminology “gene coexpression network” refers to a sparse representation of the correlation structure among genes, and that such networks are not amenable to straightforward interpretation in the way that protein interaction or metabolic networks are. However, a key advantage of coexpression is that there is relatively abundant data, so “condition-specific networks” can be constructed. Thus one can evaluate differences between condition-specific networks to help elucidate systems level molecular dysfunction.
Such coexpression network analyses have recently been applied to a number of postmortem human brain expression profiling datasets for examining general transcriptome patterns of the CNS , and to interrogate the molecular basis of neuropsychiatric disease [7–10]. Torkamani and colleagues  conducted a network analysis by combining two independent schizophrenia expression profiling datasets. Expression data was merged across control and schizophrenia cohorts and modules of coexpressed genes were characterized according to disease characterization, cell type specificity and the effects of aging. A more recent cross-cortical network study was carried out by Roussos et al.  using control and schizophrenia samples across four different brain regions. Discrete modules of coexpressed genes displayed high preservation between control and schizophrenia networks for all but one module. Brain regional differences were assessed with an analysis of variance comparison of module eigengene expression, with changes only observed in the control network. Chen et al.  also explored networks using combined data from schizophrenics and controls. Two modules were associated with genes differentially expressed with disease across the datasets; one which was specific to cerebellar cortex and the other identified across all brain regions. They did not report any differences in networks between schizophrenics and controls. Although Chen et al. used four data sets, they were not independent as three of the datasets used samples from the same brain collection.
In this study, we applied coexpression network analysis to seven independent gene expression studies of the prefrontal cortex to demonstrate, in agreement with previous studies, an overall similarity in transcriptome organization between healthy controls and individuals with schizophrenia. We then examined network properties of genes we previously reported to be differentially expressed in schizophrenia  within each network to reveal features of these genes that are not observed with other functional gene groups or other brain-related disease gene sets. Finally, using a network clustering approach, we found evidence for functional dysregulation of immune-related processes in schizophrenia. Our results complement previous gene expression and genetics evidence supporting an immunological aspect of the disorder.
Summary of demographic variables across combined cohort
Number of Subjects
56.25 ± 20
55.27 ± 19
p = 0.67
101 M : 52 F
113 M : 40 F
p = 0.1
6.5 ± 0.28
6.39 ± 0.29
p = 0.001
21.95 ± 15.3
22.65 ± 15.2
p = 0.69
Whole network properties of the control and schizophrenia brain networks
Maximum node degree
Mean node degree
Shortest path length
log-log fit (R2)
Number of modules
Schizophrenia gene set network properties
Edges (within gene set)
Random gene set comparison
Random gene set comparison
The trends observed for SZUP and SZDOWN between the two networks were small and only marginally significant. To evaluate whether each of the individual gene set values were unusual in the networks we implemented three different methods of control. A first control was supplied by comparing observed network measures for SZUP and SZDOWN to a background distribution of 1000 randomly selected gene sets of matched size and node degree (see Methods). The difference between the observed values and background was assessed by computing z-scores and p-values, as reported in Table 3. Our strongest result was for the clustering coefficient of both gene sets. The p-values for the SZUP clustering coefficient indicate that the high value in SZ is significant when compared to a background distribution (p = 0.005). For SZDOWN, the high clustering coefficient in CTL showed a trend difference compared to the background distribution (p = 0.06). Together, these results converge to highlight that the gene neighbouring meta-signature gene sets are unusually highly coexpressed, with the SZUP genes displaying this property in the SZ network and the SZDOWN genes tending to display this in the CTL network.
We next evaluated whether our meta-signature gene sets share network properties with gene sets associated with other brain-related disorders. We assembled gene sets for five different disorders, mostly based on findings from genetic association studies. For each disease gene set, z-scores were computed based on a background distribution and compared against the GO group z-scores in each network (Additional file 1: Table S1). Of particular interest are the results observed for clustering coefficient in the two networks. Interestingly, the Alzheimer’s disease gene group (Figure 4, red arrows) exhibited strikingly similar properties to SZUP in both networks despite having only one overlapping gene. Notably, the Parkinson’s disease gene group (Figure 4, blue arrows) follows a similar but more subtle trend as SZDOWN.
One concern is that these features are particular properties of the data sets we analyzed. In the absence of sufficient independent data, we assessed the robustness of the network measures observed for SZUP and SZDOWN using a jackknife procedure. In this process, we removed one of the seven datasets and regenerated aggregate CTL and SZ networks on the remaining six, for each study in turn. This yields seven pairs of jackknife networks. For each jackknifed network, the average shortest path length and clustering coefficient was computed for SZUP and SZDOWN and values were compared between networks (Additional file 2: Figure S1). For SZUP, we observed a general agreement of increasing clustering coefficient and consistently decreasing path length between CTL and SZ across all iterations. For SZDOWN, we found that only the clustering coefficient effects were robust to removing single data sets; the path length results proved to be more sensitive. Taken as a whole, these results are consistent with subtle network property differences for the SZUP and SZDOWN genes between the two networks.
Disease characterization of coexpression modules in each network
Cell-type marker enrichment of coexpression modules in each network
Low stringency threshold ( > 4-fold)
Low stringency threshold ( > 4-fold)
The next two sets of modules that showed significant association with schizophrenia were also notable in high enrichment of cell-type marker genes. Module CTL25 and Module SZ20 were both significantly enriched for oligodendrocyte markers genes (e.g. OLIG2, MBP, MAG) with a 61 percent overlap in total gene membership, and each contained two SZUP genes. Highly ranked GO terms from functional enrichment analysis of each module included the presence of terms such as “ensheathment of neurons” and “regulation of action potential” suggesting this is was a myelination-related module. In contrast, modules CTL20 and SZ9 (which have 71 percent overlap in gene membership) were significantly enriched for astrocyte marker genes (Table 5; p < << 0.001). The common functional role suggested by GO enrichment centered around glutamine metabolism (e.g. GAD1, GAD2; see Additional file 5: Table S4 and Additional file 6: Table S5). SZUP genes also identified with this module, with a slightly higher number of genes identified in the SZ network (Table 3; p < < 0.001).
The final top disease-associated modules from each network, CTL18 and SZ2, were each associated with different cellular processes. Module CTL18 was enriched for neuronal marker genes and functional enrichment analysis revealed an association with genes related to neurotransmitter secretion. Furthermore, ten of the SZDOWN genes overlapped with this module. In the schizophrenia network, twelve SZDOWN genes (none of which overlap with the previous ten) were identified in Module SZ2. Module SZ2 is also neuron marker enriched but is associated with genes involved in ubiquitination. The two modules showed very little overlap with each other in terms of gene membership (with only six genes in common) and each highlighted different functional roles, yet both modules exhibited an association with schizophrenia in their respective networks.
We next evaluated the impact of covariates on our results by using expression changes known to be associated with age, brain pH and sex. Gene lists for these factors were compiled from a previous study of healthy control postmortem brain , and hypergeometric probabilities were computed to evaluate the significance of overlap with each module (Additional file 7: Table S6). No significant overlap was observed with the sex genes in either network. The age and pH genes displayed enrichment across the top modules which was mostly consistent between the control and schizophrenia networks. However, differential enrichment was observed for the immune response module, a trend similar to what we observed with SZUP genes and astrocyte marker genes. Expression data for the age and pH genes from the immune response module were plotted and fold change values were computed to examine the extent to which these genes might be affecting the network changes we observed (Additional file 8: Figure S2). We found expression to be variable within each cohort and differences in mean expression between cohorts were very small and not significant. Thus, it seems unlikely that there are large confounding effects of age and pH driving the changes we see in coexpression clustering between the two networks. We also addressed medication effects by cross-referencing lists of gene expression changes associated with lifetime antipsychotic use, to the top modules identified in our network analysis. The SMRI Online Genomics Database (https://www.stanleygenomics.org/) provides gene lists for several demographic variables which have been independently assessed using only the diseased subsets of each dataset. From the database we extracted significant gene lists (p < 0.001; FC > 1.2) pertaining to the effects of lifetime antipsychotics (69 genes) in subjects with schizophrenia. In the control network, a total of 34 medication-associated genes overlapped with our top modules, with the largest number found in the immune response module (Additional file 7: Table S6). In the schizophrenia network, roughly the same number of genes are re-distributed across the modules, but a significant overlap with the immune response module remains.
Our network-based approach for evaluating gene coexpression provides a novel assessment of coexpression patterns across seven large schizophrenia microarray datasets. We implemented a rank aggregation approach for network analysis revealing interesting patterns of molecular connectivity in the control and schizophrenia postmortem human brain. Overall, the two coexpression networks were very similar to one another. This is consistent with existing findings from network analysis in schizophrenia [8, 10, 11]. The control and schizophrenia networks shared a similar node degree distribution, and average values of path length and clustering coefficient taken across all nodes in the network were not significantly different. However, closer inspection revealed differences of potential biological significance.
To evaluate differences in gene-gene connectivity between networks, we initially focused on the network properties of 95 differentially expressed ‘schizophrenia genes’ as reported in our previous study of these same data sets . This gene list was divided into two groups: 1) genes which are up-regulated in schizophrenia and 2) genes which are down-regulated in schizophrenia. We examined the network properties of each gene set within the control and schizophrenia networks and identified distinguishing features of our ‘schizophrenia genes’. The clustering coefficient, a measure which gives us insight into the community structure of nodes, proved to be an interesting characteristic of both gene sets. Importantly, we applied control protocols to demonstrate that this differential coexpression among the neighbourhood of ‘schizophrenia genes’ is not observed with other groups of “functionally related” genes and most other brain-related disease gene groups.
Datasets used in coexpression network analysis
No. of Subjects
Prefrontal cortex (BA9)
Garbett K. et al., 2008 
Prefrontal cortex (BA46)
Katsel P. et al., 2005 
Maycox P. et al., 2009 
HG-U133 Plus 2.0
Anterior prefrontal cortex (BA10)
Narayan S. et al., 2008 
HG-U133 Plus 2.0
BAZ1A encodes a subunit of the chromatin assembly factor (ACF), which together with other proteins comprises the chromatin remodeling complex. Although it is not directly associated with the immune system, we found that it was coexpressed with a number of immune activation genes (Figure 6), suggesting a possible role in the transcriptional regulation of immune related genes. TMEM176A was also coexpressed with immune-related genes, complement component 4A (C4A) and complement component 4B (C4B), both of which are also not found in the SZ immune module (Figure 6). TMEM176A encodes a transmembrane protein, which together with TMEM176B when overexpressed has been shown to block dendritic cell maturation in rats . Dendritic cells (DC) are immune cells found in most major organs, and their ability to regulate immunity is dependent on DC maturation. The brain has long been considered devoid of DC in the absence of inflammation, with microglia charged with many functional attributes commonly ascribed to DC. However, recent evidence has illustrated that DC are found in various tissue reservoirs within the steady-state CNS and are also potential players in brain immune surveillance .
Genes differentially expressed between schizophrenia subjects and healthy controls have also been identified through a recent combined analysis conducted on six of the same datasets used in our study . Perez-Santiago et al. identified 117 up-regulated and 43 down-regulated probes of which 8 and 10 probes overlap with our up- and down- meta-signatures, respectively. The genes from Perez-Santiago et al. displayed a similar enrichment pattern as the Mistry et al. signature genes in the top modules in each of our networks. Specifically, the up-regulated genes showed a trend that coincides with trends observed with the astrocyte marker genes and SZUP genes. In the control network, the immune module contains 44 up-regulated genes and in the schizophrenia network the number of genes in the immune module is reduced to 19 genes. These findings provide additional support to our discussion on genes up-regulated in schizophrenia and alteration of immune-related processes.
The immunological link to the pathophysiology of schizophrenia suggested from our network analysis is not a new concept. Linkage and GWAS support an association of a broad section of markers in the major histocompatibility complex (MHC) region at 6p21.33 with schizophrenia [32, 33] and many immune-related genes, genetic variants and haplotypes are also implicated in schizophrenia [34–36]. Several studies of gene expression in the postmortem PFC have reported alterations in immune and stress-response genes in subjects with schizophrenia [37–39]. Additionally, the investigation of gene expression in peripheral blood mononuclear cells (PBMCs) of drug-naïve schizophrenia subjects also supports alteration of immune-related processes [40–42]. Using our panel of schizophrenia genes found from meta-analysis we were able to identify unique features of the coexpression network and highlight relevant areas of dysfunction which may contribute to the pathophysiology of schizophrenia. However, our interpretation of the network model is based on GO enrichment and further investigation at the individual gene level will provide an explanation of higher resolution. We also sought evidence to support the plausibility of schizophrenia genetic risk variants driving the changes we observed. From the SZGene (http://www.szgene.org/) database we compiled the top 45 most reliable genetic associations based on findings from a meta-analysis  . This gene list was cross-referenced with the top modules from each network and we found no overlap with the immune module in either network.
One conclusion of our study is the effects on gene coexpression patterns in schizophrenia, like those on expression levels, are subtle in the face of other sources of variability. This suggests that while seven datasets is a much larger cohort than used in previous studies, our study would benefit from having additional data. Aggregating across a larger number of datasets has been shown to result in networks more comparable to PPI networks , and is likely to give more reliable interactions, but saturation of some properties was not reported to be achieved until >20 networks were combined. To help ameliorate concerns of robustness, we used a jackknife approach. The results from the jackknife analysis also address concerns regarding the reuse of datasets for both differential expression  and coexpression. We established that our findings are not overly sensitive to influence of any single data set. Also, for our null model comparisons we applied stringent controls whereby node degree was controlled for. Because node degree is one of the most important features of a network, it can drive numerous effects on other topological properties such as clustering coefficient and shortest path length. Thus controlling for node degree in generating null distributions is critical for avoiding false positives. Similarly, because we wished to identify network features which are associated with schizophrenia, we controlled for “generic” network features by comparing our schizophrenia gene sets to the properties of other “functionally coherent” gene groups, not just randomly selected genes. This approach was motivated by recent work from our group showing that generic multifunctionality effects can strongly skew the interpretation of gene networks .
As is the case with most postmortem brain studies in schizophrenia, our results should be interpreted in the context of several caveats. Samples used in this study were taken from patients having lived with schizophrenia for various lengths of time, and often having received medications. We cannot be sure that the changes we have identified are direct effects of the illness or are secondary to an underlying pathology. We were unable to obtain medication information for all samples used in this study (specifically, GSE17612 and Mclean66), and therefore were unable to precisely identify the extent to which antipsychotic use affects the results of our network analysis. We found that medication-associated genes overlap to a similar degree with the immune response module in both networks, which suggests that coexpression clustering patterns in schizophrenia are not driven by medication effects. The effects of other confounding variables (i.e. age, pH and sex) were also addressed in a similar manner, however we cannot exclude with certitude the possibility that the network properties we have identified are still in some way influenced by these extraneous factors. Also, samples used in our study comprise a heterogeneous collection of cell types from the DLPFC. While the majority of our datasets utilized samples from Brodmann areas 10 and 46, we also included one dataset from Brodmann area 9. Focusing on samples within a specific region would be ideal, as we avoid the potential dilution of cell-specific biological signals associated with schizophrenia. However, we included samples from all three Brodmann areas to maximize total sample size in our study and increase the power of our analysis.
Finally, we wish to stress limitations to the interpretation of coexpression networks. In contrast with other biological networks (i.e. protein interaction networks or metabolic networks) whose edges represent well-defined biological interactions, the edges in a coexpression network are a reduced representation of the correlation structure of the data. The edges are related to values of the pairwise correlation coefficient that are calculated from the expression data of the genes, and are dependent on the threshold applied to infer those networks. A connection between two genes in a coexpression network does not necessarily correspond with a connection in PPI networks, pathway or regulatory networks . Thus, when studying gene coexpression networks it is important not to confuse the edges as direct physical interactions. Indeed, there is growing evidence that a substantial proportion of the variance in gene expression among replicate brain samples is explained by variance in cellular composition [5, 6]. Our work supports this as we identify “modules” that are strongly associated with different cell type marker genes. Thus rather than identifying “physical interactions” among gene products, coexpression patterns to a large degree seem to reflect cell-type enriched expression, that is, expression in the same cell, but not necessarily finer levels of granularity such as a pathway. Changes in the composition of such modules might reflect changes in cellular states in subpopulations of cells, or changes in the associations of cell types with one another . It is important not to interpret such differences as meaning that a physical interaction has been gained or lost among gene products.
In summary we have contributed the largest meta-analysis of gene coexpression in schizophrenia. We evaluated various topological properties of the control and schizophrenia networks to reveal a shared coexpression structure between them. Characterization of functional clusters in each network with cell-type marker genes displayed differences that link together disease-related processes. Differentially expressed genes in schizophrenia also associate with biologically relevant clusters providing evidence for systems level dysfunction. Further research is required to disentangle these network findings to distinguish primary from secondary disease phenomena, but we hope our study will encourage new directions in the network biology of schizophrenia. Finally, our work demonstrates novel methodological approaches that can assist in ensuring that coexpression analyses yield biologically justifiable and robust results.
Data processing and quality control
Expression profiling data sets were selected on the basis of microarray platform, use of prefrontal cortex (BA 9, 10 or 46), the availability of information on covariates such as age, and finally the availability of the raw data . Details on each of the seven datasets, including the source citation, can be found in Table 6. Data were preprocessed as described; briefly, expression levels were summarized, log2 transformed and normalized for each individual dataset using the R Bioconductor ‘affy’ package , with default settings for the RMA algorithm. Sample outliers were removed from each dataset based on an inter-sample correlation analysis, resulting in a total of 306 samples (153 from schizophrenia subjects, 153 from unaffected controls) across the seven data sets. Schizophrenia and control groups had no significant differences in age and PMI (Table 1). Sex differences were assessed by use of a chi-squared test for equality of proportion, and we observed no significant difference (p = 0.1). Brain pH was significantly different (t-test; p = 0.001). For each of the seven data sets, batch information was obtained using the ‘scan date’ stored in the CEL files; chips run on different days were considered different batches and batch effects for each dataset were removed using the ComBat algorithm .
Gene coexpression networks
For each dataset, samples were separated into control and schizophrenia cohorts. Probes were mapped to genes using annotations provided in Gemma (http://www.chibi.ubc.ca/Gemma), which are based on stringent methods described in . For genes mapping to multiple probes, the average expression value was retained. Only genes that were represented in all seven datasets were considered, leaving a total of 12,582 genes. This yielded seven expression data matrices for schizophrenia and seven for controls (one for each study). Separate networks were constructed for the schizophrenia and control groups based on previously described methods . Briefly, a gene expression profile similarity matrix was computed for each cohort by taking the absolute value of the Pearson correlation between all possible gene pairs. Correlation values in the similarity matrix were replaced by ranks. These similarity matrices were aggregated by cohort across datasets by taking the mean rank for each gene pair. We previously showed that this aggregation procedure is a robust method for producing high-quality coexpression networks . In keeping with previous work [44, 51], the aggregated matrix was thresholded at 0.5% sparsity, resulting in an adjacency matrix of 392,606 connections for each of the control and schizophrenia cohorts.
Random coexpression networks
To evaluate the significance of network measures across the whole network, formulation of appropriately randomized null models are required. We devised a procedure that results in a random network with the same number of genes and the same node degree distribution as the original data. Additionally, the node degree for each individual gene is preserved (i.e. each gene still has the same number of connections, but the specific genes to which it is connected to are scrambled). It has been previously shown that both PPI and coexpression networks show a correlation between node degree and gene multifunctionality . Thus, by constraining each gene by its node degree we can systematically assess the significance of other topological properties of the network (i.e. clustering coefficient and shortest path length) while controlling for any potential multifunctionality bias in the microarray data. To create random networks, all gene pairs were assembled into an adjacency list (2 columns, 392,606 rows) and genes on one side of the edge were permuted. The resulting edges that created self-connections and/or duplicate gene pairs were isolated and permutation was re-applied to them. This was done iteratively until the number of conflicts was reduced to ten or less. These remaining conflicting edges were removed from the final random network.
We explored three different network properties, each of which is briefly described below.
Each gene can be characterized by the number of connections it has, that is, the number of other genes it is significantly coexpressed with. This property is called the node degree. Node degrees were characterized by their distribution. For many biological networks the degree distribution has been characterized as ‘scale-free’, or at least ‘heavy tailed’. This can be observed by the quality of a linear fit of the distribution on log-log scale .
The shortest path length measures the shortest distance to get from one gene to another gene by traversing edges in the network. In an un-weighted network this is the least number of edges traversed to get between the two genes. We computed shortest paths using Dijkstra’s algorithm . A value is obtained for a gene against every other gene in the network, and presented as the mean shortest path length across all genes. Genes without any direct neighbours are treated as missing values.
The clustering coefficient of a gene indicates how connected the direct neighbours of a gene are to one another. It is the ratio of the number of connections in the neighbourhood of a node to the number of connections if the neighbourhood was fully connected. The clustering coefficient ranges from zero to one. A value of 1 would indicate that all the neighbours of a node are all connected to each other, or ‘cliquish’ in nature. A value of 0 would indicate that none of the neighbours of a node are connected to each other. This measure can only be computed for nodes that interact with more than one other node.
Schizophrenia meta-signature network analysis
The meta-signature gene set of 25 up- and 73 down-regulated schizophrenia genes were obtained from the results of our meta-analysis of differential expression on the same datasets . Four genes were removed from the down-regulated gene set as they were not present in the network, leaving a total of 94 ‘schizophrenia genes’. Throughout this chapter we will refer to these gene sets as SZUP and SZDOWN for the genes up- and down-regulated, respectively. Average values of shortest path length and clustering coefficient for the SZUP and SZDOWN gene sets were evaluated within each network. To estimate the relevance of the network measures for SZUP and SZDOWN, we implemented three important controls described below.
Random gene set comparison
For each meta-signature gene set, the average values of shortest path length and clustering coefficient were compared to a background distribution in each network. The background distribution was generated by randomly selecting 1000 gene sets. Random gene sets were constrained by size and node degree of the meta-signature gene set to control for multifunctional bias. To ensure a well-matched node degree for each random gene set, selection was done on a per-gene basis by choosing a random gene within ± 50 of its node degree rank. Z-scores were then computed to quantify the difference between the mean of the background distribution to the observed values for each network measure of SZUP and SZDOWN. For positive z-scores a p-value was computed reflecting how many random gene sets have values higher than the observed value. For negative z-scores a p-value was computed reflecting how many random gene sets have values less than the observed value.
Functional gene set comparison
Although our meta-signature of schizophrenia genes span a range of cellular functions, they possess a shared functional feature of altered expression in schizophrenia. Thus, it is important to assess whether the network properties we observe with our meta-signature gene sets are not just a property of gene groups that have lots of shared functional features. To control for this, we generated functionally characterized gene sets using the Gene Ontology (GO). From the GO database (http://www.geneontology.org/), we obtained 3,230 GO terms for which the associated gene set size ranged from 10–1000 genes. For each GO term we retrieved all human genes that were annotated with that term to compile a gene group for each GO term, also referred to as a functional gene set. Each of these functional gene sets were evaluated individually by comparison to a background distribution of randomly selected gene sets (of equivalent size and node degree), within each network. The distribution of z-scores obtained from 3,230 functional gene groups was plotted for each network and used to evaluate network properties of the meta-signature gene sets in reference to other functionally related gene sets.
Disease gene set comparison
To assess the network properties of our schizophrenia meta-signature genes in relation to other sets of disease-associated genes, we compiled disease gene lists for five different brain-related disorders. Gene sets were assembled for Alzheimer’s disease (http://www.alzgene.org/), Parkinson’s disease (http://www.pdgene.org/), multiple sclerosis (http://www.msgene.org/), and schizophrenia (http://www.szgene.org/) from their respective gene databases. Each database has been compiled based on findings from genetic association studies and provide gene lists on their website. The schizophrenia list obtained from SZGene (http://www.szgene.org/) comprised only the top 45 of the most reliable gene associations based on findings from a SZGene in-house meta-analysis. We also compiled an Autism spectrum disorder gene list from Toro et al. . Average values of shortest path length and clustering coefficient were computed for all five disease gene sets. Network measures were compared to a background distribution of randomly selected gene sets and z-scores were compared against functional gene set z-score distributions.
To extract modules (i.e. subset of nodes that are more densely connected to each other than to nodes outside the subset) from the control and schizophrenia networks we implemented a cluster-based algorithm based on methods described by . Each adjacency matrix was transformed into a distance matrix by computing the topological overlap between all probe pairs . Topological overlap measure (TOM) between two genes is calculated by comparing the direct connections of each. If two nodes connect to the same group of other nodes they are said to have ‘high topological overlap’. We used a generalization of this measure that enriches TOM’s sensitivity to longer ranging connections between nodes by incorporating the number of m-step neighbours (m = 2) that a pair of node share . The TOM matrices were subjected to WGCNA-based methods , whereby hierarchical clustering was applied with average linkage, and the resulting tree was used to define network modules.
In order to determine which modules were associated with schizophrenia we looked for enrichment of differentially expressed genes found from our previous meta-analysis of the same data . In addition to computing overlaps of SZUP and SZDOWN with each module, we also looked at enrichment by utilizing the t-statistic for the disease effect of each gene. T-statistics were entered into a Wilcoxon rank-sum test by module and resultant p-values were corrected for multiple testing using the Benjamini-Hochberg procedure.
To examine the functional roles encoded by these modules we used the Gene Ontology (GO) annotations and evaluated enrichment using the over-representation analysis (ORA) method in ErmineJ [44, 56]. The ORA method evaluates the genes that meet a selection criterion to determine if there are gene sets (GO groups) which are statistically over-represented. The ORA method requires the entire list of genes and their associated scores and a score threshold must be selected. Binary scores were used to evaluate enrichment of each module; a value of 1 was assigned to genes with membership in the module and a value of 0 assigned to the rest of the genes. P-values for this method are computed by using the binomial approximation to the hypergeometric distribution, and then corrected for multiple testing using the Benjamini-Hochberg procedure. Modules were also evaluated for CNS cell type enrichment by cross-referencing the genes in each module with published lists of neuron, oligodendrocyte and astrocyte marker genes . Cell type marker lists were compiled by extracting genes at a lower stringency threshold (as described in ) and a high stringency threshold (greater than 10-fold expression change). Hypergeometric probabilities were computed to evaluate the significance of overlap with cell type marker lists in each module.
Meta-signature genes up-regulated in schizophrenia
Meta-signature genes down-regulated in schizophrenia
Central nervous system
- Sequeira PA, Martin MV, Vawter MP: The first decade and beyond of transcriptional profiling in schizophrenia. Neurobiol Dis. 2012, 45 (1): 23-36. 10.1016/j.nbd.2011.03.001.PubMed CentralView ArticlePubMedGoogle Scholar
- Mistry M, Gillis J, Pavlidis P: Genome-wide expression profiling of schizophrenia using a large combined cohort. Mol Psychiatry. 2013, 18 (2): 215-225. 10.1038/mp.2011.172.PubMed CentralView ArticlePubMedGoogle Scholar
- Bader GD, Hogue CW: An automated method for finding molecular complexes in large protein interaction networks. BMC Bioinformatics. 2003, 4: 2-10.1186/1471-2105-4-2.PubMed CentralView ArticlePubMedGoogle Scholar
- Spirin V, Mirny LA: Protein complexes and functional modules in molecular networks. Proc Natl Acad Sci U S A. 2003, 100 (21): 12123-12128. 10.1073/pnas.2032324100.PubMed CentralView ArticlePubMedGoogle Scholar
- Zhang B, Horvath S: A general framework for weighted gene co-expression network analysis. Stat Appl Genet Mol Biol. 2005, 4: Article17.PubMedGoogle Scholar
- Oldham MC, Konopka G, Iwamoto K, Langfelder P, Kato T, Horvath S, Geschwind DH: Functional organization of the transcriptome in human brain. Nat Neurosci. 2008, 11 (11): 1271-1282. 10.1038/nn.2207.PubMed CentralView ArticlePubMedGoogle Scholar
- Gaiteri C, Sibille E: Differentially expressed genes in major depression reside on the periphery of resilient gene coexpression networks. Frontiers in neuroscience. 2011, 5: 95.PubMed CentralView ArticlePubMedGoogle Scholar
- Torkamani A, Dean B, Schork NJ, Thomas EA: Coexpression network analysis of neural tissue reveals perturbations in developmental processes in schizophrenia. Genome Res. 2010, 20 (4): 403-412. 10.1101/gr.101956.109.PubMed CentralView ArticlePubMedGoogle Scholar
- Voineagu I, Wang X, Johnston P, Lowe JK, Tian Y, Horvath S, Mill J, Cantor RM, Blencowe BJ, Geschwind DH: Transcriptomic analysis of autistic brain reveals convergent molecular pathology. Nature. 2011, 474 (7351): 380-384. 10.1038/nature10110.PubMed CentralView ArticlePubMedGoogle Scholar
- Roussos P, Katsel P, Davis KL, Siever LJ, Haroutunian V: A system-level transcriptomic analysis of schizophrenia using postmortem brain tissue samples. Arch Gen Psychiatry. 2012, 69 (12): 1-11.View ArticleGoogle Scholar
- Chen C, Cheng L, Grennan K, Pibiri F, Zhang C, Badner JA, Gershon ES, Liu C, Members of the Bipolar Disorder Genome Study C: Two gene co-expression modules differentiate psychotics and controls. Mol Psychiatry. 2012, doi:10.1038/mp.2012.146Google Scholar
- Stumpf MP, Porter MA: Mathematics. Critical truths about power laws. Science. 2012, 335 (6069): 665-666. 10.1126/science.1216142.View ArticlePubMedGoogle Scholar
- Tong AH, Lesage G, Bader GD, Ding H, Xu H, Xin X, Young J, Berriz GF, Brost RL, Chang M, et al: Global mapping of the yeast genetic interaction network. Science. 2004, 303 (5659): 808-813. 10.1126/science.1091317.View ArticlePubMedGoogle Scholar
- Yu H, Braun P, Yildirim MA, Lemmens I, Venkatesan K, Sahalie J, Hirozane-Kishikawa T, Gebreab F, Li N, Simonis N, et al: High-quality binary protein interaction map of the yeast interactome network. Science. 2008, 322 (5898): 104-110. 10.1126/science.1158684.PubMed CentralView ArticlePubMedGoogle Scholar
- Cahoy JD, Emery B, Kaushal A, Foo LC, Zamanian JL, Christopherson KS, Xing Y, Lubischer JL, Krieg PA, Krupenko SA, et al: A transcriptome database for astrocytes, neurons, and oligodendrocytes: a new resource for understanding brain development and function. J Neurosci. 2008, 28 (1): 264-278. 10.1523/JNEUROSCI.4178-07.2008.View ArticlePubMedGoogle Scholar
- Mistry M, Pavlidis P: A cross-laboratory comparison of expression profiling data from normal human postmortem brain. Neuroscience. 2010, 167 (2): 384-395. 10.1016/j.neuroscience.2010.01.016.PubMed CentralView ArticlePubMedGoogle Scholar
- Iwamoto K, Bundo M, Kato T: Altered expression of mitochondria-related genes in postmortem brains of patients with bipolar disorder or schizophrenia, as revealed by large-scale DNA microarray analysis. Hum Mol Genet. 2005, 14 (2): 241-253.View ArticlePubMedGoogle Scholar
- Prabakaran S, Swatton JE, Ryan MM, Huffaker SJ, Huang JT, Griffin JL, Wayland M, Freeman T, Dudbridge F, Lilley KS, et al: Mitochondrial dysfunction in schizophrenia: evidence for compromised brain metabolism and oxidative stress. Mol Psychiatry. 2004, 9 (7): 684-697, 643.View ArticlePubMedGoogle Scholar
- Garbett K, Gal-Chis R, Gaszner G, Lewis DA, Mirnics K: Transcriptome alterations in the prefrontal cortex of subjects with schizophrenia who committed suicide. Neuropsychopharmacol Hung. 2008, 10 (1): 9-14.PubMedGoogle Scholar
- Katsel P, Davis KL, Gorman JM, Haroutunian V: Variations in differential gene expression patterns across multiple brain regions in schizophrenia. Schizophr Res. 2005, 77 (2–3): 241-252.View ArticlePubMedGoogle Scholar
- Maycox PR, Kelly F, Taylor A, Bates S, Reid J, Logendra R, Barnes MR, Larminie C, Jones N, Lennon M, et al: Analysis of gene expression in two large schizophrenia cohorts identifies multiple changes associated with nerve terminal function. Mol Psychiatry. 2009, 14 (12): 1083-1094. 10.1038/mp.2009.18.View ArticlePubMedGoogle Scholar
- Narayan S, Tang B, Head SR, Gilmartin TJ, Sutcliffe JG, Dean B, Thomas EA: Molecular profiles of schizophrenia in the CNS at different stages of illness. Brain research. 2008, 1239: 235-248.PubMed CentralView ArticlePubMedGoogle Scholar
- Walterfang M, Wood SJ, Velakoulis D, Pantelis C: Neuropathological, neurogenetic and neuroimaging evidence for white matter pathology in schizophrenia. Neurosci Biobehav Rev. 2006, 30 (7): 918-948. 10.1016/j.neubiorev.2006.02.001.View ArticlePubMedGoogle Scholar
- Hansen T, Hemmingsen RP, Wang AG, Olsen L, Timm S, Soeby K, Jakobsen KD, Fenger M, Parnas J, Rasmussen HB, et al: Apolipoprotein D is associated with long-term outcome in patients with schizophrenia. Pharmacogenomics J. 2006, 6 (2): 120-125. 10.1038/sj.tpj.6500350.View ArticlePubMedGoogle Scholar
- Qin W, Gao J, Xing Q, Yang J, Qian X, Li X, Guo Z, Chen H, Wang L, Huang X, et al: A family-based association study of PLP1 and schizophrenia. Neurosci Lett. 2005, 375 (3): 207-210. 10.1016/j.neulet.2004.11.013.View ArticlePubMedGoogle Scholar
- Yang YF, Qin W, Shugart YY, He G, Liu XM, Zhou J, Zhao XZ, Chen Q, La YJ, Xu YF, et al: Possible association of the MAG locus with schizophrenia in a Chinese Han cohort of family trios. Schizophr Res. 2005, 75 (1): 11-19. 10.1016/j.schres.2004.11.013.View ArticlePubMedGoogle Scholar
- Bedard A, Tremblay P, Chernomoretz A, Vallieres L: Identification of genes preferentially expressed by microglia and upregulated during cuprizone-induced inflammation. Glia. 2007, 55 (8): 777-789. 10.1002/glia.20477.View ArticlePubMedGoogle Scholar
- Farina C, Aloisi F, Meinl E: Astrocytes are active players in cerebral innate immunity. Trends Immunol. 2007, 28 (3): 138-145. 10.1016/j.it.2007.01.005.View ArticlePubMedGoogle Scholar
- Condamine T, Le Texier L, Howie D, Lavault A, Hill M, Halary F, Cobbold S, Waldmann H, Cuturi MC, Chiffoleau E: Tmem176B and Tmem176A are associated with the immature state of dendritic cells. J Leukoc Biol. 2010, 88 (3): 507-515. 10.1189/jlb.1109738.View ArticlePubMedGoogle Scholar
- D’Agostino PM, Gottfried-Blackmore A, Anandasabapathy N, Bulloch K: Brain dendritic cells: biology and pathology. Acta Neuropathol. 2012, 124 (5): 599-614. 10.1007/s00401-012-1018-0.PubMed CentralView ArticlePubMedGoogle Scholar
- Perez-Santiago J, Diez-Alarcia R, Callado LF, Zhang JX, Chana G, White CH, Glatt SJ, Tsuang MT, Everall IP, Meana JJ, et al: A combined analysis of microarray gene expression studies of the human prefrontal cortex identifies genes implicated in schizophrenia. J Psychiatr Res. 2012, 46 (11): 1464-1474. 10.1016/j.jpsychires.2012.08.005.View ArticlePubMedGoogle Scholar
- Li T, Li Z, Chen P, Zhao Q, Wang T, Huang K, Li J, Li Y, Liu J, Zeng Z, et al: Common variants in major histocompatibility complex region and TCF4 gene are significantly associated with schizophrenia in Han Chinese. Biol Psychiatry. 2010, 68 (7): 671-673. 10.1016/j.biopsych.2010.06.014.View ArticlePubMedGoogle Scholar
- Stefansson H, Ophoff RA, Steinberg S, Andreassen OA, Cichon S, Rujescu D, Werge T, Pietilainen OP, Mors O, Mortensen PB, et al: Common variants conferring risk of schizophrenia. Nature. 2009, 460 (7256): 744-747.PubMed CentralPubMedGoogle Scholar
- Lencz T, Morgan TV, Athanasiou M, Dain B, Reed CR, Kane JM, Kucherlapati R, Malhotra AK: Converging evidence for a pseudoautosomal cytokine receptor gene locus in schizophrenia. Mol Psychiatry. 2007, 12 (6): 572-580. 10.1038/sj.mp.4001983.View ArticlePubMedGoogle Scholar
- Ozbey U, Tug E, Namli M: Interleukin-10 gene promoter polymorphism in patients with schizophrenia in a region of East Turkey. World J Biol Psychiatry. 2009, 10 (4 Pt 2): 461-468.View ArticlePubMedGoogle Scholar
- Paul-Samojedny M, Owczarek A, Suchanek R, Kowalczyk M, Fila-Danilow A, Borkowska P, Kucia K, Kowalski J: Association study of interferon gamma (IFN-gamma) +874T/A gene polymorphism in patients with paranoid schizophrenia. J Mol Neurosci. 2011, 43 (3): 309-315. 10.1007/s12031-010-9442-x.View ArticlePubMedGoogle Scholar
- Arion D, Unger T, Lewis DA, Levitt P, Mirnics K: Molecular evidence for increased expression of genes related to immune and chaperone function in the prefrontal cortex in schizophrenia. Biol Psychiatry. 2007, 62 (7): 711-721. 10.1016/j.biopsych.2006.12.021.PubMed CentralView ArticlePubMedGoogle Scholar
- Saetre P, Emilsson L, Axelsson E, Kreuger J, Lindholm E, Jazin E: Inflammation-related genes up-regulated in schizophrenia brains. BMC Psychiatry. 2007, 6 (7): 46.View ArticleGoogle Scholar
- Shao L, Vawter MP: Shared gene expression alterations in schizophrenia and bipolar disorder. Biol Psychiatry. 2008, 64 (2): 89-97. 10.1016/j.biopsych.2007.11.010.PubMed CentralView ArticlePubMedGoogle Scholar
- de Jong S, Boks MP, Fuller TF, Strengman E, Janson E, de Kovel CG, Ori AP, Vi N, Mulder F, Blom JD, et al: A gene co-expression network in whole blood of schizophrenia patients is independent of antipsychotic-use and enriched for brain-expressed genes. PloS one. 2012, 7 (6): e39498-10.1371/journal.pone.0039498.PubMed CentralView ArticlePubMedGoogle Scholar
- Gardiner EJ, Cairns MJ, Liu B, Beveridge NJ, Carr V, Kelly B, Scott RJ, Tooney PA: Gene expression analysis reveals schizophrenia-associated dysregulation of immune pathways in peripheral blood mononuclear cells. J Psychiatr Res. 2012, doi:10.1016/j.jpsychires.2012.11.007Google Scholar
- Sainz J, Mata I, Barrera J, Perez-Iglesias R, Varela I, Arranz MJ, Rodriguez MC, Crespo-Facorro B: Inflammatory and immune response genes have significantly altered expression in schizophrenia. Mol Psychiatry. 2012, doi:10.1038/mp.2012.165Google Scholar
- Allen NC, Bagade S, McQueen MB, Ioannidis JP, Kavvoura FK, Khoury MJ, Tanzi RE, Bertram L: Systematic meta-analyses and field synopsis of genetic association studies in schizophrenia: the SzGene database. Nature genetics. 2008, 40 (7): 827-834. 10.1038/ng.171.View ArticlePubMedGoogle Scholar
- Gillis J, Pavlidis P: The role of indirect connections in gene networks in predicting function. Bioinformatics. 2011, 27 (13): 1860-1866. 10.1093/bioinformatics/btr288.PubMed CentralView ArticlePubMedGoogle Scholar
- Gillis J, Pavlidis P: The impact of multifunctional genes on “guilt by association” analysis. PloS one. 2011, 6 (2): e17258-10.1371/journal.pone.0017258.PubMed CentralView ArticlePubMedGoogle Scholar
- Xulvi-Brunet R, Li H: Co-expression networks: graph properties and topological comparisons. Bioinformatics. 2010, 26 (2): 205-214. 10.1093/bioinformatics/btp632.PubMed CentralView ArticlePubMedGoogle Scholar
- Tan PP, French L, Pavlidis P: Neuron-Enriched Gene Expression Patterns are Regionally Anti-Correlated with Oligodendrocyte-Enriched Patterns in the Adult Mouse and Human Brain. Frontiers in neuroscience. 2013, 7: 5.PubMed CentralView ArticlePubMedGoogle Scholar
- Team RDC: R: A Language and Environment for Statistical Computing. 2011, Vienna, Austria: In. Edited by Computing RFfSGoogle Scholar
- Johnson WE, Li C, Rabinovic A: Adjusting batch effects in microarray expression data using empirical Bayes methods. Biostatistics. 2007, 8 (1): 118-127. 10.1093/biostatistics/kxj037.View ArticlePubMedGoogle Scholar
- Barnes M, Freudenberg J, Thompson S, Aronow B, Pavlidis P: Experimental comparison and cross-validation of the Affymetrix and Illumina gene expression analysis platforms. Nucleic Acids Res. 2005, 33 (18): 5914-5923. 10.1093/nar/gki890.PubMed CentralView ArticlePubMedGoogle Scholar
- Lee HK, Hsu AK, Sajdak J, Qin J, Pavlidis P: Coexpression analysis of human genes across many microarray data sets. Genome Res. 2004, 14 (6): 1085-1094. 10.1101/gr.1910904.PubMed CentralView ArticlePubMedGoogle Scholar
- Albert RJ H, Barabasi AL: Internet: Diameter of the World-Wide Web. Nature. 1999, 401: 2.Google Scholar
- Dijkstra EW: A note on two problems in connexion with graphs. Numerische Mathematik. 1959, 1: 269-271. 10.1007/BF01386390.View ArticleGoogle Scholar
- Toro R, Konyukh M, Delorme R, Leblond C, Chaste P, Fauchereau F, Coleman M, Leboyer M, Gillberg C, Bourgeron T: Key role for gene dosage and synaptic homeostasis in autism spectrum disorders. Trends Genet. 2010, 26 (8): 363-372. 10.1016/j.tig.2010.05.007.View ArticlePubMedGoogle Scholar
- Yip AM, Horvath S: Gene network interconnectedness and the generalized topological overlap measure. BMC bioinformatics. 2007, 8: 22-10.1186/1471-2105-8-22.PubMed CentralView ArticlePubMedGoogle Scholar
- Lee HK, Braynen W, Keshav K, Pavlidis P, Ermine J: tool for functional analysis of gene expression data sets. BMC Bioinformatics. 2005, 6: 269-10.1186/1471-2105-6-269.PubMed CentralView ArticlePubMedGoogle Scholar