Sox2 acts as a transcriptional repressor in neural stem cells

Background The transcription factor, Sox2, is central to the behaviour of neural stem cells. It is also one of the key embryonic stem cell factors that, when overexpressed can convert somatic cells into induced pluripotent cells. Although generally studied as a transcriptional activator, recent evidence suggests that it might also repress gene expression. Results We show that in neural stem cells Sox2 represses as many genes as it activates. We found that Sox2 interacts directly with members of the groucho family of corepressors and that repression of several target genes required this interaction. Strikingly, where many of the genes activated by Sox2 encode transcriptional regulators, no such genes were repressed. Finally, we found that a mutant form of Sox2 that was unable to bind groucho was no longer able to inhibit differentiation of neural stem cells to the same extent as the wild type protein. Conclusions These data reveal a major new mechanism of action for this key transcription factor. In the context of our understanding of endogenous stem cells, this highlights the need to determine how such a central regulator can distinguish which genes to activate and which to repress. Electronic supplementary material The online version of this article (doi:10.1186/1471-2202-15-95) contains supplementary material, which is available to authorized users.


Background
Sox2 is a central player in animal development and one of only a few factors that together can initiate the formation of pluripotent cells (iPS cells) from somatic cell populations [1,2]. This reflects its central role as a 'node' in the gene regulatory network that controls embryonic stem cell biology, promoting stem cell self-renewal and inhibiting differentiation. Sox2 is also one of the first genes to be active in the neural ectoderm and its expression is maintained in proliferating neural stem cells (NSCs) of the CNS throughout development and in the mature brain [3,4]. The expression of Sox2 in these cells again seems to be associated with their self-renewal and inhibition of differentiation [5]. Sox2 is generally regarded as a transcriptional activator. However, in recent studies analysing the global response of cells to loss of Sox2 activity, the expression of many genes was seen to increase rapidly when Sox2 function was inhibited [6,7]. ChIP-seq analysis shows that several of the genes affected in these studies are directly bound by Sox2, implying that they are direct targets [8][9][10][11]. Given the central role of Sox2 in the biology of both ES cells and NSCs, the possibility that it might also possess such a major alternative mechanism of action is of great interest. We therefore set out to determine to what extent Sox2 represses genes in NSCs. Using an expression array approach, we found that Sox2 repressed approximately as many genes as it activated in NSCs.
We also investigated the mechanism by which Sox2 might achieve transcriptional repression. Tcf and Lef are closely related to the Sox gene family. These factors can act as either transcriptional activators or repressors. Like many other transcriptional repressors, these proteins achieve this effect by recruiting members of the grouchorelated gene (Grg) family of co-repressors [12]. We considered this to be a likely route for the repressor activity of Sox2.
We found that Sox2 can indeed interact with Grg proteins and a mutation disrupting interaction between Sox2 and the Grgs resulted in loss of the ability of Sox2 to repress expression from the GFAP promoter whilst its ability to activate other promoters remained intact. We also found that, unlike wild type (WT) Sox2, the mutant version of Sox2 was unable to repress neuronal differentiation when overexpressed. This suggests a new model for the mechanisms by which Sox2 regulates NSC biology.

Elucidating the target genes of Sox2 repression
In order to ascertain the extent to which Sox2 activates and represses genes in NSCs, we carried out gene expression microarray analysis. Human NSCs were transfected with expression constructs encoding EGFP alone, or together with WT Sox2. After 14 hours in culture, EGFP-expressing cells were isolated by FACS and RNA extracted for analysis using Affymetrix Human Genome U133 Plus 2.0 chips ( Figure 1A, see Additional file 1: Table S1).
RT-qPCR revealed that the level of Sox2 overexpression was approximately 6 fold greater than endogenous Sox2 and expression of the endogenous Sox2 gene was unaffected in transfected cells. (data not shown). Among the genes that were affected by Sox2 (>1.5 fold change as compared to cells transfected with EGFP alone), the number exhibiting repressed probe sets (650, 5% of genes on the array) was almost the same as the number activated (652) ( Figure 1B, C). These numbers are strikingly similar to the number of genes whose expression increased (623) or decreased (648) >1.5 fold when Sox2 expression was lost (using inducible Sox2-null mice) in ES cells (Masui et al. [6]).
Comparison of the genes activated by Sox2 to those repressed revealed a striking difference. According to their gene ontology (GO) terms (Using the Gorilla tool [13]) genes that were activated by Sox2 were highly enriched for those listed under terms related to regulating transcription (see Additional file 2: Table S2 and Additional file 3: Table S3). Most of these represent transcription factors. Remarkably, the list of genes repressed by Sox2, included no terms in these same GO term categories. By contrast, among the genes repressed by Sox2, there was very strong enrichment for genes associated with the cell cycle and mitosis, including spindle organization and DNA replication and repair. Such a dramatic difference in the classes of gene activated or repressed by Sox2 provides additional assurance that the genes identified are a non-random selection of the genes on the array.
Previous studies in which Sox2 was knocked down were carried out in ES cells and so would not be expected to share much in common with our experiments in NSCs. However, comparison between these data revealed a small proportion of the genes identified as repressed by Sox2 in our study that were also repressed by Sox2 in ES cells. Some of these genes have also been shown to be bound by Sox2 by ChIPseq (Masui et al. [6]; Greber et al. [7]) ( Figure 1D).

Sox2 interacts with Grg proteins
Other HMG domain factors have been shown to interact with the Grg family of co-repressors [14,15]. Therefore, in order to determine if this might also be a potential mechanism by which Sox2 could act as a transcriptional repressor, we used two assays. First, we determined if Sox2 could alter the subcellular localization of Grg proteins. There are five Grg genes in vertebrates; Grg1-4 are long forms and Grg 5 is equivalent to only the N-terminal half of those long forms. We analysed Grg5 and Grg3 as a representative full length Grg protein. When transfected into COS-7 cells, Grg3 forms nuclear bodies, Grg5 is seen in both the nucleus and the cytoplasm and Sox2 exhibits diffuse staining restricted to the nucleus ( Figure 2A, see Additional file 4: Figure S1A). When co-transfected with Sox2, both Grg3 and Grg5 adopted exclusively diffuse nuclear staining matching the distribution of Sox2, suggesting an interaction between Sox2 and Grgs ( Figure 2B, C). In a second assay, co-immunoprecipitation was carried out using human NSCs, (which are known to express Sox2) transfected with MYC-tagged Grgs. This showed that endogenous Sox2 was co-precipitated with the MYCtagged Grg3 and Grg5 implying that they exist in a protein complex ( Figure 2D).

Grgs affect Sox2 function
To determine if Grgs affect the transcriptional regulation activity of Sox2, luciferase reporter assays were carried out using several promoter sequences. Co-transfection of COS-7 cells with Sox2 resulted in a twofold increase in luciferase expression from a 'generic' Sox promoter (pTl/3xSX, which has three Sox binding sites upstream of the luciferase reporter gene) and a 25-fold increase when the luciferase gene was under the control of the REX gene proximal promoter element, a known target of Sox2 [16]. When Grg3 or Grg5 constructs were also transfected alongside Sox2, these increases in luciferase activity were almost completely abrogated or severely reduced ( Figure 3A, B). According to Cavellaro et al. [17], expression of the GFAP gene is directly repressed by Sox2. Here, we confirmed this in P19 EC cells, in which the basal level of luciferase activity was significantly repressed by co-transfected Sox2 ( Figure 3C). This repression was even greater in the presence of co-transfected Grg3 or Grg5 ( Figure 3C), where transfection of the same amount of Grg3 or Grg5 alone had no significant effect on the luciferase expression ( Figure 3D).

Mapping the Grg interaction domain in Sox2
We designed six C-terminal deletions of Sox2 ( Figure 4A) and tested their ability to interact with Grgs (all the constructs produced Sox2 protein that still located to the nucleus, see Additional file 4: Figure S1). The subcellular translocation assay with Grg5 indicated that the C-terminal amino acids between 203 and 209 were essential for the interaction ( Figure 4A). We therefore set out to generate a site-directed mutant that no longer bound Grgs, but would retain transcriptional activator activity. The genes most strongly repressed by Sox2 in our study that were also suggested to be repressed by Sox2 in mouse ES cells (Masui et al. [6]), or in human ES cells (Greber et al. [7]). Final column indicates the study in which these were also shown to be direct targets of Sox2.

Loss of Grg interaction in a 203-209 mutant
Comparison between Sox2 and the other SoxB1 members and between Sox2 orthologues from different species revealed that only four amino acids were conserved in the region from position 203-209. We therefore made a mutant, referred to as Sox2 M203-209 , in which these four amino acids were altered (DxxxLQY converted to VxxxAAA; Figure 4A). Although this mutant still located to the nucleus (see Additional file 4: Figure S1A ), in the subcellular translocation assay, the Sox2 M203-209 mutant exhibited a dramatically reduced ability to change the subcellular localization of Grg5 ( Figure 5A). Moreover, unlike WT Sox2, immune precipitation of MYC-Grg5 in COS-7 cells did not co-precipitate co-transfected Sox2 M203-209 mutant ( Figure 5B). In order to determine whether this mutation had disrupted a direct proteinprotein interaction between the Grg proteins and Sox2 an in vitro pull down assay was used. The ability of GSTfused Grg1 or Grg5 proteins to pull down radiolabelled Sox2 or Sox2 M203-209 was assessed. For both Grg proteins, the Sox2 M203-209 mutant resulted in a similar 4-fold decrease in the amount of Sox2 that was co-immune precipitated ( Figure 5C). It was also noted that Grg1 was able to pull down approximately 4-fold more Sox2 than did Grg5 (data not shown). These data show that residues 203-209 of the Sox2 protein are necessary for a direct interaction with both Grg1 and Grg5.

The Sox2 M203-209 mutant loses repressor activity
Having established that amino acids 203-209 were required for Grg binding, we next asked whether Sox2 M203-209 was defective in its ability to act as a gene repressor. Using the luciferase reporter assay described above, we found that the Sox2 M203-209 mutant retained the ability to activate both the 3xSX and the REX-regulated reporter constructs ( Figure 5D, E), indeed having a slightly stronger activator effect on the 3xSX promoter than WT Sox2. However, the mutant no longer significantly repressed luciferase expression of the construct driven from the GFAP promoter even in the presence of ectopic Grg3 or Grg5 (Figure 5 F). These results suggest that despite its inability to bind Grg co-repressors, Sox2 M203-209 retains the transcriptional activator activity of WT Sox2, but is significantly impaired in its ability to repress transcription from the GFAP regulatory sequence.
Consistent with this, when the effect of ectopic Sox2 or the mutant Sox2 M203-209 expression on GFAP and six genes in NSCs was assessed by qPCR, a range of levels of   repression was exhibited by WT Sox2. This showed repression by WT Sox2 and a reduced or lack of repression by the Sox2 M203-209 mutant ( Figure 6A).

Groucho-binding mutant of Sox2 fails to repress neural differentiation
Overexpression of Sox2 in NSCs has been shown to interfere with their ability to differentiate [18,19]. We therefore compared the effect of overexpressing WT Sox2 to the Sox2 M203-209 mutant. Unlike control NSCs transfected with EGFP alone, cells transfected with WT Sox2 did not extend fine, MAP2 positive processes after 5 days in differentiation medium, but instead, fewer cells extended broader, shorter processes ( Figure 6B,C). However, no such inhibition of differentiation was seen in cells transfected with the Sox2 M203-209 mutant.

Discussion
Sox2 is a central component of the gene regulatory network that controls a range of stem cells, most notably ESCs and NSCs. The realization that, in addition to its role as a transcriptional activator, Sox2 might also repress genes is relatively recent and little has been done to investigate this activity.
Here, we have shown that the number of genes repressed by Sox2 in NSCs is almost identical to the number activated and we have identified one mechanism (recruitment of groucho family corepressors) by which it can In an in vitro pull down assay, immunoprecipitation of GST-fused Grg1 or Grg5 reproducibly pulled-down 4-5 fold less Sox2 M203-209 than it did WT Sox2 (D) The Sox2 M203-209 mutant retained the ability to activate luciferase expression driven by the 3xSX promoter in Cos-7 cells (the difference between this and the level induced by WT Sox2 was not statistically significant). (E) The ability of co-transfected Sox2 M203-209 mutant to induce luciferase expression driven by the REX promoter in P19 cells was very similar to that induced by WT Sox2. (F) Unlike WT Sox2, the Sox2 M203-209 mutant failed to repress luciferase expression driven from the GFAP promoter, and the addition of Grg3 or Grg5 had no effect on this. *p = <0.05; **p = <0.01; nsnot significant Scale bar approximately 20 μm. achieve this repression. We have consequently generated a version of Sox2 that acts as a transcriptional activator but now lacks that mechanism for repressor activity. This allows us to begin to dissect the full complexity of Sox2 activity in regulating cell behaviour.

Sox2 mechanism of repression
We chose to investigate a potential interaction with Grgs since this interaction has already been shown to mediate repression by the HMG family protein, Tcf [14,15]. We have shown that Sox2 does indeed, directly interact with both full length and short forms of the Grgs. We used a series of deletions constructs to map the putative Grg-interacting region of Sox2 to amino acids 203-209 (YDVSALQY), which shows similarity to the Eh1 Grg interacting domain, F/YxI/VxxI/L/V [20,21]. Targeted mutation of this region localized the interaction to the sequence, DxxxLQY, which is well conserved in the SoxB1 family.
However, alternative repressor mechanism(s) may also exist. This would provide multiple aspects to the mechanisms of action of Sox2 that could therefore be independently regulated to achieve a high level of complexity in its biological activities. It is also possible that some of the repressive effects of Sox2 are indirect via transcriptional activation of a repressor. Our results were over a short time scale so we do not feel that there was likely to be sufficient time for this to occur but it remains formally possible. This is supported by our earlier observations that the repressive effects of the very similar protein, Sox3, were mimicked by an HMG-engrailed repressor fusion protein [22].

Relative role of transcriptional repression in Sox2 functions
Since Sox2 has traditionally been regarded as a transcriptional activator, it is striking that our study revealed that the number of genes repressed by overexpression of Sox2 was a similar to the number of genes that were activated. This implies that repression plays as big a part in its biological functions as does activation. Since the numbers of genes affected by Sox2 in our study closely resembles the numbers of genes affected when Sox2 was knocked down in ES cells [6,7] it seems probable that the effects in our study represent true targets of Sox2.
It is of note that the targets of Sox2 activation are highly enriched for regulators of Pol II transcription whereas no such genes are repressed. This implies that a large part of the activator function of Sox2 (approximately 25% of the genes affected by Sox2 overexpression) is to regulate the biological activity of NSCs indirectly through the function of downstream transcription factors, whereas its repressor function affects the cells directly though regulating effector genes. Since Sox2 is expressed in dividing progenitor cells, the enrichment for cell cycle related genes in those repressed by Sox2 looks at first to be counterintuitive. However, this observation suggests that its normal role may be in part to control the rate of mitosis in those stem cells.
Previous studies have shown that the effects of SoxB1s in inhibiting the differentiation of NSCs was mimicked by a constitutive activator form of SoxB1 protein, while an HMG-EnR construct caused cells to begin to differentiate, suggesting that the effects of the SoxB1s were entirely through its activity as a transcriptional activator [18,19]. However, overexpression of the HMG-EnR construct did not elicit complete differentiation as shown by the absence of neurofilament or beta-tubulin expression [18]. Indeed, close inspection of the published data shows that an HMG-VP16 construct inhibits expression of early neurogenic transcription factors, but does not appear to completely inhibit expression of beta-tubulin.
We therefore suggest a model in which the activator function of Sox2 promotes 'stem cell-ness' and so inhibits differentiation, but the repressor function of Sox2 is also required to inhibit differentiation, repressing those effector genes that would be activated soon after the cells were released from the NSC state ( Figure 6C). Consistent with this model and the published data, the gene encoding neurofilament light chain was amongst those genes revealed to be repressed greater than 1.5-fold by Sox2 in our microarray analysis.

Target specificity
Our observation that some genes are activated while others are repressed in the same transfected cell population, suggests that it is the gene target sequence that determines whether Sox2 exerts its activator or repressor activity. Activation or repression is not dictated solely by the availability of cofactors for these functions in a 'cell context' dependent manner, but is dictated by the target gene, which presumably determines which Sox2 cofactors are available at that regulatory region to cause Sox2 to act as either an activator or repressor.

Conclusions
This study shows that transcriptional repression is a major part of the mechanism by which Sox2 acts in NSCs. In order to understand how Sox2 functions to regulate stem cell biology, we must therefore understand not only what is upstream and downstream of Sox2, but also which cell type-dependent cofactors are required for Sox2 to regulate each target and the gene sequence context that determines whether the target gene is activated or repressed.

Cell culture
The

Immunofluorescent staining
Cells were cultured on poly-D-lysine coated coverslips, fixed with 4% paraformaldehyde/PBS for 10 min and permeabilised with 0.2% Triton X100/PBS for 20 min. Blocking was carried out with 10% BSA in 0.1% Triton X100/PBS for 30 min. Primary antibodies (MYC antibody (9E10), Sox2 antibody (R&D, MAB2018), MAP-2 antibody (Abcam)) and the fluorescent-conjugated secondary antibodies were incubated at room temperature for 1 h. Staining was observed after mounting in mounting medium for fluorescence with DAPI (Vector). Error bars in Figures 1C based on counting 15 cells and 3A, and 4A counting 100 cells.

Immunoprecipitation and immunoblotting
Two days after transfection, cells were treated with the cross-linker, 1 mM dithio-bis(succinimidyl propionat) (DSP; Sigma) for 30 min. The cells were lysed, cleared by centrifugation and incubated with MYC antibody-conjugated beads (Sigma). Eluted proteins were analysed by SDS-PAGE and western blotting using anti-MYC and anti-Sox2 antibodies (See supplementary material for additional details).
In vitro pull down assays were performed using GSTfusions of Grg1 and Grg5 (cloned into the pGEX4T1 vector). These proteins were induced in BL21 E.coli cells by the addition of 0.1 mM isopropyl-D-thiogalactopyranoside. After 16 h growth at 18°C bacteria were harvested and lysed. Solubility of GST-Grg proteins was increased by an additional 1 h incubation (4°C) in lysis buffer with 1% Nonidet P-40 and 0.03% SDS. GST-fusion proteins were subsequently incubated with S 35 -labelled Sox2 proteins produced in vitro using a TNT T7 kit (Promega), with (Amersham Biosciences). GST-fusion protein were pulled-down using Glutathione-sepharose 4B beads (G.E Healthcare) and the presence of co-precipitated Sox proteins assessed by exposure of a PAGE gel on a phosphorimager.

Luciferase assay
The luciferase assay was carried out 24 h after transfection with the dual-luciferase-reporter assay system (Promega) following manufacturer's instructions. Error bars represent standard deviation based on three independent transfection experiments.
Expression profiling microarray hNSCs were transfected with EGFP with or without pcDNA-Sox2 or pcDNA-Sox2 M203-209 . After 14 h culture, GFP-positive cells were isolated by FACS. Total RNA was extracted using TRI reagent (Sigma) and further purified using an RNeasy kit (QIAGEN). The RNA samples were analysed using GeneChip® Human Genome U133 Arrays (Affymetrix) and data were first preprocessed using the statistical software, R with packages provide by www. bioconductor.org. Data was preprocessed using the RMA method [23] and filtered such that probes which gave expression outputs below control background probes (recorded in the GeneChip) were excluded. Fold differences in expression were calculated, and annotation packages were used to assign gene information to each probe set. The data was exported as a .txt file in order to be read and analysed in Excel. The Excel tool PivotTable was used to assign average expression intensity values to each gene.

Site directed mutagenesis
Point mutants were generated using the QuikChange kit (Stratagene) according to manufacturer's instructions.

Quantitative RT-PCR
The extraction of total RNA from transfected hNSC using TRI reagent (Sigma) was carried out according to the manufacturer's instructions. Total RNA samples were cleaned up using the RNeasy Mini Kit (Qiagen). Purified total RNA was used for the synthesis of cDNA using SuperScript III Reverse Transcriptase (Invitrogen). The primers used in the quantitative RT-PCR are listed below. Quantitative RT-PCR was carried out by Rotor-Gene 6000 (Corbett/Qiagen) with Brilliant SYBR Green QPCR Master Mix (Agilent). The PCR programme was set at 95°C for 15 seconds, 60°C for 20 seconds and 72°C for 20 seconds. All the quantitative RT-PCR data were analysed by Rotor-Gene 6000 realtime rotary analyzer version 1.7. The relative expression level compared to cyclophilin B was calculated according to Pfaffl [24]. Error bars in Figure 6A-standard deviation based on three replicated qPCR reactions.
Details of plasmids and primers available in supplementary material.

Additional files
Additional file 1: Table S1. Affymetrix expression data from GFP and Sox2 transfected human NSCs. Data from GeneChip® Human Genome U133 Arrays. Columns G and H are anti-logged values of columns E and F allowing un-logged ratios of expression to be clear, as shown in columns I and J.
Additional file 2: Table S2. List of GO terms of genes that increased in expression when Sox2 was over-expressed in NSCs. N is the total number of genes; B is the total number of genes associated with a specific GO term; n is the flexible cutoff, i.e. the automatically determined number of genes in the 'target set' (ie. affected by Sox2 overexpression) and b is the number of genes in the 'target set' that are associated with a specific GO term. Enrichment is defined as (b/n)/(B/N). FDR q-value -False Discovery Rate analogue of the p-value.
Additional file 3: Table S3. List of GO terms of genes that decreased in expression when Sox2 was over-expressed in NSCs. N is the total number of genes; B is the total number of genes associated with a specific GO term; n is the flexible cutoff, i.e. the automatically determined number of genes in the 'target set' (ie. affected by Sox2 overexpression) and b is the number of genes in the 'target set' that are associated with a specific GO term. Enrichment is defined as (b/n)/(B/N). FDR q-value -False Discovery Rate analogue of the p-value.
Additional file 4: Figure S1. Nuclear localization of various Sox2 deletion mutants (B) and the Sox2 M203-209 mutant (C) as compared to WT Sox2 (A). Sox2 proteins were overexpressed in COS-7 cells and visualized after 20 h using anti-sox2 antibody (green). Nuclei were counter stained with DAPI (blue). All proteins localized exclusively to the nuclei. Scale bar approximately 10 μm.