An evolutionary conserved region (ECR) in the human dopamine receptor D4 gene supports reporter gene expression in primary cultures derived from the rat cortex

Background Detecting functional variants contributing to diversity of behaviour is crucial for dissecting genetics of complex behaviours. At a molecular level, characterisation of variation in exons has been studied as they are easily identified in the current genome annotation although the functional consequences are less well understood; however, it has been difficult to prioritise regions of non-coding DNA in which genetic variation could also have significant functional consequences. Comparison of multiple vertebrate genomes has allowed the identification of non-coding evolutionary conserved regions (ECRs), in which the degree of conservation can be comparable with exonic regions suggesting functional significance. Results We identified ECRs at the dopamine receptor D4 gene locus, an important gene for human behaviours. The most conserved non-coding ECR (D4ECR1) supported high reporter gene expression in primary cultures derived from neonate rat frontal cortex. Computer aided analysis of the sequence of the D4ECR1 indicated the potential transcription factors that could modulate its function. D4ECR1 contained multiple consensus sequences for binding the transcription factor Sp1, a factor previously implicated in DRD4 expression. Co-transfection experiments demonstrated that overexpression of Sp1 significantly decreased the activity of the D4ECR1 in vitro. Conclusion Bioinformatic analysis complemented by functional analysis of the DRD4 gene locus has identified a) a strong enhancer that functions in neurons and b) a transcription factor that may modulate the function of that enhancer.


Background
Complex behaviours are likely to be generated in part by how, where, when and by how much proteins are involved in neurotransmission. A key to understanding how such expression patterns are generated is to identify transcriptional regulatory domains for a particular gene. Comparative genetics allows the identification of such domains [1,2] and in this communication we have applied such a strategy to identify regulatory domains of the dopamine receptor D4 (DRD4). The DRD4 gene is involved in the regulation of social and cognitive phenotype of modern humans and it is preferentially expressed in the prefrontal and cingulate cortices [3][4][5], key centres for processing of complex behaviour and cognition. In these regions, the DRD4 protein acts as a regulator of dopamine levels, where mis-expression of DRD4 has also been associated with the onset of cognitive, behavioural and personality disorders e.g., shyness, ADHD, addiction and Parkinson's disease [6][7][8][9]. Several polymorphisms exhibiting cis regulatory properties have been identified in this gene. However, in spite of the interest in the DRD4 gene and its potential importance for pharmacogenetics, the regulatory machinery behind appropriate spatial-temporal gene expression remains to be elucidated.
Genome sequencing of diverse vertebrate species has permitted comparisons that reveal strong conservation of non-coding regions (evolutionary conserved regions or ECRs) between distantly related vertebrate species (e.g., between human and mouse or human and fish). Such conservation has been suggested to indicate that a given ECR could act as either a cis regulator of gene expression, alter post-transcriptional modifications or both [1,2]. In vertebrates ECRs have been identified in many genes involved in development [10][11][12] and behaviour [13].
To search for ECRs that may play a role in regulating the DRD4 gene expression, we conducted a multiple comparison of 28 vertebrate genomes at this locus using the UCSC browser (http://genome.ucsc.edu). The transcriptional activity of the most conserved ECR (D4ECR1) identified at the DRD4 locus was tested in vitro. Here we assessed the ability of this ECR fragment to support luciferase reporter gene expression in primary cultures of rat neonate cortical tissue. These cells were chosen for the analyses because prefrontal cortex (PFC) neurons have been reported to express the endogenous DRD4 gene during prenatal and early postnatal development in modern humans [3,4]. The expression of the DRD4 gene was also confirmed by RTPCR of cDNA extracted from tissue sections (additional file 1).
Furthermore, we assessed whether this D4ECR1 was subject to regulation by Sp1, a transcription factor which has multiple potential consensus sequences in this ECR and has been suggested as a candidate regulator for DRD4 gene expression [14]. Our data demonstrate that this bioinformatics approach is capable of identifying regulatory domains and the transcription factors that modulate them.

Results
Identification of an ECR in the DRD4 gene of mammals Examination of the DRD4 gene and flanking regions using the 28 way most conserved option of the conservation tool on the UCSC browser (limited by neighbouring genes at 10 kb upstream and 3.5 kb downstream) revealed the presence of one strong ECR in its first intron (henceforth referred to as D4ECR1, located in chr11: 628420 -628493, 74 bp in the human genome [hg18 2006 assembly]). Location is shown in Figure 1A.
Examination of the peak of conservation generated showed that a highly homologous sequence to that of the human D4ECR1 was found in 7 other mammalian genomes: Pan troglodytes (chimpanzee), Macaca mulatta (rhesus macaque), Mus musculus (mouse), Rattus norvegicus (rat), Cavia porcellus (guinea pig), Echinops telfairi (tenrec) and Canis familiaris (dog). Due to the lack of sequencing across the D4ECR1 locus other vertebrate genomes, it was impossible to determine if this sequence was conserved across vertebrates.
The TFBS in the mammalian D4ECR1 sequences were identified by AliBaba 2.1 (Table 1). Briefly, AliBaba is a program for predicting binding sites of transcription factors in an unknown DNA sequence; it uses the binding sites collected in TRANSFAC (4.0). This analysis identified 7 conserved binding sites for different types of TFs between the human and mouse D4ECR1 (Table 1). These TFs include: Sp1, GATA1, AP2alpha, Oct1, NF1, GR and T3Ralpha, whereby the most common binding site found in the D4ECR1 sequence was for the TF Sp1 ( Figure 1B).
Functional analysis of the human D4ECR1 demonstrated its ability to support differential luciferase expression in cultures of neonate frontal cortex The potential transcriptional activity of a DNA fragment including the human D4ECR1 was validated in a reporter gene assay ( Figure 2). The D4ECR1p construct was able to act as an enhancer of pGL3p reporter gene expression in primary cultures of frontal cortex obtained from 2 and 5 day old male Wistar rats. In these cell cultures, the activity supported by the D4ECR1p (4.9 fold) was significantly higher than that supported by the unmodified pGL3p control plasmid (Student's T-test, pGL3p vs. D4ECR1 2+5 days, p < 0.001, indicated by *** in Figure 2).

D4ECR1 enhancer activity is regulated by overexpression of Sp1 in vitro
The identification of multiple Sp1 sites in the D4ECR1 sequence ( Figure 1B) may be of importance because this TF has been previously found to bind other proposed cis regulators of the human DRD4 gene [14,15]. The cotransfection experiment ( Figure 3) demonstrated that overexpression of Sp1 can modulate the enhancer activity of D4ECR1p in dissociated cultures of neonate rat frontal cortex at age 2 days, whereby both concentrations of Sp1 tested had a repressing effect on the transcriptional activity of the D4ECR1p.
In the graph the basal transcriptional activities of the D4ECR1p and pGL3p plasmids were normalised to 100% to express the effect of co-transfecting with Sp1 as percentages of repression. In brief, Sp1 was found to repress the activity of both the unmodified pGL3p and D4ECR1p constructs; however, this effect was only statistically significant for the D4ECR1p.

Discussion
This study aimed to identify potential transcriptionally active non-coding ECRs in the DRD4 gene locus of mammals, using a comparative genomic analysis complemented and validated by reporter gene analysis. The analyses conducted demonstrated that there is at least one ECR in the DRD4 gene of mammals ( Figure 1A) that exhibited transcriptional activity in frontal cortex cultures of neonate rat ( Figure 2). Furthermore, in our analyses the activity of the D4ECR1p construct was  reduced in these cells when the TF Sp1, predicted to bind to the D4ECR1, was over-expressed ( Figure 3). The results suggest that D4ECR1 could act as a regulator of DRD4 gene expression in the CNS. Analysis of the in vitro regulatory role of the human D4ECR1p construct ( Figure 2) showed that it supported high levels of reporter gene expression in cultures of neonate rat frontal cortex. We observed (data not shown) that the transcriptional effects of the D4ECR1p were distinct in the cultures obtained from rats at different ages. This might be relevant as dopaminergic activity and DRD4 mRNA levels fluctuate during the first days of postnatal development in the cerebral cortex of rats [4,[16][17][18].
DRD4 expression in the prefrontal cortex is constantly being modified in vivo in early postnatal development as demonstrated by previous reports (e.g. ref. [18,19]). Thus, it is possible that the D4ECR1 could be one domain that mediates such differences in regulation at this time, however there any many other regulators and domains that are likely to contribute to DRD4 expression. The variation in the activity of the D4ECR1p at these times is likely to be caused by interaction with transcription and growth factors differentially expressed during this period of development. Coincidentally, the expression of Sp1 has also been documented to fluctuate in the first days of postnatal development in the rat cerebral cortex [19].
The factors potentially binding to the D4ECR1 sequence (Table 1) indicate potential regulatory pathways operating on D4ECR1. In the present study, the TF with the highest number of binding sites found in the D4ECR1 sequence by Alibaba 2.1 was Sp1 ( Figure  1B). Potential Sp1 binding sites have previously been reported in various positions within the human DRD4 locus including: the 120 bp duplication and 27 bp deletion in the 5' region [20,21]; upstream of the promoter region [22] and in the 48 bp VNTR of the 3 rd exon [15]. The binding of Sp1 to some of these elements has been confirmed in vitro by EMSA [14,15]. Furthermore, Sp1 was found to be expressed in the frontal cortex of neonate rats and has been implicated in the regulation of dopaminergic systems in the rat brain [23]. For all these reason we addressed Sp1 regulation of D4ECR1. The results indicate that the D4ECR1p described here is also responsive to Sp1 and could either individually, or in concert with the other predicted Sp1 regulated domains in the DRD4 gene modulate its expression (Figure 3). However, Sp1 is not the only transcription factor that is predicted to modulate function directed by D4ECR1 and further work is required to address how these might interact to modulate the enhancer function.

Conclusion
The DRD4 gene harbours a non-coding ECR region in its first intron which exhibits characteristics of a cis regulator of gene expression in mammalian neurons. Furthermore, combination of comparative genomics and in vitro assays can be very helpful to identify novel regulatory elements in genes relevant to behavioural disorders and the transcription factors that regulate their function

Identification of ECRs in the DRD4 gene
The UCSC browser (http://genome.ucsc.edu/) was used to analyse the entire dopamine receptor D4 gene locus (accession number L12397) and 10000 and 3500 bases of the 5' and 3' regions respectively (extent between DRD4 and flanking genes SCT and DEAF1, Figure 1A). The browser was set to identify regions of conservation amongst the genome of 28 vertebrates using the Vertebrate Multiz Alignment & PhastCons Conservation.
In silico prediction of transcription factor binding sites (TFBS) in the D4ECR1 sequence In order to investigate which transcription factors (TFs) were potentially interacting with identified conserved regions, the sequences were subjected to in silico analyses, using the publicly available Alibaba 2.1 program. Figure 2 The human D4ECR1 exhibits transcriptional activity in dissociated cultures of neonate rat frontal cortex. D4ECR1 (1 μg) construct was transfected into dissociated cultures of frontal cortex obtained from 2 and 5 days old Wistar rats under basal conditions. The transcriptional activity of the D4ECR1 is different from pGL3p alone. This effect was found to be significant (Student's T-test, p < 0.001, ***). Values obtained in three independent experiments per triplicate (n = 9). Alibaba 2.1 was set to detect known consensus binding sequences for TF based on the TRANSFAC 4.0 library (from the biological database webpage http://www.biobase-international.com) using the following parameters: Minimum matrix conservation (similarity between the consensus binding site for a TF and a potential binding site in the query sequence) 70%, minimum number of homologous sites (the minimum number of sites of which a matrix is build) = 4, factor class level (the classification of TFs in the TRASNFAC database is hierarchical and include 6 levels, from family of transcription factors to splice variants) = 4 and similarity of the sequence to the matrix = 1.

Generation of the D4ECR1 reporter gene construct
In silico analysis of the DRD4 gene suggested presence of one strong ECR in intron 1 (termed D4ECR1). The D4ECR1 fragment identified was amplified by polymerase chain reaction (PCR) from H. sapiens genomic DNA (Novagen) using the following primers: DRDR4ECR1-f 5'ggggtacccct act cga ggt ttc ccc ttg at 3' and DRDR4ECR1-r 5' ccgctcgagcg tat gaa gac cgt gcc cag tg 3'. These primers were designed to encompass the D4ECR1 and incorporate flanking restriction sites for Xho I and Acc65I (bold in the primer sequences) one at either end of the primers, to facilitate cloning into the reporter gene vector pGL3p (Promega, UK), thus yielding a PCR fragment of 239bp. Briefly, conditions for PCR comprised: a reaction mix containing 100 ng of DNA template, 1 μM of each primer, 2.5 units Diamond DNA polymerase (Bioline), 1X Diamond polymerase buffer (Bioline), 0.2 mM of each dNTP, 2 mM MgCl 2 , 1 M betaine and dH 2 0 to a final volume of 50 μl per reaction. The PCR was performed for 35 cycles consisting of 55°C annealing step (1 min), 72°C extension step (1 min), and a 95°C denaturing step (1 min) in a Px2 thermal cycler (Thermo Scientific). The fragment generated by PCR was subcloned into the pGEM-T vector (Promega, UK) by TA cloning and the sequence verified. It was then released by enzymatic digestion with Xho I and Acc65 I and subsequently cloned into the multiple cloning site of the pGL3p vector, which carries a reporter gene, Firefly luciferase, driven by a minimal SV40 promoter (Promega, UK).

Cell culture
All rats were used under local and national Schedule 1 guidelines. Primary rat frontal cortex cultures were Ltd.]). The tissue was mechanically and enzymatically disassociated using 2 Pasteur pipettes of decreasing pore size in 3 ml of trypsin/EDTA solution (0.025/0.02% [Sigma-Aldrich Ltd.]). Dissociated cells were pelleted at 500 rpm and washed (3 times) with medium I (DMEM, 10% FCS, penicillin/streptomycin 100 units/ml, 100 μg/ml), resuspended and plated in poly-D-lysine (100 μl of 200 mg/ml) coated 24 well plates (5 × 10 5 cells per well) containing medium I (1 ml, without antibiotics) for 7 hours at 37°C in a humidified CO 2 environment. The culture medium was changed to culture medium II (Neurobasal-A medium [Invitrogen/Gibco], 2% B27 supplement, 2 mM GlutaMAX I and 1 μg/ml of gentamycin), and cells were incubated overnight prior to transfection.

Transfection and Co-transfection assays
The human D4ECR1-reporter gene plasmid (DRECR1p) or unmodified pGL3p vector (1 μg each) and pmLuc2 (Sea pansy luciferase expression vector [Novagen], 10 ng per 1 μg of luciferase reporter gene plasmid) were transiently co-transfected into primary cultures using ExGen 500 reagent (Fermentas) following manufacturers instructions. The pmLuc2 vector was used to control for transfection efficiency and enable standardisation of firefly luciferase values. At 48 hours post transfection, cells were lysed with 40 μl of passive lysis buffer (Promega UK, Ltd.). Firefly luciferase was quantified using the Dual Luciferase kit (Promega Ltd UK) following manufacturers instructions. Average luciferase activity values were expressed as a ratio to the reporter gene expression supported by unmodified pGL3p. Standard deviation and standard errors were calculated based on 3 independent experiments (triplicate wells).
To determine the effects of co-expression of a specific TF on the regulatory abilities of this D4ECR1, we transfected the D4ECR1p (1 μg per well) with the renillin luciferase plasmid pmLuc2 (1:100 ratio) and two different concentrations (0.5 and 1 μg) of an expression vector carrying the full length cDNA of human Sp1, kindly donated by Michael Bannon [24]. These constructs were delivered into primary cultures of neonate rat frontal cortex obtained from 2 days old rats as described above. In control experiments, the unmodified luciferase vector pGL3p (+pmLuc2) was also co-transfected with the two concentrations of Sp1 to assess possible effects of Sp1 on the backbone of the luciferase plasmid.

Additional material
Additional File 1: RTPCR gel of DRD4 mRNA from rat frontal cortex. Agarose gel showing the RTPCR products of the DRD4 receptor amplified from from sections of Wistar neonate rat brain frontal cortex. Lane 1: 1 kb ladder, lane 2: negative control, lane 3: RTPCR conducted with rat genomic DNA, lane 4: RTPCR amplification from rat brain cortex.