A novel locus on mouse chromosome 7 that influences survival after infection with tick-borne encephalitis virus

Background Tick-borne encephalitis (TBE) is the main tick-borne viral infection in Eurasia. Its manifestations range from inapparent infections and fevers with complete recovery to debilitating or fatal encephalitis. The basis of this heterogeneity is largely unknown, but part of this variation is likely due to host genetic. We have previously found that BALB/c mice exhibit intermediate susceptibility to the infection of TBE virus (TBEV), STS mice are highly resistant, whereas the recombinant congenic strain CcS-11, carrying 12.5% of the STS genome on the background of the BALB/c genome is even more susceptible than BALB/c. Importantly, mouse orthologs of human TBE controlling genes Oas1b, Cd209, Tlr3, Ccr5, Ifnl3 and Il10, are in CcS-11 localized on segments derived from the strain BALB/c, so they are identical in BALB/c and CcS-11. As they cannot be responsible for the phenotypic difference of the two strains, we searched for the responsible STS-derived gene-locus. Of course the STS-derived genes in CcS-11 may operate through regulating or epigenetically modifying these non-polymorphic genes of BALB/c origin. Methods To determine the location of the STS genes responsible for susceptibility of CcS-11, we analyzed survival of TBEV-infected F2 hybrids between BALB/c and CcS-11. CcS-11 carries STS-derived segments on eight chromosomes. These were genotyped in the F2 hybrid mice and their linkage with survival was tested by binary trait interval mapping. We have sequenced genomes of BALB/c and STS using next generation sequencing and performed bioinformatics analysis of the chromosomal segment exhibiting linkage with TBEV survival. Results Linkage analysis revealed a novel suggestive survival-controlling locus on chromosome 7 linked to marker D7Nds5 (44.2 Mb). Analysis of this locus for polymorphisms between BALB/c and STS that change RNA stability and genes’ functions led to detection of 9 potential candidate genes: Cd33, Klk1b22, Siglece, Klk1b16, Fut2, Grwd1, Abcc6, Otog, and Mkrn3. One of them, Cd33, carried a nonsense mutation in the STS strain. Conclusions The robust genetic system of recombinant congenic strains of mice enabled detection of a novel suggestive locus on chromosome 7. This locus contains 9 candidate genes, which will be focus of future studies not only in mice but also in humans.

Background Tick-borne encephalitis (TBE) is the main tick-borne viral infection in Eurasia. It is prevalent across the entire continent from Japan to France [1]. The disease is caused by tick-borne encephalitis virus (TBEV), a flavivirus of the family Flaviviridae, which besides TBEV includes West Nile virus (WNV), dengue virus (DENV), Zika virus (ZIKV), yellow fever virus (YFV), Japanese encephalitis virus (JEV), and several other viruses causing extensive morbidity and mortality in humans. Ticks act as both the vector and reservoir for TBEV. The main hosts are small rodents, with humans being accidental hosts. In Europe and Russia between 5000 and 13,000 clinical cases of TBE are reported annually, with a large annual fluctuation [2]. The highest incidence of TBE is reported in western Siberia, in the Czech Republic, Estonia, Slovenia and Lithuania, but the prevalence of the disease is believed to be higher than actually reported [1,2]. TBEV may produce a variety of clinical symptoms, from an asymptomatic disease to a fever and acute or chronic progressive encephalitis. The outcome of infection depends on the strain of virus [1], as well as on the genotype [3], sex and age of the host [4], and on the environmental and social factors [1]. Environmental and social factors influence also risk of infection.
Genetic influence on susceptibility to TBEV-induced disease has been analyzed by two main strategies: a hypothesis-independent phenotype-driven approach and a hypothesis-driven approach. Application of a genome-wide search (hypothesis-independent approach) in mouse led to identification of the gene Oas1 (2′-5′-oligoadenylate synthetase gene) [5,6]. A stop codon in exon 4 of the gene Oas1b (a natural knockout) present in majority of mouse laboratory strains causes production of protein lacking 30% of the C terminal sequence [5]. This part of molecule seems to be critical for tetramerization required for OAS1B activity leading to degradation of viral RNA. Thus, this mutation makes majority of mouse laboratory strains susceptible to flaviviruses [6,7]. Human ortholog to this gene (OAS1) also modifies susceptibility to other flaviviruses (WNV) [8,9], whereas OAS2 and OAS3 localized in the same cluster on chromosome 12q24.2 influence response to TBEV [3]. The polymorphic sites associated in OAS2 and OAS3 with susceptibility to TBEV did not resulted in amino acid changes, thus mechanisms of susceptibility control is not known [3]. The hypothesis-driven approach has focused on genes that encode molecules indicated to be involved in antiviral response by mechanistic studies [9]. These candidate genes studies revealed that polymorphisms in CD209/DC-SIGN [10], CCR5 [11,12], TLR3 [12,13], IL10 [14] and IFNL3/IL28B [14] influence susceptibility to TBEV in humans.
Our previous study has shown that both after subcutaneous and intracerebral inoculation of European prototypic TBEV, BALB/c mice exhibited intermediate susceptibility to the infection, STS mice were highly resistant, whereas the strain CcS-11, which carries 12.5% of the STS genome on the background of the genome of the strain BALB/c [15], is even more susceptible than its two parents-BALB/c and STS [16]. Importantly, mouse orthologs of human TBEV controlling genes: Oas1b, Cd209, Tlr3, Ccr5, Il10 and Ifnl3 are in CcS-11 localized on segments derived from the strain BALB/c (Fig. 1), so they are identical in both BALB/c and CcS-11 and hence cannot be responsible for the phenotypic difference of the two strains. Therefore, the difference must be due to a presently unknown locus, which could be detected by a linkage study of a cross between BALB/c and CcS-11. Thus, we have generated a F 2 intercross between BALB/c and CcS-11 and performed a linkage and bioinformatics analysis. These studies revealed a novel suggestive locus on mouse chromosome 7 containing 9 potential candidate genes.

Methods
Mice 417 female F 2 offspring of an intercross between strains CcS-11 and BALB/c (mean and median age 9.5 and 9 weeks, respectively, at the time of infection) were produced at the Institute of Molecular Genetics AS CR. Mice were tested in three successive experimental groups at the Institute of Parasitology, AS CR. When used for these experiments, strain CcS-11 had undergone more than 90 generations of inbreeding. Experiments Nr. 1, 2, and 3 comprised 120, 121 and 176 F 2 mice, respectively. Sterilized pellet diet and water were supplied ad libitum. The mice were housed in plastic cages with wood-chip bedding, situated in a specific pathogen-free room with a constant temperature of 22 °C and a relative humidity of 65%.

Virus infection and disease phenotype
Experiments were performed with European prototypic TBEV strain Neudoerfl (a generous gift from Professor F. X. Heinz, Medical University of Vienna). This strain was passaged five times in brains of suckling mice before the use in this study [16]. Mice were infected subcutaneously with 10 4 pfu of the virus.
Mice were scored for mortality for a period of 35 days post-infection (p.i.) with TBEV, as well as presence of ruffled fur and paresis in three independent successive experiments at the Institute of Parasitology AS CR.

Statistical analysis
Survival, ruffled fur and paresis were treated as binary phenotypes (death/survival; presence/absence of symptom), and binary trait interval mapping was performed [18,19]. A permutation test [20] was used to assess significance. This takes account of the limited genetic difference between the strains BALB/c and CcS-11. On the basis of 10,000 permutation replicates, the 5% significance LOD threshold was 2.56; the 10% threshold was 2.23. The Pearson correlation coefficient between presence of death and paresis was computed by the program Statistica for Windows 12.0 (StatSoft, Inc., Tulsa, OK).

Detection of polymorphisms that change RNA stability and genes' functions
We have sequenced the genomes of strains BALB/c and STS using next generation sequencing (NGS) system HiSeq 2500 (Illumina) (12× coverage). NGS data was preprocessed using software Trimmomatic [21] and overlapped paired reads were joined by software Flash [22]. Alignment-reference mouse sequence mm10 (build GRCm38)-was performed using BWA (Burrows-Wheeler Aligner) [23] program. Mapped reads were sorted and indexed, duplicated reads were marked. Segment covering peak of linkage on chromosome 7 from 36.2 to 74.5 Mb was inspected for polymorphisms between BALB/c and STS that change RNA stability and Fig. 1 Genetic composition of the strain CcS-11. The regions of STS and BALB/c origin are represented as dark and white, respectively, the boundary regions of undetermined origin are shaded. Only the markers or SNPs defining the boundaries of STS-derived segment and markers that were tested for linkage (underlined) are shown. Genes Oas1b, Cd209, Tlr3, Ccr5, Ifnl3 and Il10, known to control susceptibility to TBEV are shown in green, potential candidate genes Cd33, Klk1b22, Siglece, Klk1b16, Fut2, Grwd1, Abcc6, Otog, and Mkrn3 detected in current study are shown in red genes' functions. Local realignment around indels, base recalibration and variants filtration were performed using software GATK (The Genome Analysis Toolkit) [24]. Variant annotation and effect prediction was performed by software SnpEff [25]. IGV (Integrated Genome Viewer) was used for visualization of results [26].

Results
Binary trait linkage analysis revealed a suggestive locus on chromosome 7 near D7Nds5 affecting the binary trait (death/survival) (LOD = 2.15), with a corresponding genome-scan-adjusted P value = 0.12 (Fig. 2a). The 1-LOD support interval spans from D7Mit25 to D7Nds1. The STS allele both in homozygotes and heterozygotes was associated with a higher death rate in each of the three separate experimental groups (Fig. 2b), and in the pooled data (Fig. 3a), so its presence in CcS-11 enhances even more the overall susceptibility determined by the BALB/c background. Ruffled fur was observed in only 8% of mice, so it was not suitable for statistical analysis. Paresis was less frequent than mortality (n = 60 vs. 102) and not all paretic mice died, but the two phenotypes were positively correlated (Pearson correlation 0.53). Moreover, frequency of paresis in the three D7Nds5 genotypes (Fig. 3b), although not significantly different, was biologically compatible with the survival data, because D7Nds5 CC homozygotes had the highest survival rate, and the lowest percentage of paresis.

Discussion
CD33 and Siglec E belong to family of CD33-related sialic-acid-binding immunoglobulin-like lectins (CD33r-Siglecs). They are ITIM-containing inhibitory receptors, which are involved in regulation of inflammatory and immune responses [27]. Gene Cd33 carried in the strain STS a nonsense mutation ( Table 1). Product of this gene is in mouse expressed on myeloid precursors and cells of myeloid origin [28] and on microglial cells [29]. It can inhibit response to amyloid plaques and its deletion leads to protection in the mouse model of Alzheimer disease (AD) [29] and in humans some CD33 genetic variants are associated with late-onset AD [30]; its potential role in pathology of TBEV might be associated with its regulatory role in inflammatory responses. Gene Siglece carried in the strain STS a single amino acid change. Siglec E is expressed on microglia and inhibits neurotoxicity triggered by neural debris [31], which might have protective role against damage induced by flaviviruses.
A single amino acid change was present in KLK1B22 and KLK1B16. Kallikreins are serine proteases that might both help to fight infection by activating complement system [32], as well as aggravate disease symptoms by releasing bradykinin, which causes alterations in vascular permeability [33]. Their role in defense against flaviviruses has not been described. Kallikrein-bradykinin system have been described to contribute to protection against Leishmania [34] and Trypanosoma cruzi [35] parasites in mice. Interestingly, on the mouse chromosome 7 were in the strain CcS-11 mapped loci Lmr21 and Tbbr3 that control susceptibility to L. major [36] and T. b. brucei [17], respectively. However, both loci are mapped on a long chromosomal segment, thus other gene(s) might be responsible for their effect.
FUT2 have been described to influence control of a wide range of pathogens such as noroviruses [37], rotaviruses [38], HIV [39], and Escherichia coli [40] in humans, and to Helicobacter pylori in mouse [41], but its role in resistance to flaviviruses is not known.
Makorin 1 induces degradation of WNV capsid which might protect host cells [42]. The E3 ligase domain responsible for MKRN1 effect is present also in MKRN3 [43]. Thus, gene Mkrn3 might have relationship to defense against flaviviruses. Similarly, possible role of Otog, Grwd1 and Abcc6 in resistance to TBEV remains to be elucidated.
Public database BioGPS shows that all the candidate genes are in uninfected mice expressed in tissues such as brain, spleen and liver ( Table 2). Brain is the main target for the virus; however, during the extraneural phase of the infection, several tissues and organs in the body are infected, including spleen and liver [44]. Highest expression in these tissues exhibits Cd33 and Siglece with expression in microglia ten times higher than median value (> 10M), Cd33 and Klk1b22 are highly expressed in spleen (> 3M), > 10M expression of these two genes is also observed in bone marrow; Siglece is also highly expressed in bone (> 3M) and bone marrow macrophages (> 3M), whereas Cd33 is highly expressed     , it was also highly expressed in large intestine (> 3M), prostate (> 3M) and in testis (> 3M). GRWD1 was described to play a role in ribosome biogenesis and during myeloid differentiation [45]. High expression level in hematopoietic stem cells (> 10M), mega erythrocyte progenitors (> 10M), granulocytes (> 10M), common myeloid progenitor (> 3M) supports this finding, but it is also expressed in several T cell subpopulations (> 3M), B cells in marginal zone (> 3M), as well as in lacrimal gland. Abcc6 is highly expressed in liver (≫ 30M) and in lens (> 10M) and Mrkn3 is highly expressed in retina (10M) and in olfactory bulb (> 3M). The expression data further support a potential role of detected candidate genes in defense against TBEV, but they must be in the future complemented with data describing gene expression after TBEV infection. We have found a susceptibility allele of a locus on chromosome 7 in the resistant strain STS. This apparent paradox is likely caused by the fact that most inbred mouse strains were produced without an intentional selective breeding for a specific quantitative phenotype (like susceptibility to specific infections). Therefore they inherited randomly from their non-inbred ancestors susceptible alleles at some loci and resistant alleles at others, so that their overall susceptibility phenotype depends on the relative number of both types of alleles. Such finding is not unique, as susceptibility alleles originating from resistant strains were found in susceptibility studies of other infectious diseases [17,46,47] and colon cancer [48]. Similarly, in different in vitro tests of immune responses a low-responder allele was identified in a high responding strain [49] or vice versa [50]. Another explanation might be presence of BALB/c allele interacting with STS allele on chromosome 7. Demonstration of such interaction would require further experiments. We have already observed interaction of STS and BALB/c alleles leading to extreme phenotypes in susceptibility to L. major [51] and L. tropica [47].

Conclusion
Mapping of TBEV controlling genes in mice is not easy due to presence of a strong TBEV controlling gene Oas1b, which is identical both in BALB/c and CcS-11, as well as in majority of laboratory mouse strains [6,7] and masks effects of other controlling genes. Therefore using a powerful genetic system-recombinant congenic strains, we succeeded in mapping novel TBEV susceptibility locus on chromosome 7 and identified 9 potential candidate genes. Products of some of these genes have been described to participate in defense against flaviviruses, the role of the others is unknown. The genes detected here will be focus of future studies that will include characterization of candidate gene(s) products in BALB/c and CcS-11, introducing modification to candidate genes and study their influence on disease outcome in mouse, and study influence of polymorphisms in human orthologs of candidate genes on susceptibility to TBEV in humans.
According to current gene and protein nomenclature, mouse gene symbols are italicized, with only the first letter in upper-case (e.g. Cd33). Protein symbols are not italicized, and all letters are in upper-case (e.g. CD33). Human gene symbols are in upper-case and are italicized (e.g. CD33). Protein symbols are identical to their corresponding gene symbols except that they are not italicized (e.g. CD33).