Amyloid-β plaque formation and reactive gliosis are required for induction of cognitive deficits in App knock-in mouse models of Alzheimer’s disease

Background Knock-in (KI) mouse models of Alzheimer’s disease (AD) that endogenously overproduce Aβ without non-physiological overexpression of amyloid precursor protein (APP) provide important insights into the pathogenic mechanisms of AD. Previously, we reported that AppNL-G-F mice, which harbor three familial AD mutations (Swedish, Beyreuther/Iberian, and Arctic) exhibited emotional alterations before the onset of definitive cognitive deficits. To determine whether these mice exhibit deficits in learning and memory at more advanced ages, we compared the Morris water maze performance of AppNL-G-F and AppNL mice, which harbor only the Swedish mutation, with that of wild-type (WT) C57BL/6J mice at the age of 24 months. To correlate cognitive deficits and neuroinflammation, we also examined Aβ plaque formation and reactive gliosis in these mice. Results In the Morris water maze, a spatial task, 24-month-old AppNL-G-F/NL-G-F mice exhibited significantly poorer spatial learning than WT mice during the hidden training sessions, but similarly to WT mice during the visible training sessions. Not surprisingly, AppNL-G-F/NL-G-F mice also exhibited spatial memory deficits both 1 and 7 days after the last training session. By contrast, 24-month-old AppNL/NL mice had intact spatial learning and memory relative to WT mice. Immunohistochemical analyses revealed that 24-month-old AppNL-G-F/NL-G-F mice developed massive Aβ plaques and reactive gliosis (microgliosis and astrocytosis) throughout the brain, including the cortex and hippocampus. By contrast, we observed no detectable brain pathology in AppNL/NL mice despite overproduction of human Aβ40 and Aβ42 in their brains. Conclusions Aβ plaque formation, followed by sustained neuroinflammation, is necessary for the induction of definitive cognitive deficits in App-KI mouse models of AD. Our data also indicate that introduction of the Swedish mutation alone in endogenous APP is not sufficient to produce either AD-related brain pathology or cognitive deficits in mice. Electronic supplementary material The online version of this article (10.1186/s12868-019-0496-6) contains supplementary material, which is available to authorized users.

Background Authentic animal models for Alzheimer's disease (AD) research are of vital importance for investigating the molecular mechanisms and testing potential therapeutic approaches to AD [1,2]. Several transgenic mouse lines overexpressing amyloid precursor protein (APP) that recapitulate amyloid-β (Aβ) deposition and the accompanying behavioral deficits have been instrumental to AD research [3][4][5][6]. However, these mice may also exhibit phenotypes because they overproduce various APP fragments in addition to Aβ [7][8][9].
To overcome this problem, alternative mouse models have been generated via knock-in (KI) of a humanized Aβ sequences harboring familial AD mutations (Swedish (NL), Beyreuther/Iberian (F), and Arctic (G)) in order to model Aβ amyloidosis without non-physiological overexpression of APP [10]. In App NL-G-F mice, which harbor all three mutations, Aβ amyloidosis is aggressive, and neuroinflammation is observed in subcortical structures as well as cortical regions [10][11][12]. By contrast, App NL mice that carry only the Swedish mutation produce significantly higher levels of Aβ40 and Aβ42 without overt AD-related brain pathology such as extracellular Aβ plaques or neuroinflammation [10,11,13]. None of these App-KI mice exhibit tau pathology or severe neuronal loss, suggesting that they are suitable models for preclinical AD [9].
Previously, we reported that App NL-G-F mice exhibit anxiolytic-like phenotypes in the elevated plus maze task at 6 months of age, as well as a subtle decline in spatial learning ability during the acquisition session of the Barnes maze task at 8 months [14]. These results suggest that App NL-G-F mice develop emotional alterations prior to the emergence of the definitive cognitive deficits. By contrast, App NL mice do not undergo overt cognitive decline prior to 8 months of age [14], although it remains to be seen whether these App-KI mice exhibit learning and memory deficits at more advanced ages.
In this study, we assessed the performance of App NL-G-F , App NL , and wild-type (WT) C57BL/6J mice in a spatial task at the age of 24 months. To correlate cognitive deficits and neuroinflammation, we also examined Aβ plaque formation and reactive gliosis in these mice. Our results demonstrate that Aβ deposits, followed by sustained neuroinflammation, is required for induction of definitive cognitive deficits in App-KI mouse models of AD.

App NL-G-F/NL-G-F mice exhibit spatial learning deficits and reduced memory function in the Morris water maze task
The Morris water maze (MWM) is one of the most commonly used paradigms for assessing hippocampal-dependent spatial learning and memory in mouse models of AD [5,15]. In this task, mice are required to use extra-maze cues to learn the location of a platform submerged below the water surface. During the initial training session, when the platform can be seen ("visible training"), the motor and visual capacities of the mice were assessed (Fig. 1a). Subsequently, the mice were trained to acquire the spatial location of a platform when the platform was not visible ("hidden training") ( Fig. 1a). One day after the sixth session of hidden training, a probe test (Probe test 1) was conducted without a platform to determine whether mice had learned the location of the platform using extra-maze cues (Fig. 1a). These mice were also subjected to a second probe test (Probe test 2) 7 days after the seventh session of hidden training (Fig. 1a).
To assess motor and visual capabilities, mice were compared across 4-day visible training sessions, during which the platform was indicated by a black cubic landmark (Fig. 1a). App NL-G-F/NL-G-F , App NL/NL and WT mice performed equally well in these sessions: latency ( Fig. 1b; F Fig. 1 Learning performance in App NL-G-F/NL-G-F and App NL/NL mice during training sessions in the Morris water maze task. a Schematic timeline of the Morris water maze (MWM) task. Mice received visible training for 4 days (4 trials per day) to assess motor and visual capabilities, followed by hidden training for 7 days (4 trials per day) to assess spatial learning ability. Two probe tests were conducted at 1 day after the sixth session (Probe test 1) and at 7 days after the seventh session of the hidden training (Probe test 2) to assess spatial memory performance. b-d In the visible training sessions, App NL-G-F/NL-G-F and App NL/NL mice performed equally as well as WT mice. e and f During the hidden training sessions, App NL-G-F/NL-G-F mice significantly spent more time and travelled a longer distance to reach the submerged platform than WT mice. g App NL-G-F/NL-G-F mice significantly exhibited lower path efficiency relative to WT mice. h Swimming speed did not differ between App NL-G-F/NL-G-F and WT mice. e-h Spatial learning ability in App NL/NL mice was equivalent of that in WT mice. n = 17 WT (B6J), n = 11 App NL/NL , n = 16 App NL-G-F/NL-G-F indicate that swimming and visual abilities were comparable among App NL-G-F/NL-G-F , App NL/NL , and WT mice at 24 months of age.
Following the visual training sessions, mice were trained to find a submerged platform during 7-day hidden training sessions (Fig. 1a). App NL-G-F/NL-G-F mice spent more a   1  2  8  1  1  1  0  1  9  8  7  6  5  4  3   Visible Training  (4 sessions) Hidden Training (7 sessions) Probe test 1 Probe test 2 App NL-G-F/NL-G-F , p < 0.001) to reach the platform than WT mice across the training. Moreover, relative to WT mice, App NL-G-F/NL-G-F mice exhibited lower path efficiency, as determined by the ratio of the actual distance to the ideal path that the mice could have taken to reach the platform ( Fig. 1g; F[2, 41] = 18.66, p < 0.001, post hoc, WT vs. App NL-G-F/NL-G-F , p < 0.001) [16,17], indicating that spatial accuracy was affected in these mice. Swimming speed did not differ between App NL-G-F/NL-G-F and WT mice ( Fig. 1h; F[2, 41] = 0.36, p = 0.700), suggesting that difference in latency was not due to a difference in the ability to swim. Our data also revealed that App NL-G-F/NL-G-F mice could still learn the location of the hidden platform, as these mice exhibited decreases in latency ( Fig. 1e . Taken together, these results suggest that 24-month-old App NL-G-F/NL-G-F mice exhibited a significant decline in the ability to learn the spatial location of a submerged platform. In Probe test 1 (Fig. 1a), App NL-G-F/NL-G-F mice spent less time in the target quadrant than WT mice ( App NL-G-F/NL-G-F , p = 0.005); this parameter is believed to provide a more sensitive evaluation of spatial memory performance [18,19]. The total distance travelled ( Fig. 2d; F[2, 41] = 2.68, p = 0.081) and swimming speed ( Fig. 2e; F[2, 41] = 2.49, p = 0.095) in Probe test 1 did not differ between WT and App NL-G-F/NL-G-F mice, suggesting that the two genotypes had similar motor capability and motivation to search for the platform. Together, these results suggest that App NL-G-F/NL-G-F mice had reduced spatial memory 1 day after the last training session, but still exhibited a spatial bias toward the former location of platform.
In Probe test 2 (Fig. 1a), the percentage of time spent in the target quadrant was significantly lower in App NL-G-F/NL-G-F mice than in WT mice ( Fig. 2f; F[2, 41] = 9.03, p < 0.001, post hoc, WT vs. App NL-G-F/NL-G-F , p < 0.001). In addition, App NL-G-F/NL-G-F mice did not exhibit a spatial bias toward the target quadrant over non-target quadrants (Additional file 1: Fig were comparable between App NL-G-F/NL-G-F and WT mice, providing further confirmation that motor capability and motivation did not differ between genotypes. These results suggest that App NL-G-F/NL-G-F mice could not retain the acquired spatial memory 7 days after the last training session.
Taken together, our findings indicate App NL-G-F/NL-G-F mice exhibit definitive deficits in spatial learning and memory at the age of 24 months.

App NL/NL mice exhibit normal spatial learning and memory in the MWM task, even at 24 months of age
App NL/NL mice exhibited motor and visual capabilities comparable to those of WT mice during the visible training sessions, as evidenced by the lack of a between-genotype difference in latency (Fig. 1b), distance (Fig. 1c) and swimming speed (Fig. 1d).
During the hidden training sessions, App NL/NL mice performed as well as WT mice: latency ( Fig. 1e; post hoc, WT vs. App NL/NL , p = 0.362) and distance ( Fig. 1f; post hoc, WT vs. App NL/NL , p = 0.450) did not differ between WT and App NL/NL mice. In addition, path efficiency was similar between App NL/NL and WT mice across the training ( Fig. 1g; post hoc, WT vs. App NL/NL , p = 0.999), indicating that both genotypes reached the hidden platform with comparable spatial accuracy. Swimming speed in App NL/NL mice was also comparable to that in WT mice and significantly decreased throughout the sessions ( Fig. 1h; App NL/NL , F[6, 60] = 5.40, p < 0.001). Taken together, these results suggest that spatial learning ability in App NL/NL mice was equivalent of that in WT mice at 24 months of age.
In Probe test 1, time spent in the target quadrant did not significantly differ between WT and App NL/NL mice ( Fig. 2a;  Taken together, these results suggest that spatial memory at 1 day after the last training session was equivalent in App NL/NL and WT mice at 24 months of age. In Probe test 2, the percentage of time spent in the target quadrant did not differ significantly between WT and App NL/NL mice ( Fig. 2f; post hoc, WT vs. App NL/NL , p = 0.907). App NL/NL mice still spent a significantly higher percentage of time in the target quadrant than predicted by chance (Fig. 2f; App NL/NL , t(20) = 5.93, p < 0.001), and exhibited a spatial bias toward this quadrant over other quadrants (Additional file 1: Fig. S1b; App NL/NL , t(10) = − 5.94, p < 0.001). The number of platform crossings (Fig. 2g) and proximity to the platform ( Fig. 2h; post hoc, WT vs. App NL/NL , p = 0.293) were similar between WT and App NL/NL mice, as reflected by total distance travelled ( Fig. 2i; post hoc, WT vs. App NL/NL , p = 0.844) and swimming speed (Fig. 2j) during the test, indicating that motor capability and motivation were comparable between these genotypes. Taken together, these results suggest that App NL/NL mice had normal memory function even 7 days after the last training session. Taken together, these results suggest that App NL/NL mice do not exhibit any alterations in spatial learning and memory, even at the age of 24 months.

App NL-G-F/NL-G-F mice exhibit aggressive Aβ amyloidosis and reactive gliosis, whereas App NL/NL mice do not develop overt Aβ pathology in their brains
To determine whether cognitive deficits are correlated with Aβ-related brain pathology, we immunostained coronal brain sections from App NL-G-F/NL-G-F , App NL/NL and WT mice with the anti-Aβ antibody 82E1, anti-Iba1 antibody as a microglial marker, and anti-GFAP antibody as an astrocytic marker. Intense Iba1 immunoreactivity were clustered around regions of Aβ immunoreactivity both in the cortex (Fig. 3b) and hippocampus (Fig. 3d) of 24-month-old App NL-G-F/NL-G-F mice, and high-magnification images in the cortex confirmed accumulation of microglia around the Aβ plaques (Fig. 3b, lower panels). App NL-G-F/NL-G-F mice also exhibited intense GFAP immunoreactivity in the cortex (Fig. 3c) and hippocampus (Fig. 3e), suggestive of astrogliosis. Higher-magnification images revealed that many reactive astrocytes were present around the Aβ plaques (Fig. 3c, lower panels). Taken together, these findings indicate that App NL-G-F/NL-G-F mice developed extensive microgliosis and astrocytosis associated with the aggressive Aβ plaque formation in their brains.
By sharp contrast, despite extensive analysis, we did not detect any Aβ-related pathology in the brains of App NL/NL mice at 24 months of age. No Aβ deposits were detected in the cortex (Fig. 3b and c) or hippocampus (Fig. 3d and  e), and Iba1 and GFAP immunoreactivities were similar between WT and App NL/NL mice in both regions (Fig. 3b  and c for cortex; Fig. 3d and e for hippocampus), indicating the absence of reactive gliosis in the brains of App NL/ NL mice. Taken together, these results suggest that App NL/ NL mice developed neither Aβ deposition nor neuroinflammation in the brain, even at 24 months of age.

Discussion
In this study, we assessed the validity of App NL-G-F and App NL mice at an advanced age as models for investigating the early phase of AD pathogenesis, based on brain pathologies and performance in a spatial learning and memory task.
Several groups have reported behavioral alterations in cognitive and non-cognitive (social and emotional) domains in App-KI mice at various ages (summarized in Fig. 4 for App NL-G-F/NL-G-F mice, Fig. 5 for App NL-F/ NL-F mice, and Fig. 6 for App NL/NL mice). Among these App-KI mice, App NL-G-F/NL-G-F mice exhibited the most severe brain pathology, including Aβ deposition and neuroinflammation (Fig. 3) [10,11,20]. However, the general behavioral phenotypes of App-KI mice described in many published studies are minimal [9], with the exception of robust alterations in emotional domains [14,21,22]. Accordingly, researchers have reached a consensus that the alterations in cognitive abilities of App-KI mice are very mild relative to the phenotypes of other overexpression models [11,13,21]. However, these findings are not consistent across studies: for example, in the MWM task, two groups have reported that App NL-G-F/NL-G-F mice develop no obvious deficits in spatial learning and memory up to 12 months of age [21,23], whereas another reported robust learning and memory deficits in App NL-G-F/NL-G-F mice at 6 months of age [12]. In this study, to investigate whether App NL-G-F mice exhibit deficits in learning and memory at an advanced age, we assessed the performance of App NL-G-F/NL-G-F mice in the MWM task at the age of 24 months. Our results revealed that 24-month-old App NL-G-F/NL-G-F mice exhibited a significant decline in spatial learning, and also exhibited spatial memory deficits relative to WT mice ( Figs. 1 and 2), supporting the idea that sustained Aβ-related pathologies in the absence of APP overexpression are capable of inducing cognitive deficits in mice.
Another important finding from this study is that spatial learning and memory in App NL/NL mice were comparable to those in WT mice at ages up to 24 months.

Moreover, in sharp contrast to App NL-G-F/NL-G-F mice,
App NL/NL mice did not develop overt brain pathologies, including Aβ deposition and reactive gliosis (Fig. 3), despite overproducing human Aβ40 and Aβ42 in their brains [10,13]. These results indicate that introduction of the Swedish mutation in endogenous APP is not sufficient to produce brain pathology or cognitive deficits in mice, and suggest that Aβ-related pathology is required to induce these cognitive deficits. These results are consistent with recent studies reporting negligible cognitive deficits and the lack of Aβ plaque formation in App NL/NL mice (Fig. 6) [10,11,13,14,24]. Familial AD mutations in APP and PSEN1 genes have been utilized to develop mouse models bearing Aβ pathologies including amyloid plaques and cerebral amyloid angiopathy [2]. The first successful mouse models reproducing Aβ pathologies overexpressed human APP with either a single Swedish [25,26] or Indiana [27] mutation. Because Aβ aggregation strictly depends on time, concentration, and Aβ42/Aβ40 ratio, dozens of mouse lines overexpressing several combinations of APP and PSEN1 mutations have been created to accelerate the onset of Aβ pathologies in brains [28][29][30]. These mouse models have made significant contributions to AD research, however, they also suffered from potential offtarget effects caused by either non-physiological overexpression of APP/PSEN1 genes and/or combinations of familial AD mutations, which are not observed in AD patients [2,9]. To overcome these issues, KI mouse models with familial AD mutations have been generated [10,24,31,32]. To date, however, Aβ pathologies and cognitive deficits were observed only when two or three familial AD mutations were combined [33][34][35]. This study provides further evidence that a single Swedish mutation in APP is not sufficient to reproduce Aβ pathologies and cognitive deficits in mice. Our results also highlight the importance of age-associated factors to promote Aβ depositions in brains, and suggest that KI mouse models with a single familial AD mutation such as App NL/NL mice may be valuable tools to truly understand the biology of APP mutations in AD pathogenesis. Such studies may reveal novel therapeutic targets not only for familial but also for sporadic cases of AD.

Conclusions
App NL-G-F/NL-G-F mice are an excellent model for investigating the mechanisms underlying cognitive deficits caused by Aβ-related pathologies, as well as for testing potential AD therapeutics. App NL/NL mice represent a valuable model for exploring the critical factors involved in Aβ plaque formation and associated brain pathologies.

Animals
The original lines of App-KI (App NL-G-F/NL-G-F and App NL/ NL ) mice on a C57BL/6J genetic background [10] were obtained from RIKEN Center for Brain Science (Wako, Japan) and maintained at the Institute for Animal Experimentation in National Center for Geriatrics and Gerontology as described previously [14]. After weaning, all mice were housed socially in same-sex groups and only male mice were used for the experiments with mixed genotypes. All handling and experimental procedures were performed in accordance with the Guidelines for the Care of Laboratory Animals of National Center for Geriatrics and Gerontology (Obu, Japan).

Morris water maze task
All experiments were performed in a white circular pool (1.0 m in diameter; O'hara & Co., Tokyo, Japan) with a light intensity on the center of the pool of approximately 80 lx. Water temperature was maintained at 24 ± 1 °C and was made opaque using nontoxic white paint during hidden training sessions and probe tests.
First, 24-month-old mice (n = 11-17/genotype) were individually handled for 3 days before starting the experiment to acclimate them to the introduction and removal of the pool. After the habituation period, mice were subjected to 4-day visible training sessions (four trials per day, with an intertrial interval of approximately 15 min) (Fig. 1a), in which a platform (10 cm in diameter) was made visible by attaching a black cubic landmark. The aim of the visible training was to exclude mice with motor, visual or motivational impairments. If the mouse found the platform within a 2-min time limit, the mouse remained there for 20 s. If not, the mouse was gently guided towards the platform before staying on it. After the 20-s period on the platform, mice were placed in the cage heated by a heating pad to dry and then transported  back to their home cage. The location of the platform and the start position were changed randomly in each trial. For each trial, latency to reach the platform (s), swimming distance (cm) and swimming speed (cm/s) were automatically measured using TimeMWM software (O'hara & Co., Tokyo, Japan).

App NL-G-F/NL-G-F mice
Following the visible training sessions, the mice were subjected to 7-day hidden training sessions (four trials per day, with an intertrial interval of approximately 20 min) (Fig. 1a), in which the platform was placed 0.8-1.0 cm below the water surface. Four distinct objects of different geometry were used around the pool as spatial cues. At the end of the trial, either when the mouse had  found the platform or when a 60-s time limit had elapsed, mice were allowed to rest on the platform for 15 s. Then, mice were placed in the cage heated by a heating pad to dry before returning to their home cage. The start position was changed randomly to avoid track memorization, while the location of the platform was fixed throughout the experiment. Latency (s) and distance travelled (cm) to reach the platform and swimming speed (cm/s) were also measured by the software. To quantify the efficiency of the strategy pursued in reaching the platform, path efficiency was calculated by dividing the distance between the first and last locations by the total distance travelled (Path efficiency = Distance between the starting point and the final point (the location of the platform)/Total distance travelled).
To confirm that this spatial task was acquired based on navigation by distal cues, two probe tests were conducted as following: the first probe test at one day after the sixth session (Probe test 1) and the second probe test at 7 days after the seventh session (Probe test 2) of the hidden training (Fig. 1a). In these tests, the platform was removed from the pool and mice were allowed to search the platform for 60 s. Time spent in each quadrant (TQ; target quadrant, OQ; opposite quadrant; RQ; right quadrant, LQ; left quadrant) (s), number of crossings over the platform location and average proximity to the platform (cm) were measured by the software. Total distance travelled (cm) and swimming speed (cm/s) were also measured to rule out the involvement of motor function and motivation to search the platform as confounding factors.
After the behavioral experiments, these mice were administered intraperitoneally with the combined agent with medetomidine (0.3 mg/kg), midazolam (4 mg/kg) and butorphanol (5 mg/kg). The whole brain tissues were collected and subjected to immunohistochemistry and other experiments.

Statistical analysis
As previously described [14], statistical differences between genotypes against behavioral parameters with one dependent variable were determined by repeatedmeasures analysis of variance (ANOVA). When necessary, Greenhouse-Geisser estimates of sphericity were used to correct for degrees of freedom. Bonferroni post hoc comparisons were used to evaluate group differences. For the comparisons of multiple means with genotypes as one independent variable, one-way ANOVA followed by the Tukey's post hoc tests was used. Onesample t-test was used to compare performance on the probe test of the MWM task against chance level (25%). Differences of the percentage of time spent between target and non-target quadrants during probe tests were evaluated using paired t-test. Data are presented as mean ± SEM. All alpha levels were set at 0.05.