SUFI: an automated approach to spectral unmixing of fluorescent multiplex images captured in mouse and post-mortem human brain tissues

Background Multispectral fluorescence imaging coupled with linear unmixing is a form of image data collection and analysis that allows for measuring multiple molecular signals in a single biological sample. Multiple fluorescent dyes, each measuring a unique molecule, are simultaneously measured and subsequently “unmixed” to provide a read-out for each molecular signal. This strategy allows for measuring highly multiplexed signals in a single data capture session, such as multiple proteins or RNAs in tissue slices or cultured cells, but can often result in mixed signals and bleed-through problems across dyes. Existing spectral unmixing algorithms are not optimized for challenging biological specimens such as post-mortem human brain tissue, and often require manual intervention to extract spectral signatures. We therefore developed an intuitive, automated, and flexible package called SUFI: spectral unmixing of fluorescent images. Results This package unmixes multispectral fluorescence images by automating the extraction of spectral signatures using vertex component analysis, and then performs one of three unmixing algorithms derived from remote sensing. We evaluate these remote sensing algorithms’ performances on four unique biological datasets and compare the results to unmixing results obtained using ZEN Black software (Zeiss). We lastly integrate our unmixing pipeline into the computational tool dotdotdot, which is used to quantify individual RNA transcripts at single cell resolution in intact tissues and perform differential expression analysis, and thereby provide an end-to-end solution for multispectral fluorescence image analysis and quantification. Conclusions In summary, we provide a robust, automated pipeline to assist biologists with improved spectral unmixing of multispectral fluorescence images. Supplementary Information The online version contains supplementary material available at 10.1186/s12868-022-00765-1.

Background Multispectral fluorescence imaging with linear unmixing is a powerful approach for visualizing and quantifying multiple molecular properties of tissues and cells in a single experiment. In fluorescence microscopy, the intensity value at each pixel is proportional to the photoemission of fluorophores [1]. Spectral imaging extends this approach by recording pixel intensity values at multiple wavelength bands across the electromagnetic spectrum [2]. For each pixel, spectral unmixing (SU) aims to recover the material source (endmembers) and the proportion of each material (abundances). Since the fluorescent light emissions mix linearly [3], individual signals can be mathematically separated based on the relative contribution of each spectral signature (also known as a reference emission profile or "fingerprint" or "endmember") present in the image [3,4] in a process called "linear unmixing. " Linear unmixing can distinguish fluorophores with similar emission spectra [2,5] and effectively remove background noise and autofluorescence from the fluorophore signal [3,6]. One such autofluorescent material is lipofuscin, a yellow-brown pigment granule composed of lipid-containing residues of lysosomal digestion, which is abundant in post-mortem human brain tissue and poses a major challenge for fluorescent imaging [7].
However, existing approaches for spectral unmixing, such as linear unmixing [4], similarity unmixing [8] and LUMoS unmixing [9] are limited and have not been well optimized for assays that generate punctate signals, such as single molecule fluorescent in situ hybridization (smFISH), or in complex tissue specimens containing abundant lipofuscin autofluorescence, such as post-mortem human brain. Linear unmixing can be performed using proprietary software that accompanies the microscope used for image acquisition, for example LSM780 microscope/ZEN software (Zeiss) and Vectra Polaris Imaging System/Inform software (Akoya Biosciences), but this leads to several potential weaknesses as the exact algorithms used for unmixing are often proprietary, creating a potential "black box" in the data processing pipeline. Another major challenge of existing linear unmixing approaches is that users are often required to create individual reference spectra before unmixing, which is laborious, and may or may not be relevant for the particular image under study. Even if the creation of individual reference spectra is preferred or cannot be avoided, individual pixels still need to be selected by the experimenter for spectra generation, which is also prone to error and user bias especially for punctate signals. Lastly, and perhaps most practically, linear unmixing of large brain sections captured in 4 dimensions (x, y, z, lambda) across 6 fluorescent channels (e.g. 4-plex smFISH, nuclear stain, lipofuscin autofluorescence) is computationally intensive. Many softwares also do not allow for batch processing, and each image must be unmixed individually. This process is particularly cumbersome for large-scale datasets containing several hundred images that need to be unmixed.
Multiplex single-molecule fluorescence in situ hybridization (smFISH) using RNAscope technology (Advanced Cell Diagnostics) has emerged as a powerful approach for localizing and quantifying gene expression at cellular resolution in mouse and human brain tissues [10,11]. Complementary to other next generation sequencing technologies, smFISH allows measurement of cell-to-cell variability in gene expression [12,13] and intracellular localization of mRNA transcripts [14]. The widespread generation of single-cell RNA-sequencing (scRNA-seq) data sets in the brain has fueled a resurgence of multiplex smFISH to validate cell-type-specific molecular profiles by visualizing individual transcripts at cellular resolution in brain sections [15,16]. Combinatorial labeling and imaging with multiple probes are necessary to determine the presence or absence of specific transcripts in distinct neuronal and glia populations and characterize complex subpopulations. Imaging specimens with a large number of fluorescent labels and high autofluorescence often leads to bleed-through or cross-talk of their emission spectra. These artifacts generally complicate the interpretation of experimental findings. To this end, it is necessary to have well-separated fluorescence labels and robust spectral unmixing methods in fluorescence imaging of brain tissue. Given that these datasets are becoming increasingly large, and because precise quantification is critical [17], decoupling data collection from linear unmixing and integrating unmixing pipelines with segmentation and quantification tools would increase throughput in acquisition and analysis workflows.
In addition to linear unmixing, many other computational methods are used by biologists for spectral unmixing. Non-negative matrix factorization (NMF) is a feature extraction method that has been successfully applied to unmix multispectral images and does not require prior knowledge of spectral signatures [18,19]. However, on different runs, NMF produces equally valid yet significantly different solutions [20]. Spectral deconvolution [21] requires users to acquire and manually select each fluorophore in regions of interest. In this age of rapidly advancing machine learning techniques, several unsupervised machine learning methods have been applied to unmix spectral data. Particularly in the field of remote sensing, studies have used clustering methods to separate geological components, such as sand, water, vegetation, etc., in hyperspectral images [22][23][24]. In a recent biological study, McRae et al. used k-means clustering to unmix individual pixels in two-photon laser scanning microscopy images [9]. In this study, they assign each pixel in a multiplexed image to individual spectra, which is accomplished by initializing n clusters, where each cluster represents an individual spectrum. Although this method does not require prior information about reference spectra, the performance is sensitive to data pre-processing and robust initialization of cluster centers. Furthermore, this method fails to separate fluorophores that are substantially overlapping. Finally, plug-in options exist in FIJI/ImageJ [25,26] to unmix multispectral images [9,27], but these are limited in terms of handling autofluorescence and biologically similar structures, such as puncta belonging to different gene transcripts.
Linear unmixing assumes a Linear Mixture Model (LMM), where contributions from each endmember sum linearly [3]. Although LMM performs very well in most scenarios and reduces computational complexity, the assumption of linearity may be inappropriate in nonlinear effects such as quenching and photobleaching. Non-linear unmixing has also been explored in the field of remote sensing [28,29], but both groups of methods fail to account for spectral variability. Spectral signatures deviate from the reference spectra for various reasons, including interaction with surrounding pixels, scattering in complex tissues, autofluorescence, etc. Several methods have been developed to address the shortcomings of LMM in remote sensing data [30,31], but these strategies have not yet been applied to fluorescence microscopy data.
We developed an intuitive and more automated pipeline to address the need for improved flexibility, accuracy, and throughput of fluorescence imaging spectral unmixing of brain tissues. Notably, we automate the process of endmember selection using vertex component analysis (VCA) [32] such that users do not have to manually select pixels for generation of reference spectra. We demonstrate the utility of VCA on both single positive and multiplexed images. We then use the estimated endmembers to unmix fluorescence images with various unmixing methods derived from remote sensing and hyperspectral imaging. We validate the proposed methods' accuracy by comparing the unmixing results obtained using ZEN Black software from Zeiss. We also provide scripts to run unmixing in parallel on several images using a High-Performance Cluster (HPC). Finally, we integrate our unmixing pipeline with the computational tool dotdotdot [17] to provide an end-toend solution for analysis and quantification of smFISH fluorescence imaging data from mouse and post-mortem human brain specimens.

Sample preparations Animals
Wild-type mice were purchased from Jackson laboratories (Bar Harbor, ME, C57BL6/J; stock #000664). All mice were housed in a temperature-controlled environment with a 12:12 light/dark cycle and ad libitum access to standard laboratory chow and water. Euthanasia was performed by rapid cervical dislocation. All experimental animal procedures were approved by the JHU Institutional Animal Care and Use Committee.

RNAscope single molecule fluorescent in situ hybridization (smFISH) in cultured neurons
Mouse cortical neurons were cultured on a 96-well ibidi optical bottom culture plate as previously described [33,34]. Briefly, following rapid cervical dislocation of timedpregnant dams, embryos were removed and cortices dissected and incubated with papain (Worthington, catalog number LK003150). The tissue was triturated to obtain single cells, which were plated in FBS-containing media for 24 h. Following complete media replacement on day in vitro (DIV) 1, partial media exchanges were performed every 3 days. In situ hybridization assays were performed with RNAscope technology using the RNAscope Fluorescent Multiplex kit V2 and 4-plex Ancillary Kit (catalog numbers 323100, 323120 ACD, Hayward, CA) as above, with the exception of the protease step, where cultured cells were treated with a 1:15 dilution of protease III for 30 min. Cells were incubated with probes for Arc, Bdnf exon 1, Bdnf exon 4, and Fos (catalog number 316911, 457321-c2, 482981-c3 and 316921-c4, ACD, Hayward, CA) and stored overnight in a 4× saline sodium citrate (SSC) buffer. After amplification, probes were fluorescently labeled with Opal dyes (PerkinElmer; Opal520 was diluted 1:500 and assigned to Fos, Opal570 was diluted 1:500 and assigned to Bdnf exon 1, Opal620 was diluted 1:500 and assigned to Bdnf exon4 and Opal690 was diluted 1:500 and assigned to Arc) and stained with DAPI (4′,6-diamidino-2-phenylindole) to label nuclei, then stored in phosphate buffered saline (PBS) at 4 °C.

RNAscope smFISH in mouse brain tissue
Mice were sacrificed by rapid cervical dislocation as previously described, to ensure that transcription of activity-regulated gene programs were not triggered by anesthesia [17]. Mouse brain was extracted and rapidly frozen in 2-methylbutane (ThermoFisher), and stored at − 80 °C until slicing. Sixteen micrometer coronal sections were prepared using a Leica CM 1520 Cryostat (Leica Biosystems, Buffalo Grove, IL) and mounted onto glass slides (VWR, SuperFrost Plus). RNAscope was performed using the Fluorescent Multiplex Kit V2 (Cat # 323100, 323120 ACD, Hayward, California) according to manufacturer's instructions as previously described [17]. Sections were incubated with specific probes targeting Gal, Th, Bdnf exon 9, and Npy (Cat # 400961, 317621-C2, 482981-C3, 313321-C4, ACD, Hayward, California) and were incubated at 40 °C with a series of fluorescent Opal Dyes (Perkin Elmer; Opal690 diluted at 1:500 and assigned to Npy; Opal570 diluted at 1:500 and assigned to Th; Opal620 diluted at 1:500 and assigned to Bdnf; Opal520 diluted at 1:500 and assigned to Gal). DAPI was used to label nuclei and slides were coverslipped with FluoroGold (SouthernBiotech).

Immunofluorescence staining in post-mortem human Alzheimer's disease (AD) brain
Post-mortem human brain tissue was obtained by autopsy from the Offices of the Chief Medical Examiner of Maryland, all with informed consent from the legal next of kin collected under State of Maryland Department of Health and Mental Hygiene Protocol 12-24. Clinical characterization, diagnoses, and macro-and microscopic neuropathological examinations were performed on all samples using a standardized paradigm. Details of tissue acquisition, handling, processing, dissection, clinical characterization, diagnoses, neuropathological examinations, and quality control measures have been described previously [35]. Alzheimer's disease diagnosis comprise standard neuropathology ratings of Braak staging schema [36] evaluating neurofibrillary tangle burden, and the CERAD scoring measure of senile plaque burden [37]. An Alzheimer's likelihood diagnosis was then performed based on the published consensus recommendations for post-mortem diagnosis of Alzheimer's disease [2] as with prior publications [38,39].

Fluorescent imaging
Using a Zeiss LSM780 confocal microscope equipped with 20× (0.8 NA) and 63× (1.4NA) objectives, a GaAsP spectral detector, and 405, 488, 561, and 633 lasers, lambda stacks were acquired in z-series with the same settings and laser power intensities. Stacks were linearly unmixed in ZEN software using previously created reference emission spectral profiles [17] and saved as Carl Zeiss Image ".czi" files to retain image metadata. Raw lambda stacks were unmixed with SUFI and compared to ZEN unmixed results. Single-fluorophore positive fingerprints were generated from samples prepared as above.

Reference spectral profile creation in ZEN software for validation
Reference emission spectral profiles, referred to as 'fingerprints' or 'endmembers' , were created for each Opal dye in ZEN software and validated for specificity as previously described [17]. Briefly, a control probe against the housekeeping gene Polr2a was used to generate four single positive slides in mouse brain tissue according to manufacturer's instructions. Mouse tissue was used for the absence of confounding lipofuscin signals and therefore lower tissue autofluorescence. Polr2a was labeled with either Opal520, Opal570, Opal620, or Opal690 dye to generate single positive slides. For DAPI, a single positive slide was generated using identical pre-treatment conditions without probe hybridization. To create a fingerprint for lipofuscin autofluorescence, a negative control slide was generated using a 4-plex negative control probe against four bacterial genes (Cat #321831, ACD, Hayward, CA) in DLPFC tissue. All Opal dyes were applied to the slide, but no probe signal was amplified due to the absence of bacterial gene expression. Within a field of view, a single pure region of interest (i.e. any isolated strong puncta for Polr2a slides) was manually selected with the crosshair tool in ZEN software to generate a spectral reference profile.
In a similar approach, reference emission spectral profiles were generated for immunofluorescent staining in post-mortem human AD brain tissue. For amyloid plaques and tau tangles, each single positive slide was prepared by labeling β-amyloid (Abeta) or phospho-tau (pTau) with appropriate primary and secondary antibodies conjugated with Alexa fluor (AF) 488 and AF555, respectively. A lipofuscin fingerprint was created for human AD brain tissue using a negative control slide treated only with fluorescently labeled secondary antibodies in the absence of primary antibodies. For DAPI-stained nuclei and MAP2-positive neurites (labeled with AF633), single positive slides were generated using mouse brain tissue to avoid lipofuscin autofluorescence.
In Eq. (1), F denotes the fluorescence intensities of n pixels recorded in C different spectral channels. S is the spectral signatures of k fluorophores, and A is the abundance of each fluorophore in each pixel. To this end, the unmixing process is usually divided into three different steps: (i) estimation of the number of endmembers, (ii) extraction of endmembers, (iii) estimation of abundance. In fluorescence microscopy, the number of endmembers is known in advance. We discuss the latter two steps below.

Automated extraction of spectral signatures
An essential part of the proposed pipeline is the automated extraction of the spectral signatures, or 'endmembers, ' from the observed multispectral image. To achieve this, we use the Vector Component Analysis (VCA)-an Endmember Extraction Algorithm [32] that can be used to extract fingerprints (i.e. spectral signatures) from multiplex lambda stacks. We approach the extraction of fingerprints in two different ways, (i) Using lambda stacks acquired from the single positive slides discussed above to extract fingerprints for individual fluorophores. (ii) Using a multiplex lambda stack and extracting fingerprints for all fluorophores in one go. We discuss the pros and cons of each method and provide additional details in the Results section.

Estimation of abundance
This step involves the estimation of the proportion of different fluorophores in each pixel. Here we implement and compare three different methods derived from remote sensing and adapt them for unmixing in fluorescence microscopy: (i) fully constrained least square unmixing (FCLSU) algorithm [43] tries to minimize the squared error in the linear approximation of multispectral image, imposing the non-negative constraint and the sum-to-one constraint for the abundance calculations. (ii) extended linear mixing model (ELMM) algorithm [30] extends the idea of FCLSU unmixing by taking into account the spectral variability-particularly, scaling of reference spectra. (iii) generalized extended linear mixing model (GELMM) algorithm [31] extends ELMM to account for complex spectral distortions where different wavelength recordings are affected unevenly.

SUFI toolbox
SUFI is a MATLAB-based command line toolbox for automated spectral unmixing of fluorescent images. Briefly, the analysis pipeline involves data normalization, automated extraction of spectral signatures using VCA algorithm, and application of spectral unmixing algorithms ( Fig. 1). Bio-formats toolbox 'bfmatlab' , which is compatible with images acquired on multiple microscope systems, is used to read the image data into a MATLAB structure with fields containing fluorescent channels, DAPI and lipofuscin. SUFI toolbox is publicly available at https:// github. com/ Liebe rInst itute/ SUFI.

Performance metrics
The root mean squared error (RMSE) between a true image (y ref ), i.e. ZEN unmixed image and its estimate (y est ) i.e. FCLSU (or ELMM or GELMM) unmixed image is defined as, The structural similarity index (SSIM) is based on the computation of luminance, contrast and structure of true image vs. estimated image [44]. The range of values are between [0, 1] with a value of SSIM = 1 indicating 100 percent structural similarity. where y est represents the cardinal of y est .

Data segmentation with dotdotdot
Using previously published MATLAB scripts [17], we automatically segment and quantify nuclei and RNA transcripts using SUFI generated unmixed outputs. Briefly, the dotdotdot processing pipeline involves smoothing, thresholding, watershed segmentation, autofluorescence masking, and dot metrics extraction. Specifically, adaptive 3D segmentation is performed on image stacks using the CellSegm MATLAB toolbox, and nuclei are further separated using the DAPI channel and 3D watershed function. Single dots are detected using histogram-based thresholding and assigned to nuclei based on their 3D location in the image stack. Lipofuscin signal is used as a mask to remove pixels confounded by autofluorescence.

Results
In this section, we compare experimental results from four unique biological datasets (cultured mouse neurons, mouse brain tissue, post-mortem human DLPFC brain tissue, and post-mortem human AD brain tissue) using the FCLSU [43], ELMM [30], and GELMM [31] unmixing methods. These methods are derived from the field of remote sensing and optimized to our use case of complex brain tissues. Performance was evaluated in comparison to linearly unmixed images from ZEN Black software (ZEN) using three different metrics: Root Means Squared Error (RMSE), structural similarity index (SSIM), and Sørensen-Dice similarity coefficient (DICE). For all three metrics, values range between 0 and 1 with better performance indicated by lower RMSE values and higher SSIM and dice similarity values. Runtime values provided are for unmixing on an individual multiplex lambda stack.

Automated extraction of fingerprints using vertex component analysis
Given the widespread use of multiplex smFISH in brain tissues, we first introduce the automated extraction of (6) s y est , y est + y ref spectral signatures (i.e., fingerprints) using RNAscope smFISH in mouse brain and post-mortem human DLPFC (Fig. 2). Single positive images of DAPI, lipofuscin autofluorescence, and RNA transcripts labeled with Opal520, Opal570, Opal620, Opal690 dyes are run through the extraction pipeline. For each single positive lambda stack, two fingerprints are extracted-one representing the fluorophore's spectral signature and the second representing the background noise's spectral signature.
In order to validate these VCA extracted spectral fingerprints, we manually selected pure pixels from every single positive image and extracted spectral signatures using ZEN software. Plotting VCA-extracted fingerprints against manually extracted ones demonstrates that VCA was able to extract each fingerprint robustly (Fig. 2). RMSE values between these curves are calculated and presented within each subplot and further show that the fingerprints extracted through VCA are similar to manually extracted fingerprints.
Although it is common to generate single positive samples for spectral imaging experiments, limited reagents or specimen availability may prohibit the generation of single positive slides. To address this, we attempted to extract all fingerprints using the multiplex lambda stack instead of lambda stacks acquired from single positive slides. We use the lambda stack acquired from post-mortem human DLPFC in Fig. 2 to extract seven pure spectral signatures (i.e., DAPI, Opal520, Opal570, Opal620, Opal690, lipofuscin, and background noise). Plotting and comparing fingerprints extracted from the multiplex lambda stack against single positive extracted fingerprints shows that DAPI and Opal fingerprints are very similar, but the lambda stack approach fails at extracting the lipofuscin fingerprint (Additional file 1: Figure S1). Discrepancies for accurately extracting lipofuscin could be due to (i) a smaller fraction of lipofuscin pixels present in the multiplex lambda stack and (ii) the spectral signature of lipofuscin is similar to that of Opal520, thus adding more uncertainty while classifying individual pixels. In summary, we show that VCA can be used as an alternative approach to manual fingerprint generation as long as the fluorophores are spectrally distinct.

Spectral unmixing of RNAscope multiplex smFISH in cultured neurons
Having demonstrated successful automated extraction of fingerprints, we first applied reference profiles for unmixing of cultured mouse neuron smFISH data, which lacks strong autofluorescence that is often observed in tissue slices from human brain. These samples contained 5 fluorophores: nuclei stained with DAPI (blue), Fos transcripts labeled with Opal520 (green), Bdnf exon 1 transcripts labeled with Opal570 (yellow), Bdnf exon 4 transcripts labeled with Opal620 (red) and, Arc transcripts labeled with Opal690 (maroon). Due to a second peak in the DAPI emission spectrum (Fig. 2), the Fos signal in green overlaps with the spectrally neighboring blue nuclei signal. We see similar overlap in emission spectra for other combinations of Opal dyes, i.e., Bdnf exon 1 in yellow overlapping with Fos signal in green, and Arc signal in maroon overlapping with Bdnf exon 4 in red. Despite spectral overlap in reference emission profiles, all three spectral unmixing In each subplot, normalized pixel intensity is plotted against wavelength. (i) Solid lines represent the fingerprints extracted manually using ZEN Black software. (ii) Dotted lines with bubbles represent fingerprints extracted automatically using VCA. The color corresponds to peak wavelength for DAPI and Opal dyes. Lipofuscin is pseudo-colored to black. Root mean squared error (RMSE) between the two lines is calculated for each set of fingerprints algorithms (i.e., FCLSU, ELMM, and GELMM) were able to separate individual fluorophore channels from the multiplex lambda stack (Fig. 3). For DAPI, we found that each algorithm performed exceptionally well, scoring 95% + dice similarity coefficient and low RMSE values (Table 1). We observe similar results for Opal520 with a 97% dice similarity coefficient and even lower RMSE scores (Table 1). Due to the long tail of Opal570 emission spectra, there is some overlap between Bdnf exon 1 in yellow and Bdnf exon 4 in red (Fig. 2). Consequently, the performance of all unmixing algorithms is reduced for Opal570 and Opal620 fluorophores (Fig. 3C, D). Finally, both FCLSU and GELMM accurately unmixed Opal690 signals, scoring 88% dice similarity. On average, FCLSU performed the best, scoring 78% dice similarity with a run time of 10 min, GELMM

Spectral unmixing of RNAscope multiplex smFISH in mouse brain tissue slices
Next, we attempted to test the unmixing algorithms on a lower magnification dataset with more densely packed cells. To this end, we applied FLCSU, ELMM, and GELMM algorithms on RNAscope smFISH data collected from mouse brain tissue at 10× magnification. These samples contained 5 fluorophores: cell nuclei stained with DAPI, Gal transcripts labeled with Opal520, Th transcripts labeled with Opal570, Bdnf exon 9 transcripts labeled with Opal620, and Npy transcripts labeled with Opal690. Although the unmixed images using the three algorithms look qualitatively similar to unmixed ZEN images (Fig. 4), the performance parameters differ significantly ( Table 2). This is partly due to densely packed cells resulting in more mixed pixels, thereby increasing the ambiguity in labeling individual pixels. For instance, due to the second smaller peak in the DAPI emission spectrum (Fig. 2), the Gal signal in green spectrally overlaps with the nuclei signal in blue (Fig. 4B-B‴). This overlap results in DAPI pixels having spectral signatures of variable amplitude and shape. For DAPI, we see ELMM outperform FCLSU with a dice similarity score of 85% and RMSE of 0.0035 (Table 2). While performance is consistent across different algorithms for Opal520 and Opal570 channels, we see a drop in performance for Opal620 and Opal690 channels with dice similarity around 50% for both fluorophores. On average, ELMM performed the best, scoring 67% dice similarity with a run time of 296 min, followed by GELMM scoring 65% dice similarity with a run time of 613 min, and FCLSU scoring 64% dice similarity with a run time of 6 min.

Spectral unmixing of RNAscope multiplex smFISH in post-mortem human brain tissue
Autofluorescence is a common, but undesired, signal in fluorescence microscopy that can confound signals from labeled biological targets. This background signal often has relatively higher intensity and a broader emission spectrum than fluorophores labeling targets [45]. Autofluorescence can come from extracellular components or specific cell types and can be more pervasive in certain wavelength bands [46]. In human brain tissue, one such autofluorescent material is lipofuscin, a product of lysosomal digestion that accumulates with aging in brain cells [7] and also in a group of neurodegenerative disorders classified as neuronal ceroid lipofuscinoses, which are characterized by dementia, visual loss, and epilepsy. We next evaluated the ability of the different unmixing algorithms to separate lipofuscin autofluorescence in post-mortem human dorsolateral prefrontal cortex (DLPFC) using a previously published dataset [17]. This experiment included 5 fluorophores to label different cell types or cell compartments: cell nucleus stained with DAPI, MBP transcripts labeled with Opal520 (marking oligodendrocytes), SLC17A7 transcripts labeled with Opal570 (marking excitatory neurons), GAD1 transcripts labeled with Opal620 (marking inhibitory interneurons), and SNAP25 transcripts labeled with Opal690 (marking all neurons). Like other fluorophores, lipofuscin autofluorescence can be treated as an additional channel and unmixed with its own spectral signature (Fig. 2). We see that all unmixing algorithms were able to isolate the autofluorescence signal (Fig. 5F′-F‴) and were consistent with high dice similarity scores and low RMSE values (Fig. 5, Table 3). Overall, ELMM did best with a dice similarity score of 80% and a runtime of 182 min. FCLSU came second with a dice similarity score of 78% and a runtime of 6 min. GELMM scored a dice similarity score of 76% and a runtime of 434 min. To assess the variability in performance across a set of images, we unmixed 48 DLPFC RNAscope images from Maynard et al. [40] (https:// doi. org/ 10. 6084/ m9. figsh are. 12104 082. v1) with FCLSU. We focused on FCLSU for this batch analysis given its faster runtime. We compared FCLSU unmixing results to those from ZEN across images using RMSE, dice similarity, and SSIM (Table 4). For all metrics, Fig. 4 Spectral unmixing of smFISH data in a mouse brain tissue section. Nuclei stained with DAPI, Gal transcripts labeled with Opal520, Th transcripts labeled with Opal570, Bdnf exon 9 transcripts labeled with Opal620, and Npy transcripts labeled with Opal690. Images shown are 2D maximum intensity projections of 3D z-stacks. A-E Linear unmixing results using ZEN software. A′-E′ Spectral unmixing results using FCLSU algorithm. A″-E″ Spectral unmixing results using ELMM algorithm. A‴-E‴ Spectral unmixing results using GELMM algorithm standard deviation across channels and images was low, suggesting that FCLSU performs consistently over a large set of images.

Spectral unmixing of immunofluorescence data from post-mortem human Alzheimer's disease brain tissue
Having established that all algorithms performed robustly on different biological samples subjected to smFISH, we lastly wanted to explore the performance of the three algorithms on a different type of fluorescent data acquired using immunofluorescence staining of postmortem human brain tissue from donors with Alzheimer's disease. In this experiment, we also used a different set of fluorophores (Alexa dyes) that were assigned as follows to label nuclei, amyloid beta plaques, neurofibrillary tangles, and neuronal dendrites: the cell nuclei are stained with DAPI, Abeta protein is labeled with Alexa fluor (AF)488, phoso-Tau protein is labeled with AF555, and MAP2 protein is labeled with AF633. Spectral signatures for these dyes extracted using VCA are reported in Additional file 1: Figure S2. Similar to smFISH data acquired with Opal dyes, we found each algorithm to be robust at unmixing individual fluorophore channels and autofluorescence for immunofluorescence data acquired with Alexa dyes (Fig. 6). In terms of performance metrics, pixel-wise RMSE values are consistently under 0.01 (Table 5) and dice similarity scores are consistent across different channels and algorithms. Overall, FCLSU performed best with a dice similarity score of 70% and a runtime of 4 min. GELMM followed with a dice similarity score of 68% and a runtime of 140 min. Finally, ELMM scored a dice similarity score of 66% and a runtime of 41 min.

Data segmentation and quantitative analysis using dotdotdot framework
Finally, to evaluate the accuracy of other unmixing methods compared to ZEN software, we integrate our automated unmixing pipeline with our previously published dotdotdot image segmentation and analysis framework for smFISH data [17]. First, we apply FCLSU, ELMM, GELMM, and ZEN-based unmixing on raw smFISH images of post-mortem human dorsolateral prefrontal cortex (DLPFC) using the same dataset from Maynard et al. 2020 described in Fig. 5 [40]. dotdotdot automatically segments nuclei and individual transcript channels (Fig. 7) and provides several metrics including dot number, dot size, and dot intensity. We quantified the number of segmented objects for each channel in Table 6. For DAPI Opal520, and Opal620, we find the number of objects to be equal or consistent across all algorithms. However, we see an increase in the number of objects for Opal570 when unmixed with FCLSU, ELMM, or GELMM versus ZEN. On the other hand, we see a decrease in the number of segmented objects for Opal690 and lipofuscin when unmixed with FCLSU, ELMM, or GELMM versus ZEN. This is potentially due to the proximity of Opal570 and Opal690 puncta as seen in Fig. 7. Overall, we found the segmentation results of the three different algorithms consistent with those of ZEN-unmixed data [17].

Discussion
With increasing sophistication in the design and probing of biological targets, there is greater demand for novel imaging technologies that offer enhanced sensitivity, reliable acquisition, and the ability to resolve several targets  5 Spectral unmixing of smFISH data in post-mortem human DLPFC brain tissue. Cell nucleus is stained with DAPI, MBP transcripts labeled with Opal520, SLC17A7 transcripts labeled with Opal570, GAD1 transcripts labeled with Opal620, and SNAP25 transcripts labeled with Opal690. Images shown are 2D maximum intensity projections of 3D z-stacks. A-F Linear unmixing results using ZEN software. A′-F′ Spectral unmixing results using FCLSU algorithm. A″-F″ Spectral unmixing results using ELMM algorithm. A‴-F‴ Spectral unmixing results using GELMM algorithm simultaneously. One such technique, multispectral fluorescence imaging, allows for the observation and analysis of several elements within a sample-each tagged with a different fluorescent dye. Combining multiple fluorescent probes offers a higher level of information from the same sample [47,48], but may also lead to mixed signals [41]. Existing spectral unmixing methods solve this problem to some extent, but their accessibility and applicability is limited, especially for smFISH in rodent and human brain tissues, and often requires manual intervention. Our goal here was to introduce and systematically evaluate several spectral unmixing algorithms derived from the field of remote sensing that provide a flexible, robust, and automated approach to disentangle multiplex fluorescent images. We aimed to address several drawbacks to existing spectral unmixing methods for brain tissues and create a user-friendly, end-to-end pipeline that could generate reference spectra and perform unmixing. First, we aimed to address the challenge that spectral unmixing is often performed using proprietary software that typically accompanies the microscope used for data collection. This leads to several roadblocks regarding cost, throughput, and customization. For instance, it can be challenging to acquire and unmix data on the same computer due to the computational and time requirements, which often necessitates the purchase of additional expensive software licenses. While ZEN software offers a free "lite" version, the full license is required to run unmixing algorithms and each image must be unmixed individually. Also, unmixing algorithms are often not transparent, which leads to a "black-box" in the analysis pipeline that can limit optimization and troubleshooting capabilities. Additionally, for large lambda stacks across many z-dimensions, spectral unmixing is a computationally-intensive-yet entirely parallelizable-task. The time spent unmixing individual images in between acquisition sessions could instead be directed for additional data collection. For example, in a single imaging session, researchers could spend 3-4 h continuously imaging, which reduces to 2-3 h of imaging when needing to unmix individual images in real time in a typical RNAscope experiment. The time complexity and inability to unmix in batch mode is daunting when several hundred images need to be collected and unmixed. We address these analytical limitations for spectral imaging by providing a MATLAB-based unmixing solution that leverages the implicit parallelizability of unmixing each image independently. By decoupling the data collection and unmixing processes, we aim to improve analysis efficiency and consistency.  Secondly, unmixing methods based on linear mixing models (LMM), such as linear unmixing, spectral deconvolution, and similarity unmixing, requires the user to choose the individual reference spectrum before unmixing. Manually selecting pure pixels in a single positive or multiplex lambda stack for fingerprint generation is a labor-intensive task that is also prone to human bias since pure pixels are variable between images or even within the same image, which may result in the generation of variable reference profiles. Also, background noise and autofluorescence need their own spectral signatures, which are more challenging to estimate manually. We solve this by automating the extraction of spectral signatures using the vertex component analysis (VCA) algorithm, which alleviates the need to manually select pixels for fingerprint generation. Furthermore, Fig. 6 Spectral unmixing of immunofluorescence data from post-mortem Alzheimer's disease brain tissue. The cell nuclei are stained with DAPI, Abeta protein is labelled with AF488, pTau protein is labeled with AF555, MAP2 protein is labeled with AF633. Images shown are 2D maximum intensity projections of 3D z-stacks. A-E Linear unmixing results using ZEN software. A′-E′ Spectral unmixing results using FCLSU algorithm. A″-E″ Spectral unmixing results using ELMM algorithm. A‴-E‴ Spectral unmixing results using GELMM algorithm background noise and lipofuscin autofluorescence are separated as individual channels with their own spectral signatures using VCA (Fig. 2). We demonstrate spectral signature generation using both single positives as well as a multiplexed image if single positives are not available. Lastly, unmixing methods based on unsupervised machine learning algorithms, like clustering, have been introduced in multispectral fluorescence microscopy [9]. Although these methods do not require prior information about the reference spectrum, the performance is sensitive to data conditioning and cluster centers' initialization, which means that we could end up with different unmixed channels for different cluster center initialization. These approaches also tend to perform worse when substantial spectral overlap exists between fluorophores. On the other hand, the only pre-processing step for our pipeline is data normalization, and our approach handles overlapping fluorophores commonly used for smFISH successfully (Fig. 5).
In this age of interdisciplinary research, machine learning methods have bought several seemingly unrelated fields closer together. One such field that shares a similar conceptual framework to spectral unmixing is satellite imaging [22,49,50]. Particularly in the field of remote sensing, studies have used spectral unmixing methods to disentangle geological channels, such as sand, water, vegetation, etc., from hyperspectral images captured by satellites [23,24]. However, spectral unmixing in fluorescence microscopy has several advantages. First, the number of endmembers (fingerprints or spectral signatures) is known in advance based on the number of fluorophores or dyes utilized by the experimenter. Typically, with remote sensing images, the number of endmembers is not known and often calculated as the first step during the unmixing process. Second, due to the limited spatial resolution of satellite images, a large proportion of individual pixels are a mixture of two or more geological components. However, in fluorescence images, individual pixels specify a single fluorophore unless a colocalized label was designed. Conversely, remote sensing images have the upper hand when it comes to the number of spectral bands. Typically, satellite imagery results in several hundred spectral bands, whereas fluorescence microscopy has 30-50 bands. In our imaging setup, data is collected at 32 spectral bands spanning the visual spectrum ~ every 8 nm. Increasing the number of detectors and therefore increasing the number of spectral bands will improve unmixing performance in fluorescent biological samples.
We acknowledge that there are limitations to the methods introduced in this paper. First, these methods assume a linear mixture model (LMM) to disentangle mixed signals mathematically. Although LMM performs very well in most scenarios and reduces computational complexity, the assumption might be inappropriate when non-linear effects such as quenching and photobleaching are used. Second, our methods require that the number of fluorophores is equal to or lesser than the number of spectral bands. If the number of fluorophores is greater than the number of spectral bands, we end up with an underdetermined system that cannot be resolved. Third, any increase in the dimension of images leads to an exponential increase in runtime. We used 1024 × 1024 images in our experiments and ran all unmixing methods on a 40 member cluster node. On average, FCLSU took 6 min, ELMM took 180 min, and GELMM took 400 min. For Table 5 Unmixing results using a single image of immunofluorescence data from a post-mortem human brain tissue section from an AD donor Here we compare the performance of SUFI against the ZEN unmixed image using three metrics (RMSE, dice similarity, SSIM) for all three methods (FCLSU, ELMM, GELMM). For the three metrics, values range between 0 and 1 with better performance indicated by lower RMSE values and higher SSIM and dice similarity values. Mean over all channels is presented as last column   Figure S2) accurately. Typically, single positive images are generated to validate fluorophores before spectral imaging. Although we show that fingerprints can be extracted directly from the multiplex lambda stack (Additional file 1: Figure S1), this method fails to extract significantly overlapping fingerprints (i.e., Opal520 and lipofuscin). Future studies should evaluate data driven approaches to extract spectral signatures and explore deep learning-based methods for spectral unmixing.

Conclusion
We present a robust, flexible, and automated tool for spectral unmixing of fluorescent images acquired in brain tissues. Notably, we automate the process of endmember selection using vertex component analysis (VCA). We then provide several unmixing methods derived from remote sensing to disentangle multiplex lambda stacks. We demonstrate these algorithms using four biologically unique fluorescence imaging datasets. Finally, we provide a MATLAB toolbox that can be readily adopted by the scientific community for their unmixing needs and bundled this pipeline with our previous smFISH segmentation/quantification tool dotdotdot.