Imagine the fright of finding yourself in an unknown room, surrounded by unfamiliar people with unknown intentions, with no memory of how you got there. Imagine the frustration of not being able to control your body movements enough to get dressed. Imagine feeling so tired that you hallucinate, but nonetheless cannot obtain rejuvenating sleep. People with NDs face these challenges as their health gradually declines until their premature death. NDs not only have societal and economic consequences but also bring intense personal suffering. Sadly, the search for treatments is hindered by our limited knowledge of the diseases’ underlying molecular mechanisms.
2. Methods for Cell Type-Specific Gene Expression Analyses
The primary CST methods are RiboTag
[7] and bacTRAP
[8][9]. RiboTag uses Cre recombinase to direct the expression in cell types of interest, where an HA-tagged variant of Rpl22 (large subunit ribosome protein 22) is expressed from its native location in the genome. In contrast, bacTRAP uses a bacterial artificial chromosome (BAC) vector to express a GFP-tagged Rpl10a (large subunit ribosome protein 10a) from a random location in the genome. The researchers have previously discussed the nuances between the two methods
[10], but here the researchers consider them practically equivalent and collectively categorize them as CST methods.
Essentially, CST methods function by capturing epitope-tagged ribosomes using antibodies. Employing transgenic methods, ribosomes are tagged because the epitope-tagged ribosome protein integrates into ribosomes during assembly (Figure 1). Transgenesis lets researchers focus the expression of the tagged ribosomes in select cell types, where they function normally and translate mRNAs into proteins. The epitope tags facilitate the immune capture of the ribosomes and their associated mRNAs from brain homogenates (Figure 1). This design offers several advantages:
-
Tagged ribosomes are present only in target cells, providing an efficient separation of desired mRNAs from those of unwanted cell types.
-
The captured mRNAs reflect the cell’s translatome, and thus provide insights into the proteins being synthesized by the cell
[11].
-
The method is compatible with frozen tissues, limiting batch variation between samples.
An alternative to CST methods is scRNAseq, where cells from a tissue are physically separated and the transcriptome of each cell is analyzed. Popular methods encapsulate cells in oil droplets, acting as micro-reaction chambers for converting mRNA to cDNA (Figure 2). The process includes the marking of each transcript (i.e., each molecule) to control for amplification artifacts, to assign each transcript to a specific cell (barcoding), and the conversion of each mRNA into cDNA (Figure 2). Due to their unique barcodes, the products from all cells are pooled before final amplification.
A key step in scRNAseq is isolating single cells, which is especially challenging for neurons in the brain, where fine processes such as dendrites and synapses are easily lost. Several dissociation methods exist that use physical or enzymatic dissociation mechanisms, but they can bias the sample quality and obscure mRNA level measurements. snRNAseq serves as a good alternative to scRNAseq, since nuclei are less prone to enzymatic and physical isolation artifacts than whole cells. Furthermore, snRNAseq has the advantage of being suitable for fixed and frozen samples, enabling studies of archived postmortem human brain samples, which are not possible with scRNAseq where fresh tissue is required. Several studies have shown significant similarity in transcriptomic analyses between scRNAseq and snRNAseq
[12][13][14][15], although microglial activation genes were depleted in snRNAseq compared to scRNAseq studies in AD
[16]. Furthermore, a comparison of different tissue dissociation protocols found differences in cell composition and further demonstrated the potential for biases in scRNAseq and snRNAseq
[17]. For example, twice the number of genes per cell were detected with snRNAseq compared to scRNAseq
[17]. A systematic analysis comparing different tissue dissociation protocols, cell lysis, sequencing, and data analysis led to the establishment of a toolbox that provides guidelines for customizing sc/snRNAseq protocols
[18].
Figure 1. (
A) In RiboTag mice, the native
Rpl22 gene has been engineered such that activation with Cre recombinase, expressed in specific cell types, results in the expression of Rpl22 fused with an HA epitope tag (cyan flag) specifically in those cells. Before Cre activation, the tagged exon 4 (ex4) is not expressed (ex4*). (
B) A variant of this method employs a viral vector that delivers a RiboTag transgene that is also activated by Cre. The bacTRAP mice are practically the same as RiboTag mice regarding the capture of tagged ribosomes and are thus not shown. (
C) illustrates the workflow, commencing with the homogenization of brain tissue, the capture of tagged ribosomes (marked with cyan-colored bubbles) with antibody-labeled paramagnetic beads, and then their subsequent analysis via RNA-seq (next-generation sequencing) and bioinformatics. (
D) demonstrates a typical validation experiment wherein the purification of RiboTag-labeled ribosomes expressed in specific cell types (top label on each chart) results in the enrichment of marker genes of the desired cells and the depletion of markers of off-target cells (marker gene classes are on the left of each chart). For example, activation by Cre in cells expressing Slc1a3 (astrocyte marker) in the cerebellum or cerebrum (top two violin plots) facilitates the enrichment of astrocyte genes and the depletion of genes from neurons and other glial cells. Lower charts show the results when Cre is expressed by Cx43 (astrocytes), Gad2 (GABAergic neurons), PV (parvalbumin neurons), SST (somatostatin neurons), and vGluT2 (glutamatergic neurons). Marker gene classes include SST, PV, Panneur (pan-neuronal), Oligo (oligodendrocytes), Micro (microglia), Glut+PV (PV-expressing glutamatergic neurons), Glut (glutamatergic neurons), GABA+PV (PV-expressing GABAergic neurons), GABA (GABAergic neurons), and Astro (astrocytes). The Slc1a3 data were published in
[19] and the rest were published in
[20]. Mouse and neuron shapes were generously provided by
https://smart.servier.com (accessed on 12 January 2024) under a CC 3.0 license.
Figure 2. Workflow for scRNAseq and snRNAseq analyses. From the top down, cells of different types are marked in different colors. Cells or cell nuclei are gently dissociated and subjected to a device that orders them into a single file (left side, colored dots), and then merges each cell or nucleus with a single oil droplet (grey bubbles on the right) that carries barcodes that are unique to each droplet. The droplets are then pooled, amplified, and sequenced together, which diminishes batch effects. Features and similarities of each cell are then compared using t-SNE (t-distributed stochastic neighbor embedding) plots.
How do CST and sc/snRNAseq compare? Both methods face challenges with contamination from unintended cells
[11]. CST analyzes a mix of related cell types, but captures approximately 15,000 genes per cell type, approaching the maximum number expected, since many genes are expressed only in certain organs or cell types. In contrast, sc/snRNAseq retrieves fewer genes but from distinct cells. A critical drawback of CST is the demand for artificial gene expression, usually delivered via transgenesis, rendering it unsuitable for human postmortem brain samples.
3. CST Studies of ALS
ALS primarily targets subsets of motor neurons (MNs) in the motor cortex, brainstem, and spinal cord
[21]. While most ALS cases are sporadic, there are many cases driven by mutations in several genes, of which the first to be discovered was superoxide dismutase 1 (SOD1)
[22]. Interestingly, different SOD1 mutations can have different effects on its enzymatic function and aggregation propensity, leading to mutation-specific disease changes
[23]. This, and the fact that historically SOD1 was the first gene associated with ALS, led to the production of several thoroughly studied SOD1 mutant mouse lines
[24][25].
The first application of CST in ND was with bacTRAP
[26] in the loxSOD1
G37R random integration mouse model of ALS
[27], focusing on the MNs, astrocytes, and oligodendrocytes of spinal cords. Most of the study focused on the disease onset stage, marked at 8 months of a disease progression that reaches its terminal stage at 13.5 months. Transgene expression levels were measured and compared to the endogenous mouse SOD1 gene; mutant SOD1 was overexpressed 17-, 8-, and 21-fold in MNs, astrocytes, and oligodendrocytes, respectively. The authors then measured gene expression changes and found that MNs, the cells known to be most affected, had the highest number of DEGs at 260 (85% were increased), versus 108 for astrocytes, and 23 for oligodendrocytes. Of the MN DEGs that were increased, 10% had pathway features related to synaptic structures and cell junctions
[16]. Genes related to the unfolded protein response (UPR) were also upregulated, along with the diminished expression of genes associated with ribosome biogenesis. This shift in gene expression might represent the cells’ adaptive strategy to reduce protein synthesis, potentially serving as a defense mechanism to counteract the detrimental effects of SOD1
G37R misfolding. Notably, the study did not find alterations in mitochondrial genes, in contrast to other studies discussed later in this research, where changes to ribosome and mitochondria biogenesis appear to be coupled. Nonetheless, it is noteworthy that there were 10 times more DEGs in MNs than in oligodendrocytes at this disease onset stage. According to the authors, the disparity in the number of DEGs, with MNs showing a high count and oligodendrocytes a considerably lower one, suggests that the disease begins in MNs and subsequently impacts oligodendrocytes as it progresses
[16].
A more recent study of ALS employed bacTRAP to reveal differences between vulnerable and resistant MNs in layer 5b of the motor cortex
[18] using SOD1
G93A transgenic mice
[19]. New bacTRAP lines were created with promoters that were specifically active in layer 5b, based on anatomical data. Interestingly, two of the new mouse lines, built into BAC transgenes of
Colgalt2 and
Gprin3, both expressed the bacTRAP protein in layer 5b of the motor cortex (M1). The bacTRAP-expressing neuronal populations had similar morphology and size, and both projected to the pons region of the brainstem, but Gprin3
+ neurons also projected to the spinal cord. Importantly, with similar transgene expression levels in both cell types, by the time SOD1
G93A transgenic mice reached 110 days of age, approximately 40% of the Gprin3
+ neurons died. Meanwhile, there was no noticeable decline in the number of Colgalt2
+ neurons. This striking difference could be attributed to their intrinsic gene expression patterns. Gprin3
+ neurons, even at baseline, exhibited an elevated propensity for oxidative phosphorylation but were notably lacking in genes that safeguarded against oxidative damage, and the SOD1
G93A transgene may have exaggerated this imbalance
[18].
When expressing the SOD1
G93A transgene, Gprin3
+ neurons underwent dramatic gene expression changes, particularly increasing their expression of an array of mitochondrial proteins, potentially amplifying oxidative damage through the increased production of hazardous oxidation catabolites
[18]. It is hard to envision how this potentially dangerous response could be beneficial to these vulnerable neurons, but it is also improbable that this coordinated response is random. Also increased were genes encoding ribosomal proteins, and a similar coordination between ribosome and mitochondria biogenesis has been reported in HD
[20]. Due to the high energetic expense required to generate ribosomes, it is logical for the biogenesis of ribosomes and mitochondria to be coordinated
[21]. In this context, it could be that the primary response was to increase ribosome biogenesis, leading to the increased need for mitochondrial biogenesis and metabolic activity that doomed these neurons. Interestingly, the Colgalt2
+ MNs also increased the expression of both mitochondrial and ribosomal proteins, but the increase was smaller in scale (about 30% less) and their higher baseline of protection from oxidative damage likely enabled them to fare much better than Gprin3
+ neurons. Notably, both neuron types decreased their expression of genes related to synapse and axon morphogenesis, with a larger effect in the Gprin3
+ neurons. The authors concluded that intrinsic properties of the Colgalt2
+ MNs (i.e., the higher baseline expression of antioxidant genes) determined their resistance.
To summarize the results from the ALS section, the first study
[26] indicated that the biogenesis of ribosomes decreased in the most susceptible cells, despite no concurrent decrease in mitochondrial biogenesis. This reduction was accompanied by a pronounced activation of the unfolded protein response (UPR). The second study
[28] compared two neuron types that are strikingly similar. While the more susceptible neurons exhibited tenfold more differentially expressed genes (DEGs) than their resistant counterparts, the overall genetic alterations were similar. These changes in gene expression seemed to align with specific molecular pathways, most notably increased ribosome and mitochondrial biogenesis. It is important to note that when this analysis took place, 40% of Gprin3
+ MNs had already succumbed. Hence, the observed changes in gene expression might either represent the adaptive strategies of the surviving neurons or be attributed to a change in the compared populations. Nonetheless, a consistent theme from both studies is that even the more resistant cells can exhibit as much disease-causing protein expression as the vulnerable ones, and the highest DEG counts typically arise in the latter. Interestingly, while the UPR was prominent in the first study, it was absent in the second, even when numerous neurons in the latter were evidently under stress.
4. CST Studies of Prion Diseases
Prion diseases are another class of rare neurodegenerative diseases. While they are most infamous for cases caused by infection (e.g., “mad cow” disease) from misfolded forms of the prion protein (PrP), they can also be caused by the inheritance of mutations in the gene encoding PrP (
Prnp) or from spontaneous PrP misfolding
[4]. Although rare in humans, acquired prion diseases (i.e., those caused by infection) are a problem in farmed sheep and goats in Europe and in wild cervids in North America
[29][30], a problem that is emerging in Northern Europe with new properties
[31][32]. The diseases’ infectious nature was exploited to develop some of the first mouse models of neurodegenerative diseases. Work on rodent models made biochemical purifications of the infectious agent possible and led to the discovery of PrP, and in turn,
PRNP [33][34]. This led to the discovery that multiple inherited neurodegenerative diseases including fatal familial insomnia (FFI), genetic Creutzfeldt–Jakob disease (gCJD), and Gerstmann–Straussler–Scheinker syndrome (GSS), are caused by certain mutations in
PRNP [35][36][37]. Prion diseases cause damage in specific brain regions depending on the disease subtype
[4][38][39]. Furthermore, the shapes of PrP aggregates and the abundance and size of spongiform degeneration “holes”, a hallmark of prion diseases, also vary according to the disease subtype. Interestingly, PV
+ neurons, a subset of GABAergic inhibitory neurons, were reported to be highly vulnerable to prion diseases in humans and rodents
[40][41][42], though less severely affected in FFI
[38]. Another study of acquired prion disease used a Cre-dependent TRAP method but detected changes only at the terminal stage
[43]. Similarly, a scRNAseq study that focused on a near-terminal disease stage has been reported
[44]. Although these reports provided novel insights into disease mechanisms, the results were focused on terminal disease stages, unlike all other studies described in this research, and thus will not be considered further.
A summary of the prion disease studies leads to several important conclusions. Since in both studies PrP was expressed from the native gene in its endogenous locus, PrP expression pattern differences cannot account for the different results reported in the two studies. Therefore, the first conclusion is that the same protein can unleash different gene expression responses within a specific cell type, depending on how the protein misfolds. Second, different cell types respond to the same misfolded protein form differently. Third, since glutamatergic neurons expressed PrP more highly than all other cell types studied, their mild response during prion disease suggests the most affected cells do not necessarily express the protein the highest. Fourth, vulnerable cell types (i.e., PV+ neurons) can have a very mild response with surprisingly few DEGs. Fifth, the UPR is not always induced during the early stages of ND.
5. CST Studies of HD
HD is caused by the expansion of a CAG repeat in exon 1 of the huntingtin (
HTT) gene, which is translated into a polyglutamine tract. While repeat lengths of 6 to 35 CAGs do not cause disease, repeats of 36 to 39 CAGs are linked to HD with incomplete penetrance, and repeats with 40 or more CAGs cause HD with high penetrance at an average age of 40 years
[45][46][47]. Early HD neuropathology is characterized by the selective degeneration of GABAergic medium spiny neurons (SPNs) of the striatum, which constitute around 95% of all striatal neurons, while other striatal cell types are usually spared
[48][49]. Striatal SPNs can be further subdivided into two main subtypes based on their expression of dopamine receptors (Drds). Drd2-expressing SPNs of the indirect pathway (iSPN) show higher vulnerability than Drd1-expressing SPNs of the direct pathway (dSPN)
[50]. This interplay of the regional and cell type-specific vulnerability of neurons to mutant Htt (mHtt) is a subject of great interest in the HD field, but our understanding of the underlying mechanisms remains incomplete. Fortunately, recent studies employing bacTRAP, RiboTag, and snRNAseq methods have provided new details of cell type-specific responses to mHtt, as elaborated in the following section.
Since HD is strictly genetic, it is reasonable to assume that there is a common molecular mechanism in all patients, and knowledge derived from studies of genetically engineered mice may be relevant for HD patients. The first genetic mouse model with a disease-relevant phenotype, the R6/2 model, was generated by injecting a fragment of the mutant
HTT gene from a human with HD into the pronucleus of a fertilized mouse oocyte, where it integrated into a random location
[51]. These mice develop a severe neurodegenerative disease that leads to death in young adult mice, at approximately 20% of their normal lifespan. In pursuit of a more accurate model, researchers employed another strategy: a long CAG repeat was inserted into the mouse
Htt gene in its native location in the genome, resulting in KI mice. This has been performed by multiple labs with one of the key differences between models being that some included various amounts of human DNA sequences
[52][53][54], whereas others used only the mouse
Htt sequence
[55][56]. Although these seemingly subtle differences can impact disease severity, the strongest determinant of severity is the length of the CAG repeat, where 150 to 200 triplets result in mild neurological disease, even at an advanced age, with little neuronal loss and only subtle markers of neurodegeneration. Despite these apparent shortcomings, the KI models cause gene expression changes detectable in crude tissue lysates reminiscent of those detected in similar preparations of human HD samples
[57], offering valuable insights into the disease’s molecular mechanisms.
6. CST Studies of AD
AD diminishes cognition and memory by attacking brain regions linked to those functions, especially the cornu ammonis 1 (CA1) region of the hippocampus and layer II of the entorhinal cortex (ECII)
[58][59]. The locus coreuelus is also affected very early and with a high impact
[60][61][62][63], but is understudied, possibly since its connection to cognition and memory, the most studied clinical features of AD, is not well established. As with other NDs, AD neuropathology is marked by the aggregation of specific proteins. Amyloid beta (AB) peptides, typically consisting of 40 or 42 amino acids, are derived from proteolysis of the amyloid precursor protein and clump together into extracellular amyloid plaques when their stoichiometric balance is perturbed
[64][65][66]. Similarly, the microtubule-associated protein tau forms aggregates that are toxic. Tau has six typical isoforms due to alternative splicing, with zero, one, or two inserts at the N-terminus, while its C-terminal half typically contains either three or four repeating units (3R and 4R, respectively) of microtubule-binding domains
[2][67]. Furthermore, tau is subject to multiple post-translational modifications including acetylation, ubiquitylation, and phosphorylation
[2]. The imbalance of 3R/4R ratios and the subsequent aggregation and modification results in intracellular tau aggregates known as neurofibrillary tangles (NFTs), which are widely thought to be toxic in AD and related tauopathies
[2][68][69]. While amyloid plaques appear to precede NFTs, they tend to be widespread and not selective, whereas the location of NFTs more closely correlates with neuronal loss
[2][59][70].
To generate new knowledge about selective vulnerability in AD, bacTRAP mice were generated to selectively study two types of vulnerable neurons, namely excitatory neurons in CA1 and ECII, as well as five types of resistant neurons, including excitatory neurons in the CA2, CA3, and dentate gyrus of the hippocampus, and excitatory neurons in visual and somatosensory cortical areas
[71]. Translatome signatures were created for each neuron type in non-diseased mice at 5, 12, and 24 months of age, with the rationale that aging can inform about AD mechanisms, since it is an age-dependent disease. These signatures were validated by comparing each, one by one, to the transcriptome signatures of 205 human brain regions. Remarkably, neuronal signatures for each of the seven mouse regions most closely matched the signatures of the corresponding human region. The authors then built new computational models to combine the mouse signature data with human genome-wide association study (GWAS) data based on NFT pathology that had been sculpted with algorithms to include cell type-specific functional network information. They then created a new test data set composed of bacTRAP-derived translatomes of ECII neurons in 6-month-old mice genetically engineered to generate AB plaques
[72] (the authors studied only this cell type in the context of the disease). This analysis detected 1936 DEGs compared to non-transgenic controls. Contrary to many of the studies described above, the ribosomes were not changed, but rather several cytoskeletal and synaptic proteins were altered, similar to some of the previous studies described above. Beyond the cytoskeletal changes seen in many NDs, this study also yielded a potential explanation for a connection between A-syn aggregates and AD, as explained below.
Since the computational model included GWAS data tied to the NFT burden in AD patients, the authors were well-positioned to detect genes involved in tau processing. Indeed, they found that PTBP1 (polypyrimidine tract binding protein 1) regulates tau splicing and its dysregulation causes an imbalance in 3R/4R tau levels. Interestingly, PTBP1 also regulated A-syn. A-syn is most infamous for its prominent role in a range of NDs known as synucleinopathies, including Parkinson’s disease, Lewy body dementia, and multiple systems atrophy. However, it was also previously associated with AD
[73][74], although the molecular mechanism was poorly understood. A study by Roussarie et al. implicates A-syn’s established role at the synapse, where the neuronal cytoskeleton is most dynamic, as the key to its connection to AD
[71]. The authors concluded that high vulnerability may be anchored in ECII neurons because their processes are highly dynamic, which can easily become unbalanced with age or an AB plaque burden, leading to tau imbalance, toxic NFT formation, and its eventual spread to secondarily vulnerable regions
[2][59][70].
Interestingly, the snRNAseq studies revealed different information about selective vulnerability in AD
[75][76]. In one of the first such studies of the prefrontal cortex, it was concluded that early in disease there are many cell type-specific gene expression changes, especially downregulations, but the differences become diminished as the disease progresses
[75]. Furthermore, pathways related to myelination, in both oligodendrocytes and neurons, were prominently affected
[75]. In a later study using snRNAseq from human samples
[76], it was determined that RORB (Retinoic acid receptor-related Orphan Receptor B), a protein best characterized for its role in cortical development
[77][78][79], served as a highly useful marker to identify specifically those neurons in ECII that are most vulnerable
[76]. It was determined that these RORB
+ cells in ECII have very similar signatures as a cell type in the superior frontal gyrus that is also vulnerable. Histological experiments revealed that vulnerable RORB
+ cell types had at least two broadly different morphologies in both regions, indicating that cell shape alone (and thus probably also firing properties) is not sufficient to distinguish vulnerable from resistant cells. Interestingly, in both snRNAseq studies, inhibitory neurons showed little, if any, vulnerability
[75][76].
Thus, through these three studies of AD, we have learned that functional weaknesses include the neuronal cytoskeleton and myelination, and that RORB is an excellent marker of vulnerable neurons. While these impressive results do not mean the selective vulnerability problem of AD is solved, they do outline the direction for additional experimentation. Indeed, all of these experiments neglected the locus coeruleus, which has NFTs even before ECII, the degeneration of which has been demonstrated to trigger the degeneration of hippocampal and cortical areas
[60][61][62][63]. Using bacTRAP or RiboTag in RORB
+ neurons in multiple regions, especially ECII, CA1, and the locus coeruleus, may build on the foundational knowledge provided by the studies reviewed here.