Approximately 70 human RNA-binding proteins (RBPs) contain a prion-like domain (PrLD). PrLDs are low-complexity domains that possess a similar amino acid composition to prion domains in yeast, which enable several proteins, including Sup35 and Rnq1, to form infectious conformers, termed prions. In humans, PrLDs contribute to RBP function and enable RBPs to undergo liquid–liquid phase transitions that underlie the biogenesis of various membraneless organelles. However, this activity appears to render RBPs prone to misfolding and aggregation connected to neurodegenerative disease. Indeed, numerous RBPs with PrLDs, including TDP-43 (transactivation response element DNA-binding protein 43), FUS (fused in sarcoma), TAF15 (TATA-binding protein-associated factor 15), EWSR1 (Ewing sarcoma breakpoint region 1), and heterogeneous nuclear ribonucleoproteins A1 and A2 (hnRNPA1 and hnRNPA2), have now been connected via pathology and genetics to the etiology of several neurodegenerative diseases, including amyotrophic lateral sclerosis, frontotemporal dementia, and multisystem proteinopathy. Here, we review the physiological and pathological roles of the most prominent RBPs with PrLDs. We also highlight the potential of protein disaggregases, including Hsp104, as a therapeutic strategy to combat the aberrant phase transitions of RBPs with PrLDs that likely underpin neurodegeneration.
Protein misfolding unites diverse neurodegenerative diseases
The problem of neurodegeneration remains a pressing public health concern and a biologic black box [1,2]. Age-related neurodegenerative diseases such as Alzheimer's disease (AD), Parkinson's disease (PD), amyotrophic lateral sclerosis (ALS), frontotemporal dementia [FTD, the clinical disorder resulting from frontotemporal lobar degeneration (FTLD) ], and Huntington's disease (HD) lead to cell death within the central nervous system (CNS) and progressive CNS dysfunction [4–9]. ALS pathology also extends to the peripheral nervous system [5,10]. Our lack of understanding of the mechanisms and risk factors governing the development and progression of neurodegenerative diseases has largely precluded the development of disease-reversing therapeutics [4,11,12]. Symptomatic treatments are available for PD and AD, but the efficacy of these can be modest or limited by problematic side effects, and they do not address the root cause of disease [12,13].
Despite dramatic differences in characteristic age of onset, symptomatology, and regional involvement of CNS tissue, neurodegenerative disorders are united on a cellular and biochemical level by the accumulation of misfolded proteins in the brain [5–7,14–16]. Cytoplasmic inclusions of α-synuclein in the neurons of the substantia nigra pars compacta and other brain regions are a hallmark feature of PD [8,16,17]. In AD, intracellular tangles of misfolded tau protein in conjunction with extracellular plaques of aggregated amyloid-β are defining features found in the neocortex and hippocampus [9,16,18,19]. In HD, a genetic trinucleotide repeat expansion leads to an elongated polyglutamine tract in the protein huntingtin, causing it to form both nuclear and cytoplasmic amyloid inclusions [18,19]. In addition, repeat-associated non-ATG (RAN) translation occurs in several diseases caused by repeat expansions, including spinocerebellar ataxia type 8 (SCA8), myotonic dystrophy type 1, fragile X-associated tremor ataxia syndrome, ALS, and HD [20–23]. RAN translation in HD, which occurs in multiple reading frames from both sense and antisense transcripts, leads to the accumulation of aggregated polyalanine, polyserine, polyleucine, and polycysteine in the brains of HD patients .
ALS and FTD are related disorders
ALS, also known as Lou Gehrig's disease in homage to the prominent baseball player who was diagnosed in 1939 and died 2 years later, is a devastating neurodegenerative disorder that affects the upper and lower motor neurons of the brain and spinal cord . The widespread and relentlessly progressive destruction of motor neurons causes muscle weakness and atrophy with hyperreflexia and spasticity, ultimately leading to paralysis and death within 2–5 years of disease onset in most cases [5,24]. FTD is a leading cause of early-onset dementia, second only to AD . It results in the selective degeneration of the frontal and temporal lobes of the brain, which typically manifests as primarily behavioral dysfunction, including changes in personality and executive function or loss of volition, or language deficits [5,25]. It has become increasingly clear that there is a significant overlap between ALS and FTD clinically, genetically, and neuropathologically [3,5,10,25,26].
It is now estimated that up to 50% of ALS patients also suffer from cognitive impairment or behavioral changes associated with FTLD, and while in many cases these symptoms do not reach a clinical severity that meets criteria for dementia, ∼15–20% of those with ALS also carry a diagnosis of FTD [3,25,27,28]. Similarly, a study of FTD patients found that ∼50% had motor neuron involvement evident via examination or electromyography [3,27]. The idea that purely motor ALS and purely cognitive FTD exist at the two ends of a spectrum of disease is not surprising when it is considered that the two clinical entities are known to share genetic causes in their familial forms and have commonalities in their cellular signatures [15,29]. Like other neurodegenerative disorders, ALS and FTD are characterized by pathologic protein aggregation in the cytoplasm of affected neurons [15,30]. Among the proteins that have been genetically linked to these diseases and identified in cytoplasmic inclusions in patient neurons are several RNA-binding proteins (RBPs) that have low-complexity domains (LCDs), termed prion-like domains (PrLDs), because of their similarity in amino acid composition to yeast prion domains .
Prions are self-replicating protein conformers
Prions are the cause of devastating human neurodegenerative diseases including Creutzfeldt–Jakob disease, Gerstmann–Sträussler–Scheinker syndrome, and fatal familial insomnia, but confer heritable traits that can be beneficial in yeast [18,31–36]. Prions are infectious protein conformers capable of self-replication, which occurs as the prion templates the folding of soluble proteins comprised of the same amino acid sequence (Figure 1) [37,38]. In the prion conformation, these proteins typically form stable amyloid fibers that are often sodium dodecyl sulfate (SDS) insoluble and resistant to proteases and heat denaturation [18,37]. Amyloid is a polymeric ‘cross-β’ structure in which the strands of the β-sheets run perpendicular to the axis of the fiber [18,35]. The ability of yeast prions to form amyloid is dependent on a prion domain rich in glycine and uncharged polar amino acids, including glutamine, asparagine, tyrosine, and serine [31,39–41]. Deletion of this prion domain precludes access to the prion state , and addition of this region to otherwise innocuous proteins is sufficient to confer prion behavior [43–45]. Importantly, randomization of the primary amino acid sequence of the prion domain does not affect prion formation [46,47]. Identification of several prion domains that confer bona fide prion behavior has led to the development of bioinformatics algorithms that scan amino acid composition to screen the human genome for proteins with PrLDs [31,39,40,48].
Prions self-replicate conformation by templating the folding of soluble protein to the prion conformation.
Human RBPs with PrLDs cause neurodegenerative diseases
Interestingly, a disproportionate number of the ∼240 human proteins with PrLDs are RNA- or DNA-binding proteins, many of which contain a canonical RNA recognition motif (RRM) [41,49]. Gene ontology (GO) annotations indicate that ∼30% of human proteins with PrLDs function in RNA binding and ∼33% function in DNA binding . While RRM-containing genes represent only ∼1% of the human protein-coding genome, they comprise >10% of all genes containing PrLDs . One by one, RNA-/DNA-binding proteins with PrLDs are being implicated in neurodegenerative disease [41,49]. This association began with the identification of a trinucleotide repeat expansion in the gene encoding ataxin 1 (ATXN1) that leads to a polyglutamine protein product and causes SCA1 [50,51]. The expansion is now recognized to occur within the PrLD and promotes aggregation of ATXN1 [41,52]. A similar expansion in ataxin 2 (ATXN2) causes SCA2 [51,53]. The SCAs are a group of autosomal dominantly inherited disorders characterized by ataxia, tremors, and dysarthria with profound cerebellar atrophy . It would be almost a decade before the misfolding of another RBP with a PrLD was linked to the pathogenesis of ALS and FTD.
Transactivation response element DNA-binding protein 43
The first of the RRM- and PrLD-containing proteins to be implicated in neurodegeneration was TDP-43 (transactivation response element DNA-binding protein 43, see domain architecture in Figure 2) [29,54]. TDP-43 was identified in 2006 as the predominant protein component of the ubiquitinated inclusions observed in ALS patients and a subset of cases of FTD in which there was no observable tau or α-synuclein aggregation [29,55]. TDP-43 is a primarily nuclear protein that shuttles between the nucleus and the cytoplasm, and plays a role in mRNA transport, transcriptional repression, splicing regulation, miRNA biogenesis, stress granule formation, and the stabilization of long intron-containing RNA and long noncoding RNA [56,57]. TDP-43 favors binding to long UG repeats or UG-enriched RNA sequences [58–61]. We now know that TDP-43 is mislocalized to cytoplasmic aggregates in degenerating neurons and glia in roughly 97% of sporadic ALS cases and ∼45% of sporadic FTD cases [56,62]. Its mislocalization has been identified as the primary histologic abnormality in cases of inclusion body myositis and a familial form of parkinsonism known as Perry syndrome [29,63]. TDP-43 inclusions are also present in many cases of AD, PD, and HD . Mutations in the gene encoding TDP-43 (TARDBP) have been identified in cases of both familial and sporadic ALS, with mutations segregating with disease in the former, further implicating TDP-43 in the pathogenesis of neurodegeneration [29,64–68]. TARDBP mutations are also found in rare instances of FTD [56,69,70].
Mutations that cause ALS and FTD cluster in the PrLD of TDP-43.
The vast majority of these observed mutations are found in the C-terminal PrLD of TDP-43 (Figure 2) , which is critical for elements of normal protein function . The PrLD facilitates miRNA biogenesis by mediating interactions with the nuclear Drosha complex, which cleaves pri-miRNAs into pre-miRNAs, and the cytoplasmic Dicer complex, which then cleaves these pre-miRNAs into mature miRNAs . The TDP-43 PrLD mediates protein–protein interactions with other splicing factors, including heterogeneous nuclear ribonucleoprotein A1 (hnRNPA1), hnRNPA2B1, and fused in sarcoma (FUS), and is essential for the regulation of splicing of certain mRNA transcripts [41,73,74]. The PrLD is essential for recruitment of TDP-43 to stress granules . The TDP-43 PrLD is also crucial for aberrant protein aggregation in vitro and in model systems, and select disease-linked mutations accelerate protein aggregation in vitro and in vivo [31,76–79]. Deletion of the PrLD eliminates protein toxicity in model organisms, as does disruption of the RNA-binding ability of TDP-43, suggesting roles for both misfolding and RNA engagement in disease pathogenesis [76,77,80,81].
Fused in sarcoma
Shortly after the connection was made between TDP-43 and disease, another protein with a canonical RRM and a low-complexity PrLD, FUS (see Figure 3 for domain architecture), was linked to both ALS and FTD. Similar to TDP-43 in many ways, FUS, also sometimes known as translocated in liposarcoma (TLS), is a primarily nuclear protein that functions in transcriptional regulation, pre-mRNA splicing, and other elements of mRNA processing and metabolism [56,82]. Notably, though, the most common FUS-binding motif is GUGGU, and the repertoires of RNAs bound by TDP-43 and FUS have little overlap . FUS-binding sites are enriched for 5′-untranslated regions (UTRs), and it has been suggested that FUS also preferentially binds 3′-UTRs and intronic sequences [83,84]. FUS participates in the shuttling of RNA between the nucleus and the cytoplasm, miRNA processing, and the stabilization of long intronic sequences and long noncoding RNAs . FUS interacts with RNA polymerase II and Transcription Factor II D, in addition to other transcription factors, and is thought to have both transcriptional activation and repression activity [56,83]. FUS is recruited to sites of DNA damage and plays an essential role in cellular recovery, including the recruitment of other DNA repair factors [56,83].
ALS- and FTD-causing mutations in FUS cluster in LC domains and the PrLD.
Mutations in FUS have been linked to sporadic and familial cases of ALS, and these patients demonstrate the accumulation of FUS-positive inclusions in the cytoplasm of degenerating neurons and glia, and decreased nuclear FUS [5,15,30,85–88]. FUS mutations have caused the earliest reported onset of juvenile-onset ALS reported in children as young as 11 years old . Neuronal and glial FUS aggregates have also been observed in ∼9% of FTD cases, and rare mutations in FUS have been identified in FTD patients [5,30,56,90–94]. Of note, nuclear FUS inclusions have been identified in patient neurons in cases of polyglutamine diseases including HD, SCA1, and SCA2 without FUS mutations [56,71].
Putative pathogenic mutations in FUS cluster in the C-terminal proline-tyrosine nuclear localization signal (PY-NLS), the RGG-rich region, and the PrLD (Figure 3) [56,82,93,95]. Studies in model systems have indicated that RNA binding is essential for the toxic effect of FUS, as is the case for TDP-43, but in addition to the RRM and PrLD, a portion of the RGG-rich region is crucial for the aggregation and toxicity of FUS [82,93,96]. ALS-linked FUS mutations confer both gain- and loss-of-function phenotypes . FUS interacts with the U1 snRNP (small nuclear ribonucleoprotein) of the spliceosome and the survival motor neuron (SMN) protein, a component of the complex that enables snRNP biogenesis [71,97]. SMN deficiency causes a childhood motor neuron disease known as spinal muscular atrophy, which is characterized by a reduction in nuclear SMN-containing bodies known as Gems . Similarly, ALS-linked mutations in FUS increase the association of FUS with SMN, leading to a reduction in the abundance of Gems and altered snRNA levels . These pathologic mutations simultaneously decrease FUS binding to the U1 snRNP, resulting in splicing disruptions that phenocopy a partial loss of FUS activity .
TATA-binding protein-associated factor 15 and Ewing sarcoma breakpoint region 1
Studies of FUS and TDP-43 pathogenicity highlight not only the fact that ALS and FTD are closely related entities, but also the potential importance of other RBPs with PrLDs in the pathogenesis of neurodegeneration [31,49]. When all human proteins with RRMs were screened for cytoplasmic aggregation and toxicity in yeast, as is seen upon over expression of TDP-43 or FUS in yeast, then filtered based on bioinformatically predicted PrLDs, two proteins, TAF15 (TATA-binding protein-associated factor 15) and EWSR1 (Ewing sarcoma breakpoint region 1), emerged with structural and functional similarities to TDP-43 and FUS [31,48]. TAF15 and EWSR1, along with FUS, belong to a family of proteins known as FET proteins (see Figure 4 for domain architecture) [31,98,99]. As their names imply, FET proteins were originally described as components of pathogenic fusion oncogenes in certain human cancers . Further investigation identified mutations in TAF15 and EWSR1 in patients with sporadic ALS (Figure 4) and revealed that either protein may be found depleted from the nucleus and mislocalized to cytoplasmic neuronal inclusions in ALS and FTD [48,98–100]. Additional evidence for pathogenicity came from in vitro studies demonstrating that both proteins are intrinsically aggregation prone, and ALS-linked TAF15 and EWSR1 mutations accelerate aggregation [48,98]. In addition, both proteins are toxic when overexpressed in the Drosophila nervous system and disease-associated TAF15 mutations cause a more severe phenotype [48,98]. Finally, in cultured mammalian neurons, disease-linked TAF15 and EWSR1 mutations induced formation of cytoplasmic TAF15 and EWSR1 inclusions [48,98].
FET proteins EWSR1 and TAF15 have domain architectures similar to the domain architecture of FUS.
Mutations in hnRNPA1 and hnRNPA2B1 cause multisystem proteinopathy
More recent information linking PrLDs in the context of RBPs to neurodegeneration has emerged from the study of a rare degenerative syndrome known as multisystem proteinopathy (MSP) . This autosomal, dominantly inherited disorder was formerly known as inclusion body myopathy with Paget's disease of bone, FTD, and ALS (IBMPFD/ALS) [101,102]. MSP is a heterogeneous, adult-onset disorder that is characterized by a variable presentation, even within the families that it affects [101–103]. Patients may suffer from degeneration of the muscle, bone, brain, motor neurons, or several of these tissues concurrently [101,104]. The most common feature of disease is inclusion body myopathy (IBM), which occurs in ∼80–90% of MSP patients and leads to progressive weakness and atrophy, primarily of proximal muscle groups [103,105]. Roughly half of MSP patients will develop Paget's disease of bone (PDB), a disorder of increased osteoclast activity and bone turnover that is clinically marked by bone pain, pathologic fractures, and skeletal deformities, most often of the skull, vertebrae, and pelvis [103,106]. Cognitive changes and language deficits that define FTD can be observed in a subset of MSP patients, as can be the signs of upper and lower motor neuron dysfunction and electromyographic findings that are hallmarks of ALS [103,104].
There are currently three known genetic causes of MSP . The first identified was valosin-containing protein (VCP), a AAA+ protein (ATPase associated with diverse cellular activities) that participates in many cellular processes including the cell cycle, DNA damage repair, apoptosis, the proteotoxic stress response, post-mitotic Golgi reassembly, endoplasmic reticulum-associated degradation, and ubiquitin-dependent protein degradation [101,102,104,107]. VCP mutations have subsequently been identified in patients with isolated ALS, IBM, and PDB [101,105,106,108]. VCP plays a critical role in the clearance of stress granules via autophagy, and disease-associated VCP variants cause the constitutive formation of stress granules in cell culture, suggesting that aberrant stress granule persistence may contribute to neurodegenerative disease pathogenesis .
Exome sequencing and linkage analysis of two MSP-affected families without VCP mutations uncovered pathogenic mutations in the genes encoding heterogeneous nuclear ribonucleoproteins (hnRNPs) A1 and A2B1 (hnRNPA1 and hnRNPA2B1), two RBPs with PrLDs [101,104]. MSP can be caused by a D262V substitution in hnRNPA1 or a D290V substitution in hnRNPA2 . hnRNPA1 and hnRNPA2 (the shorter of two hnRNPA2B1 isoforms by 12 amino acids, which constitutes roughly 90% of hnRNPA2B1 expression in most human tissues) share a domain structure consisting of two N-terminal RRMs and a PY-NLS-containing C-terminal PrLD (Figure 5) [39,101].
MSP-causing mutations affect a conserved aspartate residue in the hnRNPA1 and hnRNPA2 PrLDs.
hnRNPA1 is an abundantly and ubiquitously expressed, primarily nuclear RBP that functions widely in nucleic acid processing . hnRNPA1 binds to promoter sequences or transcription factors to either activate or repress transcription and contributes to the regulation of alternative splicing and splice-site selection, often promoting exon skipping [110–113]. It can shuttle between the nucleus and cytoplasm, facilitating nuclear mRNA export . In addition to showing affinity for specific motifs including UAGGGA, UAGA, UAGG, and UGGGGU [110,114,115], hnRNPA1 binds AU-rich elements (containing AUUUA motifs) that are known to modulate the stability and degradation of mature mRNA transcripts [110,116]. hnRNPA1 also binds to internal ribosomal entry sites to regulate translation [117,118], is critical for telomere biogenesis and length maintenance [110,119], and participates in miRNA processing [120,121]. Like hnRNPA1, hnRNPA2B1 is one of the most abundantly expressed proteins in the cell and is predominantly nuclear with the ability to shuttle between the nucleus and cytoplasm . It has functional similarities to hnRNPA1, including roles in the regulation of alternative pre-mRNA splicing and translation [122–124], mRNA stability , and telomere maintenance [125,126]. Distinct from hnRNPA1, hnRNPA2B1 also plays a crucial role in mRNA trafficking in neurons and oligodendrocytes [127,128]. Like hnRNPA1, hnRNPA2B1 has a significant binding preference for UAG motifs .
Recent studies of hnRNPA2B1 function in mouse spinal cord, patient fibroblasts, and motor neurons derived from human induced pluripotent stem cells (iPSCs) identified an enriched UAGG-binding motif in CNS tissue . hnRNPA2B1-binding sites were particularly enriched in 3′-UTRs in vivo and in cultured cells, and hnRNPA2B1 was found to contribute to polyadenylation site selection . The importance of hnRNPA2B1 to pre-mRNA splicing was illustrated by altered proportions of the long and short isoforms of the murine protein Dao upon depletion of hnRNPA2B1 in the mouse CNS . The human homolog, DAO, which encodes d-amino acid oxidase, is highly expressed in the CNS and has been implicated in familial ALS [124,129,130]. Loss of hnRNPA2B1 expression in the mouse model causes increased proportional expression of a short Dao isoform that is degraded by the proteasome and has ∼85% less enzymatic activity than the longer isoform . Importantly, the splicing changes that result from the MSP-causing substitution, D290V, in hnRNPA2B1 in patient fibroblasts are distinct from those that occur due to loss of hnRNPA2B1 function . In contrast, the splicing changes caused by the D290V substitution in hnRNPA2B1 have a ∼66% overlap with splicing alterations observed in fibroblasts from patients with an MSP-causing mutation in VCP . This finding suggests a possible etiology for the shared disease phenotype caused by mutations in VCP and hnRNPA2B1.
MSP-linked hnRNPA1 and hnRNPA2B1 mutations enhance protein aggregation
hnRNPA1 and hnRNPA2 have a common domain architecture consisting of two N-terminal RRMs and a C-terminal PrLD containing a PY-NLS that mediates nuclear import (Figure 5) [101,110]. Interestingly, both MSP-linked mutations involve a valine substitution at a conserved gatekeeper aspartate residue in the PrLD that is computationally predicted, by two separate algorithms, to increase prionogenicity (Figure 5) [39,40,101]. Additionally, an algorithm that scores the ability of hexapeptides to form amyloid fibrils primarily based on structural information rather than amino acid sequence predicts that each of these mutations lies within a ‘steric-zipper’ motif (Figure 6) [101,131]. Steric zippers are defined as two self-complementary β-sheets with the ability to act as the backbone of an amyloid fibril . The aspartate-to-valine substitution in this region is predicted to strengthen a steric zipper, making the protein more prone to fibrillization (Figure 6) . Indeed, both hnRNPA1 and hnRNPA2 form fibrils in vitro that are self-seeding (i.e. can nucleate the aggregation of soluble protein), thereby reducing the lag phase of assembly, and the disease-associated mutations greatly accelerate fibrillization [101,132]. In vitro, the mutant proteins are capable of seeding their own assembly and the assembly of the corresponding wild-type protein , providing a potential explanation for the genetic dominance of MSP mutations. A heterozygous individual would produce both wild-type and mutant protein. However, if the presence of the aspartate-to-valine substitution accelerates the misfolding of the mutant protein, and the misfolding of the mutant protein can nucleate the misfolding of the wild-type protein, the presence of the wild-type allele would not be protective against the development of a disease phenotype.
MSP- and ALS-associated mutations are predicted to increase the fibrillization propensity of hnRNPA1 and hnRNPA2.
Muscle biopsies from MSP patients with mutations in VCP, hnRNPA1, or hnRNPA2B1 share cytopathologic features including the cytoplasmic aggregation of TDP-43, which has also been observed in sporadic IBM in addition to ALS and FTD [101,103,105,133]. A biopsy from an affected individual in the family harboring the hnRNPA2D290V variant also demonstrated mislocalization of hnRNPA2 from the nucleus to cytoplasmic inclusions, and in muscle fibers obtained from a patient expressing hnRNPA1D262V, both hnRNPA1 and hnRNPA2 were cleared from myonuclei and localized to sarcoplasmic inclusions . Motor neurons differentiated from iPSCs from MSP patients with hnRNPA2D290V or VCPR155H variants demonstrate nuclear hnRNPA2B1 aggregation . Concurrent mislocalization and partial colocalization of TDP-43 and hnRNPA1 or TDP-43 and hnRNPA2 could be observed in muscle fibers of MSP-affected patients . Cytoplasmic hnRNPA1- and hnRNPA2-positive aggregates have also been identified in sporadic cases of IBM [101,134]. The intersection of protein pathologies in MSP and IBM underscores the fact that there is much to be learned about common degenerative diseases from more rare, familial disorders.
Sequencing efforts to uncover pathogenic mutations in familial and sporadic ALS patients have identified additional mutations in hnRNPA1 and hnRNPA2 linked to ALS [101,135,136]. A substitution (D262N) occurring in a familial case of ALS affects the same aspartate residue implicated in the pathogenesis of MSP . The D262N substitution in hnRNPA1 introduces a strong steric zipper and strengthens an existing steric zipper (Figure 6) [101,131]. Similar to the D262V substitution, D262N significantly reduced the lag phase of fibrillization and accelerated hnRNPA1 aggregation in vitro . Several other mutations in hnRNPA1 that have been identified in patients with ALS also introduce or strengthen steric zipper motifs (Figures 5 and 6) . One of these, a substitution in the PY-NLS of hnRNPA1 (P288S) was recently identified as the cause of a familial case of flail-arm ALS (Figure 5) . The location of this mutation suggests that hnRNPA2P288S may have impaired nuclear import, leading to increased cytoplasmic mislocalization in addition to increased fibrillization propensity.
Many questions remain, however, about the extent and prevalence of hnRNPA1 and hnRNPA2 pathology in patients with MSP and sporadic forms of ALS and FTD. Mislocalized hnRNPA1 and hnRNPA2 inclusions have been observed in muscle fibers of patients with MSP, but can the clearance of these proteins from the nucleus to cytoplasmic foci be observed also in motor neurons of the brain and spinal cord and in the frontal and temporal cortical lobes of these patients? It remains unclear how this disease manifests in such a heterogeneous way among patients with the same mutation, and it would be informative to investigate, via postmortem biopsy, whether patients who developed muscle and bone pathology, for example, but no clinical dementia demonstrated evidence of asymptomatic protein pathology in the frontal cortex. Also of relevance would be a study of ALS patients with TDP-43 or FUS mutations and pathology to look for co-occurrence of hnRNPA1 or hnRNPA2 pathology. Wild-type TDP-43 aggregates along with hnRNPA1 and hnRNPA2 in MSP , suggesting the possibility that wild-type hnRNPA1 and hnRNPA2 may be present in the inclusions driven by mutations in other RBPs in ALS and FTD patients. A single study of frontal cortex from 10 patients with FTD and TDP-43 pathology showed no mislocalization of hnRNPA1 or hnRNPA2 . Importantly, one of these patients harbored a familial VCP mutation . Thus, VCP mutations are not always accompanied by hnRNPA1 and hnRNPA2 pathology as they can be in MSP.
Finally, the contribution of hnRNPA1 and hnRNPA2 mutations to the overall landscape of neurodegeneration is currently unknown in that we do not yet know how frequently these mutations occur or how penetrant they are. The discovery of hnRNPA1 and hnRNPA2 mutations in MSP was rapidly followed by the identification of additional hnRNPA1 and hnRNPA2 mutations in patients with sporadic and familial ALS [101,135], and we expect the number of patients suffering from neurodegenerative phenotypes with identified mutations in hnRNPA1 or hnRNPA2 to grow as our knowledge of disease increases. We also anticipate that additional RBPs with PrLDs will emerge in degenerative diseases [31,41]. Indeed, mutations in the PrLD of hnRNPDL, leading to D378N or D378H substitutions, have now been linked to limb-girdle muscular dystrophy type 1G .
Disease-associated RBPs are involved in the formation of RNP granules
An important shared feature of ATXN2, TDP-43, FUS, hnRNPA1, hnRNPA2, EWSR1, and TAF15 is their recruitment to stress granules upon cellular exposure to environmental stresses like heat shock, infection, ischemia, or oxidative stress [49,101,140]. Stress granules are RNP granules that assemble in the cytoplasm in stress conditions and incorporate nontranslating polyadenylated mRNA transcripts, translation initiation factors, small ribosome subunits, and RBPs (Figure 7) [49,141]. They are sites of translation suppression, consisting of stalled translation–initiation complexes and translational-silencing proteins in addition to other regulators of RNA metabolism, and serve to redirect cellular energy and resources towards the production of cytoprotective proteins that will be essential for survival and recovery after stress [49,140,142,143]. Processing bodies (P bodies) are a related class of RNP granules that are constitutively assembled in addition to being induced by cellular stress (Figure 7) . P bodies are cytosolic sites of mRNA decay that interact with stress granules, allowing for possible exchange of mRNAs and proteins between assemblies [49,140,142,144,145]. Crucial to the reversible assembly of RNP granules is the intermolecular association of PrLDs or other LCDs via multiple weak, transient interactions as target RNAs are engaged, primarily via RNA-binding domains [49,101,143,146]. In some cases, as with hnRNPA1, PrLDs can also bind RNA, frequently via RGG motifs [147,148]. In other cases, as with FUS, the PrLD does not bind to RNA directly . The PrLD of the mammalian stress granule protein T-cell intracellular antigen 1 (TIA1)  is required for incorporation into chemically induced stress granules . In yeast, a reduction in the recruitment of prion-like proteins Lsm4 and Pop2 to P-bodies is observed in the absence of their PrLDs . Therefore, despite their propensity for misfolding events, PrLDs have likely been preserved throughout evolution in part because they enable essential protein–protein interactions that provide the fluid architecture of membraneless cellular compartments . In addition to stress granules and P bodies, germ granules are cytoplasmic RNP bodies found in the cytoplasm . Membraneless organelles that contribute to nuclear organization include nucleoli, paraspeckles, gems, Cajal bodies, and promyelocytic leukemia (PML) bodies [41,125].
Cytoplasmic RNP granules include stress granules and P bodies.
Remarkably, many RBPs with PrLDs, which have not yet been connected to disease, are emerging as critical scaffolds for the formation of these membraneless organelles. For example, the PrLD of RBM14 (as well as FUS) is critical for paraspeckle formation . Likewise, the PrLD of hnRNPD plays an important role in Sam68 nuclear body formation , whereas the PrLD of Xvelo is critical for Balbiani body formation [156,157]. Finally, PrLDs in DAZ1-4 and DAZL are predicted to have important roles in the formation of amyloid-like structures that regulate key meiotic events [158,159]. We anticipate that PrLDs in RNA/DNA-binding proteins will continue to surface as key scaffolds for various membraneless organelles. PrLDs in proteins that do not bind nucleic acids will also likely serve as scaffolds in other contexts. For example, the PrLD of Pin2 can function as a trans-Golgi network retention motif by driving the assembly of higher order complexes .
A role for the alteration of RNP granule dynamics in neurodegenerative pathology is suggested by studies showing that disease-associated mutant proteins are recruited differently to RNP granules than their wild-type counterparts [95,101,141,150,161–162]. Moreover, changes in the expression of RNP granule components modify the effects of toxic neurodegenerative disease RBPs in model systems . hnRNPA1 and hnRNPA2 are nuclear when expressed in HeLa cells, but are incorporated into cytoplasmic stress granules upon arsenite stress, and recruitment of hnRNPA1D262V and hnRNPA2D290V occurs more rapidly than relocalization of the wild-type proteins . The D290V substitution also enhances hnRNPA2 recruitment to stress granules in motor neurons derived from MSP-patient iPSCs . A VCP mutation that also causes MSP has the same effect on hnRNPA2 . The fact that these mutations promote the targeting of RBPs to stress granules, while VCP mutations can also decrease stress granule clearance , suggests a model in which MSP can be caused by any perturbation that shifts the equilibrium of dynamic stress granule formation and dissolution towards granule formation or persistence. In cultured cells, familial ALS mutations cause increased formation of TDP-43 inclusions that are also positive for stress granule markers after exposure to environmental stress [49,150]. FUS variants, too, show enhanced association with stress granule markers in cytoplasmic inclusions [49,95,141,161,162]. In a yeast model of TDP-43 proteinopathy, overexpression of several RNP granule components, including Tis11, Hrp1, Vts1, Kem1, and Pbp1, either enhanced or suppressed the toxicity of TDP-43 expression .
Pbp1 is a stress granule protein that interacts with Pab1, also a component of stress granules, and regulates mRNA polyadenylation [80,164]. Interestingly, Pbp1 is the yeast homolog of human ATXN2, which bears a polyglutamine expansion in SCA2 . Deletion of Pbp1 diminishes stress granule formation and suppresses TDP-43 toxicity in yeast, whereas overexpression of Pbp1 enhances TDP-43 toxicity in yeast . The Drosophila homolog, Atx2, also has a dose-dependent effect on TDP-43 toxicity in the fly nervous system, with a reduction in Atx2 expression reducing the toxic TDP-43 phenotype . Further analysis revealed that TDP-43 and ATXN2 physically interact in yeast and humans in an RNA-dependent manner . Furthermore, ATXN2 forms abnormal cytoplasmic foci in ALS and FTD patient neurons, and TDP-43 inclusions can be found in cerebellar Purkinje cells and brainstem nuclei in SCA2 . Genetically, mutations in ATXN2 are the most common known risk factor for ALS . Polyglutamine expansions of >34 repeats cause SCA2, but intermediate length expansions from 27 to 33 glutamines in length were found to increase the likelihood of developing ALS by a factor of ∼2.8 [80,165–167].
In Drosophila, increased expression of the stress granule protein polyA-binding protein (PABP) causes more severe TDP-43-induced retinal degeneration . The cytoplasmic human homolog PABPC1 was observed in cytoplasmic inclusions in the motor neurons of ALS patients, despite having a predominantly diffuse pattern of localization in healthy controls . RNP granule markers have also been found to modify FUS toxicity in model systems . Overexpression of stress granule proteins Pab1, Tif2, Tif3, and Tis11 in a yeast model suppressed the toxic effect of FUS overexpression [93,163,164]. The human homolog of Tif2, EIF4A1, is similarly able to suppress FUS toxicity in cultured mammalian cells . FUS toxicity in yeast is also mitigated by overexpression of the P body protein Edc3 or Sbp1, which localizes to both stress granules and P bodies [93,164,168]. Both Edc3 and Sbp1 promote mRNA decapping prior to 5′-to-3′ degradation [164,169]. Deletion of the stress granule protein Pub1 or the P-body protein Lsm7 decreases FUS toxicity in yeast [93,164]. In P bodies, Lsm7 is part of a heteroheptameric complex consisting of Lsm proteins 1–7 [170,171]. The Lsm1–7 proteins activate mRNA decapping and protect mRNA from trimming, a process by which transcripts are shortened by 10–20 nucleotides at the 3′-end [170–172]. Lsm7 also participates in pre-mRNA splicing as a component of a nuclear complex consisting of Lsm proteins 2–8 [170,173,174]. This heptamer stabilizes newly synthesized U6 snRNA by binding to its 3′-end [170,173,174]. The Lsm2–8 complex also contributes to mRNA degradation in the nucleus by targeting nuclear RNAs for decapping [170,175].
Stress granule formation in yeast is diminished by deletion of Pbp1, the yeast homolog of ATXN2, or Pub1, the yeast homolog of the human protein TIA1 . TIA1 is required for mammalian stress granule formation, and reduced ATXN2 expression results in reduced stress granule assembly [177,178]. TIA1 and another stress granule marker, EIF3, have been identified in the proteinaceous inclusions in the brain and spinal cord tissue of patients with ALS and FTD [95,150]. TIA1 is a protein containing RRMs and a PrLD that is essential for stress granule formation in cultured mammalian cells [49,151]. A mutation in TIA1 causes Welender distal myopathy, and mutant TIA1 expression leads to increased stress granule abundance in cultured cells, suggesting that altered stress granule dynamics may underpin this slowly progressive, adult-onset disorder [49,179,180].
Phase transitions underpin RNP granule formation and misregulation
It is now thought that RNP granule components coalesce into membraneless compartments through phase transitions that drive the reversible formation of liquid droplets or more solid hydrogel states [41,49,181–183]. Several RNP granules have been shown to have liquid-like properties, including P granules in Caenorhabditis elegans, P bodies in Saccharomyces cerevisiae, PML nuclear bodies, and mammalian stress granules and P bodies [184–186]. These compartments are spherical, can fuse with one another and relax into a new sphere, and undergo rapid internal rearrangement as demonstrated by half-bleaching experiments [184,185,187]. Liquid droplets form via liquid–liquid phase separation (LLPS), or the ‘demixing’ of the granule components and the cytoplasm, a process modeled by the separation of standing oil and vinegar . Recent work has shown that the liquid droplet environment promotes certain biochemical reactions, including the stabilization of RNA hairpins and the unwinding of double-stranded nucleic acids . The liquid interior is therefore a specialized microcosm for certain nucleic acid remodeling reactions . Liquid droplets create a controlled environment by permitting or restricting entry of proteins based on amino acid sequence .
The transition from soluble protein to liquid droplet is characteristically driven by intrinsically disordered proteins and can be mediated by a multitude of intermolecular interactions [146,189]. Disordered LCDs, including PrLDs, associate with each other via weak, nonspecific interactions in a manner that can be concentration-dependent [146,149,190]. RNA binding via RRMs or PrLDs can facilitate additional multivalent interactions, explaining the observation that the protein concentration required for the formation of hnRNPA1 droplets is decreased in the presence of RNA [146,190]. Interactions between disordered regions of the P-granule protein, Ddx4, are mediated by electrostatic interactions resulting from patterned blocks of residues of alternating net charge [146,191]. Structural analysis of the LCD of FUS in the liquid phase-separated state demonstrates that it retains a disordered character within droplets, suggesting that interactions among PrLDs within liquid droplets are likely to be transient with frequent reorientations [41,149].
Hydrogels have solid-like properties, a cross-linked structure, a high water content, and water-soluble components . Stress granules in yeast are gel-like, highlighting the biological relevance of this form of protein assembly [146,185]. LCDs can also facilitate the transition to the gel phase [181,192,193]. In vitro, the PrLDs of FUS, hnRNPA1, and hnRNPA2 all form hydrogels that are composed of amyloid-like fibrils [41,190,192]. These hydrogel structures are capable of trapping homotypic and heterotypic LCDs . FUS LCD hydrogels, for example, bind and retain, with varying avidities, soluble FUS LCDs and the LCDs of hnRNPA1, hnRNPA2, TDP-43, and TIA1 . The role of hydrogel structures in normal RNP granule assembly in mammalian cells has been controversial [185,194]. One recent model of mammalian stress granules suggests that, rather than being pure liquid droplets, stress granules are composed of a liquid-like exterior containing an internal gel-like core [41,194].
Recent evidence suggests that inappropriate phase transitions nucleated by RNP granules may represent a crucial element of the pathogenesis of neurodegenerative disease [190,195]. In vitro experiments exploring LLPS of FUS and hnRNPA1 indicate that, over time, liquid droplets are prone to ‘mature’ and undergo a liquid-to-solid transition involving protein fibrillization [190,195]. This process is accelerated by pathologic PrLD mutations [190,195]. Mutations in the PrLD of FUS also reduce the reversibility of FUS hydrogel formation . This suggests a model in which disease-causing FUS mutations, which tend to cluster in the PrLD, RGG-rich regions, and NLS , enhance fiber formation within droplets via one of two mechanisms. First, PrLD mutations likely serve to directly increase the propensity of FUS liquids to transition into irreversible aggregates [193,195]. Second, NLS mutations, or frameshift mutations that disrupt the NLS (Figure 3), may function to increase cytoplasmic FUS concentration by decreased nuclear import, driving liquid droplet formation, persistence, and maturation to fibrous structures [41,195]. Importantly, though, mutations in the FUS NLS can also directly alter the dynamics of phase transitions . When purified FUS with and without mutations in the PY-NLS was induced by a temperature shift to form liquid droplets in vitro, mutant FUS droplets persisted longer than those composed of wild-type FUS . Thus, mutations in regions outside the LCD may contribute to pathologic persistence of RNP granules leading to aberrant fibril formation.
The most common cause of ALS and FTD is a hexanucleotide repeat expansion in a noncoding region of C9ORF72 [197,198]. This expansion leads to the RAN translation of several dipeptide repeat proteins, including poly-(Pro-Arg) (PR) and poly-(Gly-Arg) (GR), which form nuclear and cytoplasmic inclusions in the brain and spinal cord of ALS/FTD patients harboring this expansion . LCDs, such as those containing hnRNPA1, hnRNPA2, and other RNP granule components, are a preferred binding target of PR and GR, which can disrupt granule dynamics [198,199]. GR50 or PR50 expression in cultured cells caused spontaneous assembly of persistent stress granules . GR20 or PR20 reduced the concentration required for hnRNPA1 LLPS and led to the formation of droplets with reduced fluidity .
Therapeutic protein disaggregases to counter aberrant phase transitions
A therapeutic agent with the ability to counteract pathologic phase transitions could have tremendous utility across neurodegenerative diseases caused by misfolding events related to RNP granule dysfunction. One approach would identify a small molecule or RNA that could preserve the liquid–granule state by preventing the transition to solid aggregates. An agent that could actively reverse the liquid-to-solid phase transition would be especially appealing for patients with active disease. Hsp104 is a hexameric protein disaggregase in the AAA+ ATPase family [200–202]. It is found in yeast and has homologs across eubacteria and eukaryotic species, but no metazoan ortholog exists [200,203]. Hsp104 preserves proteostasis and promotes survival in S. cerevisiae by renaturing aggregated proteins and returning them to their native conformations after exposure to environmental stress [200,203,204]. It also has the ability to rapidly remodel amyloid fibers and prefibrillar oligomers and, in doing so, regulates prionogenesis and the propagation and elimination of yeast prion conformers [201,204–208]. In S. cerevisiae, Hsp104 also functions in the dissolution of stress granules and the maintenance of the liquid-like properties of P bodies . Hsp104 contributes to the proper targeting of P body components, which mislocalize to stress granules in its absence . As a potential therapeutic, Hsp104 has shown promise in several models of neurodegenerative disease [8,204,209]. In a rat model of PD, expression of Hsp104 decreased dopaminergic neuron loss and accumulation of α-synuclein aggregates in the substantia nigra of animals expressing a PD-linked α-synuclein variant . Hsp104 increased lifespan and reduced the number of cortical polyglutamine inclusions in a mouse model of HD . Potentiated Hsp104 variants with enhanced ATPase activity reduce protein aggregation and suppress toxicity of TDP-43, FUS, and α-synuclein in S. cerevisiae [203,204,210–212]. Enhanced Hsp104 variants also protect against dopaminergic neuron loss in a C. elegans model of PD . These studies suggest that Hsp104 has broad activity against neurodegenerative disease substrates, and its substrate repertoire can be expanded or sharpened using engineering strategies.
Finally, it will also be important to determine whether endogenous human protein disaggregases, including Hsp110, Hsp70, Hsp40, and small heat-shock proteins [213–215]; HtrA1 ; and NMNAT2 plus Hsp90 , also display activity against disease-linked RBPs with PrLDs. These protein disaggregase systems could also be engineered to possess enhanced disaggregase activity against disease-linked RBPs with PrLDs . Moreover, small-molecule drugs that enhance the activity of these systems could be useful therapeutics aimed at restoring homeostasis of RBPs with PrLDs . We anticipate that harnessing the power of protein disaggregases could lead to important advances in treating several devastating diseases caused by aberrant phase transitions of RBPs with PrLDs .
amyotrophic lateral sclerosis
central nervous system
Ewing sarcoma breakpoint region 1
frontotemporal lobar degeneration
fused in sarcoma
heterogeneous nuclear ribonucleoproteins
inclusion body myopathy
induced pluripotent stem cells
liquid–liquid phase separation
- P bodies
Paget's disease of bone
proline-tyrosine nuclear localization signal
RNA recognition motif
spinocerebellar ataxia type 8
sodium dodecyl sulfate
survival motor neuron
small nuclear ribonucleoprotein
TATA-binding protein-associated factor 15
transactivation response element DNA-binding protein 43
T-cell intracellular antigen 1
translated in liposarcoma
We thank Korrie Mack, Zachary March, and Lin Guo for comments on the manuscript. We acknowledge the NHLBI GO Exome Sequencing Project and its ongoing studies which produced and provided exome variant calls for comparison: the Lung GO Sequencing Project [HL-102923], the WHI Sequencing Project [HL-102924], the Broad GO Sequencing Project [HL-102925], the Seattle GO Sequencing Project [HL-102926], and the Heart GO Sequencing Project [HL-103010]. A.F.H. was supported by a Center for Neurodegenerative Disease Research T32 Training Grant [National Institutes of Health/National Institute on Aging AG00255] and an F31 fellowship [National Institute of Neurological Disorders and Stroke F31NS087676]. J.S. is supported by the National Institutes of Health [R01GM099836 and R21NS090205], the Life Extension Foundation, a Sanofi Innovation award, the ALS Association, the Muscular Dystrophy Association, Target ALS, and the Robert Packard Center for ALS Research at Johns Hopkins.
The Authors declare that there are no competing interests associated with the manuscript.