The spliceosome is a multi-subunit RNA–protein complex involved in the removal of non-coding segments (introns) from between the coding regions (exons) in precursors of messenger RNAs (pre-mRNAs). Intron removal proceeds via two transesterification reactions, occurring between conserved sequences at intron–exon junctions. A tightly regulated, hierarchical assembly with a multitude of structural and compositional rearrangements posed a great challenge for structural studies of the spliceosome. Over the years, X-ray crystallography dominated the field, providing valuable high-resolution structural information that was mostly limited to individual proteins and smaller sub-complexes. Recent developments in the field of cryo-electron microscopy allowed the visualisation of fully assembled yeast and human spliceosomes, providing unprecedented insights into substrate recognition, catalysis, and active site formation. This has advanced our mechanistic understanding of pre-mRNA splicing enormously.
Most eukaryotic genes are transcribed as precursor messenger RNAs (pre-mRNAs), in which protein-coding segments, exons, are interrupted by non-coding regions, introns. To produce mature mRNA that can be subsequently translated, introns have to be removed from pre-mRNA. Early work in the splicing field established that intron removal requires two consecutive transesterification reactions that proceed via a lariat-intron intermediate [1–4]. The reaction occurs between splice sites — conserved sequences around exon–intron junctions (Figure 1). During the first step of splicing (branching), the branch point (BP) adenosine performs a nucleophilic attack at the 5′-splice site (5′-SS), leading to formation of a free 5′-exon and lariat-intron intermediate. In the second step reaction (exon ligation), the 5′-exon performs a nucleophilic attack at the 3′-splice site (3′-SS), which results in a spliced mRNA and lariat intron product.
Schematic representation of the pre-mRNA splicing reaction and intron structure.
The above reactions are catalysed by a multi-subunit RNA–protein complex, known as the spliceosome . Among dozens of factors required for pre-mRNA splicing, five small nuclear ribonucleoprotein particles (snRNPs) are the primary building blocks of the spliceosome . Each of the five snRNPs (U1, U2, U4, U5, and U6) consists of an snRNA molecule, seven Sm proteins (or LSm proteins for U6), and other snRNP-specific factors. For each intron, they assemble de novo in a highly hierarchical manner [7,8], recruiting numerous protein factors, most notably the multi-subunit NineTeen (NTC) and Nineteen-Related (NTR) Complexes required for spliceosome activation . Proteomic analysis revealed a vast complexity of yeast and human spliceosomes [10–12]. Until now, at least 10 different assembly intermediates of the spliceosome have been isolated (complexes: E, A, pre-B, B, Bact, B*, C, C*, P, ILS) using biochemical or genetic approaches (see previous reviews, [13,14]).
Spliceosome assembly can be divided into four distinct phases: (1) substrate recognition and assembly; (2) activation; (3) catalysis; and (4) disassembly (Figure 2). The transition between these stages involves a remodelling of RNA–RNA and RNA–protein interactions, which is facilitated by eight conserved RNA helicases . These helicases additionally act as molecular timers, monitoring the reaction progress by so-called kinetic proofreading [16–18].
Schematic representation of splicing cycle.
Similar to many other phosphoryl transfer reactions, pre-mRNA splicing relies on a two-metal-ion mechanism, as proposed by Steitz and Steitz . During pre-mRNA splicing, the two catalytic magnesium ion ligands are exclusively located in the RNA [20–22]. Therefore, the spliceosome can be regarded as a ribozyme, similar to group II introns, with which it shares evolutionary ancestry [23–25].
Owing to its complexity and highly dynamic nature, the structure of the spliceosome remained elusive until very recently. Early cryo-electron microscopy (cryo-EM) studies of snRNPs and spliceosomes provided the first glimpse into the overall shapes of these particles [26–29]. However, detailed architectural information, such as subunit localisation and their contacts, remained largely missing. This problem was partly overcome by the development of various labelling strategies, which allowed the approximate location of selected components within EM reconstructions [28,30,31]. However, the resolution limits imposed on these early EM studies encouraged the use of alternative methods (i.e. X-ray crystallography). Over the years, numerous insightful crystal structures have been solved [32–45], providing the first high-resolution insights into the design and working principles of the splicing machinery. Those structures have had a major impact on the interpretation of the newly emerging cryo-EM reconstructions.
Recent technological advances in cryo-EM [46,47], including the introduction of direct electron detectors and statistical image processing, paved the way forward to high-resolution reconstructions of dynamic spliceosomal complexes. In 2015, structures of the U4/U6.U5 tri-snRNP  and post-splicing ILS complex  were reported, allowing a detailed interpretation and modelling of large splicing complexes for the first time. Since then, nearly two dozen structures have been reported, see previous reviews [50–52], providing unprecedented insights into the mechanism of pre-mRNA splicing.
Substrate recognition and spliceosome assembly
5′-Splice site recognition
Correct substrate recognition is critical for splicing fidelity. In the yeast Saccharomyces cerevisiae, the 5′-SS and BP sequences are strictly conserved with the consensus/GUAUGU and UACUAAC, respectively (‘/’ denotes the cleavage point and ‘A’ is the BP adenosine) . The BP sequence is typically located 18–40 nt upstream of the conserved AG/ dinucleotide at the 3′-SS. The corresponding human sequences are more degenerate . During spliceosome assembly, the initial recognition of the 5′-SS occurs via base-paring with the 5′-end of the U1 snRNA [54–56]. This interaction was first visualised in the structure of human U1 snRNP [36,57], which revealed a role for the U1C protein in a sequence-independent stabilisation of the U1 snRNA : 5′SS duplex.
The structure of yeast U1 snRNP  revealed that its core architecture is very similar to human U1 snRNP, but it contains six additional proteins supported by a >300 nt-long yeast-specific stem loop 3 extension in U1 snRNA. Two of these proteins, Luc7 and Nam8, exhibit similarity to the known human alternative splicing factors LUC7-like and TIA-1, respectively. The 5′-SS and its complementary binding site in the U1 snRNA remain disordered in the isolated particles. Upon substrate binding and formation of complex A , the 5′-end of U1 becomes ordered, allowing the folding of previously disordered parts of Luc7 and U1-C. This hints at a possible role in substrate stabilisation by alternative splicing factors .
Branch point and the 3′-splice site recognition
The branch point region is initially recognised by two proteins, Msl5 (mBBP/SF1 in human) [60,61] and Mud2 (U2AF65 in human), which bind to the BP and polypyrimidine tract of the intron. Together with the pre-bound U1 snRNP and cap-binding complex, they constitute the spliceosomal complex E — the earliest, ATP-independent assembly intermediate. In humans, U2AF35 additionally recognises the conserved AG dinucleotide at the 3′-SS [62–64]. The bridging interactions between the U1 snRNP and Msl5/Mud2 dimer provide a scaffold for the subsequent integration of U2 snRNP and the formation of the complex A (pre-spliceosome).
In yeast, U2 snRNP recruitment and formation of spliceosomal complex A is mediated by two DEAD-box helicases: Sub2 and Prp5. Sub2 may be involved in the displacement of the Msl5/Mud2 dimer , which is necessary to make the branch region available for base-paring with the BP-interacting stem-loop of the U2 snRNA . Prp5 likely triggers the conformational changes in the U2 snRNA that modulate its binding to the branch site region [67–69]. The U2 snRNP was visualised for the first time within complex Bact and B [70–73], and can be divided into two distinct modules: the 5′-domain, comprising the 5′-end of the U2 snRNA together with the branch helix and the SF3b factor , and the 3′-domain, containing the SF3a complex , Sm ring, Lea1, Msl1, and the 3′-end of the U2 snRNA . This morphology is consistent with an early EM study of the U2 snRNP . The U2 snRNA : branch site duplex is in contact with the SF3b and SF3a factors. Overall, complex A exhibits a bipartite architecture, with only very few contacts between U1 and U2 snRNPs (Figure 3).
Initial assembly of the spliceosome.
Assembly of the pre-catalytic complex
Recognition of the 5′-SS and BP by U1 and U2 snRNPs is followed by recruitment of the U4/U6.U5 tri-snRNP, which leads to the formation of a fully assembled complex pre-B [76,77], containing all five snRNPs. Base-paring interactions between the 5′-end of the U2 snRNA and 3′-end of the U6 snRNA (U2/U6 helix II) tether these two complexes together. The U4/U6.U5 tri-snRNP remains largely unchanged within complex pre-B  when compared with the structure of the free particle [48,78,79]. The main body of the tri-snRNP is organised around Prp8 — the largest and most highly conserved protein in the spliceosome . The two major parts of Prp8, the N-terminal and large domains, are connected via a flexible linker. The N-terminal domain firmly holds the U5 snRNA and binds the putative GTPase Snu114 . The large domain, consisting of reverse transcriptase-like, linker and endonuclease-like domains  (Figure 4), provides a platform for the binding of U4 snRNP and tethering of the Brr2 helicase to its U4/U6 snRNA substrate. The U5 snRNA loop 1 and U6 snRNA ACAGAGA stem-loop  are located close together at the interface of the two domains of Prp8 in the so-called active-site cavity , which at later stages of splicing accommodates the RNA catalytic core and pre-mRNA substrate. Interactions between complex A components and tri-snRNP are weak in the pre-B structure, and the splice sites remain far apart and away from the active-site cavity. Stable incorporation of the tri-snRNP components into the spliceosome occurs in several steps during a process known as catalytic activation.
Domain architecture of Prp8 and its contacts with the RNA catalytic core of the spliceosome.
Spliceosome activation and formation of the active site
Complex pre-B contains a complete set of five spliceosomal snRNPs, but is not capable of performing catalysis for two main reasons: (1) the RNA-based active site is not yet formed and (2) the pre-mRNA substrates for branching are protected by the U1 and U2 snRNPs. The transition from complex pre-B to B initiates catalytic activation during which the 5′-SS is transferred from the U1 snRNP to the ACAGAGA-box of the U6 snRNA [82,83], and the U1 snRNP is displaced from the complex. This process is ATP-dependent and requires the DEAD-box helicase Prp28 , but the exact mechanism by which Prp28 promotes the transfer of the 5′-SS from U1 to U6 snRNA remains unclear. Interactions between the 5′-SS and ACAGAGA-box are necessary for stable incorporation of the tri-snRNP into newly formed complex B .
The structure of complex B [72,73] is largely similar to complex pre-B , although the displacement of U1 snRNP causes a substantial movement of the U2 snRNP and Brr2 helicase. Complex B-specific proteins (Spp381, Snu23, and Prp38) stabilise Brr2 and the ACAGAGA stem-loop of U6 snRNA [72,73] (Figure 3). The U2 snRNP binds to the tri-snRNP via two main surfaces, between Brr2 and LSm ring, and base-pairing interactions in the U2/U6 helix II. The branch helix is bound by the HEAT repeats of Hsh155 from the SF3b complex. After displacement of U1 snRNP, the 5′-SS is freed to interact with the ACAGAGA-box of the U6 snRNA. While this interaction has been observed in the human complex B , it is not fully formed in yeast complex B . This difference could reflect either a species-specific disparity or different methods used for complex stalling (low magnesium concentration for human vs. ATP depletion for yeast complex).
The Brr2 helicase is loaded onto a single-stranded region of the U4 snRNA in complex B, ready for unwinding of the U4/U6 snRNA duplex [85,86] and promoting B to Bact transition (Figure 5). This major structural and compositional rearrangement includes the displacement of more than 20 proteins (U4 snRNP proteins: LSm, Prp6, Snu66, and Dib1) and recruitment of many factors including NTC, NTR , and RES  complexes as well as Bact-specific proteins, Cwc24 and Cwc27 . Single-molecule studies showed that spliceosome activation is irreversible and competes with an unproductive tri-snRNP dissociation . Unlike other transiently acting helicases, Brr2 is an integral component of the spliceosome, which imposes the necessity for a tight regulation to prevent premature unwinding of the U4/U6 snRNAs duplex. Several different mechanisms of Brr2 inhibition have been proposed [42,89,90], but the temporal control of substrate loading and the signals that trigger helicase activity remain elusive.
Catalytic activation of the spliceosome.
Disruption of the U4/U6 snRNA duplex allows the single-stranded region of the U6 snRNA to fold back on itself and form an internal stem-loop (ISL) and base-pairing interactions with the U2 snRNA (helix Ia and Ib), consistent with earlier genetic experiments . These newly formed RNA structures include binding sites for the catalytic magnesium ions [20–22] and constitute the RNA catalytic core of the spliceosome.
The catalytic core is sandwiched between the N-terminal and large domains of Prp8 [40,49], which rotate ∼30° during the complex B to Bact transition to form the active-site cavity. The catalytic core is deeply embedded in Prp8 and further stabilised by the NTC components, in particular Cef1, Cwc15, and Clf1. In the complex Bact structure [70,71], the 5′-exon is bound to the U5 snRNA loop I [92,93], and further stabilised by Cwc21  and the remodelled Prp8 switch loop. The 5′-SS is base-paired with the ACAGAGA region of the U6 snRNA. The BP adenosine is sequestered away from the active site (∼50 Å) by the SF3b complex, and the 5′ scissile phosphate is protected by Cwc24 . To create a complex competent for the first step of catalysis, the branch helix has to be docked into the active site to bring the necessary reagents in close proximity. This remodelling of complex Bact leads to the formation of a catalytically competent B* spliceosome.
Substrate positioning for the first step
The transition from complex Bact to catalytically competent complex B* is mediated by the DEAH-box helicase Prp2  (Figure 5). The location of Prp2 in yeast complex Bact [70,71] suggests that it could act from a distance by pulling on the intron downstream from the branch site, and triggering the displacement of SF3a and SF3b complexes together with Bact-specific factors, Cwc24 and Cwc27 [96,97]. This remodelling allows the docking of the branch helix into the active site (Figure 6) in a configuration that is further stabilised by the branching factors, Cwc25 and Yju2 [96,98,99]. In yeast, the BP region has a strictly conserved sequence, UACUAAC (A denotes the BP adenosine), which base-pairs with the U2 snRNA to form a helix with the BP adenosine bulged out . In complex C, the BP adenosine forms an unusual hydrogen bond with a uridine at position BP(−2) of the intron, helping to project its 2′-OH towards the 5′-SS [101,102]. In contrast with other spliceosome complexes, the geometry of the branch helix deviates from a canonical A-form RNA in complex C.
Remodelling of the spliceosome between the two catalytic steps.
The structure of complex B* is unknown, but it is expected to resemble the post-step 1 configuration found in the complex C [101,102]. In this catalytic configuration, the 5′-exon is bound to the U5 snRNA loop 1, and the AUACAG region of U6 snRNA (residues 45–50) forms four Watson–Crick and two non-canonical base-pairs with the intron sequence downstream of the conserved GU dinucleotide. These interactions force a kinked geometry of the RNA backbone around the scissile phosphate, exposing it for nucleophilic attack from the branch point adenosine (Figure 7). Cleavage of the 5′-SS and formation of the 2′–5′ phosphodiester bond are hallmarks of complex C. The 5′-exon is base-paired with the conserved loop 1 of U5 snRNA in complex C, and its 3′-OH group remains very close to the 5′-SS, catalytic magnesium ions binding site, and newly formed lariat bond. This suggests that the complex was captured immediately after the first step of chemistry and the reaction could be reverted with minimal structural alterations, consistent with previous biochemical studies .
Structure of pre-mRNA substrate within catalytic spliceosome.
The upstream region of the 5′-exon is guided inside the spliceosome, in a channel formed between the N-terminal and large domains of Prp8, both of which form stacking interactions with the −4 and −5 positions of the 5′-exon. Branching factors, Cwc25 and Yju2, together with Isy1, play a critical role in the positioning of the branch helix at the active site. The N-termini of these factors penetrate deep into the active site, forming multiple contacts with the region adjacent to the BP.
Remodelling for the second step
A wealth of experimental data [18,104–106] suggests that the spliceosome has to undergo remodelling in order to switch from a branching to exon ligation configuration. The active site in complex C is sterically crowded, implying that the branch helix has to be repositioned to vacate the space necessary for 3′-exon incorporation. Consequently, the 3′-exon remains sensitive to nuclease digestion until after DEAH-box RNA helicase Prp16 action [107,108]. Prp16 interacts with the intron region downstream from the branch point (BP + 18) , and in complex C  the 3′-end of the intron projects from the active site towards Prp16. Translocation of Prp16 in 3′–5′ direction would pull the branch helix out of its binding pocket, suggesting a plausible remodelling mechanism .
The structure of complex C* remodelled for exon ligation [109–112] indeed showed that the branch helix has been undocked from the active site, and that the branching factors Cwc25 and Isy1 have been displaced (Figure 6). This structural change is accompanied by a long-range movement of the NTC/NTR module. The RNaseH-like domain of Prp8 plays a crucial role in guiding the branch helix movement and stabilising it in a new position [52,113]. The exon ligation factors Prp18 and Slu7 [105,114] play an essential role in securing the RNaseH-like domain in its new position, suggesting that their role in maintaining a catalytic configuration is less direct when compared with branching factors. Despite active site remodelling, none of the complex C* structures reported to date allowed visualisation of the 3′-exon position docked at the active site prior to exon ligation, suggesting the transient nature of this state.
The most valuable insights into second step catalysis come from post-step 2 complex P structures [115–117]. In complex P, both exons are ligated together, but stay closely attached to the 5′- and 3′-splice sites, as a dominant-negative mutation in the DEAH-box helicase Prp22 [118,119] blocks mRNA release. The overall configuration of complex P is very similar to complex C*, with the main difference being the presence of the 3′-exon and 3′-SS at the active site. The relative positions of the 3′-exon and 3′-SS suggest that prior to exon ligation, the substrate adopts a highly kinked conformation (Figure 7), exposing the scissile phosphate for nucleophilic attack by the 3′-OH group of the 5′-SS. The 3′-SS is recognised via two non-canonical base-pairs: G–G between 5′-SS G(+1) and the Hoogsteen edge of 3′-SS G(−1); and A–A between Hoogsteen edges of the branch point adenosine and 3′-SS A(−2). The interaction between the first and last bases of the intron was previously proposed based on genetic experiments [120,121]. The 3′-SS is further stabilised by the Prp8 α finger (1565–1610) and the conserved region of Prp18, which both bind the 3′-SS immediately upstream of the conserved AG dinucleotide. Similar to complex C*, the Prp8 RNaseH-like domain and its β finger, together with Slu7 and Prp18, guide the exact position of the branch helix.
Interestingly, 3D classification of the cryo-EM data shows that the docked 3′-exon configuration correlates with the presence of the C-terminal domain of the branching factor Yju2 , suggesting that two distinct domains of Yju2 play critical roles at two different steps of splicing.
After exon ligation, the DEAH-box helicase Prp22 removes ligated mRNA from the spliceosome and promotes the transition from complex P to the intron-lariat spliceosome (ILS). The position of Prp22, as determined in complex C* and P, suggests that it promotes remodelling from a distance by pulling on the mRNA downstream from the splice junction. Those structural insights are consistent with previous biochemical data that show spliceosome disassembly requires at least 13 ribonucleotides downstream from the 3′-SS . A similar mechanism has been proposed for the action of Prp2 and Prp16. The structure of S. cerevisiae ILS complex revealed that removal of spliced mRNA is accompanied by a displacement of several factors (including Cwc21, Cwc22, Prp18, Slu7, Prp17 and Prp22) and further conformational changes . As a consequence, the Ntr1/Ntr2 complex binds in the region vacated after displacement of Prp22. Ntr1 was previously reported to act as a Prp43 activator [123,124] and likely involved in its recruitment to the spliceosome. Indeed, the DEAH-box helicase Prp43 [125,126] is located in proximity to the disordered G-patch domain of Ntr1 . Long helical repeats of Ntr1 form a bridge between the GTPase Snu114 and Prp43, which might be important for allosteric regulation of spliceosome disassembly. As Prp43 is positioned near the 3′-ends of the intron-lariat and U6 snRNA, it has been proposed that either of these might be a substrate for this helicase . An ATP-dependent activity of Prp43 leads to dissociation of the lariat-intron from the spliceosome and subsequent disassembly [127,128].
The ILS complex structure from S. cerevisiae revealed significant differences in the previously reported post-splicing ILS complex from Schizosaccharomyces pombe . The most striking difference is the presence of Ntr1, Ntr2, Cwc23 and Prp43 in S. cerevisiae, but not S. pombe. At the same time, the S. pombe complex contains Cwf19, a homologue of the debranching enzyme cofactor Drn1 . Additionally, the global conformations of the two complexes show significant differences, in particular regarding the position of U2 snRNP core. A functional relationship between these two different states remains unclear.
Dynamics and complexity of the spliceosome
Comparison of various spliceosome structures reported to date has revealed its extraordinary conformational dynamics. The most prominent is the complex B to Bact transition, which displaces more than two dozen proteins to allow the formation of the U2/U6 helix Ia/Ib and U6 snRNA ISL structures (i.e. RNA catalytic core). During this rearrangement, the N-terminal and large domains of Prp8 rotate with respect to each other (by ∼30°), forming the active-site cavity, which accommodates the RNA catalytic core. This movement brings U5 snRNA loop 1 close to the active site. Once formed, the active site remains essentially unchanged throughout the catalytic stages of splicing, except for the highly mobile branch helix (Figure 6).
Several regions of Prp8 exhibit dynamic changes during the splicing cycle. These include: (1) the RNAseH-like domain, which seems to guide precise movements of the branch helix during catalytic stages of the splicing cycle (see previous review, ); (2) the α finger (1565–1610), which adopts multiple different conformations in various complexes and appears to be critical for stabilisation of the 3′-exon in complex P [115–117]; (3) the so-called switch loop, which toggles between two states upon 5′-exon binding (see previous review, ); and (4) the Jab1/MPN domain of Prp8 tightly binds the Brr2 helicase [41,42], and both proteins move dramatically during catalytic activation, becoming completely disordered past the complex C stage.
The U2 snRNA has been shown to exist in two mutually exclusive conformations: stem IIc/IIb and stem IIa/IIb [66,131], and genetic evidence suggests that U2 toggles between these two states. While the catalytic stages of splicing require the IIc/IIb configuration, the IIa/IIb configuration promotes substrate binding and release. Indeed, both configurations have been observed in cryo-EM structures [72,73,101,102].
One striking feature of the spliceosome structure are the extended α-helical repeat structures in Clf1 and Syf1 (NTC complex), and Ntr1. These repeats connect distant regions in the complex and likely play a role in its allosteric regulation. During the transition from complex C to C*, the undocking of the branch helix from the active site is accompanied by an orchestrated movement of the entire NTC module together with the U2 snRNP core domain, which are tethered to the complex via Syf1 helical repeats.
Complexity of yeast and human spliceosomes
Comparative proteomic analysis of yeast and human spliceosomes revealed that they share an evolutionarily conserved core . Accordingly, a phylogenetic analysis shows that the spliceosomes traced back to the last eukaryotic common ancestor had nearly unchanged complexity . Human spliceosomes tend to co-purify with a significantly larger number of factors, suggesting a more elaborate composition [10,11]. However, a semi-quantitative proteomic analysis of human spliceosomes showed that many of the factors are present in sub-stoichiometric quantities, and that their true composition might be simpler than previously expected . Indeed, comparison of human and yeast splicing complexes B and C shows a remarkable conservation of their core structure, with only a few additional subunits present in the human system (Figure 8).
Evolutionarily conserved core of the spliceosome.
In the human complex B, the most significant differences compared with the yeast complex are the three additional B-complex-specific proteins: RED, Smu1 and FBP21. Their location around the ACAGAGA:5′-SS helix, and contacts with Brr2 helicase, suggests that they may play regulatory role in the activation process (Figure 8).
In the structure of the human complex C , several additional subunits were visualised when compared with yeast complex [101,102]. These subunits include components of the exon junction complex [135–137], the Aquarius helicase , and four peptidyl prolyl isomerases (PPIs) (Figure 8). The same differences have been identified in human [111,112] and yeast [109,110] complex C*. Interestingly, the Brr2 helicase could be visualised in human complex C*, while remaining disordered in yeast.
The most striking difference between yeast and human spliceosomes relates to U1 snRNP. The human U1 snRNP [36,37,57] exhibits a much lower complexity than its yeast counterpart, which contains six additional proteins . Two of these proteins: Luc7 and Nam8 are homologues of known human alternative splicing factors .
The advent of cryo-EM unveiled a large number of high-resolution structures of splicing complexes, enormously advancing our mechanistic understanding of the basic splicing process in yeast. Nevertheless, there remains many challenges and new opportunities to explore.
Most of the currently available structures are restricted to states that can be captured using currently available biochemical methods, or the most stable steady states captured directly from cells. Given the magnitude of conformational and compositional changes during some spliceosome assembly transitions (for example, catalytic activation), it is likely that more intermediate states are yet to be discovered. Further dissection of the assembly process will provide a more detailed picture on the order and dynamics of these events.
Proteomic analyses of samples used for cryo-EM studies often show that not all the proteins present in sample preparations can be visualised in cryo-EM reconstructions. It remains unclear whether those components are truly disordered, or if their binding is just very transient and unstable. Visualisation of these loosely bound factors and transient states remains challenging and will surely attract more attention in the future.
The spliceosome is a very dynamic complex with a multitude of distinct conformational states present even in biochemically clean preparations. Dealing with continuous conformational changes remains one of the biggest challenges for cryo-EM single particle analysis, but some promising attempts to deal with this problem are being undertaken [139–141]. A comprehensive analysis of the population of all states present in a given sample, together with their relative occupancies, may allow a much better understanding of the thermodynamic properties of the system. Complementary data on the kinetics of the spliceosome assembly are now being provided by ever more complex single molecule experiments [8,18,99,142–144].
Alternative splicing is a major source of diversity in the human proteome [145,146]. In human cells, pre-spliceosomes can assemble across exons, forming the so-called exon definition complex , as opposed to the intron definition complex, that assembles across introns in lower eukaryotes. For productive splicing, the cross-exon complexes have to be converted into cross-intron complexes, but the mechanism of this process remains elusive. Structural characterisation of early spliceosomes with trans-acting and tissue-specific alternative splicing factors would provide valuable insights into this process. This, together with the structural characterisation of small molecule inhibitors/modulators of the spliceosome [148–151] (reviewed in ref. ), could pave the way towards potential medical applications.
On the technological front, it is anticipated that cryo-electron tomography (cryo-ET) imaging will become feasible to study splicing complexes in situ. This relates to the development of focused ion milling methods and Volta phase plate [153,154], especially when combined with improved sub-tomogram averaging algorithms . Visualisation of the splicing process in its native cellular environment remains one of the main goals for future structural studies of the spliceosome.
I thank Daniel Peter, Andrew McCarthy, Moritz Pfleiderer, Clemens Plaschka, and Andy Newman for critical comments on the manuscript. I am grateful to Kiyoshi Nagai and Andy Newman for their long-term support.
The Author declares that there are no competing interests associated with this manuscript.