In recent years, a dynamic view of the structure and function of biological macromolecules is emerging, highlighting an essential role of dynamic conformational equilibria to understand molecular mechanisms of biological functions. The structure of a biomolecule, i.e. protein or nucleic acid in solution, is often best described as a dynamic ensemble of conformations, rather than a single structural state. Strikingly, the molecular interactions and functions of the biological macromolecule can then involve a shift between conformations that pre-exist in such an ensemble. Upon external cues, such population shifts of pre-existing conformations allow gradually relaying the signal to the downstream biological events. An inherent feature of this principle is conformational dynamics, where intrinsically disordered regions often play important roles to modulate the conformational ensemble. Unequivocally, solution-state NMR spectroscopy is a powerful technique to study the structure and dynamics of such biomolecules in solution. NMR is increasingly combined with complementary techniques, including fluorescence spectroscopy and small angle scattering. The combination of these techniques provides complementary information about the conformation and dynamics in solution and thus affords a comprehensive description of biomolecular functions and regulations. Here, we illustrate how an integrated approach combining complementary techniques can assess the structure and dynamics of proteins and protein complexes in solution.
Structural biology and conformational dynamics of biomolecules
Structural biology aims to provide a detailed high-resolution explanation of molecular mechanisms that underlie biological functions. The regulation of biomolecular function often involves complex networks of biological molecules whose interactions can be modulated by modifications. Traditional structural biology methods are generally focused on studying the structural details of well-defined compact domains, proteins, and tightly bound complexes. In recent years, various advancements in structural biology have opened novel ground at one side for determining atomic-resolution structures of large complexes using cyro-electron microscopy (cryo-EM) and on the other side to assess dynamic conformational states and mechanisms using integrative approaches combining solution techniques.
Recent breakthroughs in cryo-EM push the limit of structural biology to enter the new era of supramolecular structures of very large machineries approaching atomic resolution. Structures of spliceosomes, reflecting different states of the spliceosomal assembly, have been revealed in the last couple of years, showing unprecedented structural details on these highly complex ribonucleoprotein complexes [1–3]. Yet, the higher-resolution structural information is usually limited to the rigid structural core of these mega complexes, leaving molecular details of essential transient and flexible regulatory components still to be uncovered. The development of free electron lasers enables the investigation of fast time-scale dynamics and time-resolved crystallography with great potential for future applications [4,5].
On the other hand, unraveling the functional importance of dynamic conformational ensembles is another key area in structural biology that has been greatly advanced in recent years. It is now realized that protein structures are better described by dynamically interconverting conformations in a shallow-free energy landscape [6–8]. Regulatory processes are not necessarily controlled by a two-step on-and-off switch, but rather can function like a rheostat with tunable population shifts of conformations, thereby allowing a gradual modulation of interactions for biological functions. Such examples have been seen in many biological pathways, such as in signal transduction (MAP kinase cascades, [9,10]), transcriptional gene regulation (DNA-binding of transcription factors, [11–13]), splicing regulation , and client interactions of protein chaperones  to name a few. An inherent feature is that the biological macromolecules adopt an ensemble of conformations rather than a single structure. Population shifts, i.e. a redistribution of populations of pre-existing conformations in this ensemble, are triggered upon receiving external cues, such as ligand-binding or posttranslational modifications . Recent studies on three-state conformational regulation of riboswitches  or transient Watson–Crick/Hoogsteen base pairing in the genetic code  demonstrate unexpected regulatory roles of conformational dynamics in the regulation of essential biological functions. The predominance of dynamic features in biomolecular mechanisms emphasizes the importance of employing a complementary solution-based structural biology technique with studying biomolecules in a more native-like state. Recent examples include the combination NMR and EPR in structural biology of the non-coding RNA RsmZ bound to RsmE proteins  and structural analysis of a box C/D snoRNP complex .
Here, we discuss integrated approaches to study the structure and dynamics of multidomain RNA-binding proteins and complexes by solution-based structural biology methods. We illustrate how different solution techniques, such as NMR, small-angle scattering (SAS), and Förster resonance energy transfer (FRET), provide unique and complementary information. Efficient computational methods are then used to integrate the various experimental data and obtain a comprehensive description of the structure and dynamics in solution (Figure 1).
Integrative structural biology in solution.
Solution NMR spectroscopy
Solution NMR spectroscopy provides the foremost details of structural information at atomic to residue resolution for studying biomolecule in solution. A great variety of powerful NMR methods offer tools to examine and study biological macromolecules and how the structure and dynamics is affected by cues, such as changes in environment (pH/temperature/buffer), ligand binding (including weak interactions up to dissociation constants KDs in the micro- and millimolar range), and modification (posttranslational, mutagenesis, or sample degradation). Most importantly, NMR can assess and quantitatively characterize conformational dynamics at timescales from nanoseconds to hours. Solution NMR can be readily employed to optimize the experimental conditions of other structural techniques. Commonly used protein NMR spectra are 1H-15N correlations for smaller systems (<30–50 kDa molecular mass) [21,22] or, for high molecular mass complexes, 1H-13C methyl correlations in highly deuterated proteins with selective protonation and 13C-labeling of methyl groups  (Figure 2). These two-dimensional correlations are fingerprints of a protein and provide efficient readouts to monitor changes in conformation and dynamics. Amide and methyl signals are advantageous to monitor the protein structure as they are well distributed in the protein (Figure 2) and sensitive probes of changes in their chemical environment. The NMR frequencies of these groups provide residue-specific readouts of structural and dynamic parameters in experiments that are of high sensitivity . Most commonly, chemical shift perturbations (CSPs) are analyzed to identify the binding interface of molecular interactions, by comparing chemical shifts before and after adding a non-isotopically labeled ligand. CSP analyses are also useful to identify structural domain/protein boundaries (construct optimization) and to screen sites for attaching paramagnetic tags or fluorescent labels for subsequent experiments (NMR and FRET) to determine their impact on the structural integrity of protein.
NMR fingerprint spectra for amides and methyls.
Structure determination by conventional NMR methods relies on NOE-based distance restraints, which reflect the dense atomic interaction network within a compact protein fold or a tightly bound complex. Obtaining such information becomes challenging with increasing molecular mass due to signal-overlap and line-broadening affecting the sensitivity of the experiment. However, when studying multidomain proteins or multimeric protein complexes, high-resolution structures of individual domains obtained using standard NMR or crystallography approaches can be used as a starting point to define the multidomain or multisubunit arrangement. Semi-rigid-body structure calculation protocols that include appropriate experimental restraints can efficiently report on long-range structural arrangements in the full-length context or a holo complex [25,26].
Defining relative orientations of domains and subunits from residual dipolar couplings
RDC (residual dipolar coupling) data provide information about the orientation of bond vectors and thereby can be used to define the relative orientation of rigid domains or subunits in a protein or protein complex (Figure 1A). RDC-based orientational restraints are useful for refining protein structures determined by conventional NMR methods or validating if a crystallograph represents the solution conformation. However, the utility of RDCs becomes more prominent for larger complexes. Here, RDC data can provide important restraints for defining the relative orientations of rigid domains or subunits within the complex based on knowledge of the structure of individual domains/subunits. Note, that in the presence of conformational dynamics and for flexible regions, RDCs are dynamically averaged. In this case, they cannot directly be used as orientational restraints . However, in combination with computational simulations, they can help defining an ensemble of conformations in solution .
Long-range distance restraints from paramagnetic effects
The presence of a paramagnetic center in a protein enables the sensitive detection of long-range distance and/or orientation information [24,29–32] (Figure 1A). This is especially useful for studying multidomain proteins or multimeric complexes where obtaining NOE-based distances is often challenging. Paramagnetic effects from unpaired electrons give rise to paramagnetic relaxation enhancements (PREs), pseudo-contact shifts (PCS), or RDCs from magnetic alignment [24,29–32]. PRE and PCS data provide long-range (15–60 Å) distance-dependent information between the paramagnetic center and a given nuclear spin and thus yield long-range distance restraints to study domain arrangements in multidomain proteins and protein complexes . Electron-nuclear spin interactions are rather strong, and thus, paramagnetic effects are very sensitive for detecting lowly populated states and transient interactions in conformational ensembles of dynamic systems [25,29–31,34]. Note, that, in the presence of such dynamic and transient interactions, great caution has to be taken with deriving structural restraints from PRE and PCS data, as these will be dynamically averaged and cannot therefore not easily be converted into distance and orientational restraints.
As most proteins are not paramagnetic, a paramagnetic center usually has to be introduced [29,31,35] (Figure 1A). For paramagnetic tagging of proteins, one common strategy is exploiting the reactive side chain of cysteine residues. In this case, a single cysteine is engineered at a strategic site to provide structural information based on paramagnetic effects on surrounding nuclear spins, while not perturbing the protein structure or function. Typically, cysteine residues are introduced at solvent-accessible regions of a protein to enable conjugation of a paramagnetic tag without interfering with its structure. Commonly used tags are nitroxyl spin labels (which yield only PREs) or lanthanide-binding tags (LBTs), such as DOTA-M8 , which give access to PREs, PCS, and RDCs. To minimize the flexibility of the tag, it is preferred to introduce the cysteine in a rigid region of the protein (avoiding flexible linkers) while preserving the protein structural integrity. Note that reactive native cysteine residues have to be removed to avoid ligation with the paramagnetic tag. Effects of cysteine mutation and attachment of the spin labels on the protein structure can readily assessed by analyzing the chemical shifts of amide methyl groups using NMR correlation spectra (Figure 2). Although many different spin labels are available for introducing a paramagnetic center, nitroxide radicals, such as MTSL and IPSL, are often used for conjugation with a cysteine side chain via disulfide and thioether bonds, respectively. Especially for the measurement of PCS data, a rigid attachment of the paramagnetic center is important. Here, doubly conjugated LBTs CLanPs have been shown to provide useful restraints . The structural restraints can be used to refine NOE-based structures, but are most efficient to define domain and subunit arrangements and to assess the conformational space in solution.
SAS techniques provide information about the shape of a molecule in solution (Figure 1B,C) and the existence of dynamic equilibria. Although NMR relaxation or diffusion experiments can also be used to estimate the size of a molecule in solution, the structural interpretation is difficult. SAS measures the scattering of electrons (SAXS) or neutrons (SANS) by a sample in solution. The Fourier transform of the scattering curve yields the pair-wise distance distribution of the atoms in the molecule, directly reflecting the molecular shape in solution. In the presence of conformational dynamics, the pair-wise distribution function reflects the shape of the conformational ensemble as population-weighted average. Although SAS data only provide lower-resolution information on the global shape (Figure 1B), SANS experiments can observe individual components of protein subunits and RNA in complex based on contrast matching. For this, samples with different combinations of deuterated or non-deuterated subunits are used and individual scattering contributions are matched out depending on the amount of D2O in the solvent (42% for protein and 70% for RNA contrast matching) (Figure 1C). This allows us to localize the protein subunits and RNA components within a complex. We provide an example of extending the resolution of SANS to the domain level, by differential labeling of protein domains (deuterated or protonated) based on protein ligation using the Sortase A enzyme.
Single-molecule FRET for studying conformational dynamics
Single-molecule fluorescence techniques provide a powerful tool to monitor and analyze molecular behaviors of proteins and nucleic acids both in vitro and in vivo. In spFRET (single pair), energy transfers of the two fluorophores, referred as donor and acceptor implemented into a biomolecule or more, are observed whose efficiency depends under appropriate conditions on the distance between the two fluorophores (FRET efficiency) [38–40]. We will discuss an example of the application of spFRET combined with confocal microscopy, where donor life time information is obtained upon donor excitation by a laser source. Correlation of FRET efficiency and donor lifetime can directly identify the presence of conformational dynamics. FRET can assess populations and dynamics of different states and thus report on dynamic equilibria of interconverting conformations (Figure 1D,E). For more details, we refer to recent reviews [38,40,41].
Example: conformational dynamics of the multidomain RNA-binding protein U2AF in solution
In the following, we illustrate the utility of integrative solution structural biology for studying the structure and dynamics of multidomain proteins with RNA recognition by the essential splicing factor U2AF (U2 auxiliary factor) [42–44]. U2AF is a heterodimer of multidomain RNA-binding proteins and essential for the recognition of the 3′ splice site in the early stage of splicing (Figure 3A). The large subunit U2AF65 (U2AF2) is composed of an N-terminal RS domain (BPS interaction) followed by Py (polypyrimidine)-tract-binding two RRM domains (RRM1,2) and a C-terminal UHM domain that mediates binding to splicing factor 1 (SF1). The short ULM stretch located C-terminal to the RS domain interacts with the UHM domain of the small subunit U2AF35 (U2AF1) to form a heterodimeric complex. A recent study on the fission yeast U2AF small subunit has shown that the two zinc-fingers flanking the UHM domain N- and C-terminally recognize the AG dinucleotide at the 3′ splice site  (Figure 3A). Extensive efforts have been made to understand the recognition of the naturally variant Py tracts by U2AF65 RRM1,2 [14,46–51].
Conformational dynamics of U2AF65 RRM1,2 (Phases 1 and 2).
Conformational dynamics of U2AF65 RRM1,2 in U2AF heterodimer (Phase 3).
(1) Discovery of the functionally important conformational dynamics of U2AF65 RRM1,2 and determining NMR-based structures of the free and RNA-bound states (combining PRE and RDC) (Figure 3).
(2) Determining the relative conformational space of RRM1 and RRM2 using ensemble analysis (combining PRE/RDC and SAXS) (Figure 3).
(3) Quantitative analysis of population shifts of RRM1,2 upon RNA binding and the role of U2AF35 (combining NMR and spFRET) (Figure 4).
Py-tract recognition by U2AF65 is an essential step in the formation of the early complex E. The function of the E complex is to recognize the 3′ splice site near intron/exon junctions in pre-mRNAs to ultimately promote spliceosome assembly and splicing of the intron. The efficiency of E complex formation is controlled by the U2AF. The structural mechanisms that underlie Py-tract RNA recognition by U2AF65 RRM1,2 highlight the importance of population shifts of dynamic arrangements of the tandem RNA-binding domains, which provide a quantitative control of the assembly of the spliceosome onto the pre-mRNA. A gradual population shift between the two conformational states of RRM1,2 relates changes in overall binding affinity to a quantitative shift of open/closed domain arrangements. This adds an additional level of regulation and fidelity by preventing unwanted spliceosome assembly and also tolerates natural variations in Py-tract sequences that exhibit different lengths and binding affinities. The population shift from closed to open arrangements allows for additional layers of regulation. This is illustrated with the role of the small subunit U2AF35, which is able to pre-shift the conformational equilibrium of U2AF65 RRM1,2 to enhance spliceosome assembly and thus splicing of weak introns in specific contexts.
Phase 1: conformational dynamics of U2AF65 RRM1,2 domains
Mackereth et al.  discovered that in solution, the tandem RRM domains of U2AF65 exist in a dynamic equilibrium of conformational states, namely ‘closed’ and ‘open,’ corresponding to the structures of free RRM1,2 and RNA-bound RRM1,2, respectively (Figure 3B). By employing PRE and RDC methods (Figure 1A), they showed that dynamic conformational equilibrium of the two conformations allows gradual population shifts that function like a rheostat and enables the recognition of diverse natural Py sequences with different binding strength. The functional significance of the conformational shifts of domain arrangements is given by the fact that these directly scale the efficiency of spliceosome assembly and thus splicing of the corresponding intron.
In the study, PRE measurements played a pivotal role in identifying the presence of lowly populated ‘open’ conformations in the ‘closed’ conformation state. For PRE measurements [14,26], spin labels were introduced individually at 10 different cysteine sites in RRM1,2. Cysteine sites were engineered strategically to allow distinguishing the ‘open’ and ‘closed’ conformations and monitoring the conformational dynamics of RRM1,2. Distinct interdomain PRE patterns were observed for free and RNA-bound RRM1,2, clearly reflecting a large conformational rearrangement seen in the structures of RRM1,2 domains upon binding to RNA (Figure 3B,D). More interestingly, further analysis of the PRE patterns showed the presence of residual PRE imprints corresponding to the RNA-bound ‘open’ conformation in the PRE pattern of the free ‘closed’ state, already in the absence of any RNA ligand. This indicates the presence of lowly populated RNA-bound like conformation in the free RRM1,2 (Figure 3B), thereby suggesting conformational selection of pre-existing bound conformations as a binding mechanism. RDC data [14,26] from two different alignment media provided structural restraints for defining relative orientations of the domains in the free and RNA-bound states. PRE-based distance restraints and RDC-based orientational restraints were combined together in a restrained molecular dynamics calculation to determine the structures of RRM1,2, starting from the previously known structures of the individual domains . Using PRE- and RDC-assisted rigid-body modeling, two converged lowest energy structures were determined for the free and RNA-bound RRM1,2 [14,26] (Figure 3B,D).
In this study, NMR RDC and PRE data revealed the presence of a dynamic equilibrium of RRM1,2 domain arrangements in U2AF65 and enabled their structure determination, where the typical NOE-based distance measurements had proved challenging. PRE data enabled detecting the conformational equilibrium of the two states. Further introduction of PRE-based distance and RDC-based orientation restraints allowed us to determine the structures of free and RNA-bound RRM1,2.
Phase 2: combining NMR and SAXS with unbiased ensemble analysis
In phase 1, the RDC and PRE data were used to determine rigid structures of the open and closed conformations of RRM1,2. However, SAXS data indicated that the unbound RRM1,2 domains adopt a large conformational space where the two domains can also locally diffuse in a range depending on the 30-residue flexible linker connecting them. To determine the conformational ensemble of the free RRM1,2, the NMR and SAXS data were used to define the dynamic ensemble of conformations of RRM1,2 that exists in solution.
Huang et al.  combined experimental and computational methods to assess the conformational space of the two RRM domains in solution. The unbiased approach used conformational sampling and experimental data from NMR (previously used RDC & PRE) and SAXS measurements. First, a large pool of stochastically generated domain arrangements of RRM1 and RRM2 was created that is only limited by the length of the RRM1,2 domain linker. Subsequently, an ensemble of conformers was selected from the initial pool to accurately represent the experimental data of PRE, RDC, and SAXS data (Figure 1F). The ensemble was selected by cross-validation against the experimental data to avoid overfitting. The final ensemble of conformers showed an anisotropic distribution of RRM1,2 domain arrangements (Figure 3C) where a large population of the conformers in the ensemble retain the two domains in proximity to each other, resembling the ‘closed’ conformation. Notably, a significant fraction of molecules exhibits detached domain arrangements with the flexible linker enabling the two domains to sample a large conformational space by rotational diffusion (Figure 3C). The interdomain contacts of RRM1,2 appear mainly electrostatic-driven as the interdomain PRE weakens with higher salt concentration.
Contribution of different experimental types to the ensemble selection
The study also highlights the contribution of the different types of experimental data to the ensemble selection. Only the availability of all three types of data (NMR PRE, RDC, and SAXS) enabled accurately mapping the conformational ensemble. Notably, excluding the SAXS data resulted in a more compact final ensemble, mainly driven by the distance restraints from the PRE data and, therefore, reflecting the converged ‘closed’ conformation of RRM1,2 in the previous study. In general, caution has to be taken to include sufficient experimental data to define the ensemble and avoid overfitting and incorrect solutions.
Advantage of the integrated approach
SAXS provides pair-wise distance distributions of the scattered object in solution, thus representing the (dynamically averaged) overall shape of a molecule or conformational ensemble in solution. In this study, SAXS data contributed in determining the spatial distribution of RRM1,2 domains (Figure 3C). The PRE data are ‘blind’ to detect non-compact domain arrangements, and thus, SAXS provided crucial information about the conformational ensemble of the free protein.
Furthermore, this study highlights how an unbiased computational ensemble analysis method can determine the ensemble from the available experimental data (Figure 1F).
Phase 3: combining NMR and spFRET
In the next step, an orthogonal approach of single-molecule spFRET was employed in combination with NMR techniques to reveal how the conformational equilibrium of the free and RNA-bound states of RRM1,2 can be shifted by the presence of the small subunit of the U2AF heterodimer, highlighting an unexpected role of U2AF35 for the recognition of the 3′ splice site . Here, two different fluorophores, spFRET, were attached to the strategically engineered cysteine sites in each domain of RRM1,2. Both cysteine sites and positions for dye conjugation were carefully optimized by NMR and assessing RNA binding using ITC to ensure that the tag attachment does not impair the structure or molecular interactions . The single-molecule spFRET approach confirmed the presence of two distinct conformational states of RRM1,2 consistent with the previous studies by NMR and SAXS (Figures 1D and 4A). The correlation of FRET efficiency and donor fluorophore lifetime allowed to measure and monitor the changes in (1) domain distances, (2) populations shift, and (3) presence of dynamics in each state in the presence and the absence of RNA substrates (Figure 1D,E). Domain distances in the two conformational states observed by spFRET showed an excellent agreement with distances back-calculated from the previous NMR/SAXS-derived structures of free and RNA-bound RRM1,2. In the absence of RNA, RRM1,2 exists in a dynamic state (Figure 1E), where the tandem domain arrangement spends indeed a larger fraction of time in the ‘closed’ conformation state based on the analysis of FRET efficiency vs. donor lifetime. This is fully consistent with the previous ensemble analysis (based on NMR and SAXS data) of the free RRM1,2 where a majority of RRM1,2 conformers were found with the two domains being in spatial proximity in the absence of RNA (Figure 3C). A striking population shift from ‘closed’ to ‘open’ is observed upon binding to the strong Py-tract RNA (Figures 3 and 4A). However, upon binding to a weak Py tract, RRM1,2 shows a dynamic state with an intermediate FRET efficiency that indicates that the equilibrium is only partially shifted from ‘closed’ to ‘open’ states (Figures 1D and 4A), consistent with the NMR-based analysis of RRM1,2.
More excitingly, spFRET-based analysis was extended to study the RRM1,2 conformation in the context of the U2AF heterodimer, including the smaller subunit U2AF35 UHM (Figure 4B). Surprisingly, the presence of U2AF35 shifts the FRET efficiency of RRM1,2 more toward the ‘open-like’ state FRET efficiency already in the absence of RNA (Figure 4B, left). This suggests a role of U2AF35 UHM in enhancing Py-tract binding by U2AF65 RRM1,2 by pre-shifting the conformational equilibrium more to the ‘open’ state. As expected, when bound to the strong Py-tract, RRM1,2 in the U2AF heterodimer adopts similar conformation as seen for the RRM1,2 bound to the strong Py tract in the absence of U2AF35 (Figure 4). However, with the weak Py tract, the FRET efficiency for the RRM1,2 in the U2AF heterodimer indicates a complete shift to the ‘open’ state (Figures 1D and 4B, right), unlike the intermediate population shift seen for RRM1,2 alone (Figures 1D and 4A, right). This suggests an unexpected role of U2AF35 in stabilizing a U2AF65 ‘open’ conformation by inducing a population shift to assist weak Py-tract recognition. NMR PRE and SAXS experiments suggest a molecular mechanism for the population shift, which relies on an interaction of U2AF35 UHM with the RRM1 domain U2AF65, as indicated in Figure 4B.
Advantage of the integrated approach
Single-molecule spFRET provides an orthogonal approach to the previous structural studies of RRM1,2 by NMR and SAXS (Figure 1). Quantitative analyses using the donor lifetime and FRET efficiency allow us to visualize the population shift of the two states and to identify the presence of dynamics (Figure 1D). Additionally, the single-molecule approach allows studying conformational dynamics more accurately as each molecule is observed individually rather than as an averaged value. However, unlike most other solution techniques discussed earlier, spFRET requires covalently attaching of large fluorescence dyes and engineering a site (cysteine mutation) within the protein, which often alter the behavior and stability of the molecule. However, in combination with the NMR technique , effect of the cysteine insertion and dye selection/attachment on the structural integrity can be readily tested using NMR-based CSP analysis (Figure 2). It should also be noted that, in this study, spFRET data were not used as structural restraints to further refine the ensemble.
Determining domain arrangements of T-cell intracellular antigen 1 by NMR and SAXS/SANS
TIA-1 (T-cell intracellular antigen 1) is a prototypic multidomain RNA binding. It plays a role in spliceosome assembly by binding to an intron region of the pre-mRNA and recruiting the spliceosomal U1 snRNP complex. TIA-1 consists of three RRM domains followed by the C-terminal Q-rich region, where the RRM2 and 3 bind pre-mRNA and Q-rich region interacts with U1C of U1snRNP.
To study domain arrangements in the multidomain RBPs such as TIA-1, Sonntag et al.  developed a novel approach that combines SAXS and SANS with segmental isotope labeling. SANS allows us to localize individual components in a multidomain protein or complex. This is based on contrast-matching domains or subunits from the buffers containing D2O percentage of 42 or 70 for matching proteins or RNA, respectively (Figure 1C). The domain arrangements of TIA-1 RRM1,2,3 bound to RNA were studied by combining NMR and SAXS/SANS data (Figure 5). A structural model of the domain arrangement was obtained by computational rigid-body modeling based on the available structures of individual domains. To maximize the domain resolution in the contrast-matching technique by SANS, RRM1 and RRM2,3 were segmentally labeled by (1) generating two constructs (RRM1 and RRM2,3) with optimized domain boundaries assessed by NMR, (2) preparing the constructs individually with and without perdeuteration, and (3) ligating them together by Sortase A reaction. For the structural modeling of RRM1,2,3 bound to RNA, initial structural models were generated based on existing structures of free RRM1, RMR3, and RRM2 bound to RNA, by randomizing the flexible linkers connecting the domains. Representative models were selected by sampling the domain arrangements available from random sampling of the linker conformations. Structural models of the three domain arrangements were then selected by first scoring against the experimental SAXS data and subsequently filtering against contrast-matched SANS data. This approach results in a rather well-defined domain arrangement of RRM1,2,3 in the RNA complex (Figure 5). The relative domain orientations are poorly converged, as no experimental information was included in the protocol. Nevertheless, a low resolution model of the domain arrangements can be efficiently obtained using this protocol.
Determining domain arrangements in multidomain proteins from SAXS and SANS data.
Advantages of the integrated approach
The combination of segmental deuteration with SAXS/SANS filtering methods provides an efficient protocol for studying domain arrangements in a multidomain protein or protein complex starting from available structures of individual domains or subunits (Figure 1). The study highlights the utility of segmental (domain) ligation and isotope labeling [53–56] for NMR and SANS experiments. Additional structural restraints can be obtained for a higher-resolution complex structure by including NMR PRE and RDC data.
Structural biology of macromolecular structures in solution is equally important and challenging. While the presence of conformational dynamics should be detected and characterized in solution, simultaneously defining the structure and dynamics adds another level of complexity. Although NMR spectroscopy provides powerful tools to study macromolecular structures in solution at atomic resolution, the size and dynamics of a given system can become challenging for using conventional NMR methods. Here, we have illustrated, with our own examples, how different solution-based structural methods can be combined to complement each other and optimize their experimental conditions for accurately determining dynamic conformational ensembles. Distinctive contributions of the complementary techniques (NMR, SAS, and FRET) enable comprehensive description of the structural ensemble in solution and often identify unexpected molecular mechanisms that are governed by dynamic interactions in solution to regulate biological functions.
Dynamics and population shifts of conformational states play important roles in protein function.
Solution techniques are crucial to the study of the structure and function of biological macromolecules.
Combining solution NMR, SAXS/SANS and FRET highlights essential roles of dynamic population shifts in RNA recognition by RNA-binding proteins.
chemical shift perturbations
electron paramagnetic resonance
Förster resonance energy transfer
isothermal titration calorimetry
nuclear Overhauser effect
paramagnetic relaxation enhancements
residual dipolar coupling
RNA recognition motif
small-angle neutron scattering
small-angle X-ray scattering
single pair Förster resonance energy transfer
T-cell intracellular antigen 1
U2 auxiliary factor
We thank Abraham Lopez, Florent Delhommel and Miriam Sonntag for comments. We are grateful to Juan Valcarcel, Julian König, Martin Blackledge, Don Lamb as well as current and previous members of the Sattler lab for fruitful discussion.
The Authors declare that there are no competing interests associated with the manuscript.