Small-angle X-ray scattering (SAXS) has become a streamline method to characterize biological macromolecules, from small peptides to supramolecular complexes, in near-native solutions. Modern SAXS requires limited amounts of purified material, without the need for labelling, crystallization, or freezing. Dedicated beamlines at modern synchrotron sources yield high-quality data within or below several milliseconds of exposure time and are highly automated, allowing for rapid structural screening under different solutions and ambient conditions but also for time-resolved studies of biological processes. The advanced data analysis methods allow one to meaningfully interpret the scattering data from monodisperse systems, from transient complexes as well as flexible and heterogeneous systems in terms of structural models. Especially powerful are hybrid approaches utilizing SAXS with high-resolution structural techniques, but also with biochemical, biophysical, and computational methods. Here, we review the recent developments in the experimental SAXS practice and in analysis methods with a specific focus on the joint use of SAXS with complementary methods.
Small-angle X-ray scattering (SAXS) is a powerful method to analyse the overall structure and conformational changes of biological macromolecules in the solution . SAXS is commonly used to probe the folding state, interactions and flexibility of proteins, the structures of macromolecular complexes, and the structural responses to variation in external conditions [2–4]. The spatial resolution in SAXS is ∼15–30 Å which is, in comparison with some other methods, such as macromolecular crystallography (MX), nuclear magnetic resonance spectroscopy (NMR), or cryo-electron microscopy (cryo-EM), relatively low-resolution yielding information on the size and shape of the particles. However, SAXS is directly applicable to near-native solutions in an extremely broad range of molecular weights (MWs) from a few kilo- up to giga-daltons.
The SAXS experiments can be conducted in dilute solutions and usually require no special sample preparation. For structural studies aiming at shape analysis, the solutions need to be highly purified, monodisperse, and moderately diluted (1–5 mg/ml) , but no freezing, crystallization, or labelling is required. SAXS is, therefore, extremely useful for high-throughput screening where different solution conditions are analysed (e.g. ligand binding or temperature or pH changes). SAXS yields highly complementary information to other structural techniques and the method is, therefore, very useful in hybrid approaches where a combination of techniques is utilized to comprehensively characterize a macromolecular system.
Owing to the progress in experimental stations and in analysis methods, the popularity of biological solution SAXS is rapidly increasing as evidenced by the number of publications (Figure 1). In the present review, major recent developments in biological SAXS are presented with a special focus on the joint use of SAXS and other techniques.
Number of publications for the last 30 years mentioning SAXS on biological macromolecules as recorded in the Web of Science database.
Progress in SAXS experimental methods
In a SAXS measurement, a monochromatic, collinear X-ray beam is impinging onto the sample and the X-rays are scattered on the dissolved macromolecules. These molecules do not adopt a periodic, static arrangement as in a crystal, but they are randomly oriented in the solvent, and the scattered X-rays give rise to a diffuse signal close to the primary beam (instead of sharp reflections observed for crystalline samples). The angular dependence of the SAXS intensity is related to the overall particle structure and this allows one to determine structural information. In case of dilute particle solutions under standard conditions, this scattering pattern is isotropic and can be azimuthally averaged (Figure 2).
Schematic overview of a typical SAXS measurement.
The so-recorded SAXS signal contains scattering not only from the macromolecule but also from the solvent, sample container and additional background from the SAXS instrument. The scattering from the matching buffer is measured as well and the difference intensity between the appropriately normalized scattering curves (solute minus solvent) is the actual SAXS signal from the dissolved macromolecules, I(s), which includes the scattering from the particles and their hydration shell. The intensity is usually expressed as a function of the momentum transfer s = 4πsinΘ/λ, where λ is the X-ray wavelength and 2Θ the scattering angle (note that other letters, e.g. q, h, or Q are also utilized in publications to depict the momentum transfer).
For dilute solutions (typically, below 1% particle concentration) without interparticle interactions, the difference intensity (called form-factor) is related to the particle structure. For high concentrations, an additional interference contribution may be present in the SAXS signal (a so-called structure factor), which reflects correlations in particle positions and depends on the interparticle interactions. The structure factor contribution, which affects the scattering curves largely at very small angles, may be useful to study interactions [6–8] but for the analysis of the structure it, somewhat controversially to its name, should be removed. In the structural studies, the solute concentration is usually sufficiently small such that no structure factor contribution is present, but sufficiently high to still have a strong scattering signal. A usual approach is to perform measurements at different solute concentrations and extrapolate to infinite dilution .
Historically, SAXS appeared at laboratory X-ray sources, and today, laboratory SAXS cameras (produced, e.g. by companies Anton Paar, Bruker, Panalytical, Rigaku, Xenocs) continue to play a significant role. Most advanced studies are presently performed on synchrotron SAXS beamlines, partially or fully dedicated to structural biology (e.g. P12 at EMBL/PETRA III, Germany , SAXS/WAXS at the Australian Synchrotron, Australia , or SWING at SOLEIL, France , and many others, see Table 1). At these, the high X-ray flux and the low parasitic scattering background allow one to collect SAXS data with sub-second time resolution. The decreasing pixel size, increasing frame rates and the short read-out times of modern single-photon counting X-ray detectors lead to a continuous improvement of the SAXS data quality [12,13]. At the same time, robotic sample changers further speed up the data collection and reduce the amount of precious biological sample needed . Automatic analysis pipelines directly process the collected data, such that the users obtain processed scattering curves, the overall parameters of the samples, and even low-resolution three-dimensional (3D) shape models essentially in real time [15–17].
In standard batch measurements, a set of scattering patterns (frames) is recorded with short exposure times, allowing us also to monitor the possible radiation damage. The frames are compared for statistical similarity (e.g. using a correlation map ), and only those frames above a similarity threshold enter the averaged SAXS profile for the given sample. Recent improvements in data collection aim at mitigating the effect of radiation damage induced by the intense X-ray beams [19–21]. As the radiation damage in SAXS, unlike MX under cryo-conditions , cannot be predicted a priori, different schemes have been developed to reduce this effect, such as continuous sample flow , optimized geometry of the exposure cell (Schroer et al. submitted), co-flow of the sample surrounded by the buffer , addition of small scavenger molecules  or cryo-cooling . Alternatively, yet higher X-ray fluxes, as obtained by using multi-layer monochromators instead of double crystal monochromators, and fast data collection schemes allow one to collect high-quality SAXS data before changes in the sample influence the signal and thus to outrun the radiation damage (Blanchet et al. submitted).
Recently, SAXS combined with size-exclusion chromatography (SEC-SAXS) became popular in the studies of samples for which coexisting oligomeric states are difficult to purify offline [11,28,29]. Splitting the solution eluding the SEC column into two parts, one exposed to the X-rays and the other used for optical spectroscopy, allows one to accurately determine the concentration and the MW along the SEC profile . Owing to the recent advances in data analysis, essential structural parameters can be readily extracted from the data sets containing thousands of frames recorded in a SEC-SAXS run [30–32]. In particular, the program CHROMIXS  of the ATSAS package  automatically detects the separated fractions and pure solvent ranges and provides the SAXS curves of the different oligomeric species within the solution. Another recent approach employing evolving factor analysis allows a model-independent separation of scattering profiles in the case of overlapping SEC peaks .
Owing to the continuous increase in flux and to the tunable X-ray energy provided by synchrotron sources, macromolecules under more complex conditions can be studied. Microfluidic devices reduce the amount of material down to 200 nl per SAXS measurement and allow one to screen the effects of different additives on the structure or rapidly characterize crystallization conditions [35–37]. SAXS experiments on macromolecules under high pressure [38,39] or shear flow [40,41] yield insights into the structural responses to perturbations and provide information on how proteins react to extreme conditions. Time-resolved SAXS measurements with or even below millisecond resolution can now be performed yielding high-quality data . This does allow one to study the structural responses induced by rapid mixing to change the solution conditions [43,44] or triggered by laser light [45,46]. Additional information can be obtained by anomalous SAXS (ASAXS); in this approach, one varies the X-ray wavelength in the vicinity of the absorption edge of specific atoms in the sample. This changes their scattering capacity and allows one to determine the spatial distribution of these atoms . Using ASAXS, e.g. the ionic clouds around DNA and RNA molecules could be determined [48,49].
The novel X-ray free electron lasers (XFEL), such as the European XFEL (Germany) , Linac Coherent Light Source (LCLS, U.S.A.) , and SPring-8 Angstrom Compact free electron Laser (SACLA, Japan) , will not only enable serial crystallography on small crystallites , but are also expected to provide a breakthrough in the scattering experiments. Owing to the extremely intense, femtosecond short X-ray pulses and their high degree of coherence [54,55], structural details that have been averaged out in conventional SAXS could be revealed . For instance, the 3D structure of nanoscale viruses has been recently determined by an angular cross-correlation approach on coherent SAXS data obtained at the LCLS , which demonstrates the potential for biological samples.
Developments in SAXS analysis methods
Some important overall parameters of the solute are directly obtained from the SAXS data using well-established procedures. These include the particle MW, radius of gyration, excluded volume, maximum size, folding state (compact or flexible),  and the recently introduced volume-of-correlation . It is further possible to reconstruct low-resolution 3D models of proteins or complexes and also to characterize flexible and polydisperse systems. Many of the advanced methods for display, processing, analysis, and modelling can be found, e.g. in ATSAS , BioXTAS RAW , or SCÅTTER .
Ab initio methods are able to determine low-resolution shapes from SAXS data without a priori information on the particle. Popular reconstruction algorithms are based on refining the shape by finite volume-elements [e.g. densely packed beads (DAMMIN/DAMMIF or DENFERT) [61–63] or, for proteins, dummy residues (GASBOR) ]. Starting from a random configuration, these methods employ global optimization algorithms (e.g. simulated annealing) to fit the experimental data while ensuring feasibility of the model (e.g. compactness, interconnectivity, or, if applicable symmetry). Typically, a dozen reconstructions starting with random configurations are performed and the resulting models are superimposed and analysed [65–67]. Another ab initio algorithm (program DENSS ) does not employ real-space parameterization, but instead it utilizes an iterative structure factor retrieval algorithm followed by the averaging of the densities obtained in multiple runs. Recently, methods were developed to predict the potential ambiguity of the shape reconstruction from a given SAXS curve (program AMBIMETER ) and to evaluate the resolution of the ab initio shape models by a posteriori analysis of the Fourier shell correlations  from multiple reconstructions (program SASRES ).
Another major approach to SAXS data analysis employs hybrid modelling utilizing available atomic models of the macromolecule, its homologs, deletion mutants, domains, or subunits/components determined by other methods. In most straightforward cases, atomic model(s) of the entire macromolecule are validated against the measured SAXS profiles [72–74]. Importantly, the programs computing the scattering curves from atomic structures always need to account for the contribution from the hydration shell, using either a solvation shell like CRYSOL  or FoXS  or explicit water molecules from molecular dynamics simulations (AXES , HyPred , and WAXSiS ). The static structure from MX may not always adequately represent that of the macromolecule in the solution. Here, exploring the conformation space of the high-resolution models using normal-mode analysis to fit the SAXS data can, therefore, account for possible variability of the structural domains [78,79].
For larger complexes where high-resolution structures of subunits/domains but not of the complete macromolecule are available, SAXS can be used for rigid body modelling. Here, the subunits are moved and rotated to construct an interconnected assembly without steric clashes and, if necessary, also add missing fragments and linkers while fitting the experimental data, e.g. using the ATSAS programs SASREF or CORAL [80,81]. There are also other rigid body modelling programs available, e.g. FoXSDock , pyDock-SAXS , or CCP-SAS .
The above approaches are applicable for monodisperse solutions of compact macromolecules. In case of polydisperse systems like oligomeric mixtures, transient complexes, or flexible macromolecules, the scattering curve is a linear combination of the individual signals from all species or conformers. In these cases, direct structural modelling is difficult, but if the type of polydispersity is known, the SAXS curve can still be meaningfully analysed. Thus, for oligomeric mixtures, if the scattering curves of the individual components are known, their volume fractions can be determined by fitting the SAXS curve, e.g. using the program OLIGOMER . If the number of components is not known a priori, but multiple data sets are recorded with their varying fractions, the number can be estimated using singular value decomposition . Recently, a chemometric decomposition method was proposed utilising a Multivariate Curve Resolution-Alternating Least Squares (MCR-ALS) procedure on differently weighed SAXS data sets. This method (implemented in a program COSMICS)  provides estimates of the scattering curves from the components and their relative abundances utilizing constraints based on the physical nature of the evolving system. Such approaches can be especially powerful, e.g. for the analysis of time-resolved data where transient species are present.
For highly flexible macromolecules like intrinsically disordered proteins, which cannot be crystallized, SAXS remains one of the most powerful structural methods. These systems adopt variable configurations making the number of components astronomically large. Here, approaches utilizing large pools of possible configurations are employed to study the flexibility and to categorize the conformational heterogeneity. Thus, the program EOM [87,88] starts from a random pool (which may also incorporate high-resolution structures of folded domains) and uses a genetic algorithm to select a subset of conformations whose mixture fits the experimental data.
Experimental SAXS data became useful in improving protein fold recognition algorithms  by helping one to select the best template structures. An option to compute and fit the SAXS curves was included into the interactive visualization and analysis tool UCSF Chimera where homology protein structures can be created by the program MODELLER . In another approach (program AllosMod-FoXS), all-atom ensemble modelling is used to generate SAXS profiles of glycoproteins with flexible glycan chains, which can be used to interpret experimental SAXS data .
Given the rapidly growing number of publications in biological SAXS (Figure 1), efforts are underway to standardize the presentation of the SAXS data in scientific publications and to ensure access to the relevant information on the sample, models, and data treatment. Thus, publication guidelines have recently been developed by a commission of the International Union of Crystallography [92,93]. To make the published SAXS data and models freely accessible, a Small Angle Scattering Data Bank (SASBDB; https://www.sasbdb.org) is available, presently containing over 550 experimental data sets and more than 900 models . It is also possible to rapidly screen the experimental SAXS data against over 150 000 scattering profiles pre-computed from atomic models from the Protein Data Bank via a web-server DARA .
Recent hybrid applications of SAXS
Nowadays, advanced applications of SAXS involve the joint use of solution scattering data with biophysical/biochemical, computational techniques, and with high-resolution structural methods, of which the combination with MX remains most widely employed. In a recent study, an MX/SAXS application shed light on the underlying mechanism of the directional translation of large repeats-in-toxin (RTX) proteins through special channel-tunnel ducts, which is central for protein secretion of bacteria . SAXS data were taken from an RTX repeat block of the multi-domain CyaA as well as from the full-length protein. The low-resolution ab initio model of the repeat block could be superposed with a high-resolution MX structure (Figure 3a). For the full protein, an elongated, zig-zag-shaped ab initio model depicted five consecutive RTX repeat blocks. Based on this model, a ratchet mechanism was predicted which prevents the backsliding of the protein upon translocation through the duct.
Examples of hybrid applications of solution SAXS.
In another recent SAXS/MX study , rigid body modelling was used to establish the tandem structure of the conserved C-terminal domain (CTD) and the immunoglobulin-superfamily domain (IgSF) of gingipain RgpB. RgpB is a secreted cysteine protease involved in the virulence of Porphyromonas gingivalis, a major periodontal pathogen. This tandem structure is important for the covalently attachment of RgpB on the outer bacterial membrane. As revealed by SAXS/MX, the two domains are connected by a linker giving rise to a high flexibility.
In an increasing number of publications, SEC-SAXS is employed to improve the data quality. Using this technique, the structure of the passenger domain of IcsA, a virulence factor needed by Shigella flexneri, which causes severe dysentery, was recently published . Based on the ab initio model, the protein does exhibit a central kink. Computational structure predictions indicated an elongated, rod-like shape, which did not agree well with the experimental data. Using SREFLEX [33,79], the predicted structure was yielding an L-shaped model (Figure 3b). In another SEC-SAXS application, solution structure of the monomeric multi-component complex retromer, which recycles transmembrane cargo from endosomes, could be determined. The ab initio model was further refined by rigid body modelling using the crystal structures of the components .
An interesting case of the joint use of SAXS and MX for a flexible protein is given by the study of the apo form of the trimeric chaperone Skp. Here, the solution SAXS curves could not be related to a single high-resolution structure, but, using EOM, Skp was shown to exist as a dynamical ensemble of multiple configurational states .
NMR, similar to SAXS, can be used to study proteins in solution, yields atomic resolution for moderately sized macromolecules and also provides information on their dynamics. The high complementarity of NMR and SAXS, including utilization of the high-resolution NMR models as building blocks for rigid body modelling, and the power of combing the two methods was recently reviewed .
Increasingly important are the interactions with single-particle cryo-EM, a direct imaging method providing 3D structural information from frozen non-crystalline specimens. The recent progress in new direct-detection electron detectors and improvements in image processing [102–104] lead to a remarkable improvement in the resolution of cryo-EM models, down to 3.5 Å [102,105], making the technique extremely powerful but also yet more attractive for hybrid applications.
Numerous joint applications of cryo-EM and SAXS were reported in the past. An example is given by the study of the bacterial class I release factor RF1 , where the open cryo-EM structures of RF1 and its closed crystal structure were validated against the experimental SAXS data. The curves computed from the cryo-EM models were agreeing much better with SAXS data and the open model agreed with the SAXS-based ab initio models indicating that RF1 had an open structure in solution. In future, one may expect yet more interactions between cryo-EM and SAXS whereby the latter technique, providing lower resolution but also being much faster, can be utilized, e.g. to study ligand- or environment-induced conformational changes based on the cryo-EM-generated models.
Owing to the continuous development of the experimental facilities, most notably, high brilliance synchrotrons, and also to significant recent progress in the analysis and modelling methods, SAXS became a powerful technique for low-resolution structure characterization of biological macromolecules. Given the improved speed, relatively low demand on the sample amounts and automation of the experiment and its analysis, the technique belongs to one of the most versatile methods in the toolkit of structural biologists. SAXS may be utilized as a standalone approach providing overall shapes, but also to construct more detailed hybrid models utilizing other structural, biophysical/biochemical and bioinformatics data. The technique is ideally suited for high-throughput screening studies and for time-resolved analysis of biological processes.
Recent developments in experimental and analysis methods made SAXS a powerful method to study the overall structure of biological macromolecules in solution under nearly native conditions.
SAXS is well suited for high-throughput and time-resolved studies to analyse structural transitions and biological processes.
SAXS is especially useful in hybrid approaches in combination with structural (e.g. MX, NMR, cryo-EM), biochemical, biophysical, and computational techniques to comprehensively characterize macromolecular systems.
M.A.S. and D.I.S. wrote the article.
The authors thank the Röntgen-Ångström cluster project ‘TT-SAS’ (BMBF project number 05K16YEA) and the Horizon 2020 programme of the European Union, iNEXT grant # 653706.
The Authors declare that there are no competing interests associated with the manuscript.