We summarize different levels of RNA structure prediction, from classical 2D structure to extended secondary structure and motif-based research toward 3D structure prediction of RNA. We outline the importance of classical secondary structure during all those levels of structure prediction.

Introduction

The secondary structure model of RNA is at the center of most computational work related to RNA. In this review, we will sketch how recent research has taken advantage of the increasing amount of experimentally solved RNA 3D structures to move beyond the secondary structure view, but how it cannot move away from it.

RNA secondary structure is defined by canonical AU, GC base pairs and GU wobble pairs, which form between nucleotides typically within one RNA strand to create antiparallel A-type helices. Pseudoknot-free secondary structures can be encoded as strings, e.g. in the Vienna dot-bracket notation, where nested pairs of left and right parentheses signify nested base pairs, while dots signify unpaired nucleotides (see Figure 1, box ‘Secondary Structure’).

Approaches for 3D structure prediction.

Figure 1.
Approaches for 3D structure prediction.

RNA structure prediction programs can be classified into hierarchical folding approaches and all-in-one programs.

Figure 1.
Approaches for 3D structure prediction.

RNA structure prediction programs can be classified into hierarchical folding approaches and all-in-one programs.

RNA secondary structure can be efficiently predicted from the primary sequence using dynamic programming approaches. The easiest such algorithm maximizes the number of base pairs for a given sequence [1], while state-of-the-art prediction tools [2] use an energy model with different energy contributions for different loop types. This model takes the stacking of neighboring base pairs into account and is often referred to as nearest-neighbor model. The energy parameters of the nearest-neighbor model were derived from RNA melting experiments collected by Turner et al. [3]. These secondary structure prediction algorithms can be used not only to predict the structure with the lowest free energy, but also to generate the Boltzmann ensemble of suboptimal structures [2,4,5]. By including data from chemical probing experiments such as SHAPE, the reliability of the generated secondary structures can be further increased. For an overview over state-of-the-art techniques in secondary structure prediction and their integration with chemical probing data, see the recent review by Lorenz et al. [6]. Pseudoknots increase the computational complexity, but simple types can be predicted with dedicated software [7].

The great success of the secondary structure description of RNA is grounded in the hierarchical nature of RNA folding [8]. Intrahelical interactions have a significantly greater energy contribution than tertiary interactions. Thus, energy barriers and in most cases energy differences between different secondary structures are significantly higher than energy differences and barriers between different 3D structures that correspond to the same secondary structure.

Secondary structure is useful in many approaches beyond simple structure prediction: sequence design [9,10] aims at finding sequences that fold into one or more predefined (meta-)stable structures. Folding kinetics [11] and co-transcriptional folding [12,13] look at the dynamic nature of RNA secondary structure formation. Functional RNA secondary structures are expected to be evolutionarily conserved. Hence, they can be detected in genomic data via compensatory mutations and structure stability [14] or co-variance models (RNA families) [15,16].

While some biological mechanisms can be understood on the basis of secondary structure alone, others require more detailed structure knowledge. In this contribution, we will therefore focus on approaches that go beyond classical secondary structure.

On the other end of the spectrum lie methods that model RNA structures in full atomistic resolution. In particular, molecular dynamics (MD) simulations can be readily applied to RNA, with the AMBER force field being most popular in the RNA community. However, despite many RNA-related corrections to the AMBER force field, the current version still sometimes fails to identify the correct native state among all possible states [17,18]. Despite these challenges, reduced model [19] and explicit solvent MD simulations [20] of the full ribosome have been successfully performed, using experimental crystal structures as a starting point and, in the case of the reduced model, for the construction of a structure-based (Gō-like) potential. The reduced model simulation covered biologically relevant time scales, whereas the explicit solvent MD simulation covered over a microsecond, enough to separate rapid local fluctuations from large-scale collective movements. The latter simulation took several months on over 1000 computer cores.

In contrast with MD from a given starting conformation, true de novo simulations are possible only for very small molecules, due to the huge space of possible conformations and rugged energy landscapes, even when combined with enhanced sampling techniques [21]. Recent applications of classical force fields to RNA folding range from the folding of G-quadruplexes [22] and RNA–ion interactions [23] to the folding of tetra-loop hairpins [17].

While all-atom approaches are limited to small molecules or fragments, somewhat larger RNAs can be treated by methods working with a coarse-grained RNA representation, which we will review in sections ‘Hierarchical folding enables aggressive coarse graining’ and ‘Predicting secondary and tertiary structure together (all in one)’. In the latter case, prior knowledge of the secondary structure is not required. Rather, correct secondary structure should emerge as a by-product of tertiary structure prediction. Occasionally, known secondary structures are used as constraints to reduce the search space. However, we will also discuss tertiary structure methods that explicitly build on secondary structure knowledge and use it to enable a more aggressive coarse graining.

For a discussion on different ways of coarse graining in RNA 3D structure prediction and the distinction between theory-based and knowledge-based potentials, see also the recent review by Dawson et al. [24].

Extended secondary structure

While prediction of full tertiary RNA structures remains an extremely difficult task, several promising avenues have emerged to go beyond classical secondary structure without tackling actual 3D structure, e.g. by extending the notion of secondary structure to include non-canonical interactions. In addition to the Watson–Crick and GU wobble pairs, RNA nucleotides can form a wide variety of interactions, both pairwise and between more than two nucleotides.

A highly useful classification of these non-canonical interactions was introduced by Leontis and Westhof [25], by noting that almost all interactions happen on one of three ‘edges’ (the Watson–Crick, Hoogsteen, and sugar edge). Together with the relative orientation of the glycosidic bond (cis or trans), this results in 12 base-pairing types, each of which has at least two hydrogen bonds and was observed for at least some combinations of bases [26]. Additionally, weaker interactions involving a single hydrogen bond, base triplets or multiplets, and G-quadruplexes have been observed in RNA molecules.

An RNA structure including some or all of the above-mentioned noncanonical interactions, but not the full 3D information, will be called an extended secondary structure. Decomposition and motif search are the two main approaches for the prediction of extended secondary structures.

Extended secondary structures play an important role in many natural RNAs, including ribosomal RNAs [27], where non-canonical base pairs contribute to the movement of the ribosome [28].

Loop decomposition with non-canonical base pairs

Classical secondary structure prediction is based on the unique decomposition of the RNA structure into loops delimited only by canonical and G-U base pairs. The program MC-Fold [29] generalizes the classical decomposition by considering all types of non-canonical base pairs as loop delimiters. The resulting loops, termed NCMs (nucleotide cyclic motifs) in MC-Fold, are defined via a minimal cycle basis of the RNA structure graph [30].

The programs MC-Fold [29] and the faster dynamic programming implementation MC-Fold-dp [31] find optimal and suboptimal combinations of NCMs for a given sequence according to a statistical energy function. This energy function is composed of probability terms for finding individual and combinations of two NCMs.

RNAwolf [31] further generalizes the MC-Fold approach by allowing each nucleotide to form two interactions, thus supporting base-triplets which are observed quite frequently in nature. RNAwolf, however, uses a simplified Nussinov-like energy model, limiting its accuracy. Parameterization of a full-featured energy model remains the big challenge for extended secondary structure prediction, and it is unclear if enough data are available to estimate the large number of parameters required. Moreover, while the nearest-neighbor approximation has been shown to be quite accurate for canonical secondary structures, the distortion of the double helix through noncanonical pairs could well exert an influence beyond its direct neighbors.

Motif-based

An alternative to the loop decomposition above is to treat substructures containing several noncanonical pairs as a single unit, called a tertiary structure motif. Such an RNA motif is characterized by well-defined (base-pairing) interactions [32,33], a well-defined geometry [3437], or both. The prime example of such a recurrent motif would be the kink turn motif, a sharp turn between two adjacent helices introduced by many bases forming a dense network of noncanonical interactions. This and three other common motifs can be detected in single sequences or multiple sequence alignments using RMDetect [38].

A well-maintained collection of such motifs can be found in the Motif Atlas [39]. An accompanying search tool, JAR3D [40], can be used to predict the presence of any motif from the Atlas in interior or hairpin loops of a secondary structure.

Motif-based approaches circumvent the problem of decomposing complex interaction networks, such as the kink turn, and even allow for crossing interaction within these motifs. On the downside, they cannot predict novel motifs and are limited to the set of motifs that we observe in known tertiary structures. One application of motifs is homology modeling, where nonhomologous regions can be filled in using motifs [41].

Prediction of G-quadruplexes can also be seen as a motif-based extension of secondary structure. G-quadruplex prediction has recently been integrated in the folding algorithms of the ViennaRNA package [42]. This integration presents a significant advantage over pure sequence searches, as it properly treats the competition between formation of G-quads and normal secondary structure.

Extended secondary structures are of great interest because functionally important interaction with proteins and other factors typically take place in regions of irregular structure, rather than perfect helices. Moreover, extended secondary structures provide a much better starting place for modeling tertiary structures. The kink turn again serves as a perfect example: The sharp bend introduced by this motif will strongly affect the overall shape of the molecule, but would be very hard to predict by any approach that is not aware of the motif.

Hierarchical folding enables aggressive coarse graining

While secondary structure can be used in most 3D structure prediction programs to constrain the search space, for some programs the secondary structure is at the core of the abstraction, on which the coarse-grained structure representation is based.

Vfold3D [43,44] creates a coarse-grained 3D scaffold from the RNA secondary structure and sequence using a template-based approach. For each loop region, the best matching template is selected from a template library constructed from PDB structures, where the match quality is based on sequence similarity. Helices are modeled as ideal A-type helices. From this scaffold, a full atom model is constructed which is then relaxed using an AMBER all-atom force field.

More than one fragment can be tried for each loop in an exhaustive way (five fragments were tried in the RNA puzzles entries — see below). However, Vfold3D does not contain any energy function to score these coarse-grained scaffolds, nor does it contain a sampling protocol to sample combinations of such fragments. Since it is unlikely that AMBER energy minimization will overcome larger energy barriers, the quality of the prediction depends on the correct choice of fragments. Thus, Vfold works best when the structure of loops strongly depends on the sequence (see section ‘Motif-based’ about motifs) or Vfold contains close homologs of the target RNA in its knowledge-base, while it has limitations whenever similar loop sequences can adopt multiple loop conformations.

RNAComposer [45] uses a machine translation system that translates RNA sequence and secondary structure into 3D structure. Similar to VFold, it selects the best-matching fragment for every loop and helix from a database of fragments. RNAComposer provides fallback mechanisms for cases where no fragment is found as well as two final refinement steps using CYANA (for refinement in torsion angle space) and the CHARMM force field (for refinement in Cartesian space). Like Vfold3D, it allows for the creation of random suboptimal structures, but does not provide sampling methods or a fast to evaluate energy function for a systematic exploration of the conformational space.

ERNWIN [46] and RAGTOP [47] use the user-supplied secondary structure to guide their aggressive coarse graining of the RNA into helices (stems) and connecting loops (see Figure 2). In contrast with RNAComposer and VFold3D, these tools explore the conformational space on the level of loops and helices.

Secondary structure based fragments.

Figure 2.
Secondary structure based fragments.

The secondary structure of RNA defines fragments for fragment assembly or aggressive coarse graining.

Figure 2.
Secondary structure based fragments.

The secondary structure of RNA defines fragments for fragment assembly or aggressive coarse graining.

RAGTOP [47] first employs machine learning to predict the topology of multiloops (junctions) [48]. Next, interior loop angles are sampled with a Monte Carlo/Simulated Annealing algorithm using a knowledge-based potential. Finally, an all-atom representation is recovered from the sampled conformations [49]. With root mean square deviation (RMSD) values from 2.38 to 14.56 (for structures of 25–158 nucleotides), the reported prediction accuracy of RAGTOP is slightly better than previous tools which use less aggressive coarse graining.

ERNWIN [46] uses a similar coarse graining based on the secondary structure. Helices are described by 10 parameters: the position of the helix's start and end (or equivalently the start and the direction) and four parameter for the position of the minor groove along the length of the helix. This model assumes a regular helix where the position of the minor groove with respect to the helices axis changes by a fixed angle with each subsequent nucleotide. The helix axis and the vector pointing to the minor groove form a local co-ordinate system for residues. Using average atom position data in this local co-ordinate system gathered from a nonredundant list of PDB files, so-called virtual atom positions of all the helix atoms can be quickly calculated just from the 10 parameters defining the helix. Another major difference between RAGTOP and ERNWIN is the fact that the later can sample different multiloop conformations. This means that the prediction quality of ERNWIN is independent of the correctness of the junction topology prediction, but comes at the cost of a higher number of rejections during Monte Carlo sampling. Finally, sampling and energy evaluation are significantly different between RAGTOP and ERNWIN. In ERNWIN, local loop and helix conformations are sampled directly from a fragment library, whereas RAGTOP samples the angle from continuous space and therefore needs local energy terms to evaluate the likeliness of a local angle. As for the contribution of global features to the energy function, both programs have a term for the radius of gyration. ERNWIN in addition contains a term for loop–loop interactions and a sophisticated term for interactions of single-stranded adenines with the minor groove of a helix (A-Minor motif [50]). ERNWIN also has direct support for motif search via JAR3D (see above).

While the idea of using helices without internal degrees of freedom is common to RAGTOP and ERNWIN, it is the details that matter: the sampling strategy that explores the conformational space and the energy function that detects native-like conformations require a lot of fine-tuning and still have room for improvement.

MC-Sym [29] uses a different approach toward tertiary structure prediction. It is built on top of MC-Fold which can predict extended secondary structures. It uses a library of 3D fragments for each NCM predicted by MC-Fold to assemble 3D structures. In a Las Vegas algorithm, 3D structures are sampled for 12 h by aligning adjacent NCM 3D fragments to form structures. The final result is an ensemble of 3D structures.

Predicting secondary and tertiary structure together (all in one)

While the tools in the previous section built their model on top of a given secondary structure, the following programs add secondary structure constraints into their model via a force field.

The program NAST [51] represents each nucleotide by a single point, which means that it does not hold any information about the orientation of the base with respect to the backbone. The program requires an input secondary structure, which is used to create energy potentials on the lengths, angles, and dihedrals between residues. These potentials direct the sampling of the RNA toward the desired secondary structure. We argue that the requirement for a secondary structure is inherent to the coarse-grained model used, because formation of new base pairs would depend on the orientation of the base, which is not part of the model. In NAST, sampling via an MD approach starting from the extended chain is followed by filtering according to tertiary structure constraints (e.g. from SAXS or SHAPE experiments or from phylogenetic observations [52]) and clustering using a k-means algorithm based on a simplified pairwise GDT-TS [53] distance. For one RNA molecule, the effect of errors in the secondary structure input was investigated and the authors concluded that up to 35% wrong base pairs did not significantly reduce the RMSD of their prediction.

The most aggressive coarse graining possible that allows for de novo prediction of base pairs uses one rigid body per nucleotide (in contrast with one point), as implemented in oxRNA [54,55] (see below). In a similar fashion, FARNA uses rigid fragments of 1–3 nucleotides to assemble the final RNA structure. Next to energy terms for clashes and the radius of gyration, FARNA uses a base-pairing statistical potential that can be seen as a heatmap around the center of an ideal base. If two bases are placed in a relative orientation that is frequently found in base pairs in solved RNA structures, these bases receive a favorable energy contribution. Structures generated by FARNA can be refined using an all-atom force field, giving rise to the combined method FARFAR [56,57]. In theory, such an all-atom refinement would be useful for all coarse-grained approaches (but see the introduction for the limits of current force fields); however, its integration in one framework (namely the Rosetta framework) is a great technical advantage of FARNA/FARFAR.

In contrast with FARNA, oxRNA [54,55] samples structure from a continuous space. It can be used for MD simulations and for Monte Carlo calculations. The energy function of oxRNA is parameterized to favor RNA A-type helices by using potentials for sequence-dependent hydrogen bonding, stacking, and cross-stacking in helices in addition to the backbone potential. Since all nucleotides are equally sized, mismatches in helices do not disturb the overall helix geometry in oxRNA, and thus, the stability of such helices is overestimated despite the lack of the energy contribution from the hydrogen-bonding term. The quality of the secondary structure predicted by oxRNA is comparable with secondary structure prediction tools using the nearest-neighbor model for small RNAs, as shown by melting temperature calculations done with oxRNA. Since oxRNA does not include any noncanonical or long-range interactions (except excluded volume effects), it is no surprise that the use of a three-dimensional model does not bring any significant advantage over a two-dimensional model with respect to the prediction of secondary structures. The oxRNA paper studied mechanical properties like force-extension and overstretching properties, persistence length, and modeled hairpin unzipping. These applications, while leading to results that are only in the same order of magnitude as the experimental values, could not be done with most of the other RNA structure prediction models.

In contrast with the continuous energy model used in oxRNA, simRNA [58,59] and iFoldRNA [60] use discrete, grid-based statistical potentials. While iFoldRNA uses discrete MD [61,62], simRNA uses Monte Carlo simulations.

iFoldRNA [60] uses three beads per nucleotide: one for the sugar, one for the phosphate and only one for the base (see Figure 3). The position of the base relative to the sugar is used to determine the direction of hydrogen bonding, but tilting of the base's plane cannot be modeled with this coarse graining. Noncanonical interactions are implicitly modeled by the use of a general hydrophobic attraction between all kinds of bases. iFoldRNA's energy function only contains local terms (for bond length, angle and dihedral angle, base pairing, phosphate–phosphate repulsion, hydrophobic interactions, and base stacking). Furthermore, an additional energy term for the loop entropy is used to compensate for the bias in loop entropy introduced by the coarse graining. According to the data reported in the paper's supplement [60], iFoldRNA predicts pseudoknot-free secondary structures slightly better than MFold (average Q-value of 0.953 vs. 0.948) and can additionally predict pseudoknots. However, we note that the reported benchmark of secondary structure prediction only includes the Q-value (true positives) and does not take false positives (additional base pairs) into account.

Different coarse-grained representations of nucleotides.

Figure 3.
Different coarse-grained representations of nucleotides.

NAST uses a single point located at the C3′ atom. oxRNA uses a single rigid body with multiple interaction sites defined with respect to the RNA's center of mass (the positions in the figure are rough estimates of this interaction sites). iFoldRNA uses three beads per residue, located at the center of the phosphate, the sugar and the base, respectively. simRNA uses five beads per nucleotide, three of which define the plane of the base.

Figure 3.
Different coarse-grained representations of nucleotides.

NAST uses a single point located at the C3′ atom. oxRNA uses a single rigid body with multiple interaction sites defined with respect to the RNA's center of mass (the positions in the figure are rough estimates of this interaction sites). iFoldRNA uses three beads per residue, located at the center of the phosphate, the sugar and the base, respectively. simRNA uses five beads per nucleotide, three of which define the plane of the base.

SimRNA [58,59] models the base with three points and is thus able to capture the full three-dimensional orientation of the base's plane. The statistical energy terms for base–base, base–backbone, and backbone–backbone interactions can be visualized as heatmaps similar to the ones used in the original FARNA publication [63]. The authors report that their model performs better than other 3D structure prediction programs for short RNA sequences, but needs explicit secondary structure (and potentially long range) restraints to model longer RNA molecules correctly.

Among the last three discussed models, simRNA is the only one which fully incorporates noncanonical base pairs in its energy model. Since secondary structure constraints are modeled as distance constraints, (extended) secondary structures could, in principle, be used as input, but the type of the base pair in the prediction might be different from the type in the input structure.

RNA puzzles

As of now, three rounds of RNA puzzles, a CASP-like blind experiment in RNA 3D structure prediction, have been completed [6466]. In these competitions, participating experimental groups provide solved but unpublished tertiary structures. Computational groups then have a few weeks time to perform tertiary structure prediction given the sequence and — in some cases — chemical probing information.

The submitted models are scored using different measures: The RMSD is a common measure for comparison of macro-molecules which can be quickly calculated [67], but has received some critiques [68]. The interaction network fidelity (INF) can be calculated for stacking and (non)canonical hydrogen-bonding interactions. While the RMSD is sensitive to errors in flexible loop regions, the INF is sensitive to errors in structured regions. Additionally, RMSD and INF are combined to the deformation index. From round II of RNA puzzles onwards, the mean of circular quantities (MCQ) [69], which is based on angular co-ordinates, was used as an additional measure. In contrast with the RMSD, errors in the relative orientation of two relatively large RNA domains do not have a higher impact on the MCQ score than those in the orientation of small stems or hairpins.

In the first round of RNA puzzles, one riboswitch domain, a stem–loop structure with two interior loops, and a square formed by four helices and interior loops in the corners were modeled. Of these challenges, only the riboswitch domain contained true higher-order 3D folds. Interestingly, a fragment-based approach (Vfold) combined with relaxation in an all-atom AMBER force field yielded the result with the lowest RMSD for this puzzle. This probably means that fragments from homologous RNA molecules were selected based on the sequence similarity score. The second best result of this puzzle used a coarse-grained approach with secondary structure constraints, followed by full-atom refinement in a discrete MD framework (iFoldRNA). One lesson learned from this RNA puzzle is certainly that the huge size of the sampling space requires some sort of coarse-grained initial step which uses the secondary structure, followed by an all-atom refinement.

In round II [65] of the RNA puzzles contest, SHAPE data were provided to all participants by one group. Since this round's target structures were longer RNA molecules with more complex 3D folds (a ribozyme, a riboswitch, and a T-box–tRNA complex), these chemical probing data were crucial for 3D structure prediction. Despite the use of experimental secondary structures, no perfect predictions were obtained, which shows the open challenges in RNA 3D structure prediction.

In round III, it became apparent how homology modeling can achieve good results when a homolog with solved structure exists, and how targets without homology to solved structures still pose huge challenges to modelers. It is also notable that many groups who submitted several models could not correctly rank the best prediction as their top choice.

Conclusions

While RNA secondary structure prediction is well established and widely used, tertiary structure prediction has long seemed out of reach. Recently, however, two directions have emerged that promise RNA structure models that go beyond secondary structure. Inclusion of structure motifs and noncanonical base pairs could yield extended secondary structures that are more detailed and will perhaps even improve prediction accuracy. Coarse-grained models of tertiary structures may overcome the sampling problems that limit all-atom methods.

A successful RNA 3D structure prediction pipeline, as illustrated in Figure 4, will need several ingredients: It should start from a reliable secondary structure, preferably including tertiary motifs and supported by experimental evidence, such as probing data. Exploration of the conformation space will be best done using coarse-grained models that provide efficient sampling. Empirical scoring functions need to be able to identify coarse-grained conformation(s) close to the native state. Additional experimental restraints can be incorporated at this step, as reviewed by Magnus et al. [70]. The final step will reconstruct and refine all-atom models from the best coarse-grained conformations. While further improvements are needed in all of these stages, reliable RNA tertiary structure prediction is slowly getting within reach.

A proposed RNA 3D structure prediction pipeline, as described in the section ‘Summary’.

Figure 4.
A proposed RNA 3D structure prediction pipeline, as described in the section ‘Summary’.

In addition to the structure prediction (middle), additional experimental (left) and computational steps (right) can improve the prediction accuracy.

Figure 4.
A proposed RNA 3D structure prediction pipeline, as described in the section ‘Summary’.

In addition to the structure prediction (middle), additional experimental (left) and computational steps (right) can improve the prediction accuracy.

Summary
  • RNA secondary structures prediction is efficient and useful, but yields little information about overall 3D structure.

  • Secondary structures can be extended to include tertiary structure motifs and/or noncanonical base pairs. Such extended secondary structures provide more detail as well as a promising starting point for tertiary structure prediction.

  • Tertiary structure prediction remains difficult, but methods based on coarse-grained structure representations are continually improving their success rate.

Abbreviations

     
  • INF

    interaction network fidelity

  •  
  • MCQ

    mean of circular quantities

  •  
  • MD

    molecular dynamics

  •  
  • NCMs

    nucleotide cyclic motifs

  •  
  • RMSD

    root mean square deviation

  •  
  • SAX

    small-angle X-ray scattering

  •  
  • SHAPE

    selective 2′ hydroxyl acylation analyzed by primer extension

Funding

This work was funded, in part, by the Austrian FWF, project ‘SFB F43 RNA regulation of the transcriptome’.

Acknowledgments

We thank Roman Ochsenreiter for proof-reading the manuscript.

Competing Interests

The Authors declare that there are no competing interests associated with the manuscript.

References

References
1
Nussinov
,
R.
,
Pieczenik
,
G.
,
Griggs
,
J.R.
and
Kleitman
,
D.J.
(
1978
)
Algorithms for loop matchings
.
SIAM J. Appl. Math.
35
,
68
82
2
Lorenz
,
R.
,
Bernhart
,
S.H.
,
zu Siederdissen
,
C.H.
,
Tafer
,
H.
,
Flamm
,
C.
,
Stadler
,
P.F.
et al. 
(
2011
)
ViennaRNA package 2.0
.
Algorithms Mol. Biol.
6
,
26
3
Turner
,
D.H.
and
Mathews
,
D.H.
(
2010
)
NNDB: the nearest neighbor parameter database for predicting stability of nucleic acid secondary structure
.
Nucleic Acids Res.
38
(
Database
),
D280
D282
4
Hofacker
,
I.L.
,
Fontana
,
W.
,
Stadler
,
P.F.
,
Bonhoeffer
,
L.S.
,
Tacker
,
M.
and
Schuster
,
P.
(
1994
)
Fast folding and comparison of RNA secondary structures
.
Monatsh. Chem.
125
,
167
188
5
Wuchty
,
S.
,
Fontana
,
W.
,
Hofacker
,
I.L.
and
Schuster
,
P.
(
1999
)
Complete suboptimal folding of RNA and the stability of secondary structures
.
Biopolymers
49
,
145
165
6
Lorenz
,
R.
,
Wolfinger
,
M.T.
,
Tanzer
,
A.
and
Hofacker
,
I.L.
(
2016
)
Predicting RNA secondary structures from sequence and probing data
.
Methods
103
,
86
98
7
Bellaousov
,
S.
and
Mathews
,
D.H.
(
2010
)
ProbKnot: fast prediction of RNA secondary structure including pseudoknots
.
RNA
16
,
1870
1880
8
Mustoe
,
A.M.
,
Brooks
,
C.L.
and
Al-Hashimi
,
H.M.
(
2014
)
Hierarchy of RNA functional dynamics
.
Annu. Rev. Biochem.
83
,
441
466
9
Wolfe
,
B.R.
,
Porubsky
,
N.J.
,
Zadeh
,
J.N.
,
Dirks
,
R.M.
and
Pierce
,
N.A.
(
2017
)
Constrained multistate sequence design for nucleic acid reaction pathway engineering
.
J. Am. Chem. Soc.
139
,
3134
3144
10
Taneda
,
A.
(
2011
)
MODENA: a multi-objective RNA inverse folding
.
Adv. Appl. Bioinform. Chem.
4
,
1
12
PMID:
[PubMed]
11
Kucharík
,
M.
,
Hofacker
,
I.L.
,
Stadler
,
P.F.
and
Qin
,
J.
(
2016
)
Pseudoknots in RNA folding landscapes
.
Bioinformatics
32
,
187
194
12
Badelt
,
S.
,
Hammer
,
S.
,
Flamm
,
C.
and
Hofacker
,
I.L.
(
2015
)
Thermodynamic and kinetic folding of riboswitches
.
Methods Enzymol.
553
,
193
213
13
Proctor
,
J.R.
and
Meyer
,
I.M.
(
2013
)
CoFold: an RNA secondary structure prediction method that takes co-transcriptional folding into account
.
Nucleic Acids Res.
41
,
e102
14
Gruber
,
A.R.
,
Findeiß
,
S.
,
Washietl
,
S.
,
Hofacker
,
I.L.
and
Stadler
,
P.F.
(
2009
)
RNAZ 2.0. Improved noncoding RNA detection
. In
Pacific Symposium on Biocomputing 2010
, pp.
69
79
,
World Scientific Pub. Co. Pte. Lt
,
New Jersey
15
Eddy
,
S.R.
and
Durbin
,
R.
(
1994
)
RNA sequence analysis using covariance models
Nucleic Acids Res.
22
,
2079
2088
PMID:
[PubMed]
16
Nawrocki
,
E.P.
and
Eddy
,
S.R.
(
2013
)
Infernal 1.1: 100-fold faster RNA homology searches
.
Bioinformatics
29
,
2933
2935
17
Kührová
,
P.
,
Best
,
R.B.
,
Bottaro
,
S.
,
Bussi
,
G.
,
Šponer
,
J.
,
Otyepka
,
M.
et al. 
(
2016
)
Computer folding of RNA tetraloops: identification of key force field deficiencies
.
J. Chem. Theory Comput.
12
,
4534
4548
18
Gil-Ley
,
A.
,
Bottaro
,
S.
and
Bussi
,
G.
(
2016
)
Empirical corrections to the amber RNA force field with target metadynamics
.
J. Chem. Theory Comput.
12
,
2790
2798
19
Whitford
,
P.C.
,
Geggier
,
P.
,
Altman
,
R.B.
,
Blanchard
,
S.C.
,
Onuchic
,
J.N.
and
Sanbonmatsu
,
K.Y.
(
2010
)
Accommodation of aminoacyl-tRNA into the ribosome involves reversible excursions along multiple pathways
.
RNA
16
,
1196
1204
20
Whitford
,
P.C.
,
Blanchard
,
S.C.
,
Cate
,
J.H.D.
and
Sanbonmatsu
,
K.Y.
(
2013
)
Connecting the kinetics and energy landscape of tRNA translocation on the ribosome
.
PLoS Comput. Biol.
9
,
1
10
21
Tribello
,
G.A.
,
Bonomi
,
M.
,
Branduardi
,
D.
,
Camilloni
,
C.
and
Bussi
,
G.
(
2014
)
PLUMED 2: new feathers for an old bird
.
Comput. Phys. Commun.
185
,
604
613
22
Šponer
,
J.
,
Bussi
,
G.
,
Stadlbauer
,
P.
,
Kührová
,
P.
,
Banáš
,
P.
,
Islam
,
B.
et al. 
(
2017
)
Folding of guanine quadruplex molecules–funnel-like mechanism or kinetic partitioning? An overview from MD simulation studies
.
Biochim. Biophys. Acta, Gen. Subj.
1861
,
1246
1263
23
Cunha
,
R.A.
and
Bussi
,
G.
(
2017
)
Unravelling Mg2+-RNA binding with atomistic molecular dynamics
.
RNA
23
,
628
638
24
Dawson
,
W.K.
,
Maciejczyk
,
M.
,
Jankowska
,
E.J.
and
Bujnicki
,
J.M.
(
2016
)
Coarse-grained modeling of RNA 3D structure
.
Methods
103
,
138
156
25
Leontis
,
N.B.
and
Westhof
,
E.
(
2001
)
Geometric nomenclature and classification of RNA base pairs
.
RNA
7
,
499
512
PMID:
[PubMed]
27
Petrov
,
A.S.
,
Bernier
,
C.R.
,
Gulen
,
B.
,
Waterbury
,
C.C.
,
Hershkovits
,
E.
,
Hsiao
,
C.
et al. 
(
2014
)
Secondary structures of rRNAs from all three domains of life
.
PLoS ONE
9
,
e88222
28
Mohan
,
S.
and
Noller
,
H.F.
(
2017
)
Recurring RNA structural motifs underlie the mechanics of L1 stalk movement
.
Nat. Commun.
8
,
14285
29
Parisien
,
M.
and
Major
,
F.
(
2008
)
The MC-fold and MC-Sym pipeline infers RNA structure from sequence data
.
Nature
452
,
51
55
30
Lemieux
,
S.
(
2006
)
Automated extraction and classification of RNA tertiary structure cyclic motifs
.
Nucleic Acids Res.
34
,
2340
2346
31
zu Siederdissen
,
C.H.
,
Bernhart
,
S.H.
,
Stadler
,
P.F.
and
Hofacker
,
I.L.
(
2011
)
A folding algorithm for extended RNA secondary structures
.
Bioinformatics
27
,
i129
i136
32
Djelloul
,
M.
and
Denise
,
A.
(
2008
)
Automated motif extraction and classification in RNA tertiary structures
.
RNA
14
,
2489
2497
33
Zhong
,
C.
and
Zhang
,
S.
(
2012
)
Clustering RNA structural motifs in ribosomal RNAs using secondary structural alignment
.
Nucleic Acids Res.
40
,
1307
1317
34
Wadley
,
L.M.
(
2004
)
The identification of novel RNA structural motifs using COMPADRES: an automated approach to structural discovery
.
Nucleic Acids Res.
32
,
6650
6659
35
Huang
,
H.C.
(
2005
)
The application of cluster analysis in the intercomparison of loop structures in RNA
.
RNA
11
,
412
423
36
Wang
,
X.
,
Huan
,
J.
,
Snoeyink
,
J.S.
and
Wang
,
W.
(
2007
)
Mining RNA tertiary motifs with structure graphs
. In
19th International Conference on Scientific and Statistical Database Management (SSDBM 2007)
, pp.
31
40
,
Institute of Electrical and Electronics Engineers (IEEE)
37
Chojnowski
,
G.
,
Waleń
,
T.
and
Bujnicki
,
J.M.
(
2014
)
RNA bricks — a database of RNA 3D motifs and their interactions
.
Nucleic Acids Res.
42
D123
D131
38
Cruz
,
J.A.
and
Westhof
,
E.
(
2011
)
Sequence-based identification of 3D structural modules in RNA with RMDetect
.
Nat. Methods
8
,
513
519
39
Petrov
,
A.I.
,
Zirbel
,
C.L.
and
Leontis
,
N.B.
(
2013
)
Automated classification of RNA 3D motifs and the RNA 3D motif atlas
.
RNA
19
,
1327
1340
40
Zirbel
,
C.L.
,
Roll
,
J.
,
Sweeney
,
B.A.
,
Petrov
,
A.I.
,
Pirrung
,
M.
and
Leontis
,
N.B.
(
2015
)
Identifying novel sequence variants of RNA 3D motifs
.
Nucleic Acids Res.
43
,
7504
7520
41
Tung
,
C.S.
,
Joseph
,
S.
and
Sanbonmatsu
,
K.Y.
(
2002
)
All-atom homology model of the Escherichia coli 30S ribosomal subunit
.
Nat. Struct. Biol.
9
,
750
755
42
Lorenz
,
R.
,
Bernhart
,
S.H.
,
Qin
,
J.
,
zu Siederdissen
,
C.H.
,
Tanzer
,
A.
,
Amman
,
F.
et al. 
(
2013
)
2D meets 4G: G-quadruplexes in RNA secondary structure prediction
.
IEEE Trans. Comp. Biol. Bioinf.
10
,
832
844
43
Xu
,
X.
,
Zhao
,
P.
and
Chen
,
S.J.
(
2014
)
Vfold: a web server for RNA structure and folding thermodynamics prediction
.
PLoS ONE
9
,
e107504
44
Cao
,
S.
and
Chen
,
S.J.
(
2011
)
Physics-based de novo prediction of RNA 3D structures
.
J. Phys. Chem. B
115
,
4216
4226
45
Popenda
,
M.
,
Szachniuk
,
M.
,
Antczak
,
M.
,
Purzycka
,
K.J.
,
Lukasiak
,
P.
,
Bartol
,
N.
et al. 
(
2012
)
Automated 3D structure composition for large RNAs
.
Nucleic Acids Res.
40
,
e112
46
Kerpedjiev
,
P.
,
zu Siederdissen
,
C.H.
and
Hofacker
,
I.L.
(
2015
)
Predicting RNA 3D structure using a coarse-grain helix-centered model
.
RNA
21
,
1110
1121
47
Kim
,
N.
,
Laing
,
C.
,
Elmetwaly
,
S.
,
Jung
,
S.
,
Curuksu
,
J.
and
Schlick
,
T.
(
2014
)
Graph-based sampling for approximating global helical topologies of RNA
.
Proc. Natl Acad. Sci.
111
,
4079
4084
48
Laing
,
C.
and
Schlick
,
T.
(
2009
)
Analysis of four-way junctions in RNA structures
.
J. Mol. Biol.
390
,
547
559
49
Zahran
,
M.
,
Bayrak
,
C.S.
,
Elmetwaly
,
S.
and
Schlick
,
T.
(
2015
)
RAG-3D: a search tool for RNA 3D substructures
.
Nucleic Acids Res.
43
,
9474
9488
50
Nissen
,
P.
,
Ippolito
,
J.A.
,
Ban
,
N.
,
Moore
,
P.B.
and
Steitz
,
T.A.
(
2001
)
RNA tertiary interactions in the large ribosomal subunit: the A-minor motif
.
Proc. Natl Acad. Sci. U.S.A.
98
,
4899
4903
51
Jonikas
,
M.A.
,
Radmer
,
R.J.
,
Laederach
,
A.
,
Das
,
R.
,
Pearlman
,
S.
,
Herschlag
,
D.
et al. 
(
2009
)
Coarse-grained modeling of large RNA molecules with knowledge-based potentials and structural filters
.
RNA
15
,
189
199
52
Weinreb
,
C.
,
Riesselman
,
A.J.
,
Ingraham
,
J.B.
,
Gross
,
T.
,
Sander
,
C.
and
Marks
,
D.S.
(
2016
)
3D RNA and functional interactions from evolutionary couplings
.
Cell
165
,
963
975
53
Zemla
,
A.
,
Venclovas
,
Č.
,
Moult
,
J.
and
Fidelis
,
K.
(
1999
)
Processing and analysis of CASP3 protein structure predictions
.
Proteins Struct. Funct. Bioinform.
37
(
S3
),
22
29
.
54
Šulc
,
P.
,
Romano
,
F.
,
Ouldridge
,
T.E.
,
Doye
,
J.P.K
and
Louis
,
A.A.
(
2014
)
A nucleotide-level coarse-grained model of RNA
.
J. Chem. Phys.
140
, 235102
55
Matek
,
C.
,
Šulc
,
P.
,
Randisi
,
F.
,
Doye
,
J.P.K.
and
Louis
,
A.A.
(
2015
)
Coarse-grained modelling of supercoiled RNA
.
J. Chem. Phys.
143
, 243122
56
Das
,
R.
,
Karanicolas
,
J.
and
Baker
,
D.
(
2010
)
Atomic accuracy in predicting and designing noncanonical RNA structure
.
Nat. Methods
7
,
291
294
57
Cheng
,
C.Y.
,
Chou
,
F.C.
and
Das
,
R.
(
2015
)
Modeling complex RNA tertiary folds with Rosetta
.
Methods Enzymol.
553
,
35
64
58
Boniecki
,
M.J.
,
Lach
,
G.
,
Dawson
,
W.K.
,
Tomala
,
K.
,
Lukasz
,
P.
,
Soltysinski
,
T.
et al. 
(
2016
)
SimRNA: a coarse-grained method for RNA folding simulations and 3D structure prediction
.
Nucleic Acids Res.
44
,
e63
e63
59
Magnus
,
M.
,
Boniecki
,
M.J.
,
Dawson
,
W.
and
Bujnicki
,
J.M.
(
2016
)
SimRNAweb: a web server for RNA 3D structure modeling with optional restraints
.
Nucleic Acids Res.
44
(
W1
),
W315
W319
60
Sharma
,
S.
,
Ding
,
F.
and
Dokholyan
,
N.V.
(
2008
)
iFoldRNA: three-dimensional RNA structure prediction and folding
.
Bioinformatics
24
,
1951
1952
61
Ding
,
F.
,
Sharma
,
S.
,
Chalasani
,
P.
,
Demidov
,
V.V.
,
Broude
,
N.E.
and
Dokholyan
,
N.V.
(
2008
)
Ab initio RNA folding by discrete molecular dynamics: from structure prediction to folding mechanisms
.
RNA
14
,
1164
1173
62
Dokholyan
,
N.V.
,
Buldyrev
,
S.V.
,
Stanley
,
H.E.
and
Shakhnovich
,
E.I.
(
1998
)
Discrete molecular dynamics studies of the folding of a protein-like model
.
Fold. Des.
3
,
577
587
63
Das
,
R.
and
Baker
,
D.
(
2007
)
Automated de novo prediction of native-like RNA tertiary structures
.
Proc. Natl Acad. Sci. U.S.A.
104
,
14664
14669
64
Cruz
,
J.A.
,
Blanchet
,
M.F.
,
Boniecki
,
M.
,
Bujnicki
,
J.M.
,
Chen
,
S.J.
,
Cao
,
S.
et al. 
(
2012
)
RNA-puzzles: a CASP-like evaluation of RNA three-dimensional structure prediction
.
RNA
18
,
610
625
65
Miao
,
Z.
,
Adamiak
,
R.W.
,
Blanchet
,
M.F.
,
Boniecki
,
M.
,
Bujnicki
,
J.M.
,
Chen
,
S.J.
et al. 
(
2015
)
RNA-puzzles round II: assessment of RNA structure prediction programs applied to three large RNA structures
.
RNA
21
,
1066
1084
66
Miao
,
Z.
,
Adamiak
,
R.W.
,
Antczak
,
M.
,
Batey
,
R.T.
,
Becka
,
A.J.
,
Biesiada
,
M.
et al. 
(
2017
)
RNA-puzzles round III: 3D RNA structure prediction of five riboswitches and one ribozyme
.
RNA
23
,
655
672
67
Liu
,
P.
,
Agrafiotis
,
D.K.
and
Theobald
,
D.L.
(
2009
)
Fast determination of the optimal rotational matrix for macromolecular superpositions
.
J. Comput. Chem.
31
,
1561
1563
68
Cristobal
,
S.
,
Zemla
,
A.
,
Fischer
,
D.
,
Rychlewski
,
L.
and
Elofsson
,
A.
(
2001
)
A study of quality measures for protein threading models
.
BMC Bioinf.
2
,
5
69
Zok
,
T.
,
Popenda
,
M.
and
Szachniuk
,
M.
(
2014
)
MCQ4Structures to compute similarity of molecule structures
.
Cent. Eur. J. Oper. Res.
22
,
457
473
70
Magnus
,
M.
,
Matelska
,
D.
,
Łach
,
G.
,
Chojnowski
,
G.
,
Boniecki
,
M.J.
,
Purta
,
E.
et al. 
(
2014
)
Computational modeling of RNA 3D structures, with the aid of experimental restraints
.
RNA Biol.
11
,
522
536