Recent structural characterizations of the p51 and p66 monomers have established an important starting point for understanding the maturation pathway of the human immunodeficiency virus (HIV)-1 reverse transcriptase p66/p51 heterodimer. This process requires a metamorphic transition of the polymerase domain leading to formation of a p66/p66′ homodimer that exists as a structural heterodimer. To better understand the drivers for this metamorphic transition, we have performed NMR studies of 15N-labeled RT216 — a construct that includes the fingers and most of the palm domains. These studies are consistent with the conclusion that the p66 monomer exists as a spring-loaded complex. Initial dissociation of the fingers/palm : connection complex allows the fingers/palm to adopt an alternate, more stable structure, reducing the rate of reassociation and facilitating subsequent maturation steps. One of the drivers for an initial extension of the fingers/palm domains is identified as a straightening of helix E relative to its conformation in the monomer by eliminating a bend of ∼50° near residue Phe160. NMR and circular dichroism data also are consistent with the conclusion that a hydrophobic surface of palm domain that becomes exposed after the initial dissociation, as well as the intrinsic conformational preferences of the palm domain C-terminal segment, facilitates the formation of the β-sheet structure that is unique to the active polymerase subunit. Spectral comparisons based on 15N-labeled constructs are all consistent with previous structural conclusions based on studies of 13C-methyl-labeled constructs.
Reverse transcription of the single-stranded viral RNA genome into double-stranded DNA is an essential step in the human immunodeficiency virus (HIV) life cycle that is dependent on the enzymatic activities of the retroviral enzyme reverse transcriptase (RT). Mature HIV-1 RT is a p66/p511 heterodimer composed of a p66 subunit containing the active polymerase and RNase H sites, and a second p51 subunit that is formed from the same polymerase domain that has adopted an alternate fold in which the polymerase active site is buried and nonfunctional. This unusual structure provides a mechanism for responding to the strong selective pressure to minimize genome size of RNA viruses ; the HIV viral genome is only 9.2 kB , a value typical for RNA viruses [3,4]. Reliance on a single protein that adopts two alternate folding patterns provides a means of increasing the information density of the HIV genome. In the case of RT, this economy of coding comes at the cost of a complex maturation process, such that a pair of p66 precursors develop along different pathways, one forming the polymerase and RNase H active sites, while the second adopts an inactive fold with a buried polymerase active site, and loses most of its RH domain. Shortly, after the structure of the RT heterodimer was first determined [5,6], a structural analysis led Steitz and co-workers to conclude that the inactive structure of the polymerase domain that is present in the p51 subunit is more stable than the active structure observed in the p66 subunit . Based on this analysis, they proposed that homodimers formed from either p51 or p66 would be structurally asymmetric, formed by combining a pair of polymerase domains that have adopted the two alternate structures.
Recent NMR, crystallographic, and molecular dynamics (MD) studies have provided structural information about the p66 monomer precursor and elucidated the three major steps involved in the formation of the heterodimer [1,8–11]: (1) unimolecular rearrangement of the monomer subdomains to a structure approximating the mature p66 subunit; (2) initial homodimer formation by combining the structurally isomerized p66 with the predominant monomer; (3) very slow conformational changes that result in subunit-specific RH domain unfolding, exposing the buried proteolysis site within the RH domain. These studies have elucidated many additional details about the factors that underlie these complex conformational transitions. Furthermore, they provide basic insights into the mechanisms by which the metamorphic polymerase domain is able to adopt alternate structures.
As more structural information for complex biological molecules has become available, metamorphic proteins, i.e. proteins for which there is not a unique relationship between sequence and structure, have been increasingly identified [12–16]. It thus becomes of increasing interest to identify the factors that influence and are involved in the corresponding metamorphic structural transition. For HIV-1 RT, the metamorphic transition involves tension that exists between the p66 monomer structure that corresponds to a global energy minimum and the intrinsic conformational preferences of its component domains . For example, recent studies indicate that subsequent to dissociation of the fingers/palm from the connection domain, the domains undergo a conformational expansion that alters the relative orientation of the fingers and palm. However, understanding of the structural basis for this transition remains limited.
Despite the substantial progress that has been made in the development of RT-targeted drugs, RT remains an important target due to the ability of the enzyme to develop drug-resistant mutations [17,18], and to the toxicity of nucleoside RT inhibitors , that must be administered chronically since there is as yet no cure for this illness. This studies described here have two objectives: (1) utilization of 15N-labeling approach to further evaluate earlier conclusions based largely on 13C-methyl-labeling studies [8,10,11] and to further investigate the basis for the metamorphic transition that is an essential step of RT maturation, and (2) to compare results with recent NMR studies of 15N-labeled p66 reported by Sharaf et al.  that reached a different conclusion about the nature of the p66/p66 homodimer.
Materials and methods
Cloning, expression, and purification of RT216
A truncated construct, RT216, including the fingers and most of the palm domains, was constructed from wild-type p51 expression vector by adding a stop codon at 217 using the QuikChange XL site-directed mutagenesis kit (Agilent Technologies) and transformed into BL21(DE)3 codon plus RIL. The RT216 expression was induced by the addition of 0.5 mM IPTG at A600 ∼0.9 followed by growth of cells at 22°C for 18 h. The cell pellets were lysed in 50 mM Tris–HCl, 5% glycerol, and 1 mM EDTA (pH 8.0) buffer (Buffer A) by sonication. The cell lysate was centrifuged at 30 000 g for 30 min, and the supernatant was loaded on the Q-Sepharose Fast Flow column and the HiTrap SP HP column connected in tandem. When the OD280 of flow-through was observed to be less than 0.01, the HiTrap SP HP column was eluted with a 0–1 M NaCl gradient of Buffer A. The fractions containing RT216 were pooled based on SDS–PAGE analysis. The pooled RT216 samples were concentrated to small volume (<10 ml) and further purified by gel filtration chromatography on a HiLoad 26/60 Superdex-200 column using a buffer of 50 mM Tris–HCl, 200 mM NaCl, and 1 mM EDTA (pH 8.0).
The uniformly U-[2H,13C,15N]-labeled RT216 samples were expressed in Escherichia coli BL21(DE)3 codon plus RIL using M9 deuterated (99% D2O) medium containing U-[2H, 13C] glucose (2 g/l) and 15NH4Cl (1 g/l) as the sole carbon and nitrogen source. The NMR samples were concentrated to 0.6 mM and exchanged into 25 mM Tris–HCl-d11, 8% D2O, and 0.02% NaN3 (pH 6.8). The U-[15N]RH domain of HIV-1 RT and the U-[15N]thumb domain constructs were prepared as described previously , using M9 minimal media containing 15NH4Cl (1 g/l). As noted previously, the RH domain construct includes a four-residue Met–Asn–Glu–Leu sequence prior to Tyr427. Preparation of the p66ΔPL (RT residues 1–560 with palm domain residues 219–230 deleted) also followed the procedure previously described; however, the U-[2H,15N] labeling pattern used the media described above for RT216 with U-[2H] glycerol (2 g/l) replacing the doubly labeled glucose.
NMR data in the present study were obtained at 25°C on Agilent DD2 800 MHz and Varian INOVA 600 MHz NMR spectrometers, each equipped with a 5-mm Varian 1H[13C,15N] triple-resonance cryogenically cooled probe. Chemical shift assignments of RT216, the isolated thumb domain, and the isolated RH domain were assigned as described previously [8,21,22]. The sequential backbone and Cβ resonance assignments were established by the combined analysis of TROSY (transverse relaxation-optimized spectroscopy) HNCA, HN(CO)CA, HN(CA)CB, and HN(COCA)CB spectra  using triply labeled U-[2H,13C,15N]RT216. Resonance assignments were challenging as a result of spectral congestion, the presence of 18 proline residues in the RT216 construct, and additional broadening of some resonances due to exchange processes. We thus utilized triply labeled U-[2H,13C,15N]RT216 and also made use of selectively labeled samples containing [15N]-alanine, [15N]-leucine, or [15N]-valine (Supplementary Figure S1). A total of 161 resonances of the expected 197 (=215–18) non-proline amide resonances were assigned, corresponding to 81% coverage of the protein, and deposited in the BMRB (accession no. 25292). Of particular note were the shifts for Gly155, for which δ(1H,15N) = (3.78, 99.62), well beyond the typical range of values. Examination of RT crystal structures indicates that these extreme upfield shifts result from the close proximity of the Gly155 NH to the indole ring of Trp153.
The 1H-15N TROSY HSQC (Heteronuclear Single-Quantum Coherence spectroscopy) spectra of U-[2H,15N]p66ΔPL  were obtained using Agilent's gNfhsqc experiment in Biopack (Agilent, Santa Clara, CA). In the 1H dimension, 1024 complex points were acquired with a sweep width of 14 ppm using a relaxation delay of 1 s. In the indirect 15N dimension, 128 complex points were acquired with a spectral width of 29.6 ppm, and the 15N offset was set to 118.886 ppm. The residual water peak was suppressed using the 3919 WATERGATE sequence . The 1H-13C HMQC spectrum of [U-2H,13CH3-Ile]p51 was obtained as described recently . All NMR data were processed by NMRPipe  and analyzed with NMRViewJ .
The MD simulations treated the segment from 1 to 237 following the protocol described recently . Four systems were considered: (i) residues 1–236 of the p51 subunit from the X-ray crystal structure (pdb ID: 1DLO) with residues 219–230 introduced using initial coordinates for the p66 subunit (pdb ID: 1DLO), (ii) residues 1–256 of the p51ΔPL monomer from the X-ray crystal structure (pdb ID 4KSE(Δ)), (iii) residues 1–236 of the p51 subunit from the X-ray crystal structure (pdb ID: 1S9E), which includes residues 219–230; and (iv) residues 1–236 of the p51 subunit of the p51ΔPL monomer (pdb ID: 4KSE) with residues 219–230 introduced using initial coordinates for the p66 subunit (pdb ID: 1DLO). Simulations used the standard Amber FF10 force field for amino acid residues; 100–150 ns production MD runs of the solvated peptides were carried out to evaluate various dynamic properties. The angle θAF between helices A and F was determined as described previously , and the angle θE in these constructs was determined based on helical segments: 156–160 and 162–166.
Circular dichroism spectroscopy
The secondary structure of two peptides corresponding to the palm domain C-terminal segment [PFL — RT(226–241) –PFLWMGYELHPDKWTV] or to a more water-soluble analog containing three substitutions [PYK — RT(226–241)(F227Y,L228K,V241K) –PYKWMGYELHPDKWTK] was determined by circular dichroism (CD). The CD signal from 190 to 260 nm was measured on a Jasco J-810 CD spectrophotometer using a 1 mm quartz cell at 20°C containing 10–20 µM peptide concentrations in 10 mM sodium phosphate (pH 8.0), with fractional ethanol volume ranging from 0 (for the water-soluble analog) to 40% under the following settings: 1 nm bandwidth, 0.2 nm data pitch, 2 s response, and 200 nm/min scanning speed. CD spectra from three individual measurements were averaged and processed using the Spectral Manager software version 1.5 (Jasco Corporation, Japan) with a curve-smoothing function. The secondary structure content of the peptide was estimated from the DichroWeb on-line server using the CDSSTR method , and data visualization was performed with Excel software (Microsoft Corporation, U.S.A.). Predicted CD spectra for residues 226–241 corresponding to each subunit of RT were calculated from the PDB2CD on-line web server (http://pdb2cd.cryst.bbk.ac.uk) .
The RT dimer represents an extremely challenging target for direct structural analysis by NMR. Consequently, it has proved useful to analyze the results of NMR studies in the context of available structural information. A schematic illustration of the early maturation steps based on previous studies is shown in Figure 1 [8,10,11]. The p66 monomer, shown as both a ribbon diagram and schematic representation, consists of a globular structure corresponding to a complex of the connection and fingers/palm domains, while the thumb and RH domains are loosely connected by flexible-linking segments derived from unraveling of the palm, thumb, and connection domains . For clarity, both the domains and polymerase subdomains are simply referred to as domains. The fingers and palm domain sequences are discontinuous and so cannot be separated [5,30]. The structure of the fingers/palm : connection complex is similar to that of the p51 subunit of the RT heterodimer (e.g. ). Support for this structure was derived from crystallographic and NMR studies of constructs in which a segment of the C-terminal palm loop corresponding to residues 219–230 was deleted, as well as studies demonstrating that this deletion does not significantly alter the monomer structure . Dissociation of the connection from the fingers/palm allows further domain rearrangements, exposing surfaces involved in dimer formation. The rearranged monomer is then able to interact with the more stable (not rearranged) monomer, forming an initial p66/p66′ homodimer that exists as a structural heterodimer. Subsequently, a series of conformational changes occur primarily in the p66′ subunit that culminate with the subunit-specific destabilization and unfolding of the p66′ RH′ domain  (not shown). RH domain unfolding allows the protease access to the monomer-inaccessible cleavage site.
Early steps of RT structural maturation.
Recent studies have suggested that the p66 monomer is spring-loaded, so that after dissociation of the connection domain, the angle between the fingers and palm domains expands significantly (Figure 2A–C; ). Figure 2A illustrates the more bent conformation of the fingers/palm observed in the monomer construct with disordered palm loop residues 219–230 deleted (pdb: 4KSE; ), where the angle θAF between helix αA in the fingers and αF in the palm is ∼50°. A similar, acute angle characterizes the p51 subunit of RT. In contrast, the crystal structure of a construct corresponding to the isolated fingers and most of the palm, RT216, adopts a more expanded structure in which θAF ∼90° (Figure 2B), similar to the fingers/palm angle in the p66 subunit of RT. A further comparison reveals that in the monomer, helix αE exhibits a sharp bend near residue Phe160, but straightens out in the isolated RT216 construct (Figure 2C). Helix E is located mostly in the palm, but extends to the fingers–palm interface. The enhanced stability of the straighter helix geometry is presumably one of the factors that favor the extended structure of the isolated fingers/palm construct. It is thus of interest to determine whether this bend is also present in solution.
Structural comparisons of the palm/loop domains.
The fingers and palm domains form part of the intersubunit interface in RT, leading to the possibility that similar interactions in the isolated RT216 construct may lead to aggregation. The interface between the p51 fingers and p66 palm domains in the RT heterodimer is shown in Figure 2D. This interface involves two components: (1) the p66 (palm domain) region near Trp88 and (2) the fingers domain (p51) region involving the β7–β8 loop. Interestingly, a similar but less complete interface exists at the lattice contracts of the reported RT216 crystal structure (Figure 2E; pdb: 1HAR). In this case, palm domain residue Trp88 forms a similar set of interactions with fingers domain residues on an adjacent molecule; however, the β7–β8 loop is poorly defined and does not interact with palm domain. Furthermore, the palm domain segment from residues 90 to 104 is disordered (Figure 2E). These results are consistent with the possibility of intermolecular interactions of the RT216 construct, but suggest that the interface may be less extensive than observed in the RT heterodimer.
Solution behavior of the isolated fingers/palm domains
To evaluate the solution behavior of the RT216 construct discussed above, isotopically labeled RT216 was expressed and characterized by NMR. The 1H-15N HSQC spectrum of doubly labeled U-[13C,15N]-RT216 showed many well-dispersed resonances, but was also characterized by some regions with a high degree of congestion and uneven resonance intensities in the random coil region near δ1H = 8.0–8.5 ppm. One characteristic of RT216 that tends to be unfavorable for NMR analysis is the presence of multiple, solvent-exposed aromatic side chains. A solvent exposure analysis  of the RT216 crystal structure (pdb: 1HAR) indicates that seven aromatic residues, such as Trp24, Phe61, Trp71, Trp88, Tyr181, Tyr183, and Tyr188, have a fractional solvent exposure ranging from 24 to 88%. Exposed aromatic side chains can mediate aggregation as well as broadening resonances of nearby residues due to internal motion. Significant improvement in spectral resolution and homogeneity was obtained using triple U-[2H,13C,15N] labeling (Figure 3A), although several congested areas remained. Missing assignments include several short segments bounded by proline residues, a central region of helix E (164–170), as well as several segments that are also disordered in the RT216 crystal structure (Supplementary Figure S2). A significant number of the problematic residues in the RT216 construct are readily observed in crystal structures of the full RT molecule, most probably due to additional stabilizing interactions at the domain boundaries.
Solution behavior of RT216.
|Secondary structure||p661||p511||RT2162||p51ΔPL monomer3|
|Secondary structure||p661||p511||RT2162||p51ΔPL monomer3|
Secondary structure identification from ref. .
Secondary structure from PyMol analysis of pdb entry 1HAR for RT216.
Secondary structure from PyMol analysis of pdb entry 4KSE for p51ΔPL.
As summarized above, the fingers and palm domains form an intersubunit interface (Figure 2D), a portion of which is also present as a lattice contact surface in the crystal structure of RT216 (Figure 2E). Based on these structures, it was anticipated that dimerization or formation of higher aggregates could be occurring for the RT216 construct. To evaluate the effects of intermolecular complex formation, concentration-dependent NMR studies of the labeled RT216 were obtained over the range from 45 to 650 µM (Figure 3B). Secondary structure identification is given in Table 1. The two most concentration-sensitive regions of the protein identified are two β-sheets: β3−β4 on the fingers domain and β6−β9−β10 on the palm (Supplementary Figure S2). Both of these sheets contain solvent-exposed hydrophobic residues that presumably mediate the aggregation. Truncation of the palm domain in RT216 removes a region of the structure that would normally interact with the hydrophobic surface of the NNRTI pocket, exposing NNRTI-binding residues Leu100, Val106, Val108, Tyr181, Tyr183, and Tyr188 to the solvent. The hydrophobic region of the fingers domain includes a region that interacts with the RNA or DNA substrate. Although few concentration-dependent shifts were observed for residues located in regions that form the fingers : palm interface in the RT heterodimer (Figure 2D), fingers domain amide resonances for Lys20, Val21, Lys22, Asn57, and Tyr56, that would be positioned near palm domain residue Trp88 if aggregation were occurring were unassigned, possibly as a result of monomer–dimer exchange broadening.
Since the major difference between the fingers/palm structure in the monomer and in the isolated fingers/palm construct involves a change in relative domain orientation, we attempted to obtain RDC data using multiple alignment media. None of these approaches was successful, perhaps as a consequence of the solvent-exposed aromatic residues noted above. We subsequently turned to a more detailed analysis of helix αE. This helix, which has been considered to extend from residue 154 to 174 , is unusual, containing Pro residues at positions 157 and 170. Residues 164–170 in the center of this helix are positioned near milder bends in the helix and correspond to regions of increased local instability that limited assignments. However, the major kink in the helix is not located at these positions, but near Phe160, and assignments were made for several residues immediately before and after Phe160 (Supplementary Figure S3).
A TALOS+ analysis of the shifts  provided the most useful approach for assessing the solution conformation of helix αE near the bend. As shown in Figure 4, TALOS analysis of the shift data for residues 158–162 yielded a set of tightly clustered phi and psi values consistent with the α-helical structure. Comparisons with the phi/psi values for the same residues in the isolated RT216 construct (pdb: 1HAR) or the p66 subunit of the RT heterodimer (pdb: 1DLO) are also consistent with α-helix geometry, while the corresponding values for the RT p51 subunit or the p51ΔPL monomer (pdb: 4KSE) show much larger variations. Thus, the TALOS analysis supports the conclusion that in solution, RT216 adopts a helix E geometry that lacks the sharp bend that is present in the monomer, and is thus consistent with the conformation observed in the extended conformation present in the p66 subunit of the RT heterodimer.
Conformational analysis of helix E.
MD simulations of the fingers/palm
We previously reported MD simulations for the isolated fingers/palm domains indicating that over a time frame of ∼100 ns, the more expanded conformation observed in the p66 subunit or the isolated fingers/palm construct persists (θAF ∼ 90°), while the more bent conformation (θAF ∼ 50°) present in the monomer is not stable once the fingers/palm has been separated from the connection domain . It appears that the bent conformation of αE in the monomer is stabilized by the extensive interface between the fingers/palm and the connection domains. These as well as additional simulations were again used to evaluate the bend in helix E (θE), based on a comparison of the helix direction in segments 156–160 and 162–164. Initial bends of ∼40°–60° are reduced to ∼20° on the same time scale as the increase in θAF (Figure 5). Each simulation starts from a slightly different structure at time zero: 1DLO, 1S9E, 4KSE, and 4KSEΔ (see Materials and Methods). Straightening of helix E appears to be concerted with the change in θAF in simulations A and B, and to slightly precede the change in θAF in simulations C and D. These results suggest that the more stable helical conformation helps select the extended conformation and perhaps may precede the overall conformational expansion.
MD simulations of helix E.
Intrinsic conformational preferences of the palm C-terminal residues
In addition to changes in the fingers/palm orientation, the metamorphic transition involves formation of a short, four-stranded β-sheet that is not present in the monomer. This sheet is formed from residues 226 to 241 corresponding to the C-terminus of the palm domain (Supplementary Figure S4), and residues 314 to 318 at the beginning of the connection domain (using recently proposed domain boundaries ). We previously suggested that, as in the case of the expanded fingers/palm angle, this conformational transformation arises after connection domain dissociation removes constraints of the monomer, allowing the intrinsic conformational preferences of the peptide segments to be expressed. The conformational preferences of the palm domain C-terminal peptide, residues 226–241 (peptide PFL: PFLWMGYELHPDKWTV), were evaluated using circular dichroism (CD). Unsurprisingly, the extremely hydrophobic peptide PFL was predicted to have a high tendency toward aggregation based on analysis by the web server TANGO . We therefore also evaluated a more hydrophilic analog: RT (226–241) containing three residue substitutions: F227Y, L228K, and V241K (peptide PYK: PYKWMGYELHPDKWTK) that appeared likely to increase water solubility without significantly perturbing the expected residue interactions. Consistent with this expectation, the TANGO analysis indicated that the hydrophilic analog has no significant tendency to aggregate, and both isolated peptides are predicted to have β-sheet structure, although with low probability.
CD spectra of the hydrophilic analog were obtained in water, while due to solubility limitations, the hydrophobic peptide spectra were obtained in 10–40% ethanol at peptide concentrations from 10 to 20 µM. Analysis of the spectra using the web-based program DICHROWEB  showed that both peptides exist in solution as mixtures of β-turns and β-strands (consistent with the β-sheet structure) and disordered structures (Supplementary Figure S5 and Table S1). The fractional β-sheet ranged from 28 to 62% for PFL, with the highest values obtained in 40% ethanol, consistent with the interpretation that aggregation is reduced in this solvent. For the more hydrophilic analog, the β-sheet probabilities determined in water ranged from 41 to 56%. CD spectra of the PFL peptide obtained in 40% ethanol were compared with theoretical spectra for residues 226–241 in each subunit of RT generated using the web server PDB2CD  (Supplementary Figure S5). The spectra for the palm domain segment in the p66 subunit that contains the β-sheet structure are much more similar to the experimental CD spectra than the spectra generated for residues 226–241 in the p51 subunit (Supplementary Figure S5). These results support the intrinsic tendency of the peptide to adopt the β-sheet structure.
NMR spectra of the U-[2H,15N]p66ΔPL monomer
In addition to the more specific questions about the conformational changes observed for the fingers/palm, perhaps the most significant question about the metamorphic transition of the RT polymerase domain is whether it is primarily a unimolecular rearrangement that occurs prior to dimer formation, or whether it occurs subsequently. Kinetic studies have led to inconsistent conclusions [35,36], and more recently, NMR studies using 15N-labeling  also have led to conclusions that are inconsistent with those based on earlier methyl-labeling studies [8,10,11]. Since the structure shown in Figure 1 is based on both crystallographic and NMR studies of methyl-labeled constructs, it was of interest to determine to what extent these earlier conclusions are also consistent with NMR analysis of 15N-labeled constructs. Following the NMR approaches previously utilized by both groups, we compared the TROSY spectra of U-[2H,15N]p66ΔPL with spectra obtained for three constructs of the 15N-labeled individual domains: the thumb, RH, and RT216 corresponding to the fingers and most of the palm domains (Figure 6). We note that the structure shown in Figure 1 (see also Figure 1C of ref. ) would imply very close agreement of the monomer spectra with those of the RH and thumb domains, but much poorer agreement with the spectrum of RT216.
NMR spectra of the p66ΔPL monomer.
TROSY spectra of U-[2H,15N]p66ΔPL overlaid with amide spectra for U-[15N]RH, U-[15N]thumb, and U-[2H,13C,15N]RT216 are shown in Figure 6A–D. Since these comparisons involved TROSY spectra of the labeled RT216 and p66ΔPL, but 1H-15N HSQC spectra for the RH and thumb domains, and since the samples were studied under similar but not identical conditions, the overlays were optimized to facilitate resonance comparisons rather than using a constant shift reference. Assignments for some of the more well-dispersed resonances are indicated in the figures. As is apparent from Figure 6B,C, the overlays with the isolated RH and thumb domains show excellent agreement.
An expanded region of the spectra illustrating these comparisons is shown in Figure 7. Figure 3A, derived from Figure 3A of Sharaf et al.  shows a spectral region of the putative p66/p66 homodimer overlaid with color-coded resonances for the isolated RH and thumb domains. These spectral results were used as the basis for a model of a symmetric p66/p66 homodimer. Figure 7B shows the same overlay derived from spectra in Figure 6 that correspond to the p66ΔPL monomer (green), RH (red), and thumb (blue) domains. A color-coded schematic of the monomer is shown in Figure 7C. Comparison of Figure 7A and B shows that although attributed to very different species, the spectral overlays are extremely similar, indicating that the corresponding structures are also very similar. In view of the spectral similarity of Figure 7A,B and the consistency of the spectral data in Figure 7B with the monomer structure in Figure 7C, the data in Figure 7A are also consistent with a structure similar to that of the monomer (Figure 7C).
Comparison of an expanded spectral region of the p66/p66 dimer and the p66ΔPL monomer.
In contrast with the above comparisons, the agreement with the RT216 spectrum is substantially poorer (Figure 6D) — a result that is anticipated on the basis of the structural data in Figures 1 and 2. Specifically, the isolated RT216 construct has a different conformation than the corresponding segment of the p51ΔPL monomer (Figure 2A,B). More importantly, there are extensive interface contacts between the connection domain and the fingers/palm in the monomer that are not present in the extended p66 subunit of the heterodimer or the RT216 construct. This interface contains several aromatic residues that will produce significant shift perturbations for residues in the fingers/palm domains. The much poorer consistency between the chemical shift pattern in the p66ΔPL monomer and in the RT216 construct is completely consistent with expectations based on the available structural information and Figures 1 and 2. Conversely, the NMR spectral comparisons presented above provide strong support for the p66 monomer structure shown in Figure 1 that was previously deduced from structural data for p51ΔPL combined with NMR studies of methyl-labeled p66ΔPL and the component domains. Specifically, in the p66 monomer structure, the RH and thumb domains are structurally isolated and internally mobile, consistent with amide spectra comparisons summarized above. Alternatively, the RT216 does not show good agreement, again consistent with the structure shown in Figure 1. In summary, the NMR studies of the 15N-labeled constructs presented here and in the previous study by Sharaf et al.  are consistent with the monomer structure of Figure 1.
Spectral comparison of the U-[2H,15N]p66ΔPL monomer with spectra of the reported U-[2H,15N]p66 homodimer
In the previous section, we observed that the amide spectrum of the U-[2H,15N]p66ΔPL monomer agrees closely with spectra for the isolated thumb and RH domain, but much less closely with spectra for RT216. The spectral comparisons presented in the previous section are identical with comparisons made by Sharaf et al.  for the similarly labeled p66/p66 homodimer. Hence, we can conclude that the amide spectra for the p66ΔPL monomer must be very similar with the spectrum attributed to the p66/p66 homodimer. Two spectral overlays shown in Figure 8 compare the amide resonances of the p66ΔPL monomer with the published spectra for the homodimer. The comparisons in Figure 8 optimized the resonance overlap (Figure 8A) or utilized the chemical shift axes from the two studies (Figure 8B). The spectra are very similar but not identical. In addition to differences in experimental conditions, one important basis for a difference is the absence of the palm loop segment from 219 to 230 corresponding to residues KKHQKEPPFLWM in the spectrum of the p66ΔPL monomer. In addition to the loss of the corresponding amide resonances, this segment contains three aromatic side chains that, depending on the motional characteristics, may produce shift and broadening effects of nearby residues. Thus, at least one of the missing TrpNHε resonances in Figure 8A (bottom, lower left) can be attributed to the loop deletion. The spectral offsets in Figure 8B approximate those expected to arise from a comparison in which only one spectrum utilized a TROSY sequence and intensity differences in the amide side chain regions are also suggestive of this difference. Overall, the comparison is consistent with very similar, if not identical structures.
Comparison of monomer and dimer spectra.
Effects of exchange
The symmetric p66/p66 homodimer model presented by Sharaf and co-workers is based largely on the conclusion that only a single set of resonances is observed for both species. This analysis is limited by the presence of both monomers and dimers in the sample and by the absence of information on chemical exchange rates. In the event that monomer and dimer exchange is fast on the chemical shift difference time scale, the monomer becomes a rapidly exchanging intermediate that will equilibrate the shifts of the two subunits of the dimer:
Sharaf et al.  concluded that for p51, the high KD value of ∼0.3 mM is consistent with fast exchange, if dimer formation is assumed to be diffusion limited. This would presumably preclude conclusions about the symmetry of the p51/p51 homodimer. The estimated diffusion-controlled association rate constant of 106 M−1 s−1 is nearly six orders of magnitude faster than the heterodimer association rate constant of 1.7 × M−1 s−1 reported for RT heterodimer formation . Sharaf et al. suggested that for the p66 sample, the lower Kd of ∼4 µM would be sufficient to overwhelm contributions from the p66 monomer; however, this does not necessarily eliminate the effects described by eqn (1) above. Thus, subunit shift differences are still subject to equilibration if the monomer–dimer exchange rates are sufficiently fast.
In contrast with the analysis of Sharaf et al., NMR studies of [13CH3-Met]- and [13CH3-Ile]-labeled p51 and p66 homodimers showed multiple chemical shift differences for the two subunits, consistent with slow chemical exchange for both the p66 and p51 homodimers [8,10,11]. For the purpose of comparison, the 1H-13C HMQC spectrum of a sample containing [13CH3-Ile]p51 obtained under very high salt conditions that enhance the dimer/monomer ratio is shown in Figure 9. Even under high salt conditions, a significant monomer fraction remains, as indicated most directly by the observation of an Ile393 resonance attributed specifically to the monomer species. Based on comparison with spectra for the [13CH3-Ile]-labeled p66/p51 heterodimer , several of the more highly resolved resonances are readily attributed to either the compact or extended structures of the polymerase domain. The high salt conditions used to increase the dimer/monomer ratio in the present study are apparently not essential for achieving the slow exchange conditions, since separate methyl resonances in the p51 dimer have been observed at lower salt concentrations (200 mM KCl) . We note also that none of the resonances are attributed to the extended (active) polymerase conformation of the monomer (p51E), which is considerably less stable and present at a very low concentration.
1H-13C HMQC spectrum of [U-2H,13CH3-Ile]p51.
As discussed previously, the monomer–dimer equilibrium is slow for many resonances in p51 because a substantial structural rearrangement, the metamorphic transition, must occur prior to dimer formation. For the isolated polymerase domain, the structural isomerization–dimerization process is described by the relations [10,11]:
The first step above will be slow since it entails major structural changes that involve domain rearrangements as well as the formation of new secondary structural elements discussed above . In general, each subunit corresponds to a separate set of resonances. For some resonances, e.g. Ile47, Ile202, and Ile274, the shift differences are small but easily resolved (Figure 9); in some cases, e.g. Ile257, the very small shift difference limits resolution of the two forms, while in others, e.g. Ile270 and Ile382, the shift difference is so large that only one of the two corresponding peaks is observed, while the resonance arising from the other subunit is broadened and/or shifted into a more congested region of the spectrum. Thus, it is seen that even in the example of the weakly associated p51/p51′ homodimer, monomer–dimer exchange is slow, consistent with the earlier analysis that this requires a structural, metamorphic transition prior to dimer formation (eqn (2)).
Although conformational heterogeneity is a nearly universal characteristic of proteins, the significant structural variations characterizing metamorphic proteins have been less extensively documented. Nevertheless, the increasing number of examples becoming available leads to interesting new problems of protein folding and structural interconversion. The polymerase domain of HIV-1 RT adopts either of two alternate structures that fulfill two different functions. As a comparison, the interconversion of open and closed positions of the thumb domain in RT is achieved by bond rotations and occurs on a much faster time scale that is typical for conformational transitions . Atypically, the extended and compact structures of p66 must both be present simultaneously in order to form the p66/p66′ homodimer precursor and the RT heterodimer. As first proposed by Wang et al.  and subsequently substantiated by Zheng et al. [8,11], the compact fold characterizing the inactive domain is considerably more stable than the extended fold of the active domain, and the homodimers exist as structural heterodimers. Much of the energy required to stabilize the extended, active fold of the polymerase domain is provided by the energy of dimerization. As a consequence, the net energy available for dimer stabilization is limited. Thus, despite a very large dimer interface area of ∼4500 Å2 , the apparent dimer Kd values are in the micromolar range [38,39]. This contrasts with many examples of considerably smaller interfaces characterized by nM stability . This effect is even more dramatic for the p51/p51′ homodimer: a 2500 Å2 interface  corresponds to a Kd value of ∼230 µM  or higher .
As noted above, recent analysis of NMR data by Sharaf et al. supporting the existence of a long-lived, symmetric p66/p66 homodimer was based, in part, on the assumption of diffusion-controlled dimerization, with an association rate constant of ∼106 M−1 s−1 . Although earlier kinetic measurements were interpreted as supporting rapid initial dimer formation followed by slower conformational adjustments , a much slower association rate constant, ka = 1.7 M−1 s−1, was subsequently determined . This study attributed the earlier kinetic results to an artefactual consequence of the acetonitrile added to stabilize the monomer. However, since both the p51 and p66 monomers adopt compact structures that resemble the p51 subunit of RT , all p51 and p66 dimerization reactions — whether they involve homo- or heterodimer formation — are rate limited by the requirement for an initial metamorphic structural transition. The slow exchange behavior expected on the basis of the slower rate constant has been observed in previous NMR studies of both p51 and p66 homodimers [8,10,11] and is consistent with the observation of multiple p51 Ile methyl resonances in Figure 9.
As concluded in previous studies [8–10], the metamorphic transition is triggered by dissociation of a spring-loaded fingers/palm : connection complex. Subsequent to dissociation, the fingers/palm and connection domains each adopt alternate conformations that are not constrained by the interdomain interactions present in the monomer. Previous analysis based on crystallographic and molecular modeling studies indicated that upon dissociation, the fingers/palm adopts a more extended conformation, similar to that observed in the p66 subunit of RT [10,41]; however, the basis for this transition has been unclear. In the present study, we have shown that one of the drivers for this structural change is likely the preference of helix αE to eliminate the sharp kink near Phe160 that is present in the monomer [pdb: 4KSE ; extending the standard helical geometry (Figure 2C)]. Although the solution behavior of the RT216 construct, containing the fingers and most of the palm domains, is not ideal, TALOS+ analysis supports the conclusion that in solution, as in the crystal, this pronounced bend is not present.
The most typical feature of metamorphic transitions is a change in secondary structure. For p66, the most significant secondary structure change involves formation of a short β-sheet from residues 226–241 at the C-terminus of the palm domain and residues 314–318 at the N-terminus of the connection domain. This β-sheet is not present in the monomer structure or in the p51 subunit of RT. As with the fingers/palm expansion, this transition also appears to be spring-loaded: dissociation of the fingers/palm : connection complex in the monomer removes the structural constraint on the palm domain exposing the hydrophobic surface of the palm and allowing the intrinsic conformational preferences of palm residues 226–241 to be expressed. The interactive nature of the exposed hydrophobic surface of the β-sheet of the palm domain (β6−β9−β10) is demonstrated by concentration-dependent shifts that reveal the availability of this surface for interactions with hydrophobic residues. A CD analysis of the palm domain C-terminal peptide as well as a hydrophilic analog support the propensity of this sequence for β-sheet formation. The β-sheet formed from the palm C-terminus, corresponding to β12−β13−β14 in the p66 RT subunit, is amphiphilic, having both hydrophilic and hydrophobic surfaces, so that β-sheet formation and stabilization is further supported by the interaction of the hydrophobic surfaces of the two β-sheets (Supplementary Figure S4). Thus, the results are consistent with the conclusion that both factors — interaction with the exposed β-sheet and intrinsic conformational preferences of the palm C-terminal segment — contribute to the metamorphic transition.
As shown above, spectral comparisons of 15N-labeled p66ΔPL monomer with 15N-labeled RH, thumb, and RT216 (fingers/palm) domains exhibit the patterns predicted on the basis of the monomer structure shown in Figure 1. Thus, the RH and thumb domain spectra are in very good agreement with the p66 monomer spectrum, while the RT216 spectrum is not. These same comparisons recently were reported for the 15N-labeled p66/p66 homodimer, but interpreted to support a stable, symmetric homodimer structure . Thus, it must be concluded that either a long-lived symmetric homodimer is formed from a pair of monomers, each of which closely resembles the isolated monomer, or that the spectra attributed to the homodimer were dominated by contributions from the smaller and more flexible monomer species. Although the existence of a short-lived symmetric homodimer formed from a pair of monomer-like structures is difficult to completely rule out, the stable, long-lived symmetric homodimer proposed by Sharaf and co-workers is inconsistent with many reported homodimer characteristics, including the activity of the p66/p66 homodimer, which requires the presence of at least one domain with an active fold and exposed catalytic site [38,42,43].
The second explanation — that the spectra attributed to the p66/p66 homodimer were instead dominated by the smaller and more flexible monomer species — is further supported by the spectral comparisons shown in Figures 6–8. The structure of the p66 monomer corresponds to a group of domains connected by flexible-linking segments. The largest of these is the fingers/palm : connection complex with MW ∼40 kDa, while the RH and thumb domains are ∼15 and 7 kDa, respectively. In contrast, the p51/p51 homodimer forms a globular structure of ∼100 kDa, and the p66/p66 homodimer is even larger. The bias toward observation of the monomer vs. dimer forms is thus larger than that suggested by the molecular mass difference. Several groups have observed that RT preparations are heterogeneous [38,44]. Our previous studies of dimer formation have generally exhibited greater variability than is typically encountered for simpler proteins, and in some instances, evidence of monomer species persists well beyond that predicted by a simple kinetic dimerization model. Among potentially complicating factors, intermediate species formed during the metamorphic transition may become trapped as domain-swapped species (e.g. [9,45]), and the presence of unstructured connecting loops makes the protein susceptible to adventitious proteases.
The NMR studies summarized above provide further insights into the mechanism of the metamorphic transition. Comparison with recently reported studies of 15N-labeled p51 and p66 further supports our previous determinations of monomer structure and supports the conclusion that the long-lived p51 and p66 homodimers exist as structural heterodimers. This conclusion is consistent with the analysis originally put forth by Steitz and co-workers  and inconsistent with the recent analysis of Sharaf et al. .
for clarity, both the domains and polymerase subdomains are simply referred to as domains
human immunodeficiency virus
Heteronuclear Multiple-Quantum Coherence spectroscopy
Heteronuclear Single-Quantum Coherence spectroscopy
nuclear magnetic resonance
non-nucleoside reverse transcriptase inhibitor
HIV-1 reverse transcriptase
transverse relaxation-optimized spectroscopy
R.E.L. supervised this project; experimental studies were designed by X.Z., E.F.D., G.A.M., K.K., and R.E.L., and molecular modeling studies were designed and performed by L.P. Experimental work was performed by X.Z. and K.K. All authors contributed to data analysis and to writing the paper.
This work was supported by the Intramural Research Program of the NIH, National Institute of Environmental Health Sciences, project number ZIA ES050147 to R.E.L. and ZIA ES043010 to L.P. E.F.D. is supported by NIH and NIEHS (National Institute of Environmental Health Sciences) under delivery order HHSN273200700046U.
The authors are grateful for a critical reading of this manuscript by Dr Peter M. Thompson and Dr Jason Williams, NIEHS. This research was performed as part of the Intramural Research Program of the NIH and National Institute of Environmental Health Sciences (NIEHS), Research Project Number Z01-ES050147 to R.E.L.
The Authors declare that there are no competing interests associated with the manuscript.
p66 includes residues 1−560 of HIV-1 reverse transcriptase. p51 corresponds to the polymerase domain, and generally includes constructs that terminate at residue Trp426 corresponding to the polymerase-RH domain boundary  used in this study, up to constructs that terminate at Phe440, corresponding to the protease cleavage site within the RNase H domain.
Present address: Bayer Crop Science, Inc., Morrisville, NC, U.S.A.