In recent years, it has become evident that structural characterization would gain significantly in terms of biological relevance if framed within a cellular context, while still maintaining the atomic resolution. Therefore, major efforts have been devoted to developing Cellular Structural Biology approaches. In this respect, in-cell NMR can provide and has provided relevant contributions to the field, not only to investigate the structural and dynamical properties of macromolecules in solution but, even more relevant, to understand functional processes directly in living cells and the factors that modulate them, such as exogenous molecules, partner proteins, and oxidative stress. In this commentary, we review and discuss some of the main contributions to the understanding of protein structural and functional properties achieved by in-cell NMR.
Progress in Life Sciences requires complete knowledge of the fundamental biological processes. Such basic understanding has an enormous impact on many aspects of life, as it is critical in the development of new drugs and therapeutic protocols. To this aim, the structural and dynamic properties of all biomolecules involved need to be characterized at atomic resolution. Classically, structural data of biological macromolecules are obtained in vitro, far from the true biological context. Indeed, since the advent of Structural Biology, a huge amount of structural information has been collected by classical approaches, mainly by X-ray crystallography, followed by nuclear magnetic resonance (NMR) and — recently — by single-particle cryo-electron microscopy (cryo-EM), while kinetic and thermodynamic information has been gathered by biophysical techniques. Nowadays, integrated structural biology approaches are becoming more common, where different techniques are combined together to investigate the biological context at a broad range of scales, from the whole cell down to the single atoms in a molecule. In such approaches, in vitro data are compared with in situ information obtained from cell biology techniques and ultimately correlated with in vivo data. A pure reductionist approach, as that pursued when applying single structural biology techniques, would be hampered by the fact that many assumptions are made on the physiological relevance of the structural and functional properties obtained in vitro, and these assumptions are often not completely satisfied or even partially wrong. On the other side, cellular techniques provide biologically meaningful information, but more indirectly and at the cost of much lower resolution. Therefore, Cellular Structural Biology approaches are needed to bridge the gap between high-resolution characterization and preservation of a faithful biological context.
In this respect, NMR spectroscopy is the ideal atomic-resolution technique, as it can investigate macromolecules in complex biological matrices in solution. In recent years, NMR has indeed been applied to study biological macromolecules, such as proteins and nucleic acids, directly in living cells. The approach, termed in-cell NMR, provides a powerful combination of atomic resolution — NMR probes directly the chemical surroundings of each atom — and high physiological relevance when investigating intracellular macromolecules. Solution in-cell NMR, in particular, is ideally applied to investigate relatively small, soluble macromolecules that are freely diffusing within the intracellular environment. Importantly, a unique feature of solution in-cell NMR is its ability to probe dynamic processes occurring at multiple timescales — from nanoseconds to minutes/hours — in metabolically active cells at physiological temperature (e.g. 37°C). Therefore, beyond purely structural aspects, in-cell NMR is a powerful methodology to gain functional insights on biological macromolecules such as proteins. Among other applications, in-cell NMR can obtain functional information on protein folding and maturation, redox state regulation, post-translational modifications, and interactions with the cellular environment and/or with specific partners.
Since the first applications in bacteria , there have been many methodological advancements, which have extended the range of cellular environments to Xenopus laevis oocytes [2,3], yeast , insect , and human cells [6,7]. Furthermore, in-cell NMR has also been applied to gain conformational insights on nucleic acids [8–11]. As these advancements have been reviewed in detail elsewhere [12–16], here we will focus on some of the unique biological insights that have been obtained by in-cell NMR and provide a critical analysis of the current strengths and limitations of the methodology.
Intracellular protein structure and beyond
NMR is an established technique for structure determination of proteins and nucleic acids, even though it only accounts for ∼10% of all deposited structures. When applied to living cells, however, NMR stands out as the only technique capable of obtaining atomic-resolution structures within an intact cellular environment. In-cell de novo protein structure calculation by classical NMR methods [based on nuclear Overhauser effect (NOE)-derived spatial restraints] has been demonstrated by Sakakibara et al.  on a small metal-binding domain in bacteria. This achievement was an important milestone, which contributed to the recognition of in-cell NMR in the field of structural biology, but it required prohibitively high intracellular protein concentration, long experimental time, and the need of multiple sample preparations. The same approach was later improved by the same group, with a 10-fold reduction in the required protein concentration and a partially automated assignment strategy . Despite these improvements, the classical NOE-based strategies still require complex isotopic labeling schemes and remain confined to bacteria. In parallel, a more flexible method for in-cell structure calculation has been demonstrated, which makes use of paramagnetic NMR data [19–21]. Here, structural restraints were derived by pseudo-contact shifts and residual dipolar couplings measured on a protein chemically modified with a tag coordinating a paramagnetic metal ion and inserted into the cells. A statistical information-driven modeling software (GPS-Rosetta; ) was used to calculate the structural model. Albeit more coarse-grained than a NOE-based strategy, this method is much less demanding and is applicable to eukaryotic and mammalian cells. Beyond pure structure determination, this approach can, in principle, provide precious data to characterize intracellular protein complexes. Overall, the protein structures determined in cells did not differ much from those obtained in vitro, indicating that the structure of rigid, globular domains is conserved. However, conformational differences were observed in less rigid regions such as loops, suggesting that flexible regions could be affected by the intracellular environment.
Solid-state NMR (SS-NMR) has also been successfully applied to obtain protein structural information in biological contexts. In this respect, SS-NMR has a great potential, as it can be applied to any protein that behaves as a solid entity, such as membrane proteins, which cannot be detected by solution NMR . While proof of principle of SS-NMR applied to intact cells has been shown [24–26], a much more promising approach can be pursued, where the molecules of interest are isolated from the cells while preserving their native context (e.g. the plasma membrane). Such in situ NMR strategy ensures higher sample stability and can be enhanced by dynamic nuclear polarization (DNP) for increased sensitivity [27,28]. This approach can deliver physiologically relevant structural insights on challenging membrane-bound systems. Kaplan et al.  have applied in situ SS-NMR to investigate the activation of the extracellular domain of the human full-length epidermal growth factor receptor (EGFR) by the epidermal growth factor (EGF) in native membranes. The authors isolated native membrane vesicles enriched in EGFR from A431 cells, which express EGFR at high levels. Importantly, the EGFR-enriched vesicles retained the native membrane composition and morphology, and the correct orientation and activity of the receptor, as assessed by cryo-EM and super-resolution light microscopy. DNP-enhanced SS-NMR analysis revealed that the extracellular domain of EGFR (ECD) is highly dynamic, a behavior that could not be observed from the X-ray structure previously available. These dynamics were greatly reduced upon binding of EGF, suggesting that the reduction in conformational entropy contributes to the free energy of EGFR dimerization and consequently drives its activation (Figure 1a).
In situ and in-cell NMR provides protein structural and dynamic insights.
Effects of the cellular environment
How the intracellular environment affects protein behavior has been a long-standing question, which is now being investigated by in-cell NMR. Since the first applications in bacteria, it was realized that many soluble, globular proteins or domains could not be easily detected by in-cell NMR, due to extensive signal broadening in the NMR spectra [30–32]. In parallel, it was observed that the intracellular folding of model proteins could not be easily predicted or reproduced in vitro by only considering the well-known effect of macromolecular crowding [33,34]. Such discrepancies are now known to be caused by weak interactions, occurring between the protein and other components of the intracellular milieu, that are disrupted upon cell lysis. A biological function for elusive interactions had been hypothesized much earlier, and defined as the quinary structure of proteins . Owing to the capabilities offered by in-cell NMR, the effect of intracellular pH, ionic strength, and quinary interactions on protein folding and structural properties at the residue level have been extensively investigated [36–41]. From these works, it has emerged that there is a sort of ‘interaction code’ written on the surface of every soluble protein of the cell, which is at the basis of the quinary structure and determines the complex supramolecular organization of the intracellular environment . To obtain insights into protein dynamics in such a highly crowded environment, fluorine in-cell NMR has proved to be a powerful approach, owing to the favorable relaxation properties of 19F compared with 1H. Indeed, proteins labeled with fluorinated amino acids can be observed by in-cell 19F-NMR, resulting in sensitive and virtually background-free spectra, even in the case of highly interacting proteins [40,43–46].
To date, the exact nature of the intracellular components interacting with each protein is not known. Majumder et al.  have hypothesized that RNA is among the principal components of intracellular quinary interactions. The authors investigated small globular soluble proteins (thioredoxin, FKBP, and ADK), which all exhibited increased transverse relaxation — and therefore signal broadening — in the in-cell NMR spectra due to the quinary interactions, both in bacteria and in human cells. By using relaxation-optimized NMR experiments combined with partial protein deuteration, the transverse relaxation rates could be estimated. From those, an apparent molecular mass of the unknown interacting molecules of ∼1.1 MDa was calculated, almost two orders of magnitude larger than the actual mass of the investigated proteins (Figure 1b). Accounting for the intracellular viscosity reduced the expected molecular masses to ∼300–400 kDa, compatible with the average mass of mRNAs (100–500 kDa). To validate this hypothesis, the authors analyzed thioredoxin in vitro in samples containing total bacterial RNA extract, which were then treated with RNase A, and found a 10-fold reduction in the apparent molecular mass of the complex. Consistent with these results, the same group has shown in yeast that a change in the total cellular RNA content modulates the intracellular localization and activity of ubiquitin and β-galactosidase . Owing to the formation of protein–RNA complexes, these proteins co-localize with RNA in yeast in response to a change in metabolic state, resulting in severe signal broadening in the in-cell NMR spectra .
Functional insights on soluble proteins
In-cell NMR can provide biologically relevant insights on many functional aspects of intracellular proteins, when applied to protein folding and maturation, cofactor binding, post-translational modifications, and protein–protein/protein–ligand interactions. Understanding these functional properties in the native cellular setting is especially important for disease-related proteins, such as human proteins implicated in degenerative diseases, and for proteins involved in bacterial survival that are potential targets for novel antibiotics.
In our laboratory, we have extensively investigated the folding and maturation of human copper, zinc superoxide dismutase 1 (SOD1) in the cytoplasm of human cells. SOD1 is an abundant antioxidant enzyme that has been linked to the onset of some variants of amyotrophic lateral sclerosis (ALS), both sporadic and familial . Specifically, the SOD1 pathogenicity is related to the intrinsically low stability of the immature apo protein, which is prone to misfolding and leads to the formation of toxic aggregates in motor neurons of affected individuals . To reach the mature state, SOD1 needs to dimerize, bind zinc and copper, and form an intramolecular disulfide bond. Copper binding and disulfide bond formation are catalyzed by the specific partner copper chaperone for SOD1 (CCS) . Numerous mutations in the SOD1 gene have been linked to the onset of familial ALS that negatively affect different steps of the SOD1 maturation process. We have characterized the various intermediates of SOD1 maturation in human cells by in-cell NMR , using a mammalian protein expression approach  in combination with in vitro NMR analysis. On the wild-type (WT) protein, we showed that zinc binding occurs spontaneously and selectively in the cell (unlike in vitro where non-native zinc binding to the copper site occurs ) and observed a novel copper-independent redox activity of CCS. A series of ALS-linked SOD1 mutants was then investigated, a subset of which showed a strikingly different behavior from that of the WT, as they failed to bind zinc and accumulated as an unstructured metal-free species in the cytoplasm . Complementary in vitro analysis by NMR and size-exclusion chromatography revealed that this species is irreversibly formed and is probably the precursor monomer of toxic oligomers. Notably, increased levels of CCS rescued the correct folding and maturation of these SOD1 mutants. These findings highlight the importance of CCS in controlling the fate of SOD1 molecules in ALS patients and prompt us to further investigate whether its protective role against SOD1 misfolding could be exploited in novel ALS therapies.
Protein redox regulation is another aspect for which in-cell NMR can provide unique insights at the molecular level. The function of many cysteine-containing proteins is modulated by the formation of intramolecular disulfide bonds, which affect protein conformation. NMR allows to identify directly the conformations corresponding to various redox states and assess the changes in their distribution as a function of cellular redox partners. Mia40 is a mitochondrial redox chaperone of the intermembrane space of mitochondria . It catalyzes the oxidative folding of small mitochondrial proteins constituted by a coiled-coil helix, coiled-coil helix (CHCH) domain, stabilized by two structural disulfide bonds . Like its substrates, Mia40 is synthesized in the cytosol and has to translocate to the mitochondria in the unfolded, reduced state. By in-cell NMR, we have shown that Mia40 overexpressed in human cells is mainly present in the cytosol in the oxidized and folded state, which is unable to cross the mitochondrial membrane . When the levels of cytosolic redox-regulating enzymes (glutaredoxin 1 and thioredoxin 1) were increased, Mia40 remained in the reduced unfolded state. Following this finding, we further investigated the relationship between the intracellular environment and the protein redox state. Three human proteins (SOD1, Mia40, and Cox17 — a substrate of Mia40), each harboring one or more disulfide bonds with known redox potentials, were investigated by NMR in a cellular environment with different redox properties . Interestingly, the observed distributions of protein redox states deviated from those expected at the equilibrium with the intracellular redox pool (defined by the glutathione redox potential). This suggests that the intracellular redox pathways are controlled by kinetic barriers, like most other cellular processes, and that specific intracellular partners are needed for redox regulation of each protein (i.e. CCS for SOD1; glutaredoxin/thioredoxin for Mia40 and Cox17).
Besides disulfide bond formation, other post-translational modifications (PTMs) have fundamental functional roles inside the cell. Functional PTMs, such as phosphorylation and glycosylation, are performed in a controlled manner through specific pathways. Several groups have independently shown that in-cell NMR is the ideal technique to investigate protein phosphorylation events, either directly [60,61] or by monitoring their consequences on the interaction with other proteins [62,63]. Other covalent modifications, such as glycation and cysteine/methionine oxidation, can occur within the cell in response to oxidative stress. Often, such spontaneous modifications adversely affect protein stability and function, and they may also be functional in triggering cell response mechanisms. Recently, Binolfi et al.  have followed the fate of oxidation-damaged alpha-synuclein (α-Syn) in neuronal cells. The aggregation of α-Syn, which leads to the formation of amyloid-rich Lewy bodies, is a central hallmark of Parkinson's disease . Cellular oxidative stress is among the factors contributing to the disease onset, and oxidative modifications are known to promote α-Syn aggregation in vitro and in vivo . The authors applied the protein electroporation method, developed in the same laboratory , to deliver methionine-oxidized α-Syn into mammalian A2780 (human ovarian carcinoma) and RCSN-3 (derived from rat substantia nigra) cells. NMR analysis in intact cells revealed that only the two N-terminal modified methionines Met1 and Met5 were reduced by the cellular methionine sulfoxide reductase repair system, while the C-terminal Met116 and Met127 remained in the methionine sulfoxide state. The kinetics of each methionine sulfoxide reduction were further monitored by time-resolved NMR on the cell lysates and revealed a stepwise repair mechanism where Met5 was reduced first, followed by Met1, while confirming that the C-terminal methionines were not reduced. Importantly, further time-resolved kinase reaction experiments revealed that the C-terminal methionine sulfoxides impaired the phosphorylation of Tyr125 in a specific manner, while Tyr133, Tyr136, and Ser129 were still phosphorylated. Therefore, the inefficiency of cellular reductases to repair the C-terminus of α-Syn, probably due to a higher propensity for local structure, translates into an incomplete phosphorylation pattern (as pTyr125 primes the phosphorylation of Ser129) and may affect the proteostasis of α-Syn and its interactions with specific partners mediated by the C-terminus.
Elucidating the initial aggregation steps of intracellular α-Syn, starting from the soluble monomer, is critical to understand the mechanisms of amyloid formation in Parkinson's and other degenerative diseases. Most studies assume that the physiological conformation of α-Syn monomer is an intrinsically disordered protein, which has been extensively characterized in vitro. This assumption has been challenged by a report claiming that intracellular α-Syn is present mostly as an α-helical tetrameric state , raising much controversy . In an effort to elucidate whether different cellular environments could preferentially stabilize different conformations, Selenko and coworkers have thoroughly investigated the properties of intracellular α-Syn, first in bacteria  and, more recently, in the cytoplasm of several mammalian cell lines . In the latter work, Theillet et al. analyzed the intracellular conformation of electroporated α-Syn by a combination of in-cell NMR and in-cell electron paramagnetic resonance, complemented with microscopy and western blot analysis. The main conclusion of these studies is that the majority of α-Syn molecules are found as unfolded monomers, in all cellular environments. Intracellular α-Syn is N-terminally acetylated and adopts a highly dynamic disordered monomeric conformation, similar but somewhat more compact than what is observed in vitro. Therefore, while the presence of a minority of metastable conformations of α-Syn cannot be excluded from these results, the notion that the most part of α-Syn would be involved in tetrameric complexes within the cell seems unlikely, in light of the in-cell NMR data. While these conclusions may seem slightly disappointing, they fully demonstrate the power of a direct atomic-resolution method such as in-cell NMR to validate existing information on protein structure (or lack thereof) and dynamics in the cellular context.
Strengths and limitations of in-cell NMR
The applications described above provide an overview of the current capabilities of in-cell NMR to investigate structural and functional aspects of macromolecules in living cells. In each of them, the strength of the approach stems from the high biological relevance of the data obtained in the native environment. Clearly, this comes at a cost when compared with pure in vitro analysis. For example, de novo protein structure calculation in-cell still needs a large set of data, for which different isotopic labeling schemes are often necessary, while the intrinsic not-so-high sensitivity of the NMR experiments required imposes a lower limit of protein concentration which is not always achieved. Similarly, protein investigation in cultured mammalian cells is still limited by the existing approaches for protein insertion, as their efficiency is highly protein-dependent, and by the need of expensive media for isotopic labeling when performing protein expression directly in mammalian cells. In addition, cell samples have short lifetimes in the NMR instrument (a few hours at most) when compared with the usual samples used for in vitro studies (days/weeks). This imposes a limit on the kind of NMR experiments that can be performed on the cells while they are still viable and metabolically active. To overcome these limitations, higher magnetic fields and advancements in the electronics will allow a steady increase in the sensitivity of solution NMR, while DNP enhancement is being increasingly applied to boost the sensitivity of SS-NMR. On the sample side, efforts are ongoing to achieve high cell viability over longer periods of time by employing in-flow systems as cellular bioreactors that fit high-resolution NMR spectrometers [71–73].
Future technical advancements will certainly widen the applicability of in-cell NMR and may one day even enable a fully in-cell/in situ characterization. However, given the intrinsic challenge in such ‘in-cell only’ strategy, a more pragmatic approach to in-cell NMR can be envisioned. Indeed, the state-of-the-art applications reviewed above demonstrate that successful outcome stems from the integration of in-cell NMR with extensive in vitro characterization by NMR, combined with other structural and cell biology methods. Owing to the exquisite sensitivity of NMR to changes in the chemical environment of each atom, the in-cell NMR spectra can be directly compared with those obtained in vitro from samples in different known states, so that the actual intracellular state can be inferred. So, for example, intracellular structural characterization can benefit from an existing in vitro structure, upon which any structural difference is calculated from the in-cell versus in vitro comparison. Similarly, protein folding, metallation, or redox states can be identified by comparison with the various possible states in vitro (Figure 2).
Structural and functional insights obtained by combining in vitro/ex vivo data with in-cell NMR.
Initially, the notion that what was once only feasible in vitro could now be performed directly on living cells excited the scientific community. Then, after a slow — although steady — progress of the technique, and a few overenthusiastic claims, some diffidence built up . One common criticism against in-cell NMR is that, despite the increased efforts versus in vitro NMR, the insights obtained are not physiologically meaningful, due to the need of artificially increasing the concentration of the macromolecule of interest. While it is true that increased levels of molecule are often necessary, we can argue that in most cases, the assumption that the cellular environment is not perturbed holds true [as an example, the highest concentration of SOD1 overexpressed in HEK293T cells in our laboratory, estimated ∼300 µM , only accounts for ∼0.01% (w/w) of the total soluble protein; unpublished data] and, even in extreme cases, a less-than-perfect cellular model is still a leap forward from a simple aqueous buffer. Furthermore, such caveats must always be taken into account when designing experiments and interpreting the resulting data, which is true for any other technique (as is the case when GFP-fused constructs are used to study intracellular protein diffusion and dynamics: they require careful control experiments). Finally, the necessity of protein overexpression can be actually turned into an advantage, as it doubles as a knock-in strategy to understand how a specific pathway is altered by the increase of one component. Another criticism is that in-cell NMR does not bring a real advancement, as some combination of previously existing techniques, with appropriate control experiments, would have led to the same findings. Theoretically speaking, for in-cell NMR and many other techniques, this is probably the case. In-cell NMR may not become a panacea or a Holy Grail for Structural Biology anytime soon, but it is definitely a more direct way to obtain biologically relevant atomic-level information, and is also a useful tool to validate the biological significance of data obtained in vitro (as, for example, the significance of zinc binding to the copper site of SOD1 can be easily excluded by in-cell observation; Figure 2). In this respect, we believe that in-cell NMR will be most powerful when integrated not only with in vitro data, but also with other emerging high-resolution cellular techniques, such as optical and X-ray fluorescence microscopy (see ref.  for a proof of principle), or cryo-electron tomography, pursuing what we may call an Integrated Cellular Structural Biology approach.
In-cell NMR provides biologically relevant insights on proteins in living cells at atomic resolution.
Structural and dynamic properties of membrane proteins in the native environment can be investigated by in situ solid-state NMR.
Quinary interactions between soluble proteins and the cellular environment are being increasingly studied.
In mammalian cells, functional processes are monitored such as protein folding/misfolding, cofactor binding, and post-translational modifications and interactions.
Integration of in-cell NMR with in vitro NMR and other high-resolution techniques is the key to Integrated Cellular Structural Biology.
amyotrophic lateral sclerosis
copper chaperone for SOD1
dynamic nuclear polarization
epidermal growth factor
epidermal growth factor receptor
nuclear magnetic resonance
nuclear Overhauser effect
superoxide dismutase 1
The Authors declare that there are no competing interests associated with the manuscript.