Coronavirus are the causative agents in many globally concerning respiratory disease outbreaks such as severe acute respiratory syndrome (SARS), Middle East respiratory syndrome (MERS) and coronavirus disease-2019 (COVID-19). It is therefore important that we improve our understanding of how the molecular components of the virus facilitate the viral life cycle. These details will allow for the design of effective interventions. Krichel and coauthors in their article in the Biochemical Journal provide molecular details of how the viral polyprotein (nsp7–10) produced from the positive single stranded RNA genome, is cleaved to form proteins that are part of the replication/transcription complex. The authors highlight the impact the polyprotein conformation has on the cleavage efficiency of the main protease (Mpro) and hence the order of release of non-structural proteins 7–10 (nsp7–10) of the SARS-CoV. Cleavage order is important in controlling viral processes and seems to have relevance in terms of the protein–protein complexes formed. The authors made use of mass spectrometry to advance our understanding of the mechanism by which coronaviruses control nsp 7, 8, 9 and 10 production in the virus life cycle.
Coronaviruses have gained much attention as the causative agents for the outbreaks of human respiratory syndromes such as severe acute respiratory syndrome (SARS), Middle East respiratory syndrome (MERS) and most recently coronavirus disease-2019 (COVID-19). These respiratory diseases arose from zoonotic transfer from animals to humans that have produced strains of the virus that have not previously circulated in the human population. The resulting respiratory diseases have produced high fatality rates as is the case for MERS with a case fatality rate reported by the World Health Organization of ≈35% in January 2020. In addition, COVID-19 has shown rapid global spread resulting in many infected individuals as cases increase above 1,200,000 and counting (7 April 2020) . The need for specific treatment options targeting these coronaviruses is high and in order to provide these options detailed understanding is required regarding the structure of the molecular components that are essential for the viral life cycle. Analysis of the SARS-CoV2 (COVID-19) sequence indicated that it shows 77.2% amino acid identity with SARS-CoV (SARS) with the non-structural proteins (nsp's) 7–10 showing between 97.1% to 98.8% identity . Therefore, molecular details pertaining to SARS-CoV proteins and regulation should apply broadly to the family of viruses.
Polyprotein nsp7–10 processing
SARS is caused by the coronavirus known as SARS-CoV which belongs to the order Nidovirales, the family Coronaviridae, the subfamily Coronavirinae and the genera betacoronavirus. These viruses are enveloped and have some of the largest single stranded RNA genomes. The SARS-CoV positive stranded RNA genome encodes two open reading frames (ORF), ORF1a and ORF1b. When ORF1b is translated it produces two polyproteins (pp), pp1a and pp1ab by ribosomal frameshifting . The latter has a higher degree of sequence conservation than ORF1a, most likely due to the functional significance of the proteins produced by this region . The polyproteins produced from ORF1b are auto catalytically processed to produce 15 to 16 nsp's essential for viral RNA synthesis which include two proteases (nsp3, PLpro; nsp5, Mpro) and the viral RNA dependent polymerase (nsp12, RdRp) . The processing of the polyprotein region is a point of posttranslational control that is essential for virus replication. SARS-CoV uses both proteases to process polyproteins, the papain-like protease (PLpro) and the main chymotrypsin-like protease (Mpro). Non-structural proteins 7, 8, 9 and 10 are cleaved by Mpro which recognizes specific cleavage sites. Sequence conservation is high in the nsp's within the Coronavirinae, which is expected, since these proteins play a critical role in the viral life cycle. They have been proposed as interacting partners for nsp12, nsp14 (exoribonuclease, ExonN) and nsp16 (ribose 2′-O-MTase) as well as RNA , but their exact functions are not well characterized. Krichel and coauthors in their article  investigate the processing of the nsp7–10 polyprotein. Their results show for the first time the critical role protein conformational structure plays in the order in which the nsp's are released from the polyprotein. Their data shows that using peptides representing joining segments between nsp variants (nsp10 with 9, nsp9 with 8 and nsp8 with 7) that the order in which nsp's are released from the polyprotein would be nsp10, then nsp7, then nsp9 from nsp8, based on the rates of the enzymatic peptide cleavage reaction. When they use a full-length polyprotein (nsp7–10) in its native conformation, Mpro cleaves the polyprotein in a different order, nsp10 is cleaved first then nsp9 and then nsp8 from nsp7 (Figure 1). Sequence analysis  and crystal structures [8,9] of coronavirus protease Mpro have indicated that the substrate binding pocket of this enzyme has many key amino acid residues that make many contacts with its substrate. It is likely that the need for so many key interactions is for distinguishing protein conformation of the polyprotein.
Polyprotein nsp7–10 cleavage order and protein product structures.
The order in which nsp's are released links to different functional quaternary structures formed
The polyprotein chain has been shown to contain sufficient structure to catalyze RNA synthesis in the uncleaved form . It is therefore not surprising that the polyprotein adopts specific tertiary structural characteristics that modulate the enzymatic cleavage by Mpro. The conformational structure formed by the polyprotein nsp9–7 by design causes the nsp‘s to be cleaved in what seems to be a logical order. One would expect that these nsp's are cleaved in an order that reflects their associations with one another and other key proteins in line with the viral needs during the production of new viruses. Nsp10 and nsp9 have no reported structures involving either nsp7 or nsp8 (Figure 1) this was also confirmed by by Krichel and coauthors  therefore it is not surprising that it is released from the polyprotein first as it requires no other partners to carry out its function. A spherical dodecameric structure has been reported that is composed of trimeric units of nsp10 alone  and a monomeric structure of nsp10 has also been reported . Both structures show the presence of zinc finger motifs. Proteins containing zinc finger motifs are often associated with DNA or RNA binding functions indicating the potential role nsp10 plays in RNA synthesis and due to it being cleaved first this role must be required prior to the other nsp's. Similarly, nsp9 has been shown to form dimeric structures that bind RNA [13,14] (Figure 1). Krichel and coauthors also confirmed the presence of nsp9 monomers and dimers in solution but no complexes of nsp9 with the other nsp's . Therefore, it seems logical that nsp9 will be cleaved next rather than nsp7. Studies have shown that nsp8 and nsp7 interact and form different structures (Figure 1), Zhai and coauthors  reported the formation of a hexadecamer that possesses two structural conformations of nsp8 termed the ‘golf club' and ‘golf club with a bent shaft'. The hexadecamer forms a doughnut shape structure while other reports indicated the formation of a heterotrimer consisting of two copies of nsp7 and one of nsp8 . The nsp7/nsp8 heterotrimer structure has been shown to produce RNA pointing to a role as an RNA polymerase . It therefore makes sense that nsp7–8 would be cleaved last producing the proteins that can then form various quaternary structures. Krichel and coauthors discovered the presence of a novel arrangement composed of two nsp7 and two nsp8 monomers in a hetero-tetramer . They also found that the complex could dissociate forming complexes with one nsp7 and two nsp8 monomers as well as two nsp7 and one nsp8 monomer. It would therefore appear based on the variety of structures reported for nsp7 interacting with nsp8 that these proteins form multiple functional quaternary structures and possibly adopt many stable intermediate structures.
Coronavirus continue to pose a very real danger globally. Our detailed understanding of the proteins encoded by these viral genomes and the posttranslational control mechanisms will be critical in the development of interventions. Although literature provides detailed structures for the nsp7, 8, 9 and 10 it is evident that many possible forms exist. These different structural forms could be part of a detailed posttranslational control mechanism and it would be important to determine the function of these different structures, be that enzymatic or as key intermediates driving virus formation. In the paper by Krichel and coauthors, they also highlight the importance of understanding polyprotein structure and its role in the viral life cycle. The virus seems to employ multiple levels of control and each protein could play multiple roles. Intermediates may also be off pathway structures used to modulate rates of reactions. Knowledge of the existence of structures would then provide the information necessary to understand the molecular details of the role these proteins play in the viral life cycle and would thus provide information that would allow manipulation of the viral proteins for control and treatment options.
The author declares that there are no competing interests associated with this manuscript.