Identifying SARS-CoV-2 antiviral compounds by screening for small molecule inhibitors of Nsp3 papain-like protease

The COVID-19 pandemic has emerged as the biggest life-threatening disease of this century. Whilst vaccination should provide a long-term solution, this is pitted against the constant threat of mutations in the virus rendering the current vaccines less effective. Consequently, small molecule antiviral agents would be extremely useful to complement the vaccination program. The causative agent of COVID-19 is a novel coronavirus, SARS-CoV-2, which encodes at least nine enzymatic activities that all have drug targeting potential. The papain-like protease (PLpro) contained in the nsp3 protein generates viral non-structural proteins from a polyprotein precursor, and cleaves ubiquitin and ISG protein conjugates. Here we describe the expression and purification of PLpro. We developed a protease assay that was used to screen a custom compound library from which we identified dihydrotanshinone I and Ro 08-2750 as compounds that inhibit PLpro in protease and isopeptidase assays and also inhibit viral replication in cell culture-based assays.


Introduction
By early January 2021, COVID-19 infections and deaths across 191 countries/regions were reported to be 87 million and 1.9 million, respectively [1]. These numbers occurred in just over one year since the first COVID-19 outbreak was reported in December 2019. COVID-19 is a highly contagious disease with a human-to-human mode of transmission [2]. Accelerated by travelers, by the end of January 2020, the World Health Organization (WHO) declared the disease as a 'public health emergency of international concern' and the first European COVID-19 case was reported in France [3].
There is currently no cure for COVID- 19. In June 2020, dexamethasone, an anti-inflammatory drug, was approved by the UK government for COVID-19 patients in National Health Service (NHS) hospitals [6]. In August 2020, remdesivir was approved by the Food and Drug Administration (FDA) in the United States [7]; this is currently the only small molecule antiviral drug approved for use against COVID-19. In addition, casirivimab and imdevimab, monoclonal antibody drugs, were approved under Emergency Use Authorization (EUA) by the FDA in November 2020 [8]. The other focus of treatment is prevention via vaccination. Multiple mRNA-based and virus-based vaccines have been rolled out across the world with similar overall safety and effectiveness [9][10][11]. However, the emergence of a new variant of SARS-CoV-2 in the UK in September 2020, has raised great concern since it is 70% more transmissible [12]. Additional variants arising across the world may render the existing vaccines less efficient. Given the uncertainty of the progression of the virus and the time frame needed to vaccinate the global community, it is crucial to search for drugs to provide treatment for COVID-19 patients.
The genome of COVID-19 is arranged as shown in Figure 1A [4]. Once the virus has entered a cell, two open reading frames, ORF 1a and ORF 1ab are translated. ORF 1ab is generated from an internal ribosomal frame shift. This produces polypeptides that are processed by two viral proteases to produce 16 non-structural proteins (nsp) 1 to 16. The main protease is the 3C-like protease (3CLpro), corresponding to nsp5. 3CLpro cleaves in between each of the nsp 4-16 proteins. The second protease, Papain-like protease (PLpro), is contained in a small domain of the nsp3 protein; it cleaves after LXGG motifs between each of the remaining A B Figure 1. Design, expression and purification of PLpro enzyme.
(A) Schematic of the COVID-19 genome showing the 29.9 kb single-strand (+) RNA. ORF 1ab is 21.5 kb and codes for a polyprotein which after processing, produces 16 proteins named nsp1-16. The nsp3 protein contains the catalytic core PLpro enzyme used in this study which comprises the Ubl2 and PLpro domains (highlighted in red). The '948 bp' highlighted in red indicates the sequence that was the basis of the bacterial and insect cell expression constructs used for protein purification. are the gel filtration fractions containing PLpro monomer from that were pooled for further assay. nsp1-3 [13]. Interestingly, PLpro is also a deubiquitylase (DUB) and a delSGylase, cleaving after the diglycine residues of ubiquitin (Ub) and the UBL protein ISG15 (interferon-induced gene 15), respectively, and thus has roles outside polyprotein cleavage [14,15]. In this paper, we describe our search for PLpro inhibitors.
We expressed PLpro in both bacterial and insect cell systems to determine which versions of the purified protein retained the most enzymatic activity. For bacterial expression, PLpro was tagged with either His-Sumo or His-TEV at its N-terminus. After protein pulldown from bacterial lysate, the His-Sumo and His-TEV affinity tags were removed by Ulp1 and TEV proteases respectively ( Figure 1B and Supplementary Figure S1A). For expression in insect cells, the protein was tagged with Flag-His at the N-terminus and the tag remained on the final protein (Supplementary Figure S1SB). The activities of all three proteins were compared as described below.

PLpro protease activity
We used a quenched Förster (fluorescence) resonance energy transfer (FRET) technique to monitor the protease activity [17]. 2-aminobenzoyl (Abz) and a nitro-L-tyrosine (Y-(3-NO 2 )R), were added to opposite ends of a small synthetic peptide that contained the cleavage sequence recognized by PLpro. This generates a fluorescence-quenching pair ( Figure 2A) in which emission from Abz is absorbed by the neighboring Y-(3-NO 2 )R; cleavage is detected as an increase in apparent emission from Abz at 420 nM. Two cleavage sequences were initially selected which corresponded to the ten amino acids between nsp1/2 and nsp2/3, designated Pro1 and Pro2, respectively. To test for cleavage of the substrates, bacterial His-TEV-PLpro was added to the substrate in the assay buffer. Over the 20 min of incubation, increasing fluorescence signal was observed for substrate Pro2 but not for Pro1 (Supplementary Figure S2A). Consistent with this, it has been recently shown that full-length nsp3 protein but not the isolated PLpro domain can cleave between nsp1/2 [18]. Although cleavage of Pro2 occurred, the rate was relatively slow making it less suitable for screening. We therefore designed a third substrate based on Pro2, called Pro3, which lengthened the recognition sequence peptide from 10 to 12 amino acids [19] (Figure 2A). This modification resulted in a more rapid reaction (Supplementary Figure S2B) and Pro3 was, therefore, used as substrate in subsequent experiments.
The bacterial His-TEV-PLpro (tag removed) was more active than insect cell Flag-His-PLpro at the same enzyme concentration (Supplementary Figure S2D). The two bacterial expression proteins, His-TEV and His-Sumo-PLpro (tags removed) showed similar activity (Supplementary Figure S2C). Since the His-Sumo PLpro is an intact protein with no extra amino acids (after tag removal) and also produced a higher yield after purification, this version of PLpro was used for all of the remaining experiments.
By assaying PLpro cleavage activity across a wide range of substrate concentrations, we found that reaction velocities did not approach saturation at the highest concentration tested and therefore the apparent K M was estimated to be 1854 mM based on an incomplete dataset ( Figure 2B). Cleavage increased over an hour at a variety of enzyme concentrations ( Figure 2C). We chose 1.75 mM of enzyme and 20 mM of the substrate to use in the high-throughput screen.
The high-throughput screening for over 5000 compounds The screen to identify small molecule inhibitors was performed using an existing custom library of over 5000 compounds aliquoted at two different concentrations, 1.25 mM or 3.75 mM, in a total of 48 384-well plates (see [20] this issue for contents and description of the library). The plates were organized with columns 3-22 containing the compounds; the remaining columns were used for reaction controls including substrate only and substrate with enzyme but no compound (Figure 3Aii). The PLpro enzyme was pre-incubated with the compounds for 10 min, and then the reaction was initiated by the addition of substrate. The activity was recorded from 0 to 20 min with 3 min intervals between readings, this resulted in seven timepoints which were then used to calculate the slope (Figure 3Ai). We incubated the inhibitors with PLpro before the enzymatic reaction so that the slow-binding inhibitors will not be missed out [21].
We selected a total of 29 candidates from the plates using 3.75 mM compound concentration and a further 8 candidates were selected from the plates with 1.25 mM concentration ( Figure 3B) that exhibited apparent inhibition greater than 25%; four of the compounds were overlapping in both concentrations. Amongst the hits, 22 out of 29 compounds from the high concentration plates and 1 compound from the low concentration plates were auto-fluorescent at 420 nm, and were excluded from our list. Three additional candidates were eliminated because the degree of inhibition was weak and the same compounds did not inhibit at higher concentration. This narrowed the final list down to seven compounds ( Table 1). The details of normalized activity to control without compound of each compounds were stated in Supplementary Table S1.

Gel-based PLpro assay to test candidates
We wanted to exclude artifacts that might result from the fluorescent-based assay. For this reason we designed a gel-based PLpro protease assay ( Figure 4A,B). We constructed a new version of the PLpro substrate that had the 10 amino acids at the junction of nsp2-3 containing the cleavage site attached to GST at the N-terminus and MBP at its C-terminus, resulting in a 67 kDa peptide (Supplementary Figure S3A). If PLpro mediated cleavage occurs, products of 25 kDa and 42 kDa will be generated which can easily be detected and visualized by SDS-PAGE (see Figure 4Ci, lane 3). One of the hits from the screen, dihydrotanshinone I was previously shown to inhibit the PLpro from SARS-CoV-1 [21]. Two other tanshinone derivatives, tanshinone IIA and cryptotanshinone, were shown to be better inhibitors of this enzyme than dihydrotanshinone I [21]. In separate studies, a non-covalent inhibitor, GRL-0617, has been reported to be effective against SARS-CoV-1/-2 PLpro by different groups [22][23][24]. We, therefore, decided to include these compounds in our validation experiments. As shown in Figure 4C high concentration, respectively that reduce the PLpro activity more than 25%. Amongst those hits, four compounds were overlapping. In the low concentration (1.25 mM), one compound is auto-fluorescent and three have a Z score of more than −3.5.
dihydrotanshinone I, Ro 08-2750 ( Figure 4Ciii) and beta-lapachonestrongly inhibited the enzyme activity at both compound concentrations tested as indicated by the reduced levels of the 25 and 42 kDa products (red dot in Figure 4). Two of the final seven compounds, ursodiol and pyrocatechuic acid, did not inhibit PLpro cleavage of the substrate in this assay and, therefore, appear to be false positives. We tested some of the autofluorescent hits, but none inhibited cleavage of the peptide, further confirming them as false positives (Figure 4Cii,iii). Tanshinone IIA and cryptotanshinone also inhibited cleavage, but only at the higher concentration. Thus, dihydrotanshinone I appears to be the strongest inhibitor amongst the tanshinone derivatives. Similarly, GRL-0617 also inhibited cleavage only at the highest concentration. 3CLpro, contained in the nsp5 protein, is the other main protease encoded by SARS-CoV-2, and, like PLpro, 3CLpro has also an active site cysteine. However, none of the five screen hits inhibited 3CLpro activity in an analogous gel-based assay (Milligan et al. this series; Supplementary Figure S3B).
We determined the IC 50 value for all 5 inhibitors from the screen that were validated using the gel-based assay. All of them are below 1 mM: 0.26 mM for PDK/Akt/Flt dual pathway inhibitor, 0.53 mM for Ro 08-2750, 0.39 mM for Cdk4 inhibitor III, 0.61 mM for beta-lapachone, and 0.59 mM for dihydrotanshione I ( Figure 5A-E). The IC 50 for the 3 published compounds were slightly higher in our experimental conditions: 1.79 mM for GRL-0617, 1.57 mM for tanshinone IIA, and 1.34 mM for cryptotanshinone ( Figure 5F-H), consistent with the gel-based assay results. During this time we realized that PDK1/Akt/Flt dual pathway inhibitor was also identified in several other screens being performed in the laboratory. We, therefore, considered it to likely to be nonspecific and we eliminated it from further consideration.

Inhibition of PLpro isopeptidase activity
We performed orthogonal assays using two different substrates of PLpro to test the inhibition of isopeptidase activity of PLpro by the small molecule inhibitors identified in the screen. When K48-linked triubiquitin (Ub3) is incubated with PLpro, it is efficiently cleaved to diubiquitin (Ub2) and monoubiquitin (Ub1) as the final products [18]. Similarly, pro-ISG15 is cleaved to ISG15. When PLpro is pre-incubated with 10 mM of beta-lapachone or dihydrotanshinone I, we observed potent inhibition of both K48-linked Ub3 and pro-ISG15 cleavage whereas moderate inhibition was observed for both Ro 08-2750 and the previously identified inhibitor of PLpro, GRL-0617, at these concentrations ( Figure 6). A potential 3CLpro inhibitor, shikonin, did not inhibit isopeptidase activity with either substrate. These experiments demonstrate the inhibitory effect of these small molecules across a spectrum of PLpro substrates.

Cell culture-based antiviral proliferation assay
We next tested the ability of these compounds to inhibit viral growth in a cell culture-based assay where VERO E6 cells are infected with SARS-CoV-2. Two of the compounds, beta-lapachone and Cdk4 inhibitor III, were cytotoxic in the low micromolar range and were not pursued (Supplementary Figure S4A,B). The two tanshinone derivatives not identified in our screen (cryptotanshinone and tanshinone IIA) did not inhibit viral growth below 200 mM (Supplementary Figure S4C,D). Ro 08-2750 and GRL-0617 were better at inhibiting viral growth (EC 50 20 and 32.6 mM, respectively) but also exhibited some cell toxicity at higher concentrations   Figure S5). Dihydrotanshinone I proved to be the best inhibitor since it effectively inhibited the SARS-CoV-2 proliferation at an EC 50 of 8 mM (Figure 7Ai) and did not exhibit much cytotoxicity, even at high concentrations.
Since remdesivir is the one and only approved antiviral compound for COVID-19, we wanted to test whether any of the compounds we have found show any synergy with remdesivir (Supplementary Figure S5). However, addition of 0.5 mM remdesivir, a concentration just below that required to inhibit viral growth, did not reduce the EC 50 of any of our compounds ( Figure S7A-Cii). The inhibitory properties of remdesivir alone was described in nsp14 methyltransferase paper in this issue [25].

Discussion
Dihydrotanshinone I, which emerged as the best overall hit from our screen, is a natural compound isolated from lipophilic fraction of Salvia miltiorrhiza, which has a long history in traditional Chinese medicine [26]. Several derivatives of tanshinones were previously reported to be inhibitors for SARS-CoV-1 PLpro and to a lesser extent in 3CLpro [21,26]. We have found that dihydrotanshinone I is the best inhibitor of SARS-CoV-2 PLpro and did not inhibit 3CLpro. Although we, like other groups [24,27,28], used a truncated nsp3 protein including just the Ubl2 and PLpro domains, and there is evidence that cleavage by the full-length nsp3 may have slightly different specificities [18]. The fact that dihydrotanshinone I stops viral replication suggests that it is a good nsp3 PLpro inhibitor in cells. Recently, an in silicon molecular docking study suggests tanshinone I, derivative of dihydrotanshinone I, directly form the hydrogen bond with the side chain of catalytic C111 amino acid in PLpro [29].
PLpro also recognizes and removes K48-linked polyubiquitin chains (Ub) and ISG15 from host cell target proteins. It is known that either ubiquitin or ISG15 are covalently bonded to target proteins during the cellular response to viral infection. The deubiquitinating (DUB) and deISGylating activities of PLpro following LXGG motifs thus have an implication in viral invasion by shutting down the viral-induced host innate immune response [30]. PLpro of SARS-CoV-2 further shows preferential activity in cleaving ISG15 over Ub in comparison with SARS-CoV-1 [18,22,24]. Interestingly, although the high transmission (and potentially more deadly) U.K. variant of SARS-CoV-2, B.1.1.7 was believed attributed to multiple mutations in the spike protein, the significance of a particular point mutation, A1708D, in PLpro for viral invasion remains unexplored [31].
The His-Sumo bacterial version was codon optimized and cloned into K27-Sumo (Addgene ID 169193) via NEBuilder HiFi DNA Assembly Cloning Kit (NEB). Vector template was amplified using primers oEcoli-C_48 and 49, PLpro sequence was amplified from His-TEV bacteria strain using primers oEcoli-C_51 and 52.

Expression and purification
Bacteria His-TEV and His-Sumo constructs were introduced into T7 express lysY/I q E. coli cell (NEB) for expression. Cells were grown at 37°C to log phase to achieve OD 0.8. Cells were then induced by the addition of 0.5 mM IPTG and switched to 18°C to incubate overnight. Cells were harvested and lysed in buffer A (50 mM Tris-HCl, pH 7.5, 10% glycerol, 1 mM DTT, 0.02% NP-40, 500 mM NaCl and 30 mM imidazole), with the addition of 100 mg/ml lysozyme and sonicated 24 × 5 s. Lysates were centrifuged and the supernatant was collected. The supernatant was incubated with Ni-NTA agarose beads (Thermo) for 2 h at 4°C. Beads were washed with wash the His-TEV-and His-Sumo-tag, respectively. After dialysis the lysate was incubated with Ni-NTA agarose beads once again to remove the proteases. The flow through was collected and loaded onto a MonoQ 5/50 GL column (GE healthcare) with buffer B, with gradient from 0.1 M to 1 M NaCl. Flow through was collected and concentrated using Amicon ultra 10 kDa (Merck). It was then loaded onto a Superdex S200 Increase 10/300 GL (GE healthcare) with buffer C (25 mM HEPES-KOH, pH 7.6, 10% glycerol, 0.02% NP-40, 150 mM NaCl and 2 mM DTT). Peak fractions were collected and pooled. Baculovirus 3xFlag-His 6 -PLpro (3FH-PLpro) was expressed in baculovirus-infected insect cells. The coding sequence was codon-optimised for S. frugiperda and synthesized (GeneArt, Thermo Fisher Scientific). PLpro DNA was subcloned into the biGBac vector pLIB [32] to include an N-terminal 3xFlag-His 6 tag (sequence: MDYKDHDGDYKDHDIDYKDDDDKGSHHHHHHSAVLQ) (Addgene ID 169194). Baculoviruses were generated using the EMBacY baculoviral genome [33] in Sf9 cells (Thermo Fisher Scientific). For protein expression Sf9 cells were infected with baculovirus and collected 48 h after infection, flash-frozen and stored at −70°C.
PLpro gel-based assay substrate TLKGG//APTKV (nsp2/3 junction) was used as the substrate for gel-based assay. GST-tag and MBP-tag were attached to its N-and C-terminus, respectively (Addgene ID 169195

Enzymatic assay
Kinetic constants were measured using a fluorescent-based assay. 0.5 mM enzyme were used in the titration of substrate Pro3 from 800 to 1.43 mM with 0.75-fold. Vmax and Km were estimated from non-linear Michaelis-Menten via linear regression of slope value.

High-throughput inhibitor screening
A total of over 5000 compounds (Sigma, Selleck, Enzo, Tocris, Calbiochem, and Symansis) were obtained from High-Throughput Screening (HTS) facility in Francis Crick Institute. The compounds were aliquoted into 48 384-well plates in two concentrations with final concentration of 1.25 mM and 3.75 mM in 20 ml reaction. An amount of 5 ml of the enzyme was pre-incubated with compounds for 10 min before the addition of 15 ml substrate. Fluorescence was monitored at 2 min for 20 min with 3 min intervals giving a total of seven readings.

Data analysis
MATLAB was used to process data (more details in [20])

Gel-based assay
An amount of 0.5 mM of the enzyme was pre-incubated with the selected inhibitor compounds for 10 min. An amount of 0.5 mM of gel-based assay substrates were added and incubated at RT for 5 h. The buffer used for the assay was buffer G. The reactions were ran on 4-15% TGX gel (Bio-Rad, Hercules, CA) and stained with Instant blue stain (Expedeon).

Assays for deubiquitylation and deISGylation activity
Expression and purification of PLpro and the substrates K48-linked Ub3 and pro-ISG15 were as described previously [18]. PLpro was diluted in buffer G (50 mM HEPES-KOH pH 7.6, 2 mM DTT, 10% (v/v) glycerol, 0.02% (v/v) Tween-20) to 100 nM and pre-incubated with indicated concentrations of inhibitor at room temperature for 10 min. The substrates Ub3 and pro-ISG15 were diluted in buffer G to a concentration of 2 mM. An equal volume of substrate was then mixed with the inhibited PLpro to give a final concentration of 50 nM PLpro, 10 mM inhibitor and 1 mM substrate. The reaction was incubated for 1 h at 25°C and the reactions were stopped by addition of LDS sample buffer. The samples were separated on 4-12% NuPAGE Bis-Tris Gels (ThermoFisher Scientific) and stained using Pierce Silver Stain Kit (ThermoFisher Scientific).
Cell-based assay SARS-CoV-2 production, infection and recombinant mAb production was done as described by [20].

Data Availability
All data in this paper can be found in FigShare