Systems biology is considered to be the possible technology that could bring breakthroughs in the study of TCM (traditional Chinese medicine). Proteomics, as one of the major components of systems biology, has been used in the mechanistic study of TCM, providing some interesting results. In the present paper, we review the current application of proteomics in the mechanistic study of TCM. Proteomics technologies and strategies that might be used in the future to improve study of TCM are also discussed.
With a history of more than 2000 years, TCM (traditional Chinese medicine) has benefitted about one-fifth of the world's population in treating a plethora of diseases. Nowadays, it is still popularly used in Asian countries and is also gaining attention in the West as sources of new drugs, dietary supplements and functional foods. However, there are many factors that impede worldwide acceptance of TCM as a real therapeutic option. Deficiency in scientific clarification of its mechanism is, of course, one of these factors. For researchers conducting mechanistic studies of TCM, the complexity, variability and underlying philosophical systems of TCM present challenges. First, the amount of TCM makes study laborious. Some 11 146 plant species representing 2309 genera and 383 families, as well as 1581 species of animals and 80 minerals and substances ranging from precious stones to mineralized fossils, are used in preparing TCM . Secondly, the fundamental belief of TCM is that its efficacy is based on interactions between multiple components in TCM. The possibility that multi-component mixtures might have advantages over a single-component drug has scientific foundation, but it is very difficult to verify. Thirdly, the composition of TCM could be variable. TCM in China include CMMs (Chinese medicinal materials), CMM extracts and PCMs (proprietary Chinese medicines)/composite formulae . In the clinic, TCM doctors prescribe selected CMMs in the form of herbal decoctions. For different patients, the composition of the decoction would be adjusted for a personalized treatment. These characteristics of TCM make it very difficult to study its mechanisms.
The development of systems biology brings a new dawn in the study of TCM. By measuring the profiles of genes, proteins and metabolites, systems biology provides a more holistic approach in biological study. This property of systems biology coincides with the holistic thinking of TCM, therefore systems biology is seen by some as a perfect match for the study of TCM. “If there is any technology that could lead to a breakthrough in TCM, it will be systems biology,” says Robert Verpoorte, head of the Pharmacognosy Department at the University of Leiden in The Netherlands (quoted in ). Proteomics, as one of the major components of systems biology, has been used in the study of TCM for several years. In the present review, we summarize the current application of proteomics in TCM mechanistic studies. Furthermore, possible techniques and strategies that might be conducted in proteomics studies of TCM mechanisms are also discussed.
Presently used technologies and strategies in the mechanistic study of TCM
Published papers related to the topic of the present review were searched in PubMed using the key words ‘proteomics’ and ‘traditional Chinese medicine’. A total of 69 papers, including 13 reviews, were found (up to 9 June 2011). Among the 56 original research papers, some were focused on studying syndromes of TCM such as kidney-yin deficiency syndrome and kidney-yang deficiency syndrome which were used by TCM doctors in diagnosis of diseases. Only 28 original papers [4–31] were related directly to the mechanistic study of TCM, including both purified compounds and mixtures extracted from TCM. Of note, seven of the 28 papers [4,7,15,20,21,23,24] were from our laboratory in the Shanghai Research Center for Modernization of Traditional Chinese Medicine, Shanghai Institute of Materia Medica, Chinese Academy of Sciences. From 2005, we have tried to use proteomics in the mechanistic study of TCMs including Ganoderma lucidum, Salvia miltiorrhiza, Panax notoginseng and others.
The first paper described the proteomics study of TCM mechanisms was published in 2004 by Guo et al. . By using 2-DE (two-dimensional protein electrophoresis), Guo et al.  examined the protein expression of bone marrow of mice with a blood deficiency treated with Siwu Tang, a type of blood-enriching TCM. Differentially expressed protein between Siwu Tang-treated mice and control mice were then identified by MS analysis. They found that irradiation caused blood deficiency in mice and changed the protein expression profiles of mice marrow. Siwu Tang treatment could reverse ten up-regulated and four down-regulated proteins of mice that underwent irradiation. The effects of Siwu Tang might be related to proteins including lymphocyte-specific protein 1, proteasome 26S ATPase subunit 4, haemopoietic cell protein-tyrosine phosphatase, glyceraldehyde-3-phosphate dehydrogenase, growth factor receptor-binding protein 14 and lgals 12 (lectin, galactose-binding, soluble 12).
Similar to that of Guo et al. , 25 of the 28 papers related to proteomics study of TCM used 2-DE coupled with MS or MS/MS (tandem MS) in their study. 2-DE-based proteomics is a classic method in proteomics study. 2-DE separation of proteins was first reported in 1975 . With the progression in MS identification of proteins, 2-DE combined with protein identification appeared to be a kind of real proteomics experiment in 1995 . 2-DE resolves complex protein mixtures first by charge using isoelectric focusing and then by mass using SDS/PAGE. Theoretically, in 2-DE, approximately 2000 proteins could be detected and quantified using fluorescence staining, with each spot representing amounts as low as 1 ng of protein . To date, quantitative proteomics based on 2-DE is still one of the most widely used proteomics approaches. Furthermore, compared with LC (liquid chromatography)–MS-based proteomics, 2-DE does not rely on sophisticated equipment and is relatively easy to carry out. Therefore it is understandable that, in the early stages of proteomics study of TCM, 2-DE-based proteomics was a popular choice. Only three of the 28 papers used technologies other than 2-DE. Hu et al.  and Cho et al.  used protein array and SELDI–TOF (surface-enhanced laser-desorption ionization time-of-flight) MS technology. Yoo et al.  used 1-DE (one-dimensional electrophoresis) and LC–electrospray ionization quadruple time-of-flight analysis in their study.
No matter which proteomics technologies were adopted, the strategies of comparative proteomics used in the 28 papers were similar. Generally, cells or animals were grouped into control and TCM-treated groups. After TCM (purified compounds or mixtures) treatment, protein samples were extracted from cells or tissues of animals. The protein expression profiles of control and TCM-treated groups were examined and compared. Proteins differentially expressed between the TCM-treated and control groups were then identified using MS or MS/MS. The proteins were considered to be possible target-related proteins of TCM. This strategy is simple, but effective. Results of this kind of comparative proteomics study did provide useful information about the mechanism of TCM. However, most of the proteins found in this kind of study might be indirect protein targets of TCM.
To search for direct targets of TCM and clarify signal networks activated by TCM, we have tried to use bioinformatics technologies together with proteomics technologies in our studies. For example, one of our studies was focused on the protein targets and signal network of salvianolic acid B in cardiac myocytes . Salvianolic acid B is the main active component of S. miltiorrhiza, a TCM commonly used for the treatment of cardiovascular disorders such as angina pectoris, myocardial infarction and stroke. The possible direct protein targets of salvianolic acid B were predicted using INVDOCK, a ligand–protein inverse docking algorithm, based on protein structure databases. Proteins that could bind directly to salvianolic acid B were considered as its possible direct protein targets. Then, 2-DE was used to check the protein expression profiles of cells that underwent salvianolic acid B treatment. Proteins whose expression levels could be affected by salvianolic acid B treatment were accepted as its possible indirect protein targets. Finally, the possible signal network between the possible direct target protein and the possible indirect protein targets of salvianolic acid B was established on the basis of protein–protein interaction databases and then verified .
Other proteomics technologies that could be used in the mechanistic study of TCM
Induced by the development and strategic application in different areas, proteomics continues to rapidly evolve. New proteomics technologies might be used to improve the proteomics study of TCM. Generally, the types of data that proteomics analysis could provide include (i) protein identification, (ii) protein quantification, and (iii) identification of protein isoforms and PTMs (post-translational modifications). The proteomics study of TCM could be improved in all three areas.
To increase proteome coverage (the number of proteins identified), different kinds of protein or peptide separation technologies have to be used. 2-DE, which has been popularly used in the study of TCM, is one of the representative technologies for separation of intact protein. Although 2-DE could provide information on protein molecular mass and pI for protein identification, it is labour-intensive and often accompanied by poor recovery of large or hydrophobic proteins, as well as loss of proteins during gel separation . Use of other technologies such as LC  should be expanded into the mainstream of proteomics study of TCM. To note, one of the most commonly used proteomics methods, called ‘shotgun’ proteomics, starts with enzymatic digestion of proteins into a mixture of peptides, followed by online one-dimensional or two-dimensional LC–MS or LC–MS/MS. Reviews of the LC–MS technologies and LC–MS instruments used in proteomics are available [37,38]. Furthermore, in the case of complex protein mixtures, prefractionation techniques such as ion-exchange, hydrophobic interaction chromatography and affinity chromatography were necessary before protein characterization. Investigators should bear in mind that different sets of proteins would be observed by using methods with different solubilization and/or separation parameters. According to the observation of Van Eyk , in some cases, there was only <30% overlap in the proteins observed between different protein separation methods in proteomics analysis. Therefore using diverse proteomics technologies in the study of TCM should be encouraged to increase proteome coverage.
For gel-free protein separation methods, gel-free quantification methodologies have to be used. Quantitative methodologies based on gel-free techniques can be classified into two types, based on tag labelling or label-free. An overview of these techniques is available in a recent review . In tag-labelling methods, labels could be introduced into samples by several methods using metabolic incorporation in vivo or chemically or enzymatically in vitro, and then quantitative analysis could be conducted based on the comparison of the intensities of ion signals. Generally, in vivo labelling is based on the labelling of proteins inside living cells, which could be conducted by feeding cells with isotopically labelled ‘light’ or ‘heavy’ nutrients. The overall process could also be scaled up to allow the creation of labelled plants, animals or yeast [41,42]. In vitro, isotopic tags can be introduced by chemical or enzymatic step to couple the label to the protein or peptide. Several labels including 18O incorporation, ICAT (isotope-coded affinity tag), isotope-coded protein labels, AQUA (absolute quantification), iTRAQ (isobaric tag for relative and absolute quantification) and Tandem Mass Tags® are available. Among these labels, iTRAQ, which uses amine-specific isobaric reagents, is a good example of a chemical tagging strategy . iTRAQ can be used to analyse four (4-plex iTRAQ kit) or eight (8-plex iTRAQ kit) different samples. Each tag in the iTRAQ kit contains a peptide-reactive group, a balance group (which compensates for the increase in mass due to the reporter so all molecules have the same mass) and a reporter group. Separately labelled peptides from each of the experimental samples are combined before separation using LC and MS/MS analysis. The same peptide from each sample appears as a single peak in MS, whereas the tagged peptides are fragmented in MS/MS and the iTRAQ tags release reporter ions at 114.1, 115.1, 116.1 and 117.1 m/z (for the 4-plex). Therefore the reporter ion peaks could used to calculate the relative abundance of the peptides and consequently of the proteins. Our laboratory is now trying to use iTRAQ kits in our study of TCM and hopefully will publish related papers soon. In contrast with tag-labelling methods, label-free quantification aims to provide quantitative information without introducing any form of labelling. The principle of label-free quantification is to find relevant indicators of (relative) protein abundance directly in the mass spectrometer output and promising methods such as spectral counting, protein abundance index, TIC (total ion chromatogram), replicate and average intensity methods have been developed .
The identification of protein isoforms and PTMs depends on the MS technologies. The development of MS technologies in proteomics has been reviewed recently . In the present paper, we want to emphasize ‘top-down’ analysis, a relatively new field in MS analysis of proteins. In top-down MS analysis, proteins are analysed in the gas phase as intact molecules and their multiply charged ions are used as precursors in tandem mass analysis . The opposite approach of top-down method, called the ‘bottom-up’ method, has been popularly used in proteomics. In the bottom-up method, proteins are digested with proteases (usually trypsin) and the corresponding peptides are analysed and fragmented. Compared with the bottom-up method, the top-down method has the advantages of getting a more complete comprehension of the protein structure. For example, 50–70% sequence coverage is usually a good result for the bottom-up method, meaning that 50–30% of the protein remains unknown to the researcher. Therefore, when the full knowledge of a polymorphism or of the complete PTMs of a protein is important for an investigator, the top-down method becomes a better choice. The idea of topdown dates from 1990, but its development happened only in the last decade. Although the high cost of instruments impeded its popularity, the top-down method seems to be consolidating its role in MS, mostly by a form of combined use with the bottom-up method. In studies using combined top-down and bottom-up approaches [46,47], bottom-up was used to generally identify proteins, whereas top-down was used to deal with more particular and stringent issues, such as the complete PTM mapping of a single protein. It is believed that the combination of these two techniques could provide answers for protein identification, quantification and characterization down to a single nucleotide polymorphism .
Strategies that need to be adjusted in the mechanistic study of TCM
The adaption to more complex samples or experimental conditions is important for use of new technology such as proteomics. Although it has been proved to be a very useful tool in identifying target molecules of bioactive small molecules , proteomics still need to be fine-tuned to be successful in TCM studies. Higher-throughput proteomics methods might be necessary to deal with the huge number of TCMs. Furthermore, public databases containing proteomics study results about the possible target-related proteins of TCMs need to be established and shared. Most importantly, the possible synergistic effects between different constituents in a TCM formula have to be clarified. Therefore suitable bioinformatics tools for proteomics data have to be developed to analyse the interaction between possible target-related proteins of different constituents in TCM. First, the interaction between different compounds in a CMM such as a kind of plant has to be analysed. Then, the interaction between different CMMs in a TCM formula has to be analysed. Only after we have clarified the mechanism of a multiple-component TCM using proteomics methods can we declare the success of proteomics in the mechanistic study of TCM.
In conclusion, application of proteomics in TCM studies is still at the early stage. More technologies and modified proteomics strategies should be used in TCM studies. However, we believe that, with the further development of proteomics, proteomics could pave a new way to clarify the mechanism of TCM and be beneficial to the modernization as well as globalization of TCM.
Joint Sino–U.K. Protein Symposium: a Meeting to Celebrate the Centenary of the Biochemical Society: A Biochemical Society Focused Meeting held at Shanghai University, Shanghai, China, 5–7 May 2011. Organized by Tom Blundell (Cambridge, U.K.), Zengyi Chang (Peking University, China), Ian Dransfield (Edinburgh, U.K.), Neil Isaacs (Glasgow, U.K.), Glenn King (University of Queensland, Australia), Sheena Radford (Leeds, U.K.), Zihe Rao (Nankai University, China), Yi-Gong Shi (Tsinghua University, China), Chihchen (Zhizhen) Wang (Institute of Biophysics, Chinese Academy of Sciences, China), Jiarui Wu (Shanghai Institute of Biological Sciences, China) and Xian-En Zhang (Ministry of Science and Technology, China). Edited by Zengyi Chang and Neil Isaacs.
Our work is supported by the National Science and Technology Major Project ‘Key New Drug Creation and Manufacturing Program’, China [grant numbers 2009ZX09102-122, 2009ZX09304-002, 2009ZX09308-005, 2009ZX09311-001 and 2009ZX09502-020].