A biochemical comparison of the lung, colonic, brain, renal, and ovarian cancer cell lines using 1H-NMR spectroscopy

Abstract Cancer cell lines are often used for cancer research. However, continuous genetic instability-induced heterogeneity of cell lines can hinder the reproducibility of cancer research. Molecular profiling approaches including transcriptomics, chromatin modification profiling, and proteomics are used to evaluate the phenotypic characteristics of cell lines. However, these do not reflect the metabolic function at the molecular level. Metabolic phenotyping is a powerful tool to profile the biochemical composition of cell lines. In the present study, 1H-NMR spectroscopy-based metabolic phenotyping was used to detect metabolic differences among five cancer cell lines, namely, lung (A549), colonic (Caco2), brain (H4), renal (RCC), and ovarian (SKOV3) cancer cells. The concentrations of choline, creatine, lactate, alanine, fumarate and succinate varied remarkably among different cell types. The significantly higher intracellular concentrations of glutathione, myo-inositol, and phosphocholine were found in the SKOV3 cell line relative to other cell lines. The concentration of glutamate was higher in both SKOV3 and RCC cells compared with other cell lines. For cell culture media analysis, isopropanol was found to be the highest in RCC media, followed by A549 and SKOV3 media, while acetone was the highest in A549, followed by RCC and SKOV3. These results demonstrated that 1H-NMR-based metabolic phenotyping approach allows us to characterize specific metabolic signatures of cancer cell lines and provides phenotypical information of cellular metabolism.


Introduction
Cancer is the second leading cause of global mortality with approximately 9.6 million deaths in 2018. Some types of cancers including lung and colorectal cancer are among the most common causes of the mortality, accounting for 18% and 8.9% of the total cancer death, respectively. It was reported in 2018 that lung, colorectal, kidney, brain, and ovarian cancer contributed to 12.3%, 10.6%, 2.4%, 1.7%, and 1.7% of the total number of cancer cases, respectively [25]. Although brain cancers only accounted for 1.7% of the total cancer cases in 2018, it is the second most common cancers in children, contributing 26% of childhood cancers [1,2].
To explore cellular or molecular mechanisms, responses to therapies and drug discovery and development of cancers, cancer cell lines have been widely used and served as the workhorse for cancer research. However, it has been recently reported that single cell-derived clones showed continuous instabilities, leading to the heterogeneity of the cell lines, altering the drug responses, and dysregulation of xenobiotic metabolism [3]. This suggested that cell line-based research should be documented with the extent, origins and consequences of genetic variation of the cell lines to improve the reproducibility of the cancer spectroscopy (COSY), 1 H-1 H total correlation spectroscopy (TOCSY), and heteronuclear single quantum coherence spectroscopy (HSQC) were acquired on the selected cell and media samples for to aid in metabolite identification. 1 H-NMR spectra obtained from cell extracts and media samples were phased, referenced to TSP at δ 1 H 0.00 and baseline-corrected in TopSpin 4.0.3 (Bruker Corporation, Rheinstetten, Germany). MATLAB software R2018a (MathWorks, Cambridge, U.K.) programming language was used to import and process the NMR spectral data. Water peak regions of the cell extract (δ 1 H 4.74-4.85) and cell media (δ 1 H 4.7-5) spectra were deleted to minimize the effect of the disordered baseline. Regions containing only noise in the cell extract (δ 1 H 0-0. 5, 9.5-10) and cell media (δ 1 H 0-0.3) spectra were removed. Two cellular extract samples from 5 million H4 cell group and one media sample from 5 million RCC cell group were excluded due to extremely low intensities of signals. The remaining spectra data from 1, 5, and 10 million cells were normalized using a probabilistic quotient normalization method separately [7]. Principal component analysis (PCA) and orthogonal projection to latent structures-discriminant analysis (OPLS-DA) were carried out based on the unit variance-scaled datasets in SIMCA-15 (Umetrics, Sartorius Stedim Biotech) and MAT-LAB (The MathWorks, Inc.) software. The PCA, an unsupervised method, can reduce data dimensions to several principal components, which allows the visualization of data variations; in other words, it can describe intrinsic similarities or differences of the data [8]. In contrast, OPLS-DA is a supervised method, which requires the sample class information (e.g. control vs. intervention) and shows the metabolic differences between the classes. In an OPLS-DA model, R 2 X and R 2 Y represent the variation explained by the model in X and Y matrices, respectively. Q 2 Y represents predictability of the model and a good model usually has a Q 2 Y > 0.5 [8]. A permutation test of the OPLS-DA model was also carried out to generate a P value. Models with P < 0.05 are considered as valid OPLS-DA models.

Results
Characterizing the biochemical composition of A549, Caco2, H4, RCC, and SKOV3 cell lines The median 1 H-NMR spectra of the cell extracts obtained from 1, 5, and 10 million cells per cell type are shown in Supplementary Figure S1. By visualizing the spectra, peak intensities increase proportionally as the number of cells increases from 1 to 10 million. The metabolite assignment from cell extracts and media samples are listed in the Supplementary Table S1. Five million cells produced a better quality of the spectra with 64 scans than one million cells. A total of 34 metabolites were identified from these cellular extracts and confirmed using 2-D 1 H-1 H COSY and 1 H-1 H TOCSY NMR spectra (Supplementary Figure S2); these metabolites included acetate, alanine, asparagine, aspartate, choline, creatine, formate, fumarate, glutamate, glutamine, glutathione, glycerol phosphocholine, glycine, histidine, hypoxanthine, isoleucine, lactate, leucine, lysine, methanol, methionine, myo-inositol, phenylalanine, phosphocholine, serine, succinate, taurine, threonine, trehalose, tryptophan, tyrosine, uracil, uridine, and valine.
The metabolic profiles obtained from 5 and 10 million cells were analyzed using unsupervised PCA analysis with three principal components (PC). The PCA scores plots of PC1 versus PC2 derived from both 5 and 10 million cells ( Figure 1) show a grouping pattern based on the cell types, except for H4. This grouping pattern is clearer in the scores plot (PC1 versus PC2) with 10 million cells, while for 5 million cells it is clearer in the scores plot (PC2 versus PC3) (Supplementary Figure S3B).
Pair-wise comparisons between different cell types were carried out using OPLS-DA analysis with one predictive component and one orthogonal component. The R 2 X, Q 2 X, Q 2 Y, and permutation P values of these OPLS-DA models are summarized in Table 1. The loading plots from the significant OPLS-DA models and the metabolite changes observed in the pair-wise comparisons are shown in Figure 2 and Table 2 . Peak integrals of 15 metabolites from 10 million cells of A549, Caco2, H4, RCC, and SKOV3 cell extracts are presented in Figure 3.
Statistically significant models were observed in the vast majority of pair-wise comparisons, except for A549 versus Caco2 (10 million) and A549 versus H4 (5 million). The model of A549 versus Caco2 from 5 million cell extracts was statistically significant, which was contributed by higher concentrations of phosphocholine, and decreased concentrations of glycine in Caco2 cells. However, the model of A549 versus RCC from 10 million rather than 5 million cell extracts was significant, corresponding to higher concentrations of formate, phosphocholine, and choline in RCC. The biochemical composition of A549 cells was also significantly different from SKOV3 cells. In the 5 million cell extracts model, the concentrations of phosphocholine, myo-inositol, glutathione, and glutamate were higher in SKOV3 compared with A549 cells. Additional metabolic differences, including higher concentrations of formate, uridine, lactate, and creatine, and lower concentrations of choline, acetate, and isoleucine, were observed in SKOV3 with 10 million cells compared with A549. The model of Caco2 versus H4 from 10 million cell extracts was statistically significant, with higher concentrations of threonine, valine and glycine, and lower concentrations of succinate in H4. The concentrations of formate, lactate, glutamate, glycine, myo-inositol, taurine, and glycerol phosphocholine was higher in both 5 and 10 million cell extracts of Caco2 compared with RCC. However, lower levels of uridine and phosphocholine and a higher level of choline were only observed in 5 and 10 million cell models, respectively. Higher concentrations of glutathione, myo-inositol, creatine, and lactate, and lower concentrations of choline in SKOV3 cells were observed to distinguish from Caco2 cells (both 5 and 10 million cells). While 5 million RCC cells showed higher concentrations of lactate, taurine, and glutamate in contrast with H4 cells, additional metabolites such as glycerol phosphocholine, phosphocholine, choline, succinate, threonine, serine, valine, and isoleucine were found to be different in concentrations in 10 million cells. Higher concentrations of uridine, glutathione, phosphocholine, lactate, myo-inositol, creatine, and succinate were observed in SKOV3 cells compared with H4 and RCC, whereas choline was found to be higher in RCC in comparison with SKOV3 ( Figure 2).

Metabolic characterization of A549, Caco2, H4, RCC, and SKOV3 cell culture media
High-intensity peaks present in the media samples were assigned based on 2D NMR spectra. These include acetate, acetone, alanine, citrate, formate, glucose, glutamate, glutamine, glycine, histidine, isoleucine, isopropanol, lactate, leucine, lysine, phenylalanine, pyroglutamate, pyruvate, succinate, threonine, tryptophan, tyrosine, and valine (Supplementary Figure S4). 1 H-NMR spectral data from the cell culture media of these cancer cell lines were analyzed using PCA. As expected, the grouping pattern observed in the scores plots of media samples is based on the cell types (Supplementary Figure  S5A), unlike the cell extracts where PC1 is dominated by the number of cells (Supplementary Figure S5B). Similar grouping patterns were observed from the PCA scores plots of all media samples, 5 or 10 million cell culture media samples. There is a clear separation along the PC1 between H4 and the other cell types, while a separation between  Caco2 and the rest was observed along the PC2 (Supplementary Figure S5A,C and D). Given that both H4 and Caco2 were cultured using DMEM and the other three cell lines were cultured using RPMI, the major variation revealed by PCA was likely due to metabolic behavior of the H4 cells rather than the compositional differences between the two media. Additional PCA analyses were carried out for each type of media to compare the metabolic contribution of the cells to the media biochemical composition. The PCA scores plots based on the 5 million cell culture samples ( Figure  4) show clear clustering based on the cell types. A similar pattern was also observed with 10 million cell culture media (Supplementary Figure S6). OPLS-DA models of media spectral data were calculated between different types of cells cultured in the same media and significant models were obtained from all comparisons (Table 3). OPLS-DA loadings plots and the metabolite changes observed in all pair-wise comparisons are shown in Figure 5 and Table 4. The peak integrals of 12 metabolites, identified from spectra of media samples cultured for 10 million cells of A549, Caco2, H4, RCC, and SKOV3, are presented in Figure 6.
Lower concentrations of amino acids (e.g. phenylalanine, tyrosine, glycine, glutamine, alanine, valine, isoleucine, and leucine), glucose, and tricarboxylic acid (TCA) cycle intermediates (e.g. succinate and pyruvate) were observed in A549 compared with RCC or SKOV3, together with higher concentrations of lactate. Additionally, a higher concentration of alanine and a lower concentration of acetone were also found in a 10 million cell culture media (RPMI) of SKOV3, in contrast with A549. Metabolites present in the SKOV3 media that distinguish it from RCC include higher concentrations of glucose, glutamine, and lower concentrations of lactate, isopropanol, pyro glucose, succinate, acetone, acetate, and alanine.
Lower concentration of glucose was only observed in 5 million H4 cultured in DMEM compared with Caco2, whereas the increased concentrations of several metabolites were found in both 5 and 10 million models; these included formate, phenylalanine, tyrosine, threonine, lactate, isopropanol, alanine, glutamine, glycine, pyruvate, acetone, acetate, valine, isoleucine, and leucine ( Figure 5).

Discussion
The present study was based on 1 H-NMR spectroscopy-based metabolic phenotyping and demonstrated the significant metabolic differences among five cell lines, namely, lung (A549), colonic (Caco2), brain (H4), renal (RCC), and ovarian (SKOV3) cancer cells. The intra-group variation was higher in the H4 cell line in contrast with the others, particularly when cultured as 10 million cells. It is likely that H4 cell growth are more sensitive to environmental conditions, and the cell growth rate and metabolic behavior may likely be affected with subtle changes in culture condition. One of the most profound findings with regards to metabolic changes was the significantly higher intracellular concentrations of glutathione, myo-inositol, and phosphocholine in SKOV3, compared with other cell lines. Glutathione (GSH) is the most abundant non-protein thiol, which functions as an antioxidant and a redox regulator. It has been found that stem cells required high levels of GSH to maintain stem cell function and migration capabilities in vitro [9]. Similarly, GSH plays an important role in cancer progression and resistance to therapy. Indeed, it is reported to be associated with chemoresistance to platinum salts, which is one of the main treatments for ovarian cancer [10]. Myo-inositol and phosphocholine (ChoP) have been reported in SKOV3 cells and their cellular concentrations reduced after treatment of Ptac2S, a novel anticancer agent [11]. Glutamate was found to be higher in SKOV3 and RCC cells compared with other cell lines. Glutamate is an amino acid that plays a key role in energy and carbon metabolism and synthesis of amino acids and nucleotides for all cells. In cancer cells, glucose-based glycolysis and glutamate-based glutaminolysis are major two ways for ATP production. With the high levels of glutamate, glioma cells can be rescued from death [12]. Another key finding was the high abundance of amino acids in the H4 cells; amino acids presented in higher levels in H4 versus other cell lines include serine, methionine, threonine, valine, glycine, and acetate. The function of mitochondria is largely dependant on the pathway of serine to formate, which is then released into the cytoplasm to contribute to nucleotide synthesis [13]. As the level of serine was higher in the H4 cell line compared with other cell lines, the mitochondrial function of H4 cells may be disturbed. Amino acids and acetate in cancers can be used as nutritional supports for protein synthesis and lipid metabolism, respectively. Threonine was reported to be responsible for Akt and ERK signalling pathway in breast cancer [14].
It has been found in our study that the concentrations of choline, creatine, lactate, alanine, fumarate, and succinate varied significantly among different cell types. Ovarian cancer cell line exhibited the highest levels of alanine, lactate, and TCA cycle intermediates (e.g. succinate, fumarate) and methanol, and the lowest levels of choline, whereas lung     For each model (e.g. A vs. B), "+" indicates a higher correlation in B cells, whereas "-" indicates a higher correlation in A cells. r represents the correlation coefficient values; P represents significance level based on a two-tailed heteroscedastic t-test; q is corrected P values using Benjamini-Hochberg correction. Abbreviations: bs, broad singlet; d, doublets; dd, double of doublets; m, multiplets; n, cell numbers; s, singlet; t, triplets; q, quartets (∼10 6 ).
cancer cell line showed the lowest levels of lactate and alanine. Lactate has been reported to be elevated in the cancerous cells due to lactic acidosis and glucose deprivation [15], while glutamine and alanine metabolism may be altered in breast cancer [16]. Due to 'Warburg effects' , in cancer cells (especially cancerous tumors), the oxidative phosphorylation pathway is more likely to shift to glycolysis, which increases the level of lactate and decreases the level of TCA cycle intermediates [17]. Alanine can be a fuel source for the TCA cycle; increased alanine levels could indicate decreased activity of the TCA cycle, which could be attributed to the higher Warburg effects observed in SKOV3 cells [18]. Choline may affect the progression of cancer through one-carbon metabolism. Therefore, higher malignancy of SKOV3 cells could be due to an accelerated one-carbon metabolism and thus decreased levels of choline [19]. Media concentrations of isopropanol was found to be the highest in RCC, followed by A549 and SKOV3, while acetone was the highest in A549, followed by RCC and SKOV3, together with a higher level in H4 compared with Caco2. Interestingly, isopropanol and acetone were deemed as a potential biomarker in a series of diseases including cancer [20,21]. There is a reversible reaction between acetone and isopropanol under the action of alcohol dehydrogenase [22]. It was reported that under the circumstances of starvation and a ketogenic diet, ketone bodies such as acetone is produced [23]. Additionally, higher concentrations of breath acetone were also reported in lung cancer patients, which is in line with our previous findings of the highest acetone production in lung cancer cells [21].
The concentrations of alanine in the culture media were the highest in the RCC, followed by SKOV3 and A549. A previous study showed that renal cell carcinoma results in an increased level of alanine in cells, which might be due to the downregulated expression of ALDH6A1 gene [24], thus demonstrating that the ALDH6A1 gene may encode methylmalonate semialdehyde dehydrogenase, was deficient and hence the level of alanine was increased.

Conclusion
Our study showed that 1 H-NMR-based metabolic phenotyping analysis can detect the cellular metabolic profile in five different cancer cell types, including lung, colonic, brain, renal, and ovarian cancers. Similarly, their metabolic profiles can also be measured in culture media. It may be concluded that 1 H-NMR-based metabolic phenotyping can be used to detect cellular metabolisms of different cancerous cells and improve our understanding of the metabolism of certain cancers.