Integrative analysis reveals methylenetetrahydrofolate dehydrogenase 1-like as an independent shared diagnostic and prognostic biomarker in five different human cancers

Abstract Background: Defects in methylenetetrahydrofolate dehydrogenase 1-like (MTHFD1L) expression have earlier been examined in only a few human cancers. Objectives: Multi-omics profiling of MTHFD1L as a shared biomarker in distinct subtypes of human cancers. Methods: In the current study, for the multi-omics analysis of MTHFD1L in 24 major subtypes of human cancers, a comprehensive in silico approach was adopted to mine different open access online databases including UALCAN, Kaplan–Meier (KM) plotter, LOGpc, GEPIA, Human Protein Atlas (HPA), Gene Expression across Normal and Tumor tissue (GENT2), MEXPRESS, cBioportal, STRING, DAVID, TIMER, and Comparative Toxicogenomics Database (CTD). Results: We noticed that the expression of MTHFD1L was significantly higher in all the analyzed 24 subtypes of human cancers as compared with the normal controls. Moreover, MTHDF1L overexpression was also found to be significantly associated with the reduced overall survival (OS) duration of Bladder urothelial cancer (BLCA), Head and neck cancer (HNSC), Kidney renal papillary cell carcinoma (KIRP), Lung adenocarcinoma (LUAD), and Uterine corpus endometrial carcinoma (UCEC). This implies that MTHFD1L plays a significant role in the development and progression of these cancers. We further noticed that MTHFD1L was also overexpressed in BLCA, HNSC, KIRP, LUAD, and UCEC patients of different clinicopathological features. Pathways enrichment analysis revealed the involvement of MTHFD1L-associated genes in five diverse pathways. We also explored few interesting correlations between MTHFD1L expression and its promoter methylation, genetic alterations, CNVs, and between CD8+ T immune cells level. Conclusion: In conclusion, our results elucidated that MTHFD1L can serve as a shared diagnostic and prognostic biomarker in BLCA, HNSC, KIRP, LUAD, and UCEC patients of different clinicopathological features.


Introduction
Cancer is the second most common cause of mortality behind cardiovascular diseases worldwide [1]. Despite improvements in cancer detection and treatment methods, cancer incidence continues to rise rapidly, resulting in huge economic losses and premature deaths throughout the world [2]. In accordance with the World Health Organization (WHO) Cancer Stats 2020, approximately 18 million new cancer cases and 9.6 million deaths were estimated around the globe [3].
Many well-known cancer-causing risk factors include tobacco smoking, alcohol consumption, hepatitis B or C infection, diabetes, and obesity [2,4,5]. When someone tests positive for cancer, its prognosis mainly depends on the stage of the tumor at the time of detection. The early diagnosis of cancer is useful to obtain the more appropriate treatment options [6]. However, prior to our knowledge, there is a dearth of effective cancer-associated biomarkers that could commonly use to detect multiple cancer subtypes together as a shared biomarker and aid in the targeted therapy. Metabolic disorders are one of the key cancer hallmarks because of their involvement in tumor growth and invasion [7]. Mainly, cancerous cells depend upon the folate cycle to grow further. The folate cycle is responsible for fulfilling numerous cancer-associated nutritional requirements. A cytoplasmic enzyme, known as Methylenetetrahydrofolate dehydrogenase 1-like (MTHFD1L) participates in the formation of tetrahydrofolate (THF) within mitochondria [8], which is further involved in the folic acid cycle to synthesize formate [9]. Folic acid deficiency can lead to a variety of diseases including immune system dysfunction and cancer [10]. In 1949, Farber and Farber were the first to observe the tumor-associated role of folic acid in leukemic cells [11]. Their keen observation of folate's cancer-associated role has laid the foundation of chemotherapeutic agents development as a single or in combination with treatment options for cancer patients [11]. A recent report revealed that MTHFD1L plays a vital role in bladder cancer by increasing cell proliferation, invasion and metastasis [12], however, their results were not validated further. Another study has also reported the role of MTHFD1L in colorectal cancer (CRC) development and progression. In view of the results of this study, MTHFD1L expression blocking reduces the CRC cells growth and development, thus providing MTHFD1L as a new avenue in CRC treatment to be targeted with different inhibitors [9]. Furthermore, MTHFD1L was also found to be associated with esophageal squamous cell carcinoma (ESCA) [13] and tongue squamous cell carcinoma [14]. Despite the significance of MTHFD1L in bladder cancer, CRC, ESCA and squamous cell carcinoma, knowledge regarding its role in other subtypes of human cancer is still unknown.
In the present work, to uncover the MTHFD1L expression variations across different cancers, we analyzed a large tumor sample size paired with controls from different reliable online databases through a comprehensive bioinformatics approach. The findings of the present study have helped us to evaluate the crucial role of MTHFD1L as a shared biomarker in predicting the prognosis, performing the diagnosis, and initiating the development of Bladder urothelial cancer (BLCA), Head and neck cancer (HNSC), Kidney renal papillary cell carcinoma (KIRP), Lung adenocarcinoma (LUAD), and Uterine corpus endometrial carcinoma (UCEC).

UALCAN
UALCAN (http://ualcan.path.uab.edu) is an effective online tool for multi-omics data analysis of any gene(s) of interest across 31 cancer subtypes. This tool obtained multi-omics data from the TCGA projects [15] and also provide correlations between gene expression and different relevant clinicopathological features. In our study, this tool was selected to determine the differential expression of MTHFD1L in various cancer subtypes relative to normal controls. For this purpose, we initially entered MTHFDL1 gene name in the search box of UALCAN, then chose the pan-cancer analysis option to obtain the graph illustrating MTHFDL1 expression across multiple cancers in the form of box plots. In the obtained graph, the transcription expression level of MTHFD1L was quantified as transcript per million (TPM) reads, and a cutoff of P-value was selected as <0.05 in the Student's t test.

Kaplan-Meier plotter, GEPIA, UALCAN, and LOGpc tools
The prognostic values (OS duration) of MTHFD1L across different cancers was analyzed via Kaplan-Meier (KM) plotter (https://kmplot.com/analysis/) and GEPIA (http://gepia.cancer-pku.cn/) tools [16,17]. Moreover, UALCAN (http://ualcan.path.uab.edu) [15] was used to evaluate MTHFD1L prognostic values in cancer patients of different stages, races and genders, while LOGpc (https://bio.tools) [18] has helped to explore MTHFD1L prognostic values in cancer patients of different age groups. To do so, we entered the name of the MTHFDL1 gene into the search boxes of each tool, then adjusted the survival analysis type to OS and clinical parameters to cancer stage, race, gender in case of UALCAN, while age in the case of LOGpc. We then plotted the KM curves using default settings. During the analysis, cancer specimens were categorized into two groups by median expression level (high v/s low expression level) and a P-value was selected as <0.05.

Gene Expression across Normal and Tumor tissue database
For further validating the MTHFD1L transcription expression in tumor tissues, we analyzed NCBI GEO datasets of independent cohorts of distinct cancer patients using Gene Expression across Normal and Tumor tissue (GENT2) database with default settings (http://gent2.appex.kr/) [19]. In the analysis, a cutoff of P-value was selected as <0.05 in the Student's t test.

Data mining through Human Protein Atlas database
The Human Protein Atlas (HPA) database (http://www.proteinatlas.org) provides immunohistochemistry (IHC)-based proteomics expression data for any gene(s) of interest across more than 20 major subtypes of human cancers [20]. Via this database, researchers can easily identify the differentially expressed protein in the tissues of a specific cancer relative to controls. In this work, the IHC images of MTHFD1L proteomics expression in distinct human cancer tissues relative to normal tissues were visualized via HPA. The observed protein expression level was graded as not detected, low, medium and high, based on the intensity of staining and fraction of the stained cells. A P-value of <0.05 was chosen as significant.

MEXPRESS
MEXPRESS (https://mexpress.be/) web tool is designed for TCGA gene expression, promoter methylation, and a Pearson correlation analysis between these parameters for any gene(s) of interest in distinct cancer subtypes [21]. In our study, we used this tool to find out the correlation among expression and promoter methylation level of MTHFD1L in distinct cancer subtypes with default settings. A cutoff of P-value was selected as <0.05 in the Pearson correlation test.

The cBioportal database
The cBio000Portal (http://cbioportal.org) open-access database is dedicated to analyzing multi-omics cancer data from TCGA projects, which collectively includes more than 715 datasets and 86733 cancer samples [22]. In this study, different TCGA PanCancer Atlas datasets were chosen to examine MTHFD1L genomic alterations (mutations and copy number variations) across distinct human cancer subtypes. The search terms included mutations and putative copy number for MTHFD1L. The OncoPrint tab on the cBioPortal results interface has provided the overview of genetic mutations and CNVs above every sample bar.

Predicted protein-protein interaction network and pathways of MTHFD1L
In the current study, STRING tool (https://string-db.org/cgi/input?sessionId=befKymonvu2t&input page show search=on) [23], which is dedicated to predicting protein-protein interactions (PPIs) of any gene(s) of interest, was used to obtain the PPI network of MTHFD1L-associated genes with default settings. Later, for the pathway enrichment of MTHFD1L-enriched genes, DAVID (http://david.ncifcrf.gov/summary.jsp) [24] was utilized with the default settings. A P-value of <0.05 was considered as significant in the analysis.

Infiltrating level of CD8+ T in relationship with MTHFD1L across different human cancers
TIMER database (https://cistrome.shinyapps.io/timer/) systematically used microarray-based expression values for computing a Pearson correlation between immune cell infiltrates and the expression level of any gene(s) of interest across diverse cancer subtypes from TCGA data (10897 samples across 32 cancer types) [25]. In our study, CD8+ T immune cell infiltrates of MTHFD1L in different cancers were calculated using TIMER, and a P-value of <0.05 was considered as significant in the Pearson correlation test.

Chemotherapeutic drugs associated with MTHFD1L
MTHFD1L-associated chemotherapeutic drugs were obtained from the Comparative Toxicogenomics Database (CTD) [26] database for constructing a gene-drug interaction network via Cytoscape. This database can also be used to analyze gene-chemistry, gene-disease, and chemical-disease interaction networks. In our study, a constructed MTHFD1L gene-drug interaction network includes information on chemotherapeutic drugs that can reduce or enhance the expression level of MTHFD1L.

MTHFD1L expression in human cancers and normal tissues
To explore the MTHFD1L mRNA expression levels in 24 major subtypes of human cancer tissues paired with normal control, TCGA expression data were analyzed using pan-cancer analysis via UALCAN platform. Results of the analysis showed a significant (P<0.05) up-regulation of MTHFD1L in all 24 major cancer tissues relative to controls ( Figure  1).

The prognostic values of MTHFD1L in human cancers calculated via KM plotter and GEPIA
Next, to check whether MTHFD1L higher expression has any effect on the overall survival (OS) duration of the 24 types of cancer patients or not, we used KM plotter and GEPIA tools for OS analysis. In view of both resources collective results, a higher MTHFD1L expression was found to be associated with the reduced OS duration of the BLCA, ESCA, HNSC, KIRP, LUAD, and UCEC patients only and not with any other subtype ( Figure 2). Altogether, our data suggested that MTHFD1L might have a significant contribution to the development and progression of BLCA, HNSC, KIRP, LUAD, and UCEC, thus the next part of our study will mainly focus on the unique role of MTHFD1L in these five types of human cancers.

Clinicopathological feature-specific mRNA expression of MTHFD1L
To detect the clinicopathological feature-specific expression of MTHFD1L in BLCA, HNSC, KIRP, LUAD, and UCEC, we used the UALCAN platform. In view of the results of our analysis, a significant (P<0.05) overexpression of MTHFD1L was also observed in BLCA, HNSC, KIRP, LUAD, and UCEC patients of different clinicopathological features including different cancer stages, race, gender, and age (Figures 3-7). A clinical parameter-wise classification of the BLCA, HNSC, KIRP, LUAD, and UCEC cohorts can be seen in Tables 1-3.

MTHFD1L has good prognostic significance in BLCA, HNSC, KIRP, LUAD, and UCEC patients regardless of different clinicopathological features
Via OS analysis using KM plotter and LOGpc databases, as shown in Figure 8, MTHFD1L up-regulation was also found to be associated with reduced OS of the BLCA, HNSC, KIRP, LUAD, and UCEC patients of different clinicopathological features including different cancer stages, races, genders and ages, however, most of the correlations were insignificant due to the small sample size used in the analysis. All in all, MTHFD1L overexpression is prognostically relevant to BLCA, HNSC, KIRP, LUAD, and UCEC patients regardless of different clinicopathological features, and can be a promising marker gene for predicting their OS, however, additional experiments are required to be done before clinical application.

Figure 12. Evaluation of correlation between expression and promoter methylation level of MTFD1L in LUAD and UCEC (A) In LUAD and (B) in UCEC.
A negative sign indicates the negative correlation between MTHFD1L expression and its promoter methylation using a specific probe at a specific CpG island. A P-value (<0.05) was considered as statistically significant.

Validating MTHFD1L overexpression using new cohorts
We re-analyzed the MTDHFD1L expression using new cohorts of BLCA, HNSC, KIRP, LUAD, and UCEC via GENT2 platform. Results of the analysis were in agreement with the previous results and also further verified the significant (P>0.05) overexpression of MTHFD1L in BLCA, HNSC, KIRP, LUAD, and UCEC patients relative to controls ( Figure  9). Information on the BLCA, HNSC, KIRP, LUAD, and UCEC datasets utilized in this analysis is given in Table 4.

Protein expression level validation of MTHFD1L in BLCA, HNSC, KIRP, LUAD, and UCEC
Next, for protein level expression validation of MTHFD1L, we analyzed IHC results of MTHFD1L through the HPA database. The obtained immunohistochemical images have shown the significant (P>0.05) higher expression of MTHFD1L protein in BLCA, HNSC, KIRP, LUAD, and UCEC specimens relative to the normal bladder, head and neck, kidney, lung, and endometrial tissues, which have shown either low (in case of bladder, head and neck, lung, and endometrial tissues) or medium expression (in case of normal kidney tissues) of MTHFD1L ( Figure 10). Collectively, our results confirmed that MTHFD1L also overexpressed at the protein level in BLCA, HNSC, KIRP, LUAD, and UCEC as compared with the normal controls.

Exploring promoter methylation of MTHFD1L
It is earlier known that the dysregulation of different tumor suppressor genes due to aberrant methylation of promoter regions leads to cancer development [27]. To find whether promoter methylation has any impact on MTHFD1L expression or not in BLCA, HNSC, KIRP, LUAD, and UCEC, we have analyzed the Pearson correlation between its promoter methylation and mRNA expression level in BLCA, HNSC, KIRP, LUAD, and UCEC using MEXPRESS. In view of our results, the mRNA expression level of MTHFDL1 was significantly (P>0.05) negatively correlated with its

Figure 15. Evaluation of correlations between MTHFD1L expression and CD8+ T immune cells infiltration levels in BLCA, HNSC, KIRP, LUAD, and UCEC via TIMER database
A P-value (<0.05) was considered as statistically significant.

Predicted PPI network and pathways of MTHFD1L
STRING was used in our study to identify the MTHFD1L-enriched genes. The resultant PPI network has highlighted that MTHFD1L was linked with 25 other genes ( Figure 13A). All these genes were further subjected to pathway enrichment using the DAVID tool. In view of the analysis results, it was observed that few MTHFD1L-associated genes were significantly (P>0.05) involved in five diverse pathways including Pyrimidine metabolism, Purine metabolism, Metabolic Pathways, Biosynthesis of antibiotics, and One carbon pool by folate ( Figure 14; Table 5).

MTHFD1L overexpression and infiltrating levels of CD8+ T cells in BLCA, HNSC, KIRP, LUAD, and UCEC patients
CD8+ T immune cells play a key role in treating cancer patients through immunotherapy [28]. To further analyze whether the change in MTHFD1L expression exerts any effect on CD8+ T immune infiltration or not, we investigated the correlation of MTHFD1L expression with infiltration levels of CD8+ T immune cells in BLCA, HNSC, KIRP, LUAD, and UCEC using the TIMER database. As a result, we observed that MTHFD1L expression has shown a significant (P>0.05) negative correlation with CD8+ T immune cells level in BLCA, HNSC, KIRP, LUAD, and UCEC ( Figure 15). Collectively, these findings suggested MTHFDL1 as a possible regulator of CD8+ T immune cells level in BLCA, HNSC, KIRP, LUAD, and UCEC.

Chemotherapeutic drugs associated with MTHFD1L
Different experimentally validated chemotherapeutic drugs associated with MTHFD1L expression were obtained from the CTD database to develop a gene-drug interaction network. Through the constructed network, it was observed that MTHFD1L expression can be regulated by a variety of drugs, such as, cisplatin and vorinostat can up-regulated the expression level of MTHFD1L while tretinon and methylmercuric chloride can suppress MTHFD1L expression level ( Figure 16).

Discussion
Cancer is one of the most killer malignancies in the world that kill millions each year [29]. A lot of literature has confirmed several molecular biomarkers for detection, prediction of prognosis, and treatment of cancer patients [30]. However, still, these biomarkers are not efficient and have various limitations. Thus, there is an urgent need to identify shared biomarkers of cancer that could be used for detection, prediction of prognosis, and treatment of cancer patients without any serious complications.
Earlier studies have reported that MTHFD1L plays a key role in tumor development by dysregulating the folic acid cycle [31,32]. The higher expression of MTHFD1L has been noted in multiple cancers including CRC [9], ESCA [13], and tongue squamous cell carcinoma [14]. In the current study, we explored MTHFD1L expression as a shared potential cancer biomarker in 24 major human cancer subtypes.
We detected the remarkably higher MTHFD1L overexpression in 24 major subtypes of human cancers matched with controls. Also, MTHFD1L overexpression was found to remarkably influence (decrease) the OS duration of BLCA, HNSC, KIRP, LUAD, and UCEC patients only. Collectively, these findings highlighted that MTHFD1L overexpression is involved in the initiation and progression of BLCA, HNSC, KIRP, LUAD, and UCEC; therefore, current research is mainly focusing on these five cancer subtypes. Next, we also re-examined MTHFD1L expression individually in BLCA, HNSC, KIRP, LUAD, and UCEC patients stratified by different clinicopathological parameters. As a result, a significant up-regulation in MTHFD1L expression was also detected in BLCA, HNSC, KIRP, LUAD, and UCEC patients stratified by different cancer stages, races, genders, and ages. Therefore, it is speculated that MTHFD1L might up-regulate in BLCA, HNSC, KIRP, LUAD, and UCEC patients regardless of heterogeneity-barrier.
In terms of promoter methylation impact of MTHFD1L expression, our research has explored the significant negative correlation among expression and promoter methylation levels of the MTHFD1L in BLCA, HNSC, KIRP, LUAD, and UCEC patients. Therefore, ultimately, we speculated that promoter hypomethylation of MTHFD1L may be one of the key factors of its overexpression in BLCA, HNSC, KIRP, LUAD, and UCEC.
Genomic alterations (mutations and CNVs) are known as the key regulator of gene expression dysregulation in cancers [33,34]. Here in this research, we noticed very small percentages of the MTHFD1L genomic abnormalities and CNVs (4, 1.6, 0.3, and 7%) in the analyzed BLCA, HNSC, LUAD, and UCEC samples, respectively. While no MTHFD1L-associated genetic alterations and CNV were found in KIRP. All in all, these findings highlighted that genomic abnormalities have the least involvement in the dysregulation of MTHFD1L, however, further detailed work is required to confirm these results.
According to the latest research, the dysregulation of a few important genes may abnormally regulate the tumor microenvironment by altering the immune cells infiltration [47,48]. Keeping this in view, we carried out the correlation analysis among MTHFD1L expression and CD8+ T immune cells infiltration in BLCA, HNSC, KIRP, LUAD, and UCEC. Results highlighted that there is a significant positive correlation between MTHFD1L expression and CD8+ T immune cells level in BLCA and KIRP while a significant negative correlation in HNSC, LUAD, and UCEC. Collectively, this scenario indicates that CD8+ T immune cells may also exert a significant effect on BLCA, HNSC, KIRP, LUAD, and UCEC tumorigenesis via MTHFD1L, however, this role of MTHFD1L ought to be investigated further.
The PPI network of MTHFD1L revealed that it directly interacts with the 25 different genes and a few of them were involved in five diverse pathways including Pyrimidine metabolism, Purine metabolism, Metabolic Pathways, Biosynthesis of antibiotics, and One carbon pool by folate (Table 5). Additionally, we also identified a few potential drugs that could be useful in the treatment of BLCA, HNSC, KIRP, LUAD, and UCEC by regulating the MTHFD1L expression ( Figure 16).

Conclusion
Overexpression of MTHFD1L is associated with tumorigenesis and poor survival in BLCA, HNSC, KIRP, LUAD, and UCEC patients of different clinical parameters. Modulating MTHFD1L expression using different chemotherapeutic drugs may be a promising future treatment option for these cancer patients. However, additional work needed to be done on a large scale prior to clinical implications.

Data Availability
The datasets analyzed in the current study are available at: http://ualcan.path.uab.edu/cgi-bin/ualcan-res.pl.