Prognostic and immunological role of Fam20C in pan-cancer

Abstract Background: The family with sequence similarity 20-member C (Fam20C) kinase plays important roles in physiopathological process and is responsible for majority of the secreted phosphoproteome, including substrates associated with tumor cell migration. However, it remains unclear whether Fam20C plays a role in cancers. Here, we aimed to analyze the expression and prognostic value of Fam20C in pan-cancer and to gain insights into the association between Fam20C and immune infiltration. Methods: We analyzed Fam20C expression patterns and the associations between Fam20C expression levels and prognosis in pan-cancer via the ONCOMINE, TIMER (Tumor Immune Estimation Resource), PrognoScan, GEPIA (Gene Expression Profiling Interactive Analysis), and Kaplan–Meier Plotter databases. After that, GEPIA and TIMER databases were applied to investigate the relations between Fam20C expression and immune infiltration across different cancer types, especially BLCA (bladder urothelial carcinoma), LGG (brain lower grade glioma), and STAD (stomach adenocarcinoma). Results: Compared with adjacent normal tissues, Fam20C was widely expressed across many cancers. In general, Fam20C showed a detrimental role in pan-cancer, it was positively associated with poor survival of BLCA, LGG, and STAD patients. Specifically, based on TCGA (The Cancer Genome Atlas) database, a high expression level of Fam20C was associated with worse prognostic value in stages T2–T4 and stages N0–N2 in the cohort of STAD patients. Moreover, Fam20C expression had positive associations with immune infiltration, including CD4+ T cells, macrophages, neutrophils, and dendritic cells, and other diverse immune cells in BLCA, LGG, and STAD. Conclusion: Fam20C may serve as a promising prognostic biomarker in pan-cancer and has positive associations with immune infiltrates.


Introduction
Protein kinases are a common way of regulating information transduction in organisms, which play a crucial role in the process of cell signal by transferring a phosphate from adenosine triphosphate (ATP) to the target proteins [1,2]. It never really came as a surprise that, protein phosphorylation is an important mechanism involved in multiple physiological processes within the cell [3,4]. What perhaps unexpected was the extracellular protein phosphorylation with the low concentration of ATP in the extracellular environment; phosphoproteomic studies have shown that more than two-thirds of human serum, plasma, and cerebrospinal fluid contain phosphoproteins [5][6][7]. Emerging evidence has revealed that in physiological functions, extracellular phosphorylation is beneficial for blood coagulation, immune cell activation, and the formation of neuronal networks [8][9][10]. On the other hand, compelling facts exist that exokinase activity is increased in some diseases, such as cancers [11][12][13].

GEPIA
Both the RNA sequencing expression data of 9736 tumors and 8587 normal samples from the TCGA and GTEx (Genotype-Tissue Expression) projects can be obtained by accessing the GEPIA (http://gepia.cancer-pku.cn/index. html) web server, as a building block in an interactive and customizable resource for research. Functions thus far include differential expression analysis, survival analysis, correlation analysis, profiling plotting, similar gene detection, and dimensionality reduction analysis [36]. Remarkably, this interactive web server includes 33 malignant tumors for users to explore interested information. In the present study, the GEPIA database was used to verify the relevant results obtained from the application of the Oncomine database, and then 'Survival Plots' module was applied to analyze the survival prognosis of Fam20C. Further, through the 'Correlation Analysis' module, we explored the relationship between the expression of the Fam20C gene and the immune gene markers.

TIMER database analysis
With respect to tumor immune research, TIMER (https://cistrome.shinyapps.io/timer/) provides a user-friendly web interface to explore and visualize tumor immunologic and genomics data [37]. Information thus far includes 10897 samples of 32 cancers from TCGA, together with the abundance of TIICs (tumor-infiltrating immune cells) based on a deconvolution method from gene expression profiles [38]. In this research, we utilized 'Gene' module to estimate the correlation between Fam20C expression and immune infiltration level (the abundance of six TIIC subgroups: B cells, CD4 + T cells, CD8 + T cells, macrophages, neutrophils, and dendritic cells), as well as tumor purity, among 39 cancer types. And then, the 'Correlation' module was applied to analyze the association between Fam20C and other prognosis-related immune cell markers to further estimate the potential infiltrating immune cells subtypes. These gene markers include B cells, CD8 + T cells, dendritic cells, exhausted T cells, macrophages, M1 macrophages, M2 macrophages, monocytes, TAMs (tumor-associated macrophages), neutrophils, natural killer (NK) cells, follicular helper T cells (Tfh), Regulatory cells (Tregs), T-helper 1 (Th1) cells, T-helper 2 (Th2) cells, T-helper 9 (Th9) cells, T-helper 17 (Th17) cells, and T-helper 22 (Th22) cells. Moreover, we selected the immune gene markers by searching the website of CellMarker (http://biocc.hrbmu.edu.cn/CellMarker/). The expression level of gene was adjusted by log2 TPM (transcripts per million). Fam20C was plotted on the x-axis, while marker genes were plotted on the y-axis. The expression scatterplots can visualize correlations between Fam20C and each immune gene marker.

Statistical analysis
Correlation datasets for the differential expression of cancer and adjacent tissues were created in Oncomine with P-values, fold changes, and gene ranks. Survival curves were drawn by the PrognoScan, Kaplan-Meier plotter, and GEPIA. The hazard ratio and Cox P-values or log-rank P-values were used for comparing OS, RFS, EFS, and DMFS among patients in different groups. The correlation of gene expression analyzed in GEPIA and TIMER, in which Spearman's correlation was employed as correlation coefficient. Throughout the text, a P-value <0.05 was examined to be statistically significant.

Fam20C mRNA expression levels across different cancers
The expression levels of Fam20C mRNA across different cancers, between tumor and normal tissue, were analyzed in Oncomine and TIMER databases. In Oncomine database, compared with the normal tissues the result revealed higher expression of Fam20C in brain and CNS (central nervous system), breast, cervical, esophageal, head and neck, lymphoma, and pancreatic tumors ( Figure 1A). In contrast, decreased expression of Fam20C was found in bladder and kidney cancers. Notably, elevated Fam20C expression was demonstrated in only one BC dataset, but two decreased expression were observed in two BC datasets. Similarly, in colorectal cancer, compared with normal tissue, there is one dataset with higher expression and one with lower expression. Detailed results of Fam20C expression across different cancer types are summarized in Supplementary Table S1.
Likewise, the same work was performed in Kaplan-Meier plotter database (data source from TCGA), OS and RFS were used as indicators to judge the prognostic value, seven cancer types, respectively, exhibited bad prognosis and good prognosis on mRNA abundance of Fam20C.

Association of lymphatic metastasis in STAD patients with high Fam20C expression
We next sought to find the relevance and potential mechanisms underlying Fam20C expression in cancers. Thus we analyzed the relationship between the Fam20c expression and several characteristics of gastric cancer patients by using the Kaplan-Meier plotter database, which includes clinical features and pathological stages. Consequently, a strong association of the Fam20C high expression with worse OS, FP (first progression), and PPS (post-progression survival) in female and male patients was found. Interestingly, a similar association was observed both in surgery alone or treatment and HER2 negative (P<0.05) ( Figure 5). For clinicopathological factors, the association of elevated Fam20C expression with OS, FP and PSS was found in stage 3, PSS in stage 2, OS and PSS in stage 4 of gastric cancer patients. Notably, Fam20C played a detrimental role on local lymph node involvement in OS, FP, and PPS among N0-N3. In addition, Fam20C seemed to only affect gastric cancer patients without distant metastases. The depth of tumor invasion (T category) and the number of positive lymph nodes (N category) had been proved to be two most important prognostic factors [39,40]. These results indicated that up-regulated Fam20C markedly impacted the lymph node metastasis, predicting worse prognosis.

Fam20C influenced the extent of immune infiltration in BLCA, LGG, and STAD
Numerous papers and reviews suggest that multiple types of immune cells are associated with prognosis in various cancer types and of particular importance are the TILs [41][42][43][44][45]. Deeply understanding the immune activity of TILs in cancer would provide more accurate prognostic information. Hence, Spearman's correlation coefficient was applied to analyze the correlation between Fam20C and immune infiltration level across 39 cancer types in TIMER. This analysis revealed that Fam20C was correlated with decreased purity of tumor in 19 cancer types and increased purity of tumor in two cancers. Furthermore, the association was also observed for 9, 12, 24, 23, 21, and 24 cancer types, Considering Fam20C expression correlated with levels of immune invasion in many types of cancers, we next performed the analysis combination of immune infiltration and prognosis. Tumor purity can be interpreted as the proportion of tumor cells in tumor tissue, immune-related genes are negatively correlated with tumor purity regardless of tumor purity [46]. Most of intersection data from TCGA were covered in TIMER and GEPIA databases. Consequently, we selected cancer types in which the elevated Fam20C was negatively related to the tumor purity in TIMER and was largely related to bad prognosis in GEPIA. As noted above, BLCA and LGG were selected, furthermore STAD was also included in this analysis, which was the only one that had poor OS and RFS with high expression of Fam20C in Kaplan-Meier plotter and also had a high level of infiltration in GEPIA. We observed positive correlation between Fam20C expression and infiltrating levels of CD8 + T cells (R, 0.269; P, 1.83e-07), CD4 + T cells (R, 0.28; P, 5.51e-08), macrophages (R, 0.468; P, 3.20e-21), neutrophils (R, 0.321; P, 3.64e-10), and DCs (R, 0.395; P, 4.74e-15) in BLCA (  neutrophils in DLBC, at the same time in GEPIA showed Fam20C played a protective role of prognosis in DLBC. Collectively, these findings may demonstrate that Fam20C could affect the intratumor densities of immune cells.

Correlation between Fam20C expression and immune markers
Beyond the correlation between Fam20C and the above six immune infiltrating cells, we next sought to find whether Fam20C was associated with the expression of more immune infiltrating cells by investigating related immune cell markers among BLCA, LGG, and STAD in TIMER and GEPIA. Immune cells were recognized by cell markers, including B cells, T cells (general), CD8 + T cells, different functional T cells, M1 and M2 macrophages, TAMs, monocytes, NK cells, neutrophils, and dendritic cells in BLCA, LGG, and STAD, using LIHC as the control. After correlating adjustment by purity, we observed that the expression of Fam20C was strongly associated with 60 among 72 immune cell markers in BLCA, 59 in LGG, and 53 in STAD. However, there was significant correlation with only five gene markers in LIHC (Table 1 ).
In addition to the aforementioned overall changes, as shown in Figure 6, CD4 + T cells, macrophages, and dendritic cells, which were most closely related to Fam20C expression in BLCA, LGG, and STAD. However, with LISC, these three types were less significant. For the most expression markers levels of TAMs, monocytes, M2 macrophages had a robust association of Fam20C, specifically, CD80, CCL2, IL10, and Tim-3 of TAM, CD86 and CD115 of monocyte, CD163, VSIG4, and MS4A4A of M2 macrophage showed a strong association with Fam20C expression in BLCA, LGG, and STAD, despite no significant correlation of CD80 in STAD (P<0.0001; Figure 7A-P). To verify this finding, we performed the same analysis in GEPIA (Table 2). Consistently with TIMER, the results showed Fam20C may regulate macrophage polarization in BLCA, LGG, and STAD.
Significant correlation between key gene markers of the dendritic cells (CD1C, CD141, HLA-DPB1, HLA-DQB1, HLA-DRA, HLA-DPA1, BDCA-4, CD11C) and expression of Fam20C was observed in BLCA, LGG, and STAD compared with LIHC ( Table 1). The results further supported a crucial role of Fam20C for DCs infiltration. With respect to Treg cells, Fam20C had a positive correlation with CD25, CCR8, FOXP3, CD127 in BLCA, LGG and STAD, despite no significant correlation of FOXP3 in LGG. Macrophages secrete a large number of chemokines such as CC-like chemokines CCL22 and CCL20, which induce Tregs to recruit to the tumor site, similarly DCs also induce Treg generation, and then promote the metastasis of cancer cells [47,48]. Whether Fam20C affects the DCs or macrophages and tumor metastasis need to be done for further studies.
In addition, a strong correlation existed between Fam20C and B cells, Tfh cells, Th9 cells, Th17 cells, and exhausted T cell markers. The relationships of Fam20C with CD8 + T cells, Th1 cells, Th2 cells, Th22 cells, neutrophils, and NK cells were partly different in BLCA, LGG, and STAD compared with LIHC. These observations, together with data    from GEPIA, illustrate that Fam20C expression in BLCA, LGG, and STAD associates with different degree of immune cell infiltration in different way, further supporting Fam20C may be as an effective factor influencing patients survival and prognosis.

Discussion
Fam20C is identified as Golgi casein kinase, which is expressed in a variety of tissues, including mineralized and non-mineralized tissues and body fluids [15,16,49]. Protein within Ser-X-Glu/pSer motif is phosphorylated by Fam20C in some 75% of human plasma and cerebrospinal fluid phosphoproteins. Focusing on the substrates of Fam20C, studies have shown that Fam20C not only regulates some biological processes, but also involved in tumor growth and metastasis [16]. Nevertheless, Fam20C has not been largely studied in the cancer field. It is now acknowledged that there is a relationship between Fam20C expression and tumor cell progression (mainly LUAD and BC) [23,24]. Therefore, it is desirable to speculate that the expression of Fam20C may affect the survival of patients through the progression of tumor cells. However, Fam20C expression in cancer and a consensus on the definition of other vital aspects like tumor cells metastasis are lacking. The role of Fam20C in cancer was observed in earlier studies but has not previously been dissected. Combined with previous research, our results remind that it should be noted that Fam20C may play diverse roles in various cancers. Reportedly, in Fam20C KO cells, the adhesion, migration, and invasion phenotype of BC cells could be rescued [16]. This might suggest that Fam20C is beneficial to the invasion and development of BC. However, in contrast to the situation with that previous study, we observed a relation between higher expression of Fam20C and a better prognosis in BRCA in PrognoScan database (data source from GEO) ( Figure 2). More recently, in a trial conducted on the bioinformatics and human LUAD cells, hypoxia was indicated a poor prognostic factor for LUAD, and Fam20C was identified as a key gene associated with hypoxia in the progression of LUAD [24]. Consistently, LUAD expressed poorer prognosis in our research ( Figure 2). In addition, we found that the expression of Fam20C was negatively correlated with tumor purity of LUAD, and positively correlated with immune cells infiltration, which further verified the relationship between Fam20C expression and poor prognosis (Supplementary Figure S3). A deeper understanding of these differences between previous studies using cancer cells and our study using the cohort of cancer patients may help develop a global view to generate cancer development mechanisms with Fam20C expression. Here, we present an integrated study on the Fam20C expression levels in pan-cancer, the association of Fam20C variations with prognosis among different cancers and the potential mechanisms underlying different clinicopathological features. Elevated Fam20C expression is associated with worse prognosis in BLCA, LGG, and STAD. Further, enhanced expression of Fam20C can affect lymph node metastasis with gastric cancer patients, indicating that Fam20C could be used as a predictor of tumor metastasis. Additionally, immune infiltration levels in BLCA, LGG, and STAD were positively correlated with Fam20C expression. Herein, the present study offers new insights into the clinical, prognostic, immunological understanding of Fam20C in different types of cancer. In order to analyze the Fam20C expression levels among different cancer, we examined differential Fam20C expression across pan-cancer and their matched paracancer normal tissues of datasets from Oncomine and 32 cancer types of TCGA data from TIMER. Based on the Oncomine data showed that Fam20C had a higher expression level in brain and CNS, breast, cervical, colorectal, esophageal, head and neck, lymphoma, and pancreatic cancers. Further, in bladder, breast, colorectal and kidney cancer, a lower expression level of Fam20C was found ( Figure 1A). However, given the data from TCGA in TIMER database, the results suggested Fam20C expression was relatively higher in HNSC, LIHC, LUAD, PRAD, and THCA than normal tissues while Fam20C expression descended in BLCA, BRCA, COAD, KICH, KIRP, LIHC, and SKCM ( Figure 1B). Partial different results may be due to the difference in data sources, data collection approaches, and numbers of cancers in the study cohort. Nonetheless, in three separate databases, we found consistently poor prognostic value with Fam20C expression in BLCA, CESC, and brain cancers. Specifically, datasets of GEO were analyzed using PrognoScan showed that elevated Fam20C expression associated with worse prognosis in bladder, brain, colorectal, and lung cancers (Figure 2A,B,F-K). Further, applying TCGA database of GEPIA to the analysis showed higher mRNA levels of Fam20C had an increased risk for shorter time for OS and DFS in most tumor types, including BLCA, CESC, and brain cancer (GBM and LGG), HNSC, and UVM ( Figure 4C-E,H-L,N). Kaplan-Meier Plotter explored elevated Fam20C expression associated with increased risk for RFS in CESC, and both OS and RFS in STAD ( Figure 3E,W,X). Clinically, in gastric cancer patients with high expression levels of Fam20C correlated with increased risk for OS, FP, and PPS in stage 3-4, T2-T4, N0-N3, and M0 ( Figure 5). These observations, together with clinicopathological features, illustrate that Fam20C is a newly identified multicancer-relevant gene with potential prognosis values in bladder, brain, cervical, and gastric cancer risk prediction, and supporting Fam20C might impact the patients with gastric cancer about lymph node metastasis.
Notably, another crucial part of the present study is that Fam20C expression is correlated with various immune infiltration levels in cancer, especially in BLCA, LGG, and STAD. We found a strong significant correlation between Fam20C and CD4 + T cells, macrophages, neutrophils, and DCs infiltration in BLCA, LGG, and STAD ( Figure  6A,B,D), suggesting that Fam20C may influence both the extent of immune infiltration and the degree of activation of diverse immune cells. Moreover, the use of TIMER to estimate the degree of correlation between infiltrating immune cell markers and Fam20C expression is an attempt to identify the contributions of biomarkers. Recently, with the development of immune checkpoint inhibitors, biomarkers of immune cells can not only serve as prognostic markers, but also receive widespread attention as a new type of treatment [50]. More directly, we found the association between Fam20C and immune cells markers suggested Fam20C might regulate tumor immunology in BLCA, LGG, and STAD. Among this, genetic markers of M1 macrophages, for example INOS and COX2 indicated no significant correlation with Fam20C expression, while M2 macrophage gene markers such as CD163, VSIG4, and MS4A4A exhibited high correlations (Table 1). These findings suggest the potential regulatory role of Fam20C in polarization of TAMs. TAM can promote tumor growth by suppressing immune clearance, promoting tumor cell proliferation, and stimulating angiogenesis [51]. We have also identified Fam20C might have the potential to activate Treg cells and induced T-cell exhaustion. Most these markers were positively correlated with Fam20C, including CD25, CCR8, FOXP3, CD127, PD-1, Tim-3, CTLA4, LAG3, and GZMB ( Table 1). As noted in previous studies, Tregs are often associated with a poor clinical outcome, their ability to promote progression of cancer through limiting antitumor immunity and promoting angiogenesis [52]. Exhausted T cells are defined as a group of T cells with poor effector function and sustained expression of inhibitory receptors [53]. During tumor immunity, CD4 + T cells and CD8 + T cells exhaustion will promote tumorigenesis and development, in this process PD-1 is the major inhibitory receptor [54]. Also, PD-1 showed a strong association with Fam20C expression in BLCA, LGG, and STAD. In addition, we observed in BLCA, LGG, and STAD there was a high correlation between Fam20C and the markers of T helper cells (Th1, Th2, Th9, Tfh, and Th17). These findings imply an alternative mechanism for Fam20C regulated activation of T cells. Here we show major correlations between CD4 + T cells, neutrophils, DCs, M2 macrophages, TAMs, Tregs, exhausted T cells, and T helper cells with Fam20C, supporting the the important role of Fam20C in the immune contexture in BLCA, LGG, and STAD.
It is still unclear that what role Fam20C expression plays in the process of tumorigenesisor in pan-cancer. More recently, some studies have presented possible mechanisms of Fam20C expression correlates with poor prognosis. For the tumor environment, it is axiomatic that hypoxia is a common feature of cancers [55,56]. DNA methylation plays important regulatory roles in cancer progression. An analysis of DNA methylation profiles of 533 LUAD patients showed FAM20C was identified as one of hypoxia-related key genes, specifically, hypoxia in LUAD cells inhibited DNA methylation of Fam20C gene, promoted Fam20C gene expression, and further led to deterioration of LUAD [24]. Another evidence of supporting the role of Fam20C in tumor migration, revealed that the Fam20C inhibitor (FL-1607) designed by structure-based molecular modeling had the effects of antitumor growth, inducing cell apoptosis and inhibiting cell migration [23]. Together with our finding that Fam20C impacted the prognosis in gastric cancer patients with lymph node metastasis, further provide an evidence about Fam20C expression might influence cancer cells migration. Recent studies have found that EMT (epithelial-mesenchymal transition) is closely related to the occurrence of multiple cancers and the proliferation, migration, and invasion of cancer cells [57]. And CDH2 (cadherin-2), one of the markers of EMT, was founded in the Fam20C phosphoproteome [16]. As predict, E-cadherin (CDH1) converted into CDH2 negatively correlated with Fam20C expression, other markers of EMT, including CDH2, SNAIL, TWIST, ZEB1 and ZEB2 were positively correlated with the expression of Fam20C (Supplementary Table S2). This likely indicates Fam20C participates in the EMT process. Therefore, enhancement of cancer cells in adhesion and migration, which may be accompanied by EMT, could be an underlying regulatory mechanism associated with Fam20C and bad prognosis.
Our study showed that increased expression of Fam20C is linked to poor prognosis in multiple cancer types, and the infiltrating immune cells associated with Fam20C expression in TME. These findings may allow better prognostic prediction and providing immuno-oncological perspective of regarding Fam20C as a prognostic marker. Nevertheless, even if we collected the information among multiple different databases, this research still had restrictions. Initially, a large amount of sequencing and microarray data were gathered and analyzed for tumor tissues, which ignored the heterogeneity of cells in the tumor tissue, so there was a certain systematic bias. Further, the applications of single-cell sequencing can provide high-resolution research to solve this problem. Second, due to the contradictory findings of individual cancers in different databases, we cannot determine the prognostic value of Fam20C in these cancers. Third, our research only performed a bioinformatics analysis of Fam20C and patient survival value in multiple databases, but did not perform in vivo/in vitro experiments to verify. Next, we will complement in vivo/in vitro experiments to achieve the mechanisms of Fam20C in different cancer types at the cellular and molecular levels. Fourth, although Fam20C expression was found to be associated with immune cell infiltration and patient's survival, it has not been demonstrated that Fam20C affects patient's survival through immune infiltration. Whether the expression or function of Fam20C and their products affects cancer cell growth and migration in the clinical setting is an important topic for future studies.
In summary, elevated Fam20C expression can impact prognostic value in pan-cancer and increase degree of immune infiltration. In BLCA, LGG and STAD, Fam20C expression potentially contributes to the polarization of TAM, activation of Treg cells and T helper cells, and induction of T cell exhaustion. Therefore, Fam20C might be a prognostic biomarker in pan-cancer and its expression is in association with immune infiltration in BLCA, LGG, and STAD.