Nuclear, casein kinase and cyclin-dependent kinase substrate (NUCKS), a protein similar to the HMG (high-mobility group) protein family, is one of the most modified proteins in the mammalian proteome. Although very little is known about the biological roles of NUCKS, emerging clinical evidence suggests that this protein can be a biomarker and therapeutic target in various human ailments, including several types of cancer. An inverse correlation between NUCKS protein levels and body mass index in humans has also been observed. Depletion of NUCKS in mice has been reported to lead to obesity and impaired glucose homoeostasis. Genome-wide genomic and proteomic approaches have revealed that NUCKS is a chromatin regulator that affects transcription. The time is now ripe for further understanding of the role of this novel biomarker of cancer and the metabolic syndrome, and how its sundry modifications can affect its function. Such studies could reveal how NUCKS could be a link between physiological cues and human ailments.
Different cells of an organism share the exact same genome but display distinct phenotypes and carry out diverse functions. Although complex intracellular cross-talks within a single signalling module [1–4] are important in defining the functional identity of various cells, differential gene expression underlies the basis of distinct cellular functions. The chromatin state, the packaging of DNA with histone and non-histone proteins, has significant effects on gene expression, and in contributing to the establishment and maintenance of cell fates. Indeed, dynamic changes in chromatin states lead to changes of gene expression patterns that define cellular physiological outcomes . Multiple mechanisms are involved in regulating chromatin accessibilities, including DNA modifications, histone modifications, ATP-dependent chromatin remodelling and non-coding RNA (ncRNA)-mediated pathways . Besides histone proteins, there are other non-histone proteins that regulate chromatin structures, and one such family of proteins is the high-mobility group (HMG) proteins .
The HMG protein family is one of the most abundant non-histone proteins that control chromatin remodelling and regulate gene transcription . HMG proteins are ubiquitous nuclear proteins that bind to DNA and other protein complexes in a dynamic and reversible fashion. This binding regulates DNA structures in the context of chromatin . The HMG protein family is classified into three classes based on the structure of their DNA-binding domains and their substrate binding specificity. These include the HMG-AT-hook families (HMGAs), HMG-box families (HMGBs) and HMG-nucleosome-binding families (HMGNs) . Besides these classic members, other proteins such as the nuclear, casein kinase and cyclin-dependent kinase substrate (NUCKS) are similar to the proteins of the HMG family .
There is very limited information in the literature about the physiological functions of NUCKS. Several groups have reported that it could bind DNA and regulate chromatin context [12,13]. Although the biological functions of this protein have not been fully elucidated, recent results suggest that it might be a potential biomarker for cancer and the metabolic syndromes [13–22]. In the present review, we first discuss the detection, cloning and initial characterization of NUCKS, and follow this by quoting the emerging evidence(s) that NUCKS could be used as a diagnostic biomarker in cancer and metabolic diseases, highlighting its therapeutic potential in diabetes and obesity.
DETECTION, CLONING AND CHARACTERIZATION OF NUCKS
HMG proteins have unique characteristics in that they are soluble in 5% perchloric acid and extractable from chromatin in 0.35 M NaCl. In addition they have a high content of acidic and basic amino acids and a relatively high content of proline residues (≥7%) . Protein members of this family are known to be highly modified by post-translational modification (PTM) such as acetylation, methylation and phosphorylation . During an investigation on phosphorylation of perchloric acid-soluble proteins exacted from HeLa cells, Østvold et al.  observed one phosphorylated protein that had a larger molecular mass than known HMG proteins. This protein, which was named NUCKS, was highly phosphorylated and noted to be rich in acidic and basic amino acids, with a high content of proline.
NUCKS was cloned from HeLa cells in 2001 and its amino acid sequences are highly conserved between humans and rodents . Given the high similarities between NUCKS and the HMG proteins, NUCKS was believed to be a member of the HMG family. However, attempts to find a putative protein family by database query using both sequence alignment and amino acid composition have failed. Northern blot analysis showed that there are three NUCKS transcripts in human and rat tissues, albeit with varying amounts. CD analysis and secondary structure predictions based on the amino acid sequence revealed a low level of α-helical content and substantial β-turns in the structure . The apparent molecular mass of NUCKS is around 50 kDa by protein gel electrophoresis . However, according to mass spectrometry using purified NUCKS and prediction based on the length of the cDNA, the molecular mass of NUCKS is around 27 kDa. The discrepancy in the predicted and actual masses of NUCKS might be due to the high content of basic and acidic residues (>50%) and the lack of hydrophobic residues (<20%) in the NUCKS protein . From the Human Genome Project database, two regions containing NUCKS cDNA were found on human chromosome 1q32.1 by using the sequences of the full-length NUCKS cDNA and the BLAST program of the National Center for Biotechnology Information (NCBI). Computer analysis revealed that the gene consists of seven exons and six introns, and lacks a TATA box but contains two Inr elements, two GC boxes and one consensus-binding site for the transcription factor E2F-1. NUCKS homologues are present in fish, amphibians and birds but not in insects and yeasts, indicating that it is a vertebrate-specific gene [16,24].
Østvold and colleagues further showed that NUCKS is an in vitro substrate for second messenger-activated kinases such as cAMP-dependent protein kinase, cGMP-dependent protein kinase, calcium/phospholipid-dependent protein kinase and calcium/calmodulin-dependent protein kinase II . NUCKS is also phosphorylated by CK-2, which is implicated in cell growth and proliferation, the cyclin-dependent kinases Cdk1, -2, -4 and -6, all of which play important roles in regulation of the cell cycle, and DNA-activated protein kinase, which is involved in DNA repair [26,27]. DNA-binding experiments have shown that NUCKS binds to both ssDNA and dsDNA in vitro and this binding can be enhanced by phosphorylation of NUCKS . The physiological stimuli that lead to phosphorylation of NUCKS are not known, and future studies to identify the signalling cascades that modulate NUCKS's covalent modifications and hence physiology will throw light on the roles of this interesting protein. The primary sequence of NUCKS contains a highly basic sequence, TPSPVKGKGKV, followed by a GRP motif. The GRP motif constitutes the core motif of an AT-hook domain, which exists in many DNA-binding proteins . These proteins bind preferentially to the AT-rich sequences in DNA. By using a NUCKS-derived synthetic peptide containing an extended GRP motif, the DNA-binding properties of NUCKS were investigated. This peptide was found to bind to random sequences of DNA, but preferentially to poly(dA/dT), which may be due to the lack of conservation between the GRP-flanking regions in NUCKS and the extended GRP domain in the AT-hooks of types I, II and III . However, TPSPVKGKGKV is important for NUCKS to bind to DNA because another peptide containing the same amino acids as the NUCKS-derived peptide, but with a random sequence, did not bind to DNA. Genome-wide ChIP (chromatin immunoprecipitation) sequencing using NUCKS antibody also revealed that NUCKS could bind to DNA in non-AT-rich regions . Thompson et al.  used guilt by association to analyse gene expression data-sets extracted from breast and ovarian cancers, and found one group of genes that contains H2A histone family, member O (H2AFO), prostate-derived Ets factor (PDEF) and NUCKS to be co-expressed and involved in transcriptional regulation. Taken together, these pieces of evidence suggest that NUCKS might be a transcriptional regulator of genes despite the differences in structure from the classic HMG proteins.
PTMs are important features that regulate signalling and transcription [31,32]. DNA-binding proteins such as histones and HMG proteins are also modified by signalling cascades, and it is known that these modifications allow the modulation of protein function and link extracellular events to the extent and duration of signalling. PTMs affect protein stability, activity, conformation, localization and interaction with DNA or other proteins. They are involved not only in normal physiological conditions but also in pathological conditions related to various diseases. NUCKS has the highest ratio of modified to unmodified residues of proteins described so far . Among its 243 residues, there are 57 PTMs on different residues, including phosphorylation, ubiquitination, acetylation, formylation and methylation. Proteomic analysis from various different groups revealed that NUCKS is phosphorylated at 34 sites [33–39]. NUCKS was constitutively phosphorylated at several sites whereas dynamic changes in its phosphorylation were observed between non-synchronized G1 and mitotically arrested cells. Using stable isotope labelling by amino acids in cell culture (SILAC), Wisniewski et al.  identified six acetylated sites in NUCKS, and some of the acetylated residues are located within putative DNA-binding domains and are cell cycle dependent. Acetylation of proteins is a crucial event in the functional activation of DNA-binding proteins such as histones, which usually correlates with the transcriptional openness of that region of chromatin . There is a vast potential in exploring the question of how the acetylated residues within NUCKS's DNA-binding domain may regulate the biological functions of NUCKS in health and disease.
NUCKS is expressed ubiquitously in most tissues in mammals, with the most abundant expression observed in adult thyroid gland, prostate and uterus, and in fetal liver, thymus and lung . During the embryonic development of rats, NUCKS exhibits dynamic spatiotemporal expression patterns . The expression level of NUCKS increases during the initial stages of embryonic development and gradually decreases until birth, in tissues including the brain, liver, limbs, tail and gonads. At E13.5 (mouse embryonic day 13.5), NUCKS appears to be a marker for migrating cells of the neural crest. At later stages of embryonic development, expression of NUCKS decreases compared with the early stages and is localized to specific regions of the embryo, with the nervous tissue and muscle fibres having the highest levels of NUCKS mRNA and protein . Immunofluorescence analysis shows that NUCKS is distributed throughout the cytoplasm in mitotic cells and targeted to the re-forming nuclei in the cell cycle's late telophase . Fractionation of various cancer cell lines revealed that NUCKS localizes to both the cytoplasm and the nucleus (Qiu, B.Y. et al., unpublished data). Treatment of cells with specific small molecule inhibitors of NUCKS should probably achieve pathway-specific inhibition of NUCKS-mediated signal transduction [41,42]. It was found that the primary sequence of NUCKS contains two regions with highly charged basic residues that are potential nuclear localization signals (NLSs). One of these regions (NLS1) contains the classic bipartite NLS sequence and is highly conserved. On the other hand, the other region (NLS2) does not contain a classic bipartite NLS sequence and is less well conserved. Both putative NLS regions can translocate green fluorescent protein (GFP) into the nucleus whereas a splice variant of NUCKS without the NLS1 region does not translocate NUCKS/GFP fusion protein into the nucleus, showing that NLS1 is important for the nuclear translocation of NUCKS . Further mutagenesis analysis in full-length NUCKS demonstrated that the minimum sequence for efficient translocation function is PPTKKIR, which locates within the NLS1 region.
BIOLOGICAL FUNCTION OF NUCKS
Given the evidence that NUCKS is expressed ubiquitously and highly modified post-translationally, it is believed that it has important biological functions in vertebrates. Several groups report that NUCKS is highly expressed in different human cancer cells such as breast, ovarian, lung, bone marrow and brain cancer, through different analyses, indicating that NUCKS might be a potential biomarker for cancer [20,30,44]. It was surprising to note that the whole-body NUCKS knockout mice are viable and fertile, with no immediate overt phenotypes in the first few days of birth . However, a few weeks after birth, NUCKS knockout animals show increased weight gain and insulin resistance, and those phenotypes are exacerbated by a high-fat diet (HFD). Further characterization of the tissue-specific functions of NUCKS, which may give rise to obesity, revealed that levels of NUCKS in adipose tissues show a correlation with obesity and diabetes even in humans . These murine studies and the human correlation of NUCKS with obesity clearly point to the clinical relevance of NUCKS, and to the need for further investigation in this direction and in other human ailments.
NUCKS as a potential biomarker for cancer
Naylor et al.  first reported the correlation between NUCKS and breast cancer and the potential of this protein to be a biomarker for cancer in 2005. They analysed primary breast tumours and breast cancer cell lines by high-resolution-array comparative genomic hybridization (CGH) with 4134 bacterial artificial chromosomes that cover the genome at 0.9-megabase resolution. Expression profiling of these tumours demonstrates that NUCKS and two expressed sequence tags are over-expressed in tumours with amplifications relative to those without . Another group used proteomic approaches and histochemical analyses to demonstrate that NUCKS is over-expressed in human ductal invasive breast cancer . Investigation shows that NUCKS is highly expressed in grades I and II breast carcinomas compared with normal tissues. Furthermore, NUCKS was shown to be moderately expressed in benign epithelial proliferations, such as adenosis and sclerosing adenosis, and highly expressed in intraductal lesions but not fibroadenoma tissues . Increased expression levels of NUCKS in pre-cancerous lesions suggest that the upregulation of NUCKS was an early event, which further emphasizes the diagnostic value of NUCKS's over-expression as a cancer biomarker. Obesity has been correlated with an increased incidence of postmenopausal breast cancer, and growing evidence also suggests that obesity is linked with poor prognosis in women diagnosed with early stage breast cancer . Soliman et al.  analysed metabolic parameters including body mass index (BMI), homoeostasis model assessment of insulin resistance (HOMA-IR), lipid profiles and cytokine levels in breast cancer patients, and checked the levels of NUCKS in patients’ breast tissues. They demonstrated a positive correlation between obesity markers and the interleukins IL-6, IL-12 and levels of lipocalin 2 (LCN2) and NUCKS mRNA expression in the breast cancer group, suggesting that NUCKS might contribute directly to oncogenesis . It is important to test these ideas using mouse models.
Besides breast cancer, clinical evidence showed that NUCKS has potential functions in other cancers. Compared with low-invasive tumour cell strains, NUCKS is increased in highly invasive tumour cell strains derived from early passage mouse lung adenocarcinoma, and this phenomenon is further validated in invasive human lung adenocarcinoma tissues . Over-expression of NUCKS was also detected in gastric adenocarcinoma by immunohistochemistry, and it may act as an independent prognostic factor together with Ki-67 for poor disease-free survival and overall survival in advanced gastric adenocarcinoma (American Joint Committee on Cancer stages III–IV) . Furthermore, Kikuchi et al.  identified NUCKS as a candidate gene involved in the distant metastasis of colorectal cancer (CRC) through analysis of primary tumours from 392 patients who underwent curative surgery for CRC by gene expression and copy number analysis. There is a reverse observation indicating that decreased levels of NUCKS may involve induction of vincristine resistance in ETV6-RUNX1-positive leukaemia . These reports highlight the potential roles of NUCKS in carcinogenesis. However, more clinical analyses of various cancer types in larger populations are required to reach a conclusion.
Although several lines of evidence show that expression of NUCKS is associated with cancer incidence and recurrence in different cancers, the mechanism by which NUCKS contributes to oncogenesis has not been addressed. The NUCKS gene is located on chromosome 1q32.1, which has been reported to have a high frequency in breast cancer and an early event in cancer progression, by array-based CGH techniques [47,48]. NUCKS belongs to the invasive gene signature, which consists of 186 genes over-expressed in breast cancer stem cells . The co-expression gene set, including NUCKS, H2AFO and PDEF, which was reported to regulate gene transcription, was found to be over-represented in several cancerous tissues, indicating that it may be involved in oncogenesis . Investigation of upstream kinases, which regulate NUCKS, hinted that this protein might also be involved in the process of DNA damage signalling . The putative promoter of the NUCKS gene contains one consensus E2F-1 site, as well as two potential sites for Sp1 [24,30]. The fact that NUCKS modifications could be linked to cancer progression is alluded to in the observations that it is highly modified, and changes, especially lower phosphorylation and additional sites of lysine acetylation, formylation and monomethylation, were found in breast cancer tissues . Recent genome-wide ChIP sequencing proved that NUCKS could bind more than 1000 genes involved in inflammation and metabolism . As such, there might be possibilities that NUCKS regulates inflammatory genes involved in cancer progression. Further mechanistic study of the function of NUCKS will be needed to establish its role as a therapeutic target of cancer.
NUCKS as a potential biomarker for metabolic diseases
Similar to the HMGA protein, which has functions in both cancer and metabolic diseases, NUCKS also has important functions in the metabolic syndromes [51,52]. From a proteomics screen, NUCKS was identified to be decreased in white adipose tissue (WAT) from mice fed on an HFD compared with WAT from mice fed on a low-fat diet. Clinical samples showed consistently that obese people have lower levels of NUCKS in adipose tissue and these are reversely correlated with BMI and the HOMA-IR. Genetic analysis of NUCKS showed that NUCKS knockout mice gained more body weight and developed insulin resistance. Genome-wide ChIP sequencing showed that NUCKS could bind to metabolic genes and regulate gene expression, especially in insulin signalling . Gene ontology analysis of ChIP-seq data shows that NUCKS binds to many genes related to cytokine secretion, which indicates that NUCKS might have a function in inflammation; chronic inflammation is reported to have a strong relationship with both obesity and diabetes . Mechanistic studies show that NUCKS regulates the structure of chromatin and its accessibility to RNA polymerase II (Pol II), which further contributes to gene transcription (Figures 1A and B) . However, microarray analysis shows additional information that loss of NUCKS leads to increases in expression of a certain set of genes (Qiu, B.Y. et al., unpublished data) . In mammalian cells, several transcription factors or co-factors, such as MeCP2, have been proven to have dual effects in regulating transcription . It is possible that NUCKS also has similar features to act as both activator and repressor via direct or indirect mechanisms.
Diagram of action of NUCKS
The role of HMGA proteins in insulin signalling has previously been investigated . Defects in the HMGA1 gene cause impaired insulin signalling and insulin secretion in both humans and mice . Furthermore, functional variants of HMGA1 are reported to be early predictive markers for the metabolic syndrome [52,57]. Given the similarity of protein characteristics between NUCKS and HMGA proteins, further study of NUCKS's polymorphism(s) may be interesting to elucidate the function of NUCKS in metabolic diseases. Moreover, as NUCKS regulates genes directly involved in insulin signalling, the tissue specificity of NUCKS in regulating glucose homoeostasis will also be worthy of study because the insulin receptor has distinct functions in various tissues .
CONCLUSION AND PERSPECTIVES
Since the original description of NUCKS 20 years ago, there have been several reports indicating that it may have a potential role in carcinogenesis, and that its presence in tumour samples is a marker of poor prognosis for many human carcinomas. However, the mechanisms of NUCKS in carcinogenesis still need to be elucidated through more in-depth studies both in vitro and in vivo. Moreover, as expression of NUCKS correlates with cancer incidence and recurrence, the analysis of its protein or mRNA expression levels could help in predicting patient prognosis. Thus, development of standardized and quantitative analysis of levels of NUCKS in patient samples may have potential clinical benefits. Investigation of the tissue specificity of NUCKS will be another avenue for understanding the functions of NUCKS. As NUCKS helps to maintain normal glucose homoeostasis and body weight, it may have potential roles in metabolic diseases. Regulation of levels of NUCKS in different tissues might result in a wide range of applications. Overall, these levels have been shown to be associated with various types of cancer and metabolic diseases. These clinical observations underscore the potential importance of NUCKS in human diseases. Metabolic stress, differential localization and PTMs of NUCKS may lead to its diverse functions, which could result in selective regulation of gene transcription (Figure 2). Detailed mechanistic studies that identify both upstream signalling culminating in NUCKS's modifications and function, and elucidate its downstream roles will contribute to the understanding of the role of NUCKS and whether there is a molecular link between cancer and the metabolic syndromes.
NUCKS is a potential biomarker in both cancer and metabolic disease
body mass index
comparative genomic hybridization
green fluorescent protein
H2A histone family, member O
homoeostasis model assessment of insulin resistance
nuclear localization signal
nuclear, casein kinase and cyclin-dependent kinase substrate
prostate-derived Ets factor
II, RNA polymerase II
white adipose tissue
Research in the laboratories of V.T. and W.H. was supported by the Agency for Science, Technology and Research (A*STAR) Biomedical Research Council.