RNA-binding proteins play a central role in cellular metabolism by orchestrating the complex interactions of coding, structural and regulatory RNA species. The SAFB (scaffold attachment factor B) proteins (SAFB1, SAFB2 and SAFB-like transcriptional modulator, SLTM), which are highly conserved evolutionarily, were first identified on the basis of their ability to bind scaffold attachment region DNA elements, but attention has subsequently shifted to their RNA-binding and protein–protein interactions. Initial studies identified the involvement of these proteins in the cellular stress response and other aspects of gene regulation. More recently, the multifunctional capabilities of SAFB proteins have shown that they play crucial roles in DNA repair, processing of mRNA and regulatory RNA, as well as in interaction with chromatin-modifying complexes. With the advent of new techniques for identifying RNA-binding sites, enumeration of individual RNA targets has now begun. This review aims to summarise what is currently known about the functions of SAFB proteins.
In this review, we discuss three scaffold attachment factor B (SAFB) proteins [SAFB1, SAFB2 and SAFB-like transcriptional modulator, SLTM] that bind both DNA and RNA. Proteins of this type, which have dual nucleic acid-binding capability, may constitute as much as 2% of the proteome and have been termed ‘DRBP’ (DNA–RNA-binding proteins) . Predictably, a high proportion of these DRBPs are involved in the regulation of transcription and processing of newly transcribed RNA, but a surprisingly high proportion is also involved in DNA repair, apoptosis and the response to cellular stresses, such as heat shock. This wide range of functions also applies to the SAFB family (Figure 1A). So far, SAFB1 has been the most intensively studied of the three proteins. Because some antibodies that have been used may cross-react, it is not always clear which of the three proteins were being investigated. In what follows, ‘SAFB’ will be used when a characteristic may refer to one or more of the proteins. SAFB is widely expressed in vertebrates, including fish, birds, reptiles and mammals. It is also found in Drosophila . Such conservation over at least 500 million years of evolution is a testament to the importance that these proteins must have in cellular processes.
Function and structure of the SAFB proteins.
SAFB proteins and the ‘nuclear matrix’
The extraction of nuclei with detergents and/or high salt concentrations followed by DNase leaves an insoluble array of filaments described as ‘nuclear matrix’  or ‘nuclear scaffold’ . Two groups identified AT-rich DNA sequences that bind matrix extracts, which they named scaffold attachment region (SAR)  or matrix attachment region (MAR) [5,6]. In 1996, while investigating S/MAR DNA elements, Renz and Fackelmayer reported the purification of a protein (and the cloning of the corresponding cDNA) that binds S/MAR elements. They had previously described another protein that binds S/MAR elements which they had named SAF-A (scaffold attachment factor A, also known as hnRNPU) , so the new protein was named SAF-B . Independently, Oesterreich et al.  identified a gene coding for a protein that binds the heat-shock 27 promoter which they named HET, whereas Weighardt et al.  identified a protein that interacts with hnRNPA1 and named it hnRNPA1-associated protein, or HAP. SAF-B, HET and HAP are now known to be identical and are currently named SAFB1. The related SAFB2 gene was later mapped adjacent to SAFB1 on human chromosome 19p13.3-p13.2, with the two genes arranged in a bidirectional (head-to-head) configuration with a short intervening sequence acting as a bidirectional promoter . The third gene in the family, SLTM, is more distantly related and is located on chromosome 15q22.1 [12,13].
The concept of a nuclear matrix or scaffold has been the subject of some controversy [14,15], and it has been suggested that the apparent structure might be an artefact produced during isolation. It was perhaps unfortunate that the terms matrix and scaffold were used, since they suggest a rigid skeleton supporting chromatin, a concept that is difficult to reconcile with the dynamic nature of chromatin. A less emotive name to describe this group of nuclear proteins might be preferable, particularly because findings reported by the Rosenfeld group  and others show that it is time for a reappraisal of this field (see below).
All the three proteins have a well-conserved DNA-binding (SAF-A/B, Acinus and PIAS, SAP) domain, an RNA-binding domain (RBD) and an arginine/glycine motif RGG/RG domain (Figure 1B). Other than these domains, the sequence that is most highly conserved between the three proteins is an arginine/glutamic acid (RE)-rich domain at the carboxy-end of the molecule, which partially overlaps with a putative coiled-coil domain and is likely to be involved in protein–protein interactions.
The three SAFB proteins have a single RBD (also known as an RRM, or RNA recognition motif). The RBD, which usually consists of 80–90 amino acids, is one of the most common protein domains, being found in ∼0.5–1% of human genes . The RBD forms a four-stranded antiparallel β-sheet with two helices packed against it which can recognise four nucleotides, with up to eight nucleotides being recognised if additional elements are utilised. Single RBDs generally have limited capacity to interact with RNA in a sequence-specific manner, so for increased specificity and affinity for longer sequences two or more RBDs may be required, either gained by homo- or hetero-dimerisation of proteins, or by co-operation with other RNA-binding domains (reviewed in Lunde et al.  and Cléry et al. ). It is now becoming clear that RBDs are also important sites of protein–protein interactions — indeed, some RBDs are only capable of binding proteins and do not bind RNA .
An RGG/RG motif is located (aa 868–875 in SAFB1) near the carboxy-terminal of SAFB1, SAFB2 and SLTM. These motifs are targets for methylation and are involved in many important cellular functions , such as nucleic acid-binding and protein–protein interactions. Most notably, the RGG/RG motif of SAFA is necessary and sufficient for RNA binding [22–24]), raising the possibility that the RGG/RG motif of SAFB is also involved in recognising and binding RNA.
Several regions of amino acid compositional bias, as well as well-characterised motifs, are located in the C-terminal region of the SAFB proteins, a region which appears to be crucial for protein–protein interactions and post-translational modifications that are associated with cellular localisation as well as functions related to the stress response and DNA repair [25–27]. Apart from the SAP and RBD domains, the most striking amino acid conservation between the three proteins occurs in a sequence of 73 amino acids (aa 656–728 in SAFB1) located in an RE-rich region, where there is 72% identity between SAFB1 and SLTM, and 90.5% identity between SAFB1 and SAFB2. This region, designated SAFB domain in Figure 1B, overlaps, to a large extent, with a dimeric coiled-coil domain (aa 641–698 in SAFB1). Coiled-coil domains are frequently involved in the formation of homo- or hetero-meric protein complexes .
Other domains and motifs
Protein synthesis occurs in the cytoplasm, and the SAFB proteins are located predominantly in the nucleus, so a mechanism must exist for transporting the newly synthesised proteins through the nuclear pore back into the nuclear compartment. Putative nuclear localisation signals (www.uniprot.org) occur in SAFB1 and SAFB2 (aa 599–616 in SAFB1), but have not been identified in SLTM. Many RNA-binding proteins shuttle between the nucleus and the cytoplasm, and sequences that are involved in, but do not bear any obvious similarity to classic NLS sequences, have been found in many RNA-binding proteins, including hnRNPA1  and hnRNPK . Unlike classic NLS, many of these sequences seem to function in both import and export from the nucleus. In the TIA1 and TIAR proteins, for example, one of the RBDs, together with a glutamine-rich C-terminal auxiliary domain, acts as a nuclear localisation signal .
Reversible post-translational modification of proteins, often regulated by enzymes that add (‘writers’) or remove (‘erasers’) moieties to individual amino acids, allows rapid changes in interaction with other proteins (‘readers’) that recognise the modifications . Such modifications include phosphorylation, acetylation, sumoylation, ubiquitylation, methylation and poly(ADP-ribosyl)ation (‘PARylation’), all of which have been detected in SAFB proteins. These processes provide mechanisms for rapid responses to changes in the cellular environment and are central to signalling pathways involving protein–protein interactions, cellular localisation, degradation and allosteric changes regulating enzyme activity.
Phosphorylation of SAFB was first noted by Nayler et al. , and more recent studies have shown that all the three SAFB proteins are subject to extensive phosphorylation of serine and threonine residues.0001
1Full range of modifications and sites at Phosphosite.org.
SAFB2 has also been shown to be a substrate for ubiquitylation by lysine-6-linked chains by the BRCA1/Bard1 complex . Unlike the more common lysine-46-linked ubiquitination, which generally targets proteins for degradation, lysine-6-linked chains appear to result in increased levels of SAFB. Also, predicted sumoylation sites are present in all the three SAF proteins, and sumoylation has been demonstrated for both SAFB1 (Lys 231 and Lys 294) and SAFB2 [36,37]. Golebiowski et al.  reported sumoylation of SAFB1 linked to the stress response, whereas Garee et al.  provided evidence for sumoylation being involved in the repression of transcription by SAFB1. Most recently, SUMOylated SAFB was shown to enhance transcription of ribosomal promoter genes .
Methylation of Lys and Arg residues of histones has long been associated with the regulation of histone function, but with the availability of new methodologies, it is becoming apparent that methylation of non-histone proteins is of major importance in multiple signal transduction pathways . Thus, the mono- and di-methylation, which has been shown to occur on the RGG/RG motif of SAFB1 , is likely to be particularly significant.
Poly (ADP-ribose) polymerases (‘PARPs’) use NAD+ as a donor to add mono-ADP-ribose to Arg, Lys, Asp and Glu residues of target proteins. Further additions result in the attachment of polymers, a process known as poly(ADP-ribose)ation, or PARylation. Massive increases in PARylation of histones and other proteins are among the earliest responses to DNA damage, and SAFB1 was shown to be one of the proteins subject to PARylation in response to genotoxic stress .
Expression and localisation
The SAFB proteins are widely expressed, with particularly high expression of SAFB1 and SAFB2 in the human central nervous and immune systems [11,42]. Using specific antibodies , we also investigated the expression and localisation of SAFB proteins in neuronal tissues (Figure 2). Following the immunocytochemical staining of sagittal mouse brain sections, we found that SAFB1 expression is particularly prominent in the hippocampus and cerebellum (Figure 2A). Interestingly, SAFB1 expression is high throughout the hippocampus (CA1, CA2, CA3 and dentate gyrus), and staining was exclusively nuclear (Figure 2A). In Drosophila, a single protein with homology to the SAF proteins (cg6995) is also highly expressed in the nervous system , where expression is highest during the first 12 h of embryogenesis. There also appears to be a degree of tissue specificity in relative expression of the proteins, so, for example, SAFB2 is highly expressed in Sertoli cells where SAFB1 is hardly detectable .
Immunocytochemical analysis of SAFB protein expression.
The intracellular distribution of SAFB proteins can change with conditions and also appears to be cell-type specific. Typically, SAFB1 and SAFB2 are limited to a punctate distribution within nuclei, with a partial overlap between the two proteins. Although some co-localisation can be detected at the punctate foci, SAFB2 was found to be more diffusely distributed in the nuclei of HeLa cells, while both SAFB1 and SAFB2 were found to be more diffusely distributed in nuclei of HEK293 cells. The distribution of SAFB1 and SAFB2 also overlaps with that of Sam68 , a member of the STAR family of proteins which link cell signalling to RNA metabolism. Although SAFB is excluded from nucleoli, it is often found located in the perinucleolar region [10,11,26,44]. The nuclear localisation of SLTM is similar, and, to a large extent, it co-localises with SAFB1, though not completely . SAFB2 expression has been reported in the cytoplasm of HeLa and MCF-7 cells . In non-neuronal (unstressed) HeLa cells, SAFB is associated with perichromatin fibrils (fine structures visible by electron microscopy and adjacent to transcriptionally active chromatin which are thought to be sites of active pre-mRNA processing) and, to a lesser extent, with condensed chromatin . Although SAFB was isolated on the basis of its ability to bind S/MAR DNA sequences, association of SAFB with chromatin appears to be dependent primarily on direct or indirect binding to RNA. Thus, in HeLa cells, SAFB is found in the insoluble nuclear extracts, but is released by RNase. Also, inhibition of transcription causes redistribution of SAFB to the nuclear periphery, again suggesting that association with chromatin depends on the presence of RNA . In Drosophila, SAFB is localised to specific bands of polytene chromosomes in salivary glands which sometimes — but not always — overlap with bands of active RNA polymerase II . Digestion with RNase abolishes much of the chromatin binding in Drosophila, again indicating that SAFB is primarily linked to RNA rather than DNA. This conclusion was supported by the finding that removal of the SAP domain did not affect nuclear localisation of SAFB .
We also investigated the localisation of SAFB proteins in cortical and hippocampal primary neuronal cultures (not previously published. Methods used are described in Howarth et al. ). While SAFB1 expression was found to be primarily nuclear, SAFB2 expression was found in the nuclei and dendrites of cortical and hippocampal neurones (Figure 2B,C). Intriguingly, SLTM was found primarily in the nuclei of cortical and hippocampal neurones, and there was also some somatodendritic staining. The results also show that SAFB2 and SLTM co-localise in the same dendritic puncta, suggesting that they may have roles in mRNA processing and/or transport.
Since SAFB was isolated on the basis of its ability to bind S/MAR DNA elements [7,8], there has been little subsequent information published on its DNA-binding properties, or on the function of its SAP domain. However, in view of the similarities between SAFB and SAFA proteins, the results of studies on the SAP domain of SAFA could well be informative. Thus, Kipp et al.  found that the isolated SAP domain of SAFA (hnRNPU) bound weakly to S/MAR DNA, with high specificity binding requiring protein–protein interactions between multiple SAP domains, a phenomenon they termed ‘mass-binding principle’. More recent studies on the SAFA-SAP domain indicate an important role in chromatin regulation (see below), though it remains unclear whether or not binding to S/MAR DNA is involved.
Binding to mRNA and pre-mRNA
Two recent mass spectrometric analyses of proteins bound to mRNA [46,47] identified SAFB1, SAFB2 and SLTM among ∼800 RNA-binding proteins associated with mRNA. Since SAFB1 is localised to perichromatin fibrils and is known to bind mRNA, it seemed likely that at least some of the SAFB is bound to newly transcribed pre-mRNA and involved in mRNA processing. Because target RNA sequences had not previously been identified for SAFB proteins, we used individual nucleotide resolution cross-linking and immunoprecipitation (iCLIP) with deep sequencing [48,49] to identify binding sites and determine which RNA species are bound by SAFB1.
Using the SH-SY5Y neuroblastoma cell line, the distribution of iCLIP tags showed that SAFB1 binding is enriched primarily within open reading frames, the highest density of cross-linked sites in exons being adjacent to intron/exon boundaries . Enrichment was also apparent in 3′ and 5′ untranslated regions (UTRs), as well as in non-coding RNAs. The motif most significantly enriched at tagged sites was the purine-rich pentamer GAAGA, with the trimers GAA, AAG and AGA being the core motifs most likely to be recognised by SAFB1. The iCLIP technique was also used to investigate the interaction of SAFB1 with RNA in MCF-7 breast cancer cells, producing broadly similar results with enrichment in open reading frames as well as in 3′ and 5′ UTRs, but with particularly high enrichment in non-coding RNA . Although it might be presumed that binding is mediated by the RBD domain, there is also the possibility that — as is the case with SAFA  — the RGG/RG domain is involved. Gene ontology analysis of tagged mRNA species predicted that SAFB1 is binding RNA expressed from the genes involved in chromosome organisation, RNA processing, the cellular response to stress as well as neurone projection and neurogenesis.
Binding to non-coding RNA
Long non-coding RNAs (lncRNAs) are enriched within the chromatin-associated fraction, and many have been implicated in the recruitment of chromatin-modifying complexes, such as polycomb repressive complex 2 (PRC2), as well as 3D nuclear organisation . Using the iCLIP assay, we have found that SAFB1 binds to several lncRNA transcripts (MALAT1, NEAT1, TUG1 and XIST), which have been linked to regulation of gene expression . Both MALAT1 and NEAT1 have been found to associate with genes actively transcribing pre-mRNAs, predominantly around transcription termination sites [52,53], while TUG1, together with MALAT1, is involved in determining PRC2 function . The silencing of an entire X chromosome during female development is orchestrated by XIST, which coats the inactive chromosome. Interestingly, SAFB1, SAFB2 and SLTM have all been shown to bind Xist . Although a possible role for the SAFB proteins in X chromosome inactivation has not been investigated, SAFA is known to be essential for chromosomal localisation of XIST, the SAP domain as well as the RGG/RG domain being involved . With the increasing realisation that non-coding RNAs (such as long interspersed nuclear elements, LINEs) are likely to be important components of chromatin structure , recent studies with the SAP domain of SAFA are of particular interest and may be relevant to understanding SAFB function. For example, the dominant negative C280 mutant of SAFA, which lacks the N-terminal sequence including the SAP domain, was shown to cause the release of LINE RNA and consequent chromatin condensation. These results suggest that the SAP domain might play an important role in maintaining chromatin structure. SatIII repeat RNA is another non-coding RNA, which seems to play an important role in SAFB function during the creation of ‘nuclear stress bodies’ (see below) associated with chromosome 9 in response to cellular stress .
Finally, SAFB1 was found to bind many microRNA transcripts, including the miR-17-92 cluster . Knockdown of SAFB1 was found to decrease expression of mature miR-19A, while it was found to increase expression of the primary miR-17-92 transcript, indicating that SAFB1 is required for processing mature miR-19A.
RNA-binding proteins are generally multifunctional molecules that interact with many other proteins in addition to RNA, and they are frequently found in large macromolecular complexes with which they may only associate transiently . There are numerous reports of SAFB proteins participating in protein–protein interactions (e.g. thebiogrid.org), but it is important to consider the methodology used for identifying these interactions since experimental conditions will inevitably affect the results obtained. Two-hybrid assays seem particularly prone to false-positive results, but false-negative results can arise with other assays if interactions are weak. Co-immunoprecipitation assays may provide a more physiologically relevant picture of interactions, but results can be influenced by the choice of extraction conditions. High salt concentrations will allow detection of stable interactions, but may miss weaker interactions which could be important. Conversely, weaker interactions may be maintained in the presence of low salt, but there may be a risk of artefactual interactions. Also, co-immunoprecipitation does not distinguish between direct protein interactions and interactions mediated by ‘bridging’ molecules, such as other proteins or nucleic acids.
Formation of homo- and hetero-dimers of SAFB1 and SAFB2, as well as interactions with several hnRNP proteins, has been observed (see Table 1 for a summary of some of the proteins shown to interact with SAFB proteins). The hnRNP proteins comprise a family of abundant RNA-binding proteins that bind nascent RNA transcribed by RNA polymerase II. They play important roles in regulating transcription, processing and transport of mRNA , and SAFB proteins share many characteristics with this group of proteins. Many hnRNP proteins are immunoprecipitated as a complex with SAFB, but co-immunoprecipitation does not occur in the presence of RNase, indicating that interactions are likely to depend on binding to associated RNA , possibly in combination with weak protein–protein interactions. The SR (serine–arginine-rich) proteins, together with hnRNP proteins, play a central role in splicing and maturation of pre-mRNA . Evidence for functional interactions with some SR proteins is strengthened by the finding that under certain circumstances, some of the proteins are co-localised with SAFB. SR proteins are concentrated in nuclear ‘speckles’, which are thought to be involved in the assembly, modification and/or storage of the pre-mRNA splicing machinery. The lncRNA MALAT1 influences the distribution of SR proteins to nuclear speckles, as well as their phosphorylation . On the basis of experiments using the C-terminal half of SAFB, it was thought initially that SAFB co-localised with these nuclear speckles , but when full-length SAFB was used it became apparent that this was not the case . In response to heat stress, several SR proteins (SRSF1, SRSF3, SRSF7 and SRSF9) translocate to nuclear stress bodies (see below) where they co-localise with SAFB, but other SR proteins, including SRSF2 (SC35), which is used as a marker of nuclear speckles, do not . STAR proteins belong to the large KH (hnRNP K homology) domain family of RNA-binding proteins , and STAR proteins which interact with SAFB include Sam68, SLM-1 and SLMT-2/T-STAR [25,26,63]. Under basal conditions, Sam68 co-localises with SAFB, and with heat stress, SAFB and Sam68 translocate to nuclear stress bodies. The interaction with Sam 68 is mediated by the glutamate/arginine (ER)-rich domain of SAFB1 which includes the coiled-coil domain .
|RNA processing||SAFB proteins||SAFB1/SAFB2 (homo- and heterodimerisation) [11,25]|
|hnRNP proteins||hnRNPA1 , hnRNPC , hnRNPD , hnRNPG [25,84,121], hnRNPI , hnRNPK [10,122], hnRNPU |
|SR proteins||SRSF1 , SRSF7 , SRSF9 [26,33], SREK1 , SRRM1 |
|SR protein kinase||SRPK1 |
|Chromatin||CHD1 , NCOR [37,71], HDAC3, [37,71] BRG1 , Matrin3 |
|Transcription||RNAPolII , TAFII68 , steroid receptors [67–69], P53 |
|Miscellaneous||PIAS1 , ZO-2 , Zbed4 |
|RNA processing||SAFB proteins||SAFB1/SAFB2 (homo- and heterodimerisation) [11,25]|
|hnRNP proteins||hnRNPA1 , hnRNPC , hnRNPD , hnRNPG [25,84,121], hnRNPI , hnRNPK [10,122], hnRNPU |
|SR proteins||SRSF1 , SRSF7 , SRSF9 [26,33], SREK1 , SRRM1 |
|SR protein kinase||SRPK1 |
|Chromatin||CHD1 , NCOR [37,71], HDAC3, [37,71] BRG1 , Matrin3 |
|Transcription||RNAPolII , TAFII68 , steroid receptors [67–69], P53 |
|Miscellaneous||PIAS1 , ZO-2 , Zbed4 |
Given the considerable evidence for interaction with proteins involved in splicing, it is not surprising that SAFB has been identified as a component of spliceosomes , macromolecular complexes of snRNA (small nuclear RNA molecules U1, U2, U4, U5 and U6) and numerous proteins (small nuclear ribonucleoproteins), which assemble on newly formed pre-mRNA (snRNPs) to catalyse the removal of introns . SAFB does not appear in a more recent analysis of spliceosomes , however, suggesting that association may be transient and dependent on many weak interactions. Some differences in the association of SAFB proteins with macromolecular complexes are suggested by the finding that SAFB1 appears to be monomeric when extracted under high salt conditions and centrifuged on a sucrose gradient after micrococcal nuclease treatment, whereas SAFB2 was found in higher molecular weight complexes of ∼670 kDa .
In addition to interactions with other RBPs and SR proteins, SAFB also binds to many proteins involved in transcription (including RNA polymerase and transcription factors, such as steroid receptors [67–69] and p53 ), as well as many proteins and protein complexes that play important roles in determining chromatin structure and function, such as NCOR and HDAC3 [37,71], CHD1 , BRG1  and matrin3 .
In a study using MCF-7 breast cancer cells, knockdown of SAFB1 and SAFB2 resulted in the induction of 457 genes and repression of 259 genes . Although it is difficult to be sure whether regulation of individual genes is direct or indirect, this suggests that around one-third of genes may be regulated positively. SAFB has, nevertheless, gained a reputation as a negative regulator, and there is an extensive literature describing inhibitory effects of SAFB on gene expression. One of the first reports described binding of SAFB to the promoter of the heat-shock protein 27 gene and consequent inhibition of expression . Subsequently, investigations carried out by the Oesterreich group have related to repressive effects of SAFB on oestrogen receptor signalling (reviewed in Oesterreich  and Garee and Oesterreich ). As with many other RBPs [11,77–80], overexpression of SAFB proteins produces generalised inhibition of RNA synthesis and eventual apoptosis [11,13]. This characteristic complicates interpretation of effects on gene expression associated with artificially raised levels of SAFB, though in some cases there is independent evidence that SAFB proteins are associated with gene silencing. For example, significant amounts of all the three SAFB proteins are found in regions of chromatin marked by the modified histone H3K9me3, which is generally considered to identify regions of gene repression .
More recent studies have provided some intriguing insights into mechanisms by which SAFB proteins might repress gene expression, which in several cases indicate processes involving interaction with chromatin-modifying complexes such as Polycomb and lncRNA. Thus, SAFB1 was shown to repress the ability of the androgen receptor to regulate transcription, with knockdown of SAFB1 in cultured prostate cells resulting in increased transcription of prostate-specific antigen (PSA) and other genes . SAFB1 was shown to associate with components of the Polycomb PRC2 complex (EZH2, SUZ12 and EEB) and to co-localise with the complex on the PSA promoter. Xanthine oxidoreductase (XOR) expression is also repressed by SAFB1, apparently as a consequence of SAFB1 interacting with the chromatin organiser BRG1 and components of the DNA–protein kinase complex at the XOR gene promoter .
It is not the case that all SAFB1 regulation of transcription is repressive, however. For example, SAFB1 functions as a positive regulator of myogenic differentiation, with knockdown of SAFB1 inhibiting the expression of skeletal muscle genes . In this case, interaction with the polycomb PRC2 complex was suggested by the persistence of the Ezh2 component of PRC2 and the repressive histone marker H3K27me3 after SAFB1 knockdown. Thus, SAFB1 appears to be necessary for the transition of chromatin from repression to the active state required for myogenesis.
More direct evidence for a function of SAP domains (at least that present in SAFA) comes from studies by the Rosenfeld group on the POU-homeodomain transcription factor Pit1, which initiates differentiation of hormone-secreting cells in the pituitary. The association of Pit1 with other proteins, including SatB1 (a global chromatin organiser protein which also binds S/MAR DNA regions) and SAFA, was found to be necessary for its association with a ‘matrin-3-rich network’ (essentially scaffold matrix by another name) in the nucleus and activation of target genes such as growth hormone . A naturally occurring mutant of Pit1 (R271W), which causes combined pituitary hormone deficiency, cannot bind SatB1 and does not translocate to the matrin-3-rich network. When the SAP domain of SAFA was grafted on to the mutant Pit1, however, the hybrid protein was found in the matrin-3-rich network and activation of growth hormone transcription was restored. Thus, appropriate sub-nuclear localisation of Pit1 mediated by a SAP domain seems to be essential for Pit1 function. Both SAFB and matrin-3 have long been recognised as core scaffold matrix proteins, and co-localisation of the two proteins has been demonstrated . At first sight, it may seem surprising that SAFB can exert both repressive and positive effects by what appear to be similar mechanisms, but another study by the Rosenfeld group might indicate some of the processes which could be involved . Methylation of PRC2 was found to determine which lncRNA, TUG1 or MALAT1, the complex binds to. Methylated PRC2 was found to bind TUG1, maintaining growth control loci in the repressive polycomb bodies. In contrast, in response to serum stimulation, unmethylated PRC2 binds MALAT1 and localises to interchromatin granules where gene expression is activated. Another important instance of positive regulation of gene expression has been identified by Liu et al. . SAFB was shown to recruit SUMO-1 to gene promoters, where the sumoylated SAFB enhanced RNAPolII activity at ribosomal protein genes as well as subsequent processing of mRNA.
Chromatin immunoprecipitation (ChIP) has been used to search for potential SAFB-binding sites in promoter regions of genes that are regulated by SAFB. Thus, Omura et al.  found that SAFB1 induced expression of SREBP-1c and used ChIP to demonstrate association of SAFB1 with the promoter region of the gene. In a separate experiment, they could not demonstrate direct binding to the DNA sequence, suggesting that the association might have been via an intermediary molecule(s). Hammerich-Hille et al.  performed a more detailed ChIP analysis with promoter sequences (ChIP-on-chip microarrays) using MCF-7 breast cancer cells and detected association with 541 promoters, but there was no significant overlap between the promoters identified and genes whose expression was modulated by SAFB1 knockdown, again suggesting that much of the gene regulation was indirect.
Park et al.  used RNA interference in Drosophila to test the role of 200 RNA-binding proteins in alternative splicing and identified 47 splicing regulators. They found that SAFB had a striking effect on alternative splicing of exon 4 in Dscam, the most developmentally regulated exon. Splicing of the Tra2β minigene is inhibited by SAFB1  and SAFB2 . This effect occurs even after deletion of the RBD, suggesting that it may be mediated either by protein/protein interactions or by binding to RNA via the RGG/RG motif.
Alternative splicing is particularly prevalent in the brain, and a markedly higher proportion of these events (which are enriched in genes involved in synaptic transmission, axon guidance and neural development) is conserved in the brain compared with other organs . Microarray analysis of neuronal SHSY5Y cells after SAFB1 knockdown identified significant down-regulation of 79 exons and up-regulation of 87 exons in a variety of genes . Interestingly, genes in which alternative splicing is regulated by SAFB1 include NCAM1, ASTN2 and PDE4B, which are known to play important roles in regulating synaptic function and have also been implicated in human psychiatric disease. Further experiments will be required to determine whether splicing of these genes is altered via direct binding of SAFB1, or whether the regulation is exerted via an indirect mechanism. Evidence for a direct effect was provided by experiments with an NCAM1 minigene, which showed that knockdown of SAFB1 reduced expression of the 9–10 isoform, but mutation of AAG/AGA/GAA trimers in exon 9 abolished this effect, thus indicating that in this gene SAFB1 is modulating splicing as a result of direct binding to the pre-mRNA . As with other splice regulators, however, it is probable that SAFB1 co-operates with other proteins to exert its effects on splicing. One possible mechanism could involve protein–protein interactions between SAFB and proteins of the SR family, which are key regulators of splicing. SREK1 (also known as SRrp86), for example, has been shown to be antagonised by SAFB . A more generalised effect on splicing could be mediated by an inhibitory interaction between SAFB and the SR protein kinase SRPK1, which phosphorylates specific SR proteins [89,90].
Many findings point to significant interactions with the functions of SRSF1. Both SAFB1 and SRSF1 bind purine-rich sequences. The most significantly enriched pentamers recognised by SAFB1 have been reported to be GAAGA  or GAAAA , whereas the most enriched motifs for SRSF1 were identified as GAAGA  or GGAGA . These similarities raise the possibility of functional interactions between SAFB and SRSF1 at RNA recognition sites, perhaps increasing specificity via dimer formation. Also, among other crucial functions, SRSF1 recruits the 70 K protein component of the U1 snRNP, thereby initiating spliceosome assembly at the 5′ splice site , and it may well be relevant that all three SAFB proteins bind U1 snRNA . Interestingly, SAFA (hnRNPU) has been shown to exert an indirect but widespread effect on alternative splicing by regulating maturation of U2 snRNP macromolecules, rather than by direct interaction with target pre-mRNA species . Direct interaction between SAFB and SRSF1 (ASF/SF2) has been reported based on a yeast two-hybrid assay , though a more recent  study failed to detect co-immunoprecipitation. It may be that there is weak interaction between the proteins that is stabilised by multiple interactions with other proteins and/or RNA. Overall, it seems probable that SAFB regulates splicing by multiple mechanisms, including RNA-binding and protein–protein interactions with other splicing regulators.
Cellular stress, such as that caused by heat shock, triggers a well-characterised response that involves blocking of a range of metabolic activities including transcription and processing of RNA. This response is orchestrated by a group of heat-shock factors (HSF1– HSF4) which activate transcription of a small group of genes encoding ‘heat-shock proteins’. Formation of nuclear stress bodies (nSBs) in primate cells is intimately involved in this process. Formation of nSBs was first observed using antibodies to hnRNP proteins in HeLa cells exposed to heat shock . Later, Sarge et al.  showed that HSF1 concentrates in these nSBs. Interestingly, nSBs seem to be limited to primate cells, where they can be induced by heat-shock, chemical and hypertonic stress (for reviews, see Jolly and Lakhotia ; Biamonti and Vourc'h ).
In 1999, the Biamonti group showed that SAFB also relocates to nSBs. In HeLa cells under basal conditions, SAFB was distributed among numerous small granules throughout the nucleus with the exclusion of nucleoli. Incubation for 45 min or more at 42°C, however, resulted in recruitment to a few large granules. If heat shock for 1 h was followed by incubation at 37°C for 3 h, virtually all SAFB1 was recruited to these sites. That these bodies were indeed nSBs was confirmed by showing that HSF1 was recruited to the same sites, though with different kinetics. After 15 min at 42°C, SAFB locates close to the nuclear envelope while HSF1 begins to form granules. Subsequently, by 45–60 min, HSF1 and SAFB are co-localised. Later, after recovery at 37°C for 3 h, HSF1 returns to a diffuse nuclear distribution while SAFB remains in the granules. Formation of these nSBs requires ongoing transcription , and their sensitivity to RNase, but not DNase, indicates that RNA is a crucial structural feature . Protein–protein interactions, rather than the RNA-binding domain, seem to be involved in recruitment of SAFB1 to nSBs, an interaction where the Arg/Glu-rich domain (including the putative coiled-coil) of SAFB appears to be necessary and sufficient . A post-translational modification linked to changes in SAFB function during the response to heat shock has been identified. Golebiowski et al.  analysed changes in global patterns of protein sumoylation and detected de-sumoylation of SAFB in response to heat stress. Other proteins recruited to nSBs include Sam68 and three SR proteins [SRSF1 (SF2/ASF), SRSF7 (9G8) and SRSF9 (SRp30c)], but not SC35 [26,97]. This selective recruitment has led to the hypothesis that nSBs might function as ‘molecular traps’, which alter the balance of available splicing factors resulting in changes in alternative splicing that are one of the most characteristic cellular responses to stress [98,99].
The realisation that the number of nSBs formed in a cell correlates with ploidy suggested that these bodies form on specific chromosomal targets. This prediction was confirmed when, after heat shock, HSF1 was shown to be recruited to a heterochromatin region on chromosome 9 which is composed largely of long tandem arrays of SatIII repeats . HSF1 drives transcription of SatIII RNAs, which remain associated with the site of transcription and are essential for recruitment of proteins such as SRSF1 and SRSF9 as the nSBs are created. In Drosophila, heat shock causes recruitment of SafB to ‘heat-shock puffs’ 87A–C (sites of Hsp70 transcription) . Again, this recruitment is not dependent on the DNA-binding SAP domain and RNase treatment reduced recruitment.
The DNA damage response
Surprisingly, double-strand breaks in DNA can result simply from normal physiological behaviour, such as exploration of a novel environment . Most recently, Madabhushi et al.  have shown that neuronal activity results in double-strand DNA breaks in promoters that enhance expression of early response genes. Many RBPs have been shown to play a prominent role in DNA repair , and SAFB1 is one such protein . Altmeyer et al. showed that SAFB1 is recruited rapidly and transiently to double-strand breaks in DNA, where it is required for efficient phosphorylation of the histone H2AX (which is in turn required for assembly of the DNA-repair complex). Recruitment of SAFB1 was found to be dependent on PARylation, and a mutant form of SAFB with the C-terminal region deleted (aa 785–917), which includes the RGG/RG motif, was not recruited .
It has frequently been observed that overexpression of RNA-binding proteins induces apoptosis. Thus, for example, overexpression of Sam68 [11,77], TIA-1  and the RBM proteins [79,80] induces apoptosis, and the SAFB proteins do not appear to be an exception. Townson et al.  showed that overexpression of SAFB1 inhibited cell growth and markedly decreased colony formation, whereas we found that overexpression of SLTM  and SAFB1 (unpublished observation, Uney, et al.) causes apoptosis. The molecular basis for the induction of apoptosis by high levels of these proteins has not been established. One consequence of this effect is the need for care in interpreting apparent effects of overexpressed SAFB proteins on gene expression when there is a generalised inhibition of transcription (see above section on ‘Transcription’ under the heading ‘SAFB functions’).
In addition to its ability to induce apoptosis, it seems likely that SAFB1 is actively involved in programmed cell death induced by other agents . After induction of apoptosis by agents such as Staurosporine, SAFB1 moves into nucleoli within 15 min. By 2 h, SAFB1 has formed a perinucleolar ring structure. This process does not require either the SAP or the RBD domain, a fragment containing the coiled-coil domain (aa 580–788) being sufficient. It may be relevant that the perinucleolar domain has been implicated in gene silencing . Later, after 4 h, nearly all SAFB1 has been proteolytically cleaved.
SAFB function in vivo and possible roles in human disease
The Oesterreich group has carried out the most extensive investigation of SAFB functions in vivo (reviewed by Garee and Oesterreich  and Hong et al. ). Ivanova et al.  generated SAFB1−/− mutant mice and found that loss of SAFB1 was associated with pre- and post-natal lethality. Of those mice which did survive, there was marked growth retardation and low IGF1. SAFB1−/− males were sterile, hypogonadal and had low testosterone, whereas SAFB1−/− female mice were subfertile with markedly decreased oestradiol and testosterone. Interestingly, fibroblasts derived from SAFB−/− embryos were found to be considerably more liable to lose senescence and acquire immortality than SAFB+/+ cells . SAFB−/− knockout mice also show defects in the development of the haematopoietic system, with increased white blood cell counts and increased signs of infections . More recently, SAFB2 null mice have been created, but in comparison with SAFB1−/− animals considerably fewer defects were apparent. Particularly high expression of both SAFB1 and SAFB2 was noted in the male reproductive tract, with increased weight of testes in SAFB2−/− animals, possibly resulting from altered androgen receptor function in the absence of SAFB2 .
As yet, apart from the defects in the haemopoietic system in SAFB1−/− noted by Ivanova et al., the high expression of SAFB in lymphoid tissues  has not been investigated directly. In a study using canonical correlation analysis, Tang and Ferreira  analysed white blood cell traits and identified an association between lymphocyte counts and a region on chromosome 19 that included SAFB1 and SAFB2 genes. Differences in SAFB protein expression were also found when 2D electrophoretic analysis was used to compare cell lines derived from lymphomas with a less aggressive clinical course (follicular lymphomas) with the more rapidly progressing mantle cell lymphomas .
Following up on their proposal that SAFB1 and SAFB2 are tumour suppressor proteins that act as co-repressors of oestrogen signalling, the Oesterreich group has carried out many investigations into the possible role of SAFB in breast cancer (reviewed by Garee et al.  and Hong et al. ). SAFB1 mutations were identified in microdissected breast tumours but not in the normal adjacent tissue, and a high loss of heterozygosity (78%) was detected at the SAFB chromosomal locus in invasive breast cancer . Also, low SAFB expression was associated with worse overall survival of breast cancer patients, but did not affect tamoxifen response . Two animal models, MMTV-Wnt-1 oncogenic mice and mice treated with DMBA, were then used to investigate the impact of SAFB1+/− heterozygosity on the development of mammary tumours . No evidence was found for any increased incidence or growth of tumours in the SAFB1+/− mice. Intriguingly, though, dos Santos et al.  identified SLTM as a gene that is probably required for survival of mouse mammary stem cells.
Linkage studies on Swedish families with a hereditary susceptibility to breast cancer implicated a locus at 19p, but no mutations were detected in coding sequences of either SAFB1 or SAFB2 in these families . A recent analysis in which genomes of tumours from 507 patients with breast cancer were comprehensively analysed also casts doubt on the proposed role of SAF genes in breast cancer. When copy number arrays, DNA methylation, exome sequencing, mRNA arrays and microRNA sequencing studies were analysed, SAFB genes did not appear among those genes identified as being subject to significant genetic changes . Nevertheless, it is highly likely that any loss-of-function mutations affecting any of the SAFB genes will be deleterious and associated with disease(s) yet to be identified. In a recent study, protein-coding regions of genomes from 60 706 individuals were sequenced to identify genes for which there is strong selection against loss-of-function mutations . All the three SAFB proteins are included in the 3230 genes found in this category.
Processing of RNA is crucial for the considerable neuronal cell division that occurs during embryogenesis and brain development, as well as the maintenance and function of post-mitotic neurones in the mature brain. These processes are orchestrated by an array of RNA-binding proteins that are required for neurogenesis, neurite outgrowth, synapse formation and plasticity. Some aspects of RNA processing, for example, mRNA transport along axons and dendrites, are unique to neurones. Other processes, notably alternative splicing, are uniquely active in neurones. Not surprisingly, RNA-binding proteins have been implicated in various neurodevelopmental and neurodegenerative disorders [118,119]. The high levels of SAFB expressed in brain, particularly in the hippocampus and cerebellum, prompted us to perform a preliminary investigation of neural cell function that might be regulated by SAFB. An adenovirus expressing SAFB1 was, therefore, used to transduce primary hippocampal neurones with the result that dendritic spine size was increased .
Much evidence has accumulated, demonstrating an important role of SAFB proteins in fundamental cellular functions, and the case for investigating their role specifically in neuronal cells and neurological/psychiatric disease is now compelling. As appreciation of the wide range of RNA species with crucial cellular functions has increased, so has understanding of the important role RBDs play in the maturation and regulation of these RNAs. The SAFB proteins belong to an important multifunctional sub-group of ∼400 proteins which, in addition to binding RNA, are also capable of binding DNA as well as other proteins. A surprisingly large number of the ‘DRBPs’ in this group are involved in DNA repair and the cellular response to stress, in addition to regulation of transcription and mRNA processing, as noted by Hudson and Ortlund . Around half of the DRBPs are transcription factors with a single domain capable of binding both DNA and RNA, thus creating possibilities for (a) competition (for example with the RNA acting as a decoy) and (b) regulating transcription by binding DNA and also controlling processing of transcribed RNA. Having separate domains capable of binding DNA and RNA, more complex regulatory functions become available for proteins, such as SAFB. For example, being tethered to DNA they could bind lncRNA molecules capable of providing a scaffold for the recruitment of other proteins and protein complexes. A single lncRNA molecule can provide multiple protein-binding sites, and many lncRNAs have been shown to recruit transcriptional regulators and chromatin-modifying complexes such as PRC2. On the basis of information currently available, as well as indirect evidence provided by knowledge of SAFA functions, Figure 3 shows a hypothetical outline for a novel mechanism by which SAFB proteins might interact with other components of chromatin. It is probable that further study of such mechanisms could lead to resolution of the long-standing disagreement over the significance of ‘nuclear matrix’ proteins and their relationship to SAFB/MAR binding by SAFB proteins, perhaps providing a synthesis with current understanding of 3D chromatin structure and providing a rational context for integrating the SAFB proteins into an overall picture of cellular function. Studies, demonstrating effects of SAFB1 depletion on both alternative splicing and pol II-mediated transcription, support the hypothesis that SAFB1 can regulate gene expression by coordinating transcription and RNA processing, a role that may be particularly important in neurones where SAFB1 is highly expressed.
Hypothetical model depicting a mechanism whereby SAFB proteins could recruit regulatory complexes to chromatin by simultaneously interacting with DNA and lncRNA.
It is already clear that SAFB plays important roles in cellular differentiation as well as the response to stress and DNA damage, and it seems safe to predict that many other important functions will come to light in the future. There is now a pressing need to clarify the roles played by the various domains of SAFB proteins. Can the SAP DNA-binding domain of SAFB (like the SAP domain of SAFA) direct transcription factors to specific chromatin sites, and does it play a role in recruiting lncRNA and chromatin-modifying complexes such as PRC2? Are these possibilities compatible with studies showing that localisation of SAFB to chromatin is dependent on binding to RNA? Does binding to RNA depend on the RBD and/or the RGG/RG domain? Further study of post-translational modifications, such as sumoylation and methylation, is also likely to illuminate SAFB functions.
breast cancer type 1 susceptibility protein/BRCA1-associated RING domain protein 1
canonical correlation analysis
Hsp27 ERE TATA SMAR
heterogeneous ribonucleoprotein particle
insulin-like growth factor 1
hnRNP K homology
long interspersed nuclear elements
long non-coding RNAs
matrix attachment region
nuclear localisation signal
nuclear stress bodies
polycomb repressive complex 2
RNA recognition motif
scaffold attachment factor B
SAF-A/B, Acinus and PIAS
scaffold attachment region
SAFB-like transcriptional modulator
small nuclear RNA
small nuclear ribonucleoproteins
small ubiquitin-related modifier-1
T-cell-restricted intracellular antigen-1
We thank the Biotechnology and Biological Sciences Research Council for supporting this work [BB/J016489/1 and BB/F022298/1].
We thank Gail Bartlett for the advice on location of coiled-coil domains.
The Authors declare that there are no competing interests associated with the manuscript.