Structural maintenance of chromosomes flexible hinge domain-containing 1 (SMCHD1) is an epigenetic regulator that mediates gene expression silencing at targeted sites across the genome. Our current understanding of SMCHD1's molecular mechanism, and how substitutions within SMCHD1 lead to the diseases, facioscapulohumeral muscular dystrophy (FSHD) and Bosma arhinia microphthalmia syndrome (BAMS), are only emerging. Recent structural studies of its two component domains — the N-terminal ATPase and C-terminal SMC hinge — suggest that dimerization of each domain plays a central role in SMCHD1 function. Here, using biophysical techniques, we demonstrate that the SMCHD1 ATPase undergoes dimerization in a process that is dependent on both the N-terminal UBL (Ubiquitin-like) domain and ATP binding. We show that neither the dimerization event, nor the presence of a C-terminal extension past the transducer domain, affect SMCHD1's in vitro catalytic activity as the rate of ATP turnover remains comparable to the monomeric protein. We further examined the functional importance of the N-terminal UBL domain in cells, revealing that its targeted deletion disrupts the localization of full-length SMCHD1 to chromatin. These findings implicate UBL-mediated SMCHD1 dimerization as a crucial step for chromatin interaction, and thereby for promoting SMCHD1-mediated gene silencing.
In 2008, SMCHD1 was identified as an epigenetic regulator essential for the maintenance of X-chromosome inactivation . Subsequently, SMCHD1 has been attributed roles in the transcriptional silencing of clustered autosomal genes important in normal development and disease [2–8]. Yet exactly how SMCHD1 interacts with chromatin to facilitate gene expression regulation remains unknown, and thus remains of outstanding interest.
Full-length human SMCHD1 is a large, 2005-amino acid multidomain protein that exerts ATPase activity through its N-terminal GHKL (Gyrase B, Hsp90, histidine kinase and MutL) domain [9–11], and interacts with nucleic acids via its C-terminal SMC (structural maintenance of chromosomes) hinge domain [12–14]. These two functional domains are separated by a linker region that remains largely uncharacterized, yet constitutes more than half of the full-length protein as it comprises approximately 1200 amino acids. In canonical SMC proteins, the C-terminal hinge domain acts as a primary dimerization site that mediates heterodimeric SMC complex formation, establishing functional complexes such as cohesin and condensin [15–17]. Further interactions with non-SMC subunits at their N-terminus, where an ABC-type ATPase domain resides, leads to the formation of a closed ring structure that has been proposed to topologically entrap or encircle DNA [18,19]. Recently, we described the X-ray crystal structure of SMCHD1's hinge domain, which revealed the domain to assemble into an unusual donut shape via homodimerization . Coupled with site-directed mutagenesis studies, the hinge domain crystal structure enabled us to identify two positively charged surface patches, rather than the central pore, as mediators of nucleic acid interactions, in addition to the domain's role in dimerizing SMCHD1.
SMCHD1 is considered a member of the GHKL superfamily of ATPases owing to its N-terminal catalytic domain. The GHKL superfamily encompasses members such as the highly studied molecular chaperone heat shock protein 90 (Hsp90), the microrchidia (MORC) family of proteins, DNA topoisomerases and DNA mismatch repair proteins of the MutL family, among others . While GHKL family members take part in diverse functions, they possess a common structure as they each comprise an α/β sandwich where three α-helices form a layer parallel to a four β-stranded antiparallel sheet, also known as a Bergerat ATP binding fold which is essential for ATPase function . This structure contains various conserved residues, including a glutamic acid (Glu) in Motif I that activates the water molecule for ATP hydrolysis, and an asparagine (Asn) that is responsible for coordinating a magnesium ion (Mg2+) to the active site . An interesting feature common to several GHKL members is the ability to undergo ATP-induced homodimerization. Their catalytic cycle is often described as a molecular clamp mechanism, whereby interaction with ATP triggers dimerization at the GHKL N-terminus and capture of substrate (protein or nucleic acids), followed by ATP hydrolysis which leads to opening of the dimer and substrate release [20,21].
SMCHD1's GHKL ATPase has been of particular interest due to the identification of disease-related variants that are frequently located within this region of the protein. Variations in the human SMCHD1 gene have been associated with two debilitating conditions: facioscapulohumeral muscular dystrophy (FSHD)  and Bosma arhinia microphthalmia syndrome (BAMS) [22,23]. While FSHD-associated variants in SMCHD1 span the entire gene, all reported BAMS-associated variants are located exclusively within the ATPase region of SMCHD1, indicating that an altered catalytic activity may underlie the disease pathogenesis. Indeed, we previously showed that several BAMS-associated variants exhibit an enhanced catalytic activity in the context of the SMCHD1 ATPase protein, while FSHD-related variants almost exclusively display a reduced ATP hydrolysis activity . However, the overall role of ATP hydrolysis in SMCHD1's function as an epigenetic regulator, and therefore the mechanisms by which pathogenic variants influence SMCHD1 function, remain poorly understood.
The first three-dimensional structure of a human SMCHD1 ATPase (residues 25–580) was recently reported, revealing the presence of an α/β sandwich arrangement that constitutes its active site—a conserved feature across members of the GHKL superfamily . Furthermore, the crystal structure revealed a novel ubiquitin-like (UBL) domain located at the N-terminus that was proposed to mediate dimerization by undergoing a domain-swapping event via an N-terminal β-strand. Important to note is that the presented structure was obtained from a catalytically inactive point variant, E147A, which is unable to hydrolyze ATP. Upon examining the SMCHD1 ATPase protein under native PAGE conditions, Pedersen et al.  showed that a proportion of the E147A variant migrated as a dimer, yet surprisingly the wild-type counterpart or other disease-associated variants remained largely monomeric. This finding prompted us to explore the conformation of the wild-type human SMCHD1 ATPase and determine whether N-terminal dimerization is a native property of the protein, rather than a point variant-specific behavior.
Using both sedimentation velocity analytical centrifugation (AUC) and small-angle X-ray scattering (SAXS) analyses, we revealed that the wild-type SMCHD1 ATPase is able to undergo dimerization in a manner dependent on both its UBL domain and the ligand, ATP. While we observed that addition of the non-hydrolysable ATP analog, AMPPNP, promoted self-association of the SMCHD1 UBL construct, approximately half the protein population remained monomeric across varying protein concentrations, consistent with the existence of a monomer : dimer equilibrium in solution. Immunofluorescence microscopy demonstrated that deletion of the UBL domain from full-length SMCHD1 abrogates its chromatin localization in cells, suggesting a requirement for SMCHD1 dimerization via the ATPase domain in recruitment to target genes. Interestingly, SMCHD1's in vitro ATPase activity was unaffected by the presence or absence of the UBL domain, or extending the C-terminal sequence beyond the transducer domain, indicating that dimerization does not impact the rate of ATP hydrolysis. Because ATPase activity has been implicated in dissociation of other GHKL ATPases, and how SMCHD1's ATPase activity compares with other family members was unknown, we compared SMCHD1's catalytic activity to other GHKL family exemplars, revealing comparable ATP turnover rates. These studies provide insights into the connection between dimerization and the ATPase activity of SMCHD1, and highlight the importance of N-terminal dimerization in the recruitment of the full-length protein to chromatin where it can promote gene silencing.
The DNA sequence encoding human SMCHD1 residues 25–580, 111–580, 111–702 or 25–702, as well as the N-terminal region of human MORC2 (residues 1–603) , were cloned into pFastBac Htb (Invitrogen), for expression using the Bac-to-Bac system (Invitrogen) following generation of bacmids in E. coli DH10MultiBac (ATG Biosynthetics), using established procedures . These constructs encode an N-terminal, TEV protease-cleavable His6 tag for affinity purification. Proteins were expressed and purified from Sf21 insect cells as described previously . Transfections were performed as described , where lipid complexes of 1 μg of bacmid DNA was prepared by mixing with CellFectin II, before application in a total volume of 1 ml Insect-XPRESS protein-free medium with l-glutamine (Lonza) to 0.9 × 106Sf21 cells adhered to the well of a 6-well plate. After 5 h static incubation at 27°C, the media was replaced with 2 ml Insect-XPRESS and the supernatant containing P1 virus harvested after 4 days static incubation at 27°C. Typically, Sf21 cells were maintained in Insect-XPRESS media, in suspension at 27°C, under shaking conditions at 130 rpm. P2 virus was generated by addition of 1 ml P1 virus to 100 ml Sf21 cells at a density of 1.5 × 106 cells/ml and incubated at 27°C, under shaking conditions for 4 days at 130 rpm. For protein expression, 0.5 l cell cultures were grown in 2.8 l Fernbach flasks, shaking at 90 rpm, 27°C to a density of 3.0–3.5 × 106 cells/ml before infection with an empirically defined ratio of P2 virus for an optimal protein yield. Cell pellets were harvested 48 h following infection, by centrifugation at 500 g for 5 min at room temperature, and pellets were snap-frozen in liquid nitrogen for storage at −80°C.
Full-length human Hsp90α  was sub-cloned into pPROEX HTb, whereas the N-terminal domain of human MLH1 (residues 1–340) was cloned into pET28-MHL vector (Addgene plasmid #26096). For both constructs, proteins were expressed in E. coli BL21-Codon Plus (DE3)-RIL cells. Briefly, transformed cells were cultured in Super broth at 37°C under shaking conditions at 200 rpm, until an OD600 ∼ 0.6–0.8 was reached. The temperature was then reduced to 18°C and protein expression was induced with the addition of 0.5 mM IPTG overnight.
Recombinant protein purification
Sf21 insect cell or bacterial cell pellets were resuspended in lysis buffer (0.5 M NaCl, 20 mM Tris–HCl pH 8.0, 20% (v/v) glycerol, 5 mM imidazole pH 8.0, 0.5 mM TCEP), supplemented with 1 mM PMSF and 1X cOmplete EDTA-free protease inhibitor (Roche). Cells were disrupted by sonication while maintaining the lysate at 4°C, and insoluble material was removed by centrifugation at 45 000 g for 30 min at 4°C. Supernatants were subjected to Ni-chromatography (cOmplete His-Tag purification resin, Roche) and following extensive washing, eluted in lysis buffer containing 250 mM imidazole pH 8.0. Following cleavage of the His-tag with TEV protease, human SMCHD1 and Hsp90 proteins were buffer-exchanged into Buffer A (50 mM NaCl, 25 mM HEPES pH 7.5, 0.5 mM TCEP, 10% (v/v) glycerol) and loaded onto a MonoQ 5/50 GL column (GE Healthcare) pre-equilibrated with Buffer A, and eluted via a 0–100% gradient of Buffer B (500 mM NaCl, 25 mM HEPES pH 7.5, 0.5 mM TCEP, 10% (v/v) glycerol) over 40 column volumes for protein elution. SMCHD1-containing fractions of interest eluted at ∼250 mM NaCl and Hsp90-containing fractions eluted at ∼600 mM NaCl. These were pooled and concentrated, and further purified by Superdex-200 10/300 GL size exclusion chromatography (Cytiva) with elution in 100 mM NaCl, 20 mM HEPES pH 7.5, 0.5 mM TCEP. For the human MORC2 protein, ion exchange chromatography was omitted and size exclusion chromatography (SEC) was performed following Ni-chromatography; for human MLH1, a subtractive Ni-chromatography step was included prior to SEC to remove the His-tagged TEV protease. Briefly, the cleaved protein was diluted in lysis buffer and incubated with Ni-NTA resin (cOmplete His-Tag, Roche) for 1 h at 4°C, on rollers. The resin was washed with increasing imidazole concentrations of up to 35 mM imidazole, where the protein of interest remained in the unbound fractions and the TEV protease bound to the resin. Unbound fractions were pooled, concentrated and further purified by SEC. Protein purity was evaluated by reducing SDS–PAGE with Stain-Free visualization (Biorad) and fractions of interest were pooled, snap-frozen in liquid nitrogen and stored at −80°C until required.
Fluorescence polarization ATPase assays were performed as outlined in Chen K et al. . Ten microliter reactions were set up in triplicates in 384-well low flange, black, flat-bottom plates (Corning) containing 7 μl reaction buffer (50 mM HEPES pH 7.5, 4 mM MgCl2, 2 mM EGTA), 1 μl recombinant protein at concentrations ranging from 0.1 to 0.6 μM or SEC buffer control, 1 μl nuclease-free water and 1.25–10 μM ATP substrate. Reactions were incubated at 20°C for 1 h in the dark. Reactions were stopped by the addition of 10 μl detection mix (1× Detection buffer, 4 nM ADP Alexa Fluor 633 Tracer, 128 μg/ml ADP2 antibody) and incubated for another hour in the dark. Fluorescence polarization readings (mP) were measured using an Envision plate reader (PerkinElmer Life Sciences) fitted with excitation filter 620/40 nm, emission filters 688/45 nm (s and p channels) and D658/fp688 dual mirror. Readings from a free tracer (no antibody) control were set as 20 mP as the normalization baseline of the assay for all reactions. The amount of ADP produced by each reaction was estimated by a 12-point standard curve, as outlined in the manufacturer's protocol. Data were plotted and analyzed in GraphPad Prism. A second type of ATPase assay was performed using the ADP-Glo Kinase Assay kit (Promega). Each reaction was performed in a total of 5 μl, consisting of a final 3.125, 6.25, 12.5 or 25 μM SMCHD1 protein (residues 25–702 or 111–702), either 20, 50 or 100 μM ATP, and reaction buffer (50 mM HEPES pH 7.5, 4 mM MgCl2, 2 mM EGTA). Reactions were incubated at 20°C for 1 h, followed by the addition of 5 μl ADP-Glo reagent to terminate the reaction by depleting ATP, and a further incubation for 40 min at 20°C. Ten microliter of Kinase Detection reagent was then added to each reaction and incubated for a further 30 min at 20°C. Luminescence was measured using the FLUOstar Omega microplate reader (BMG Labtech). Data were plotted and analyzed in GraphPad Prism.
Sedimentation velocity experiments were performed with a XL-I analytical ultracentrifuge (Beckman Coulter) using double sector quartz cells and epon center-pieces in an An-50 Ti 8-hole rotor. Data were obtained at 50 000 rpm using 350 μl protein at 1.0 mg/ml concentration for the initial experiment, and varying concentrations (0.25, 0.5, 1.0, 1.5 and 2.0 mg/ml) for the following experiment. The protein samples were diluted in 100 mM NaCl, 20 mM HEPES pH 7.5, 0.5 mM TCEP, which was used as the reference buffer. A total of 100 scans were collected at 20°C using radial absorbance scans at 290 nm and a step size of 0.003 cm. Data were analyzed using SEDFIT  and Ultrascan . Sedimentation data were fitted to a continuous size distribution [c(s)], and fit data were plotted using GUSSI  and analyses are reported in Table 1. The buffer density, buffer viscosity and an estimate of the partial specific volume of the protein based on the amino acid sequence were determined using SEDNTERP. Data were also subjected to two-dimensional spectrum analyses, and van Holde-Weischet analyses in UltraScan 4.0 [29,31].
|[SMCHD1] mg/ml .||Weight average sedimentation coefficient .||Monomer (%) .||Dimer (%) .|
|[SMCHD1] mg/ml .||Weight average sedimentation coefficient .||Monomer (%) .||Dimer (%) .|
Small-angle x-ray scattering (SAXS) data collection and analysis
SAXS data were collected at the Australian Synchrotron SAXS/WAXS beamline using the co-flow, in-line SEC setup , as previously described [9,24]. Data collection and analysis statistics are shown in Supplementary Table S1. Fifty microliter of 5 mg/ml recombinant protein was resolved by injection on to an in-line Superdex-200 Increase 5/150 GL column (Cytiva) in 200 mM NaCl, 20 mM HEPES pH 7.5, 5% (v/v) glycerol, 0.5 mM TCEP (Pierce) and eluted in the path in the beam via a quartz capillary. For samples with added ligand, protein was incubated with 1 mM AMPPNP/Mg2+ on ice for 30 min prior to the commencement of the experiment, and the buffer used for these samples was as described above for the apo samples but with the addition of 1 mM ATP/ Mg2+. One second exposures of scattering data were collected with a PILATUS3 × 2 M detector, radially averaged, and scattering data from the apex of the SEC peak were background subtracted using data collected for buffer-only shots earlier in the data collection using Scatterbrain software (Stephen Mudie, Australian Synchrotron). Data analyses were performed using the ATSAS suite  as described previously . Guinier analyses were performed using PRIMUS  to examine scatter at very low q (qRg ≤ 1.3) to estimate radius of gyration, Rg, and zero angle intensity (I(0)), with linearity indicating the absence of both high molecular mass aggregates and interparticle interference. The real space interatomic distance distribution function, P(r), and the maximum dimension of the particle, DMAX, were computed by indirect Fourier transform using GNOM , which also allowed estimation of Rg and I(0). The atomic co-ordinates from a SMCHD1 (residues 25–702) monomer or dimer model (Figure 1c) were used to obtain theoretical scattering curves for comparison with experimental data, using the program CRYSOL .
Wild-type SMCHD1 ATPase dimerizes in solution when in the presence of the UBL domain and AMPPNP/Mg2+.
CRISPR-cas9-mediated SMCHD1-knockout in HEK293 cells
We designed guide RNAs (gRNAs) targeting the first exon of SMCHD1 (forward 5′CAAACAAGTACACCGTCCTG; reverse 5′ GGGGAGCGCTCGGACTACGC). These were cloned into the lentiGuide-puro vector (Addgene #52963), whereas Cas9 was delivered in a blasticidin resistant lentivirus generated from a pLentiGuide-BlastR vector. To produce lentivirus, HEK293T cells were cultured in DMEM supplemented with 10% (v/v) fetal bovine serum (FBS), penicillin (100 U ml−1), streptomycin (100 μg ml−1) at 37°C in a humidified atmosphere with 5% (v/v) CO2. Cells plated on a 10 cm plate at 80% confluency were transfected with a lentiviral packaging cocktail (14 μg of psPAX vector, 5.6 μg of pVSV-G vector, 8.4 μg of LentiGuide-BlastR vector or LentiGuide-puro vector, 144 μL of 0.5 M CaCl2, 1.2 ml HBS and 1 ml of DNAse-free water), which was vortexed for 10 s and incubated for 10 min at room temperature prior to adding onto cells. The cells were incubated overnight and media was changed the next day. One day later, the supernatant was harvested and filtered. HEK293 cells were first transduced with the cas9 lentivirus and selected for with blasticidin, followed by the gRNA lentiviral transduction and selection with puromycin. For either, lentiviral supernatant was prepared 1 : 10 in media containing 4 μg/ml polybrene (Sigma–Aldrich) and added to HEK293 cells at 50% confluence. After 24 h, the media was changed and 5 μg/ml puromycin (Sigma–Aldrich) or 5 μg/ml blasticidin S (InvivoGen) were added for selection of transduced cells. Clonal cell lines were generated via single cell isolation, and verified by Illumina Next Generation sequencing (FWD: 5′GTGACCTATGAACTCAGGAGTCtcgcgtacctgacacacaca, REV 5′ CTGAGACTTGCACATCGCAGCcgctgtcttttctccttttc), as described previously .
Deletion of the UBL domain (Δ1–110) in mouse full-length SMCHD1 was accomplished using PCR-mediated cloning before introduction into a pcDNA3 vector. For transfection of SMCHD1-knockout HEK293 cells with constructs containing full-length SMCHD1 or the UBL-deletion variant, ∼2 × 104 clonal SMCHD1-KO HEK293 cells were seeded in a 12-well plate on a 13 mm coverslip (Marienfield Superior). 24 h following plating, cells at ∼80% confluency were transfected with 1.2 μg of the corresponding construct using calcium phosphate-mediated transfection. Immunofluorescence was performed 24 h post-transfection.
Immunofluorescence microscopy studies were performed as described previously . Briefly, cells were washed in PBS and fixed in 3% (w/v) paraformaldehyde made in PBS for 10 min at room temperature. Cells were washed three times with PBS for 5 min each, then permeabilized on ice with 0.5% (v/v) Triton X-100 in PBS, followed by three washes in PBS for 5 min each. Cells were blocked in 1% (w/v) bovine serum albumin (BSA) (Life Technologies) for 15 min, followed by an overnight incubation in a dark and humid chamber at 4°C with a primary anti-SMCHD1 antibody (in-house clone 2B8; available from Millipore under catalog number MABS2292) diluted 1 : 100 in 1% (w/v) BSA. Cells were washed three times in PBS for 5 min each and incubated for 40 min at room temperature in a dark and humid chamber with a secondary anti-rat-568 antibody for SMCHD1 (Life Technologies, A-11077) diluted 1 : 500 in 1% (w/v) BSA. Cells were washed three times in PBS for 5 min each and stained with DAPI for 10 min at room temperature, followed by another two PBS washes. Coverslips were mounted in Vectashield H1000 mounting medium (Vector Laboratories). Cells were visualized using the Zeiss LSM 880 NLO microscope at 63× magnification and z-stacks were acquired. Images were analyzed using the open source ImageJ distribution package, FIJI.
Samples were resolved by standard reducing SDS–PAGE analysis on a 4–12% Bis-Tris gel (Thermo Fisher Scientific) in MES buffer and transferred to a PVDF membrane (Osmonics, GE Healthcare) by wet transfer at 100 V for 1 h in transfer buffer (25 mM Tris, 192 mM glycine, 20% v/v methanol). Membranes were blocked with a 5% (v/v) skim milk powder in 0.1% (v/v) Tween-20/PBS for 1 h at room temperature. Primary antibody was added to the membrane in 5 ml blocking buffer and incubated overnight at 4°C in a capped tube, on rollers. Membranes were washed for 30 min at room temperature with 0.1% (v/v) Tween-20/PBS, followed by incubation with HRP-conjugated secondary antibody for 1 h at room temperature, which was diluted in 5 ml blocking buffer. The 30-min washing step was repeated once more, and antibody binding was visualized using the Luminata ECL system (Millipore) following the manufacturer's instructions on a ChemiDoc Touch Imaging System (Bio-Rad).
Dimerization of the SMCHD1 ATPase is dependent on the ubiquitin-like (UBL) domain and ATP binding
The recently published structure of the SMCHD1 ATPase (residues 25–580) harbored the catalytically inactive point mutant, E147A, for which preferential dimerization was observed over the wild-type counterpart . This led us to examine whether the wild-type SMCHD1 ATPase similarly adopts a dimeric conformation. To do so, we investigated an extended SMCHD1 ATPase construct that encompasses residues 25–702, extending past the transducer domain at the C-terminus while incorporating the newly identified UBL domain (residues 25–110). We compared this to a SMCHD1 construct that lacks the UBL domain (ΔUBL; residues 111–702; Supplementary Figure S1a,b), which we previously demonstrated to occur as a monomeric species in solution [9,24] by performing sedimentation velocity AUC experiments. When the sedimentation data were fitted with a continuous sedimentation coefficient [c(s)] distribution model, we observed only a single species for the ΔUBL SMCHD1 ATPase (residues 111–702), both in the absence and presence of a non-hydrolysable analog of ATP, AMPPNP, and a cofactor for the hydrolysis reaction, Mg2+, with sedimentation coefficients of 4.09S and 4.10S, respectively (Figure 1a). The estimated molecular mass values for the single species were 65.1 kDa and 65.3 kDa for the apo and AMPPNP conditions, respectively, consistent with a calculated monomeric mass of 68.0 kDa for the ΔUBL SMCHD1 ATPase construct (residues 111–702). For the UBL-containing SMCHD1 (residues 25–702), we detected a monomeric species in the absence of ligand with a sedimentation coefficient of 4.45S and an estimated molecular mass of 73.8 kDa. However, in the presence of AMPPNP/Mg2+, we detected both a monomeric population in addition to a higher molecular mass species that corresponds to 7.01S and a molecular mass of 126.0 kDa, where the calculated monomeric mass for this construct is 78.0 kDa (Figure 1a). These data are consistent with the wild-type SMCHD1 ATPase undergoing a dimerization event that is dependent on both the presence of the N-terminal UBL domain and an ATP-mimetic ligand.
To gain further insight into the self-association properties of the UBL-containing SMCHD1 ATPase (residues 25–702), we repeated the sedimentation velocity AUC studies in the presence of AMPPNP/Mg2+, this time at varying protein concentrations of 0.25, 0.5, 1.0, 1.5 and 2.0 mg/ml (Figure 1b). Under these experimental conditions, we observe a shift towards an increasing dimer population with a corresponding rise in protein concentration, which indicates the presence of a concentration-dependent self-association between monomer and dimer. Furthermore, this is demonstrated both by the proportion of each oligomeric species and an increase in the weight average sedimentation coefficient, calculated by integrating the area under the monomer and dimeric species within the c(s) distribution (Table 1). The sedimentation velocity data were also analyzed using the van Holde–Weischet method , which is a model-independent analysis that directly assesses the shape of the sedimenting boundary to provide an estimate of the proportion of each species in solution. These analyses support the idea that increased protein concentration favors the prevalence of the dimer over the monomeric form (Supplementary Figure S1c). We also assessed whether the dimeric SMCHD1 population remained associated in solution, as opposed to reverting back to a monomer-dimer ratio. Here, we pooled the fractions from size exclusion chromatography containing SMCHD1 ATPase dimer (Supplementary Figure S2a) and subjected the protein to a second round of size exclusion chromatography. Interestingly, the reloaded protein eluted as a single peak that corresponded to the dimeric protein, indicating that the SMCHD1 ATPase homodimer is preserved in the presence of AMPPNP/Mg2+ (Supplementary Figure S2b).
We further examined the conformation of wild-type SMCHD1 UBL-containing ATPase protein (residues 25–702) using SAXS. SMCHD1 ATPase was eluted from a Superdex-200 size exclusion column into the capillary in the path of the SAXS beamline at the Australian Synchrotron and images were collected every 1 s. The capillary was equipped with a ‘co-flow’ setup in which buffer was delivered in a laminar manner most proximal to the capillary to prevent capillary fouling/deposition if the protein were to undergo radiation damage [32,38]. The Guinier analysis for the protein under either apo or ligand conditions produced linear plots (Figure 1c–d; insets), which is consistent with a single monodisperse species in solution and an absence of interparticle interference, enabling more detailed analyses of the samples. We previously established the radius of gyration (Rg) of the wild-type, ΔUBL SMCHD1 ATPase (residues 111–702) as ∼32 Å, with a maximum particle dimension (DMAX) of 105 Å , and further confirmed these parameters in this study (Supplementary Figure S3a,b; Supplementary Table S1). These values correspond to a monomeric configuration, additionally validated here by our sedimentation velocity AUC studies (Figure 1a). To establish the conformational states of the unliganded and ligand-bound SMCHD1 UBL-containing ATPase (residues 25–702), we computed the theoretical scattering curves for either a monomeric or dimeric UBL-containing SMCHD1 ATPase (residues 25–702) and compared these with experimentally obtained data (Figure 1c,d). These models are in agreement with the experimental data, as evidenced by χ2 values of 0.401 for the fit of the monomeric SMCHD1 ATPase model to the monomeric scatter pattern, and 1.191 for the AMPPNP-bound SMCHD1 (Supplementary Figure S3c). The former is an excellent fit to the data; the reference value for this fit is 0.25, because the data reduction software, Scatterbrain, records 2 standard errors rather than the more conventional 1 standard error. Based on the elution volume from size exclusion chromatography, we concluded that the AMPPNP-bound UBL-containing SMCHD1 ATPase was the dimer form. By comparison, the dimer model fitted the AMPPNP-bound UBL-containing SMCHD1 scatter with a χ2 value of 0.184 (Figure 1d). Interestingly, the dimer model also fit the monomeric (sans AMPPNP) UBL-containing SMCHD1 ATPase very well, with χ2 = 0.180 (Supplementary Figure S3d). For the experimental data of the UBL-containing SMCHD1 ATPase (-AMPPNP) we observed a Rg of 36 Å and a DMAX of 115 Å, whereas in the presence of AMPPNP, the indicated Rg was 39 Å and the DMAX value was 125 Å (Figure 1e; Supplementary Table S1). The similarity between the parameters calculated for the monomer and dimer forms was surprising, but illustrates that the general topology and envelope of the monomeric and dimeric forms are remarkably similar. Importantly, molecular mass estimates from our scattering data using SAXSMoW  confirmed that, despite the similarity of their fits to the monomeric SMCHD1 ATPase scattering data, a molecular mass estimate of 95.7 kDa (cf. calculated molecular mass of 78 kDa) was determined from the monomer (−AMPPNP) scattering data, whereas the estimated molecular mass from the scattering of UBL-containing SMCHD1 (+AMPPNP) was 139.7 kDa, consistent with a dimeric species (Supplementary Table S1). Taken together, these results are consistent with a dimeric arrangement of the SMCHD1 ATPase in the presence of AMPPNP/Mg2+.
SMCHD1's UBL domain is required for the cellular localization of the full-length protein
To assess the functional role of the UBL domain in the context of the full-length protein, we designed a mouse SMCHD1 construct that excludes the N-terminal region where the UBL domain resides (Δ1–110). We transiently transfected constructs encoding either wild-type full-length mouse SMCHD1, the UBL-deleted SMCHD1 (Δ1–110) or the E147A SMCHD1 point variant that represents an established catalytically inactive mutant, into CRISPR-Cas9-edited, SMCHD1-knockout 293 cells (SMCHD1-KO 293). This is an ideal cellular system to study SMCHD1 function because mouse SMCHD1 complements the endogenous human SMCHD1 function and the exogenous mouse SMCHD1 constructs are not susceptible to editing by the human targeting gRNA sequences. We validated the SMCHD1-KO efficiency by both immunoblot (Figure 2a) as well as immunofluorescence (Figure 2b). Wild-type 293 cells illustrate the native nuclear localization pattern of SMCHD1, which is denoted by two bright nuclear foci per cell that correspond to SMCHD1's localization to the two inactive X-chromosomes present in tetraploid 293 cells (Figure 2b). Transfection of wild-type full-length SMCHD1 into the SMCHD1-KO 293 cells resulted in the formation of multiple nuclear foci, a localization pattern that differs from the wild-type control cells where two bright nuclear foci are observed in each cell, most likely as a result of the overexpression of the construct. Nonetheless, deletion of the UBL domain from the full-length protein (in a construct encoding residues 111–2007) compromised nuclear foci formation, resulting in a diffuse SMCHD1 staining pattern that does not resemble wild-type SMCHD1-transfected cells (Figure 2b). The E147A catalytically inactive SMCHD1 point variant exhibited a similar nuclear localization pattern, equally unable to form nuclear foci. These findings suggest that both ATPase domain dimerization mediated by SMCHD1's UBL domain, as well as ATPase activity, are required for proper chromatin localization and focus formation by the full-length protein.
SMCHD1's N-terminal UBL domain is required for its localization to chromatin.
The UBL domain or a C-terminal extension do not alter the catalytic activity of the SMCHD1 ATPase
Previously, we characterized a SMCHD1 construct that encompasses residues 111–702, extending past the transducer domain. This C-terminal extension (residues 580–702) contains five reported FSHD2-associated missense mutations in SMCHD1: W596G, V615D, P622L, V641L and P690S . We formerly investigated one of these, P690S, and showed that it exhibits a decreased catalytic activity in vitro . We attempted to study an additional SMCHD1 variant, V615D, but were unable to produce sufficient recombinant protein due to a low expression yield, suggesting the Val to Asp substitution is likely destabilizing (data not shown). Overall, these data indicated that the C-terminal extension downstream of the transducer domain of SMCHD1 likely holds an important functional role. We, therefore, set out to examine the extended 25–702 amino acid wild-type SMCHD1 ATPase protein, which incorporates both the newly identified UBL domain and the C-terminal extension (Figure 3a), to determine its catalytic activity in vitro in comparison with SMCHD1 constructs encompassing residues 25–580, 111–580 and 111–702. While at lower protein concentrations, SMCHD1 constructs encompassing residues 25–580 and 111–580 exert a higher ATP turnover than the 111–702 SMCHD1 counterpart, similar trends were observed at higher protein concentrations across all three constructs (Figure 3b). The presence of the UBL domain within the extended SMCHD1 construct (residues 25–702) similarly shows no increase in catalytic activity compared with the ΔUBL SMCHD1 ATPase (residues 111–702) counterpart (Figure 3c,d). Similarly, when assayed at concentrations at which we observed the construct to undergo dimerization, the UBL-containing SMCHD1 ATPase (25–702) did not exhibit increased catalytic activity, but rather a slight decline (Figure 3e,f), raising the prospect that the ATP turnover might be conformationally or sterically impacted in the dimeric form. These findings suggest that the dimerization event does not enhance ATP turnover, inferring the absence of a synergistic behavior upon dimerization of the SMCHD1 ATPase. Our data are consistent with a previous report, where the presence of the UBL domain on the SMCHD1 ATPase domain (residues 25–580) did not impact the ATP hydrolysis activity compared with a construct lacking the UBL (residues 111–580) . Taken together, these data demonstrate that the in vitro catalytic activity of SMCHD1 remains unaltered in the presence of either the N-terminal UBL domain or the C-terminal extension downstream of the transducer domain.
SMCHD1's UBL domain or C-terminal extension do not alter its catalytic activity in vitro.
SMCHD1's ATPase activity is comparable to that of related GHKL-type proteins
Initial structural analyses of the ΔUBL SMCHD1 ATPase (residues 111–702) using SAXS revealed a gross conformational similarity to full-length Hsp90 , and biochemical studies demonstrated a susceptibility to a well-established Hsp90 inhibitor, Radicicol [9,11]. Thus, while it is evident that SMCHD1 exhibits topological similarities to other GHKL family ATPases, including the capacity to undergo ATP-induced N-terminal dimerization, it has remained unclear whether there are intrinsic differences between SMCHD1 and other GHKL family members’ catalytic activities in the absence of direct comparisons. We addressed this knowledge gap by directly comparing the catalytic activities of UBL-containing SMCHD1 ATPase (residues 25–702) with H. sapiens full-length Hsp90α, the MutL homolog 1 (MLH1) ATPase domain (residues 1–340) and the extended MORC2 ATPase (residues 1–603) in vitro using our established endpoint fluorescence polarization assay (Figure 4a–d). Upon assessing their activity, we found that full-length Hsp90 exhibits a comparable catalytic rate to SMCHD1, with an estimated turnover rate (kcat) of ∼0.015 μM ADP/min/μM protein, akin to 0.018 μM ADP/min/μM protein obtained for SMCHD1 (Figure 4a,b). The monomeric ATPase domain of MLH1 displayed a slightly higher turnover rate of 0.025 μM ADP/min/μM protein (Figure 4c), whereas the MORC2 dimer exhibited a kcat of 0.047 μM ADP/min/μM protein in this assay (Figure 4d). Overall, these results suggest that the catalytic rate of SMCHD1's ATPase is comparable to that of other GHKL-type proteins, but most similar to full-length Hsp90.
The catalytic rate of SMCHD1's ATPase is comparable to that of other GHKL-type proteins.
The growing evidence for the role of SMCHD1 variants in human disease has led to increased interest and, consequently, many recent advances in our current understanding of SMCHD1's molecular structure and function. Both component domains of SMCHD1 — the N-terminal ATPase and the C-terminal hinge — can independently dimerize. While the hinge domain is constitutively dimeric [12,13], the ATPase can interchange between monomeric and dimeric forms , although the underlying basis was incompletely understood. In this study, we explored the dimerization properties of the wild-type SMCHD1 ATPase and demonstrated that, similarly to the recently reported E147A SMCHD1 mutant , the wild-type protein is able to dimerize via a mechanism reliant on the N-terminal UBL domain and ATP binding. GHKL-type proteins have been commonly described to function via a molecular clamp mechanism where ATP-dependent N-terminal dimerization dictates the opening and closing of the dimer and is directly coupled to the catalytic cycle [20,21]. SMCHD1 may likewise behave as a molecular clamp around chromatin, transitioning between the open and closed states to engage and disengage from target sites. The propensity of the SMCHD1 ATPase to undergo dimerization in the presence of ATP raises the possibility of an intrinsic behavior that may regulate SMCHD1's residency time and consequently gene silencing function at chromatin targets. ATP binding is known to trigger intermolecular interactions that further result in N-terminal dimerization among the GHKL ATPase family , a behavior that likely also occurs in SMCHD1, leading to UBL domain swapping and the dimerization event that we observe. While the presence of a UBL domain has not yet been reported in other members of the GHKL superfamily, the N-terminal β-strap that precedes the UBL domain in SMCHD1 and co-ordinates dimerization via a domain-swapping event is also found among other GHKL members, for example in the molecular chaperone Hsp90 [40,41]. Despite the primary dimerization interface of Hsp90 being situated within the C-terminal domain of the protein, N-terminal dimerization via the β-strap was suggested to promote efficient ATP hydrolysis .
The E147A SMCHD1 mutant from which the first three-dimensional structure of a SMCHD1 ATPase (residues 25–580) was solved, was reported to elute as a mixture of monomer and dimer from SEC in the presence of ATP, from which only the dimer peak was selected for crystallization . Here, we observed a similar phenomenon for the extended wild-type SMCHD1 ATPase (residues 25–702) in the presence of non-hydrolysable AMPPNP, whereby monomer and dimer peaks occurred in both SEC and sedimentation velocity AUC analyses in solution. Our data support the existence of a proportion of the UBL-containing SMCHD1 ATPase in a conformation that is equipped to undergo a monomer to dimer transition in the presence of ATP (or analogs), which could plausibly be reversed upon ATP hydrolysis. It remains of outstanding interest whether, in the context of full-length SMCHD1, which harbors a C-terminal SMC hinge domain that confers constitutive dimerization upon SMCHD1, whether the ATPase domain would be poised for stoichiometric dimerization owing to the inherent high local concentration of protomers.
GHKL-type proteins are a family of weak ATPases, where the energy released upon ATP hydrolysis is thought to drive complex conformational rearrangements within the protein itself rather than being used as a motor function [43,44]. Here, we were able to provide a direct comparison of the catalytic activity exhibited by SMCHD1's ATPase and representative human GHKL family members – Hsp90, MLH1 and MORC2 – revealing that they all display slow turnover rates, with kcat values ranging between 0.015 and 0.047 μM ADP/min/μM protein. Previously reported turnover rates for full-length Hsp90 were in the range of 0.1–1.2 ADP/min/μM protein [45,46], 0.023 ADP/min/μM protein for MLH1 (residues 1–343) , while for MORC2 (residues 1–603) the fitted kcat was reported as 0.10 ADP/min/μM protein . While some of the values we obtained are ∼2-fold lower than previously described turnover rates for the respective proteins, it is important to note that we employed an end point assay where we monitored ADP production during a linear relationship to ATP usage, as opposed to an NADH-coupled system , for example, which quantitates ATP hydrolysis in a continuous mode. We also performed the ATPase assays at 20°C rather than the 37°C often used for enzymatic assays. Together, these two aspects may contribute to an underestimation of turnover rates in our experimental setup, but overall show excellent agreement with published values. Most importantly, these data validate SMCHD1 ATPase activity as comparable to others in the GHKL family. It is interesting that ATPase assays of SMCHD1 at concentrations that favor dimerization (in the presence of ATP) led to slightly suppressed, rather than the anticipated elevated, ATPase activity. These data suggest that dimerization might suppress processivity and/or thwart ATP or ADP dissociation from SMCHD1 dimers, although the precise mechanism remains of ongoing interest.
The identification of an N-terminal UBL domain in SMCHD1 raised great interest in its functional role. Most commonly, UBL domains have been associated with the recruitment of proteins to the 26S proteasome, stimulating proteasomal degradation in a similar fashion to ubiquitin [48,49]. In the case of SMCHD1, we have shown that its UBL domain is necessary for N-terminal dimerization and localization of the full-length protein to target sites on chromatin. This mechanism differs from canonical SMC proteins, where ring closure at the N-terminus of SMC heterodimers is dependent on interactions with non-SMC subunits rather than via an ATP-dependent domain-swapping event [18,50]. Whether the UBL domain has a direct role in SMCHD1's recruitment to chromatin, or rather, that the ability to dimerize at the N-terminus is sufficient for SMCHD1's localization to target sites, remains to be investigated. Nonetheless, the aberrant nuclear localization we observed for the E147A SMCHD1 variant, which forms a stable dimer that is unable to hydrolyze ATP, indicates that SMCHD1 relies on the UBL-mediated monomer-dimer conformational cycling at the N-terminus for faithful chromatin association.
Our studies validate the idea that SMCHD1's ATPase activity is connected to its dimerization and localization to chromatin. The precise nature and dynamics of ATP hydrolysis serving as an ‘off’ switch in SMCHD1 conformational cycling await further structural analyses for resolution. In particular, it remains of enormous interest to understand how the two protomers assemble into dimers, the dispositions of the component domains, and organization of the uncharacterized intermediate domain that connects the ATPase and SMC hinge domains in the context of the full-length, >2000 amino acid protein. Despite SMCHD1's essential role in epigenetic regulation, the atomic structure of the full-length protein and the molecular mechanisms underlying its function in both a healthy and diseased state remain to be elucidated. We anticipate that a detailed understanding of full-length SMCHD1 structure will enable interpretation of the functional effects of various substitutions in SMCHD1 found in patients suffering of FSHD and BAMS, and provide a foundation for development of therapeutic interventions.
All data and reagents are available from the authors upon request.
The authors declare that there are no competing interests associated with the manuscript.
This work was supported by grants from the National Health and Medical Research Council of Australia (1098290, 1172929, 1194345, 9000653) and FSHD Global (grant 39); Australian Research Training Program scholarships to A.D.G. and M.I.; an FSHD Society Fellowship (5059384059) to A.D.G; the Bellberry-Viertel Senior Medical Research Fellowship (to M.E.B). Additional support was provided by the Victorian State Government Operational Infrastructure Support.
Open Access Statement
Open access for this article was enabled by the participation of the University of Melbourne in an all-inclusive Read & Publish pilot with Portland Press and the Biochemical Society under a transformative agreement with CAUL.
CRediT Author Contribution
Alexandra D. Gurzau: Conceptualization, formal analysis, investigation, methodology, writing — original draft. Christopher Horne: Investigation, writing — review and editing. Yee-Foong Mok: Investigation, writing — review and editing. Megan Iminitoff: Investigation, writing — review and editing. Tracy A. Willson: Investigation, writing — review and editing. Samuel N. Young: Investigation, writing — review and editing. Marnie E. Blewitt: Conceptualization, supervision, funding acquisition, writing — review and editing. James M. Murphy: Conceptualization, supervision, funding acquisition, writing — review and editing.
We thank the staff of the Australian Synchrotron SAXS/WAXS beamline for assistance with data collection. We would also like to thank Dr Yorgo Modis (University of Cambridge) and Dr David A Agard (University of California San Francisco) for the MORC2 and Hsp90 protein expression vectors.
Bosma arhinia microphthalmia syndrome
facioscapulohumeral muscular dystrophy
small-angle X-ray scattering
size exclusion chromatography
structural maintenance of chromosomes flexible hinge domain-containing 1