Kinetic and structural analysis of human ALDH9A1

Aldehyde dehydrogenases (ALDHs) constitute a superfamily of NAD(P)+-dependent enzymes, which detoxify aldehydes produced in various metabolic pathways to the corresponding carboxylic acids. Among the 19 human ALDHs, the cytosolic ALDH9A1 has so far never been fully enzymatically characterized and its structure is still unknown. Here, we report complete molecular and kinetic properties of human ALDH9A1 as well as three crystal forms at 2.3, 2.9, and 2.5 Å resolution. We show that ALDH9A1 exhibits wide substrate specificity to aminoaldehydes, aliphatic and aromatic aldehydes with a clear preference for γ-trimethylaminobutyraldehyde (TMABAL). The structure of ALDH9A1 reveals that the enzyme assembles as a tetramer. Each ALDH monomer displays a typical ALDHs fold composed of an oligomerization domain, a coenzyme domain, a catalytic domain, and an inter-domain linker highly conserved in amino-acid sequence and folding. Nonetheless, structural comparison reveals a position and a fold of the inter-domain linker of ALDH9A1 never observed in any other ALDH so far. This unique difference is not compatible with the presence of a bound substrate and a large conformational rearrangement of the linker up to 30 Å has to occur to allow the access of the substrate channel. Moreover, the αβE region consisting of an α-helix and a β-strand of the coenzyme domain at the dimer interface are disordered, likely due to the loss of interactions with the inter-domain linker, which leads to incomplete β-nicotinamide adenine dinucleotide (NAD+) binding pocket.

ALDH9A1 gene is highly expressed in the liver, skeletal muscle, kidney, pancreas, and heart [5]. The enzyme is also present in the brain and the spinal cord [6]. The enzyme displays activity with a dopamine metabolite 3,4-dihydroxyphenylacetaldehyde (DOPAL) [7] and much higher with betaine aldehyde (BAL) [6,8] leading to the production of glycine betaine (therefore often annotated as BADH, E.C. 1.2.1.8). Glycine betaine is a quaternary ammonium compound acting as a zwitter-ion at physiological pH and maintaining protein and membrane conformations under various stress conditions [9]. A mammalian ALDH displaying γ-trimethylaminobutyraldehyde (TMABAL) dehydrogenase (TMABALDH, E.C. 1.2.1.47) activity was first purified from bovine liver [10]. Later on, other mammalian isoforms from rat and human were shown to be TMABALDHs thus it became clear that TMABALDH and ALDH9A1 were indeed the same enzyme [11]. The enzyme is involved in the carnitine synthesis pathway comprising three other enzymes: trimethyl lysine dioxygenase, 3-hydroxy-N-trimethyllysine aldolase and γ-butyrobetaine (TMABA) dioxygenase ( Figure 1). Carnitine (3-hydroxy-4-N-trimethylaminobutyrate) is a water-soluble quaternary amine, which transports the CoA-activated fatty acids into the mitochondrial matrix for β-oxidation as well as products of peroxisomal β-oxidation, for their full oxidation to CO 2 and H 2 O in the Krebs cycle [12]. Both, carnitine synthesis and uptake are regulated by the peroxisome proliferator-activated receptor α (PPARα), which is a transcription factor involved in lipid metabolism and energy homeostasis. It is abundantly expressed in tissues showing high rates of β-oxidation such as liver and kidney [13]. It has been shown that PPARα regulates the expression of ALDH9A1 gene [14].
ALDH9A1 may play a role in systemic vasculitis and vasculitis-associated diseases such as Kawasaki disease known as a mucocutaneous lymph node syndrome [15]. Patients' serum with the Kawasaki disease contains significantly induced levels of anti-ALDH9A1 antibodies. Decreased expression of ALDH9A1 may also contribute to a non-alcoholic steatohepatitis without inflammation as reported by the study on a mice model [16]. Allelic variants in ALDH9A1 gene have also been observed [5]. Polymorphism in several genes related to GABA signaling pathway including ALDH9A1 has been associated with neuroleptic-induced tardive dyskinesia, which is an involuntary movement disorder that develops in patients undergoing a long-term treatment with antipsychotic medications [17].
Herein, we report a complete enzymatic characterization and the structure of human ALDH9A1 in three different crystal forms. Kinetic parameters and substrate specificity were determined using various aminoaldehydes including 3-aminopropanal (APAL) or 4-guanidinobutyraldehyde (GBAL), which have so far never been analyzed with this enzyme. ALDH9A1 is one of the few remaining members of the human ALDH superfamily with yet unknown crystal structure. Although the enzyme was co-crystallized with the β-nicotinamide adenine dinucleotide (NAD + ) coenzyme, all structures, which are very similar, are devoid of NAD + and display the same disordered region forming the coenzyme binding site. Structural comparison with the cod liver ALDH9A2 (PDB ID: 1BPW) and other ALDHs revealed that the inter-domain linker of human ALDH9A1, involved in the coenzyme binding, adopts a position and a fold that have never been observed so far in any X-ray structure of ALDHs.

Substrate specificity and kinetic parameters
The human ALDH9A1 (GenBank ID: AF172093) is a protein of 494 amino acids (Uniprot ID: P49189). A new polymorphism site was discovered in cDNA used in this work at the position 330. It appears in the codon for Ile 110 (ATT is changed to ATC). However, this variation does not alter the amino acid composition. Kinetic measurements were performed at the known optimal pH of 7.5, as described in previous studies [3,8,11]. The enzyme is very sensitive and becomes nearly inactive after two re-freezing cycles. The effect of the buffer composition on protein stability was further investigated using nano-differential scanning fluorimetry (nanoDSF) (Figure 2A) and the highest melting temperatures were observed in buffers at pH 7.0 and 7.5.
Several aminoaldehydes were screened as potential substrates of ALDH9A1 using a fixed 1 mM concentration of the NAD + coenzyme. The enzyme displays a wide range of substrate specificity ( Figure 2B). Indeed, although ALDH9A1 shows the highest rate activity for TMABAL and BAL, it can oxidize other aminoaldehydes such APAL, ABAL, γ-dimethylaminobutyraldehyde (DMABAL) or GBAL. In line with previous studies where specific activities of 14, 1.8, and 2.3 nmol.s −1 .mg −1 using ABAL were reported for the enzyme isolated from liver and brain [3,6,8], Table 1 Kinetic and affinity parameters for human ALDH9A1 and selected substrates V max /K m ratios are expressed in relative values referring to the best TMABAL substrate (V max /K m = 1). Saturation curves for aldehydes were measured in 150 mM sodium pyrophosphate, pH 7.5, using 1 mM NAD + ; saturation curve for NAD + was measured using 150 μM TMABAL. Kinetic constants including their standard error values were determined using GraphPad Prism 5.0 software including K i , which is the substrate inhibition constant. The lower V max value for NAD + (indicated by asterisks) compared with that for TMABAL results from using a fixed sub-saturating TMA-BAL concentration in the saturation of the enzyme by NAD + . K d values were determined using MST. Abbreviations: n.d., not determined; TMAPAL, we measured a final activity of ∼1.8 nmol.s −1 .mg −1 . The enzyme can also convert aliphatic aldehydes such as acetaldehyde, hexanal, or 2-hexanal (lipid peroxidation product) as well as aromatic aldehydes such as benzaldehyde and DOPAL. The K m value for NAD + of 32 + − 2 μM correlates well with the measured K d value of 16 + − 3 μM and the previous reported K m value of 13 μM [3]. As the activity with NADP + is only ∼2-3% of that with NAD + , NAD + is the preferred coenzyme for HsALDH9A1. Kinetic properties of ALDH9A1 were further explored (Table 1). A comparison of the catalytic efficiency values (V max /K m ) shows that TMABAL is the best substrate in vitro. The enzyme displays a K m value of 6 + − 1 μM (V max ∼ 9.8 nmol.s −1 .mg −1 ) for this substrate while that for BAL is much higher 216 + − 16 μM (V max ∼ 6.3 nmol.s −1 .mg −1 ). Dissociation constants (K d ) of 6 + − 3 μM for TMABAL ( Figure 2C) and 171 + − 35 μM for BAL were measured by microscale thermophoresis (MST) in the absence of the coenzyme. These values are close to the respective K m values. Together with the catalytic efficiency values, they indicate that ALDH9A1 should be first of all a TMABALDH in vivo. Nonetheless, the putative in vivo substrates ABAL, APAL, and GBAL which share similar saturation curves can be oxidized with a V max value approximately five-fold lower than those for TMABAL and K m values in low micromolar range ( Figure 2D and Table 1). Indeed, a K m value of 13 μM for ABAL with a V max value of 33 nmol.s −1 .mg −1 was previously reported for the native enzyme [3] as well as K m values of 5 and 260 μM for TMABAL [11] and BAL [8], respectively.

Crystal structure of HsALDH9A1
Crystallization of the HsALDH9A1 apoform was not successful. Co-crystallization with 50 mM NAD + with or without 10 mM TMABA product resulted into different crystal forms. However, the three solved structures from P2 1 2 1 2, P2 1 , and C2 space groups at 2.5, 2.9, and 2.3Å, respectively (Table 2), reveal an ALDH9A1 apoform in the absence of a bound NAD + or TMABA. The structures were determined by molecular replacement using the structure of BADH from cod (Gadus morhua subsp. callarias) liver as a search model (PDB IDs: 1BPW and 1A4S; 71% sequence identity) [19]. This enzyme is also annotated GmALDH9A2 [1]. The asymmetric units of the P2 1 2 1 2 and P2 1 structures contain two similar tetramers (dimer-of-dimers) and that of C2 only one ( Figure 3A). The tetrameric form in solution with a molecular mass value of 214 + − 12 kDa was also confirmed by gel permeation chromatography (monomer of 55.4 kDa including the His-tag). All monomers are very similar to each other with an RMSD up to 0.24Å. Superposition with monomers of GmALDH9A2 gives an RMSD of 1.3-1.6Å. Each monomer displays the classical ALDH fold consisting of a catalytic domain (residues 258-448) with the catalytic Cys 288 , a coenzyme binding domain (residues 1-127, 146-257, 470-478), and an oligomerization domain (residues 128-145 and 479-494), which wraps over the groove between the catalytic and coenzyme domains of the other monomer forming the dimer ( Figure 3B).  The first remarkable difference concerns the region comprising residues 232-256 of HsADLH9A1 which belongs to the coenzyme domain and usually forms the dimer interface in other known ALDHs. The region αβE, named according to the nomenclature for GmALDH9A2 [19], is composed of the αE helix followed by the βE strand. The αβE region is not visible in the electron density maps thus is highly mobile (Figure 3C,D). The αE helix delineates the coenzyme cavity and possesses conserved residues known to bind the pyrophosphate moiety such as Ser 233 and Thr 236 (Ser 242 and Thr 245 in GmALDH9A2). The βE strand defined by residues 250-254 would be a part of the central five-stranded pleated β-sheet of the coenzyme domain (abbreviated as Rossmann fold). The last residue of this strand is indeed the conserved active-site base glutamate Glu 254 . The remaining residues 255-257 bridge the gap over the nicotinamide riboside moiety and connect to the catalytic domain.

Inter-domain linker and active site
The inter-domain linker (residues 449-470), which does not adopt the typical fold so far observed in all known ALDHs' structures ( Figure 3E,F), represents the second major difference in HsALDH9A1. Under classical fold, this highly conserved region has an important role both in stabilizing the coenzyme binding site and in interacting with the bound substrate. Indeed, it protrudes alongside the edge between the coenzyme and the catalytic domains and establishes several H-bonds including the βE strand forming the coenzyme binding site. In the HsALDH9A1 structure, this inter-domain region adopts a new position and fold, associated with large conformational changes compared with GmALDH9A2 structure [19]. This is not compatible with a bound substrate ( Figure 3G). For example, Lys 461 -Lys 462 -Ser 463 -Gly 464 are located between 23 and 30Å away from their equivalent residues in GmALDH9A2 and Pro 452 -Val 453 and Glu 454 (10Å away from their equivalent residues in GmALDH9A2) occupy the substrate binding site.
NanoDSF experiments with the purified apo-enzyme and enzyme with either 5 mM NAD + , or 5 mM substrate, or 10 mM product were performed to check transitions in the folding state of HsALDH9A1. While a high single peak was present for the ALDH9A1 apoform as well as for the coenzyme complex (69.5 • C, Figure 4A), the enzyme became slightly destabilized with the substrate TMABAL (63 • C) and the product TMABA (66 • C). Moreover, a broad transition was observed at lower temperatures between 45 and 52 • C for both the substrate and product complexes indicating possible conformational changes compared with the apoform.

Discussion
In the present study, using kinetic assays and affinity measurements, we showed that among various available aminoaldehydes, ALDH9A1 preferentially oxidizes TMABAL with the highest rates at very low saturating micromolar concentrations. This suggests that the major in vivo role of this enzyme is a TMABALDH activity resulting to TMABA and further carnitine production, especially in liver. However, in other organs, the enzyme may be involved in oxidation of other aminoaldehydes and aliphatic aldehydes. While the involvement of ALDH9A1 in the production of the GABA neurotransmitter from ABAL was discussed in the past [6], its role in degradation of APAL or GBAL has not been considered at all. These aldehydes were previously deeply studied for plant ALDH10 family, the closest relative of the ALDH9 family [20,21].
APAL is known to cause apoptotic and necrotic death of both neurons and glial cells during cerebral ischemia [22]. As ALDH9A1 is expressed in the brain, it is very likely that it oxidizes ABAL as well as APAL in this organ. It is well known that APAL is produced by polyamine oxidase mediating the oxidation of spermine and spermidine [23] and by spermine oxidase releasing spermidine and hydrogen peroxide [24]. Moreover, APAL can be non-enzymatically converted into acrolein, which is even more toxic than hydrogen peroxide [25]. On the other hand, GBAL oxidation represents another way of GABA production (predominantly, it is generated by the cytosolic glutamate decarboxylase, E.C. 4.1.1.15). GBAL is produced from agmatine as demonstrated with swine kidney diamine oxidase [26]. Kidneys may be a place for further GBAL conversion into γ-guanidinobutyrate, which is likely hydrolyzed to GABA by agmatinase. Again, this enzyme is highly abundant in the liver and kidney [27]. Oxidation of BAL to glycine betaine apparently requires higher BAL concentrations, which is different from the other aminoaldehydes. The K m value of 182 μM for ALDH9A1 was reported for BAL with catalytic efficiency similar to ABAL and only 3% of that for TMA-BAL [11]. Glycine betaine is known to contribute to a normal homocysteine metabolism by donating its methyl group for the remethylation of homocysteine to methionine. Again, the major site of betaine metabolism is liver [28,29].
The structural comparison with the cod liver ALDH9 (PDB 1BPW) revealed two major differences likely linked to each other. The first one concerns the αβE region, which is composed of the α-helix E and β-strand E, located at the dimer interface. This region interacts with the bound NAD + when present in ALDH structures, and the absence of a bound NAD + in the HsALDH9A1 structure is in agreement with this largely disordered region. Moreover, with or without a bound coenzyme, this region has always been observed in X-ray structures in contact with the inter-domain linker located in the C-terminal part of the enzyme. The active site residues as well as those forming the inter-domain linker are highly conserved among ALDH9 family members ( Figure 4B,C). Only two residues are different in the active site of GmALDH9A2 compared with HsALDH9A1. First of all, the catalytic cysteine, which is nearly always followed by the second cysteine at the neighboring position (Cys 289 in HsALDH9A1), is the threonine residue (Thr 298 ) in the cod enzyme. Second, the highly conserved glutamine Gln 161 in ALDH9A1 is substituted by the methionine Met 170 in GmALDH9A2. The cod enzyme, which has only been studied with six substrates [30], displays the highest catalytic efficiency for BAL followed by benzaldehyde. As no other aminoaldehydes including TMABAL were tested, it is difficult to deduce any effect of these two substitutions on substrate specificity. K m value of 140 μM and V max of 15 nmol.s −1 .mg −1 for BAL are comparable with those presented in the present study for HsALDH9A1 (216 μM and 6.3 nmol.s −1 .mg −1 ).
There are also four differences in sequence between the inter-domain linker of HsALDH9A1 and GmALDH9A2 ( Figure 5) but none of them is related to residues establishing H-bonds to βE strand or to residues from catalytic and coenzyme domains. This inter-domain linker serves as a hook in the formation and stabilization of the coenzyme binding site by interacting with the region αβE as well as a loop within the active site interacting with the aldehyde substrate. In our three HsALDH9A1 structures, the inter-domain linker adopts a unique fold, never observed so far, preventing the binding of any substrate. A drastic rearrangement up to 30Å is therefore required to access the substrate channel. In order to check whether the observed position of the inter-domain linker was independent of the crystallization conditions at acidic pH, we measured the enzyme activity in sodium citrate, pH 5.6. The enzyme still displayed 15% activity and TMABA binding could be measured at this pH by MST (K d of 2.6 + − 0.2 mM). Therefore, HsALDH9A1 which was able to crystallize in three different crystals forms, was active. The same fold of the inter-domain linker in the three crystal forms suggests that this linker exhibits such conformation in the apoform. Therefore, a switch corresponding to a rearrangement up to 30Å must occur for substrate binding followed by NAD + binding. A fine control mechanism of the enzyme may exist as previously reported for ALDH5 [31], in which the substrate channel is blocked by the catalytic loop through a disulfide bond formation between the catalytic cysteine Cys 340 and the surrounding Cys 342 . However, no similar disulfide bond is formed in HsALDH9A1.

Expression and purification
Human ALDH9A1 ORF was amplified using gene-specific primers (5 -TTAGGATCCGATGAGCACTGGCACCTTC-3 and 5 -ACACTCGAGGTCAAAAAGCAGATTCCACA-3 ) and Accuprime Pfx polymerase (Life Technologies) and further ligated into a pCDFDuet vector (Merck Millipore) using BamHI and XhoI and transformed into T7 express Escherichia coli cells (New England Biolabs) and Rosetta2 (DE3) pLysS cells (Merck Millipore). Cells were grown at 37 • C in LB medium, at OD 600 = 0.5 the cultures were supplemented with 0.5 mM isopropyl-β-thiogalactopyranoside for protein expression, and incubated at 20 • C overnight. Recombinant ALDHs were purified on HisPur Cobalt or NiNTA Spin Columns (Thermo Fisher Scientific) in 20-50 mM Tris/HCl pH 7.5 with or without 150 mM NaCl. Enzymes were concentrated using Amicon 30 kDa filters (Merck Millipore) and further purified by gel filtration chromatography on a HiLoad 26/60 Superdex 200 column.

Affinity and thermal stability determination
MST method was used to determine binding affinities for TMABAL, BAL, and NAD + . HsALDH9A1 was labeled with the His-Tag Labeling Kit RED-tris-NTA (100 nM dye + 200 nM His-tagged protein) for 30 min. The labeled proteins were adjusted to 50 nM with 50 mM Tris/HCl buffer at pH 7.5 supplemented with 0.05% Tween. A series of sixteen 1:1 ligand dilutions was prepared using the identical buffer. Measurements were done on a Monolith NT.115 instrument (NanoTemper Technologies). Data of three independently pipetted measurements were analyzed. Thermostability was measured by nano differential scanning fluorimetry on a Prometheus NT.48 instrument (Nan-oTemper Technologies) with a back-reflection aggregation detection at a range from 20 to 95 • C and with a heating rate of 1 • C.min −1 . Protein unfolding was followed by tryptophan fluorescence intensity at 330 and 350 nm in various buffers covering pH range of 7.0-9.0 in the presence or absence of 100 mM NaCl and 5% (v/v) glycerol. The melting temperature (T m ) was determined by detecting the maximum of the first derivative of the fluorescence ratios (F 350 /F 330 ) after fitting experimental data with a polynomial function. Data were measured in triplicate. Effect of presence of coenzyme, substrate, or product was measured using Tycho NT.6 instrument with a heating rate of 30 • C.min −1 .

Enzyme kinetics
Enzyme activity was measured by monitoring the NAD(P)H formation (ε 340 = 6.22 mM −1 .cm −1 ) on an Agilent UV-Vis spectrophotometer 8453 (Agilent) at 30 • C. Britton-Robinson buffers in the pH range of 6-10 and adjusted to a constant ionic strength of 0.15 M were used to determine pH optimum.
Aliphatic and aromatic aldehydes, BAL chloride together with APAL and ABAL diethylacetals were purchased from Sigma-Aldrich. Diethylacetals of GBAL, 3-guanidinopropionaldehyde (GPAL), 3-(trimethylamino)propionaldehyde (TMAPAL), and 4-(trimethylamino)butyraldehyde (TMABAL) were synthetic preparations [11,32]. Free aminoaldehydes were prepared by heating their acetals in a plugged test tube with 0.2 M HCl for 10 min [33]. Substrate screening was done upon addition of various aldehydes at a final concentration of 1 mM in 150 mM sodium pyrophosphate, pH 7.5 and 1 mM NAD + . Saturation curves for substrates were measured using 1 mM NAD + . Kinetic constants were determined using GraphPad Prism 5.0 data analysis software (www.graphpad.com) by fitting data to the Michaelis-Menten equation. When substrate inhibition was observed, data were analyzed by nonlinear regression using Michaelis-Menten equation that accounts for partial substrate inhibition: v = V max . where v is the determined initial velocity, V max is the maximal velocity, [S] is the concentration of the substrate, K m is the substrate concentration at half-maximal velocity, K i is the substrate inhibition constant. Saturation curve for NAD + was measured using 0.1 mM TMABAL, which is a sub-saturating concentration providing the maximal experimentally attainable activity and is not affected by a substrate inhibition. Therefore, the kinetic constants calculated for the coenzyme are only apparent.