AFPs (antifreeze proteins) are produced by many organisms that inhabit ice-laden environments. They facilitate survival at sub-zero temperatures by binding to, and inhibiting, the growth of ice crystals in solution. The Antarctic bacterium Marinomonas primoryensis produces an exceptionally large (>1 MDa) hyperactive Ca2+-dependent AFP. We have cloned, expressed and characterized a 322-amino-acid region of the protein where the antifreeze activity is localized that shows similarity to the RTX (repeats-in-toxin) family of proteins. The recombinant protein requires Ca2+ for structure and activity, and it is capable of depressing the freezing point of a solution in excess of 2 °C at a concentration of 0.5 mg/ml, therefore classifying it as a hyperactive AFP. We have developed a homology-guided model of the antifreeze region based partly on the Ca2+-bound β-roll from alkaline protease. The model has identified both a novel β-helical fold and an ice-binding site. The interior of the β-helix contains a single row of bound Ca2+ ions down one side of the structure and a hydrophobic core down the opposite side. The ice-binding surface consists of parallel repetitive arrays of threonine and aspartic acid/asparagine residues located down the Ca2+-bound side of the structure. The model was tested and validated by site-directed mutagenesis. It explains the Ca2+-dependency of the region, as well its hyperactive antifreeze activity. This is the first bacterial AFP to be structurally characterized and is one of only five hyperactive AFPs identified to date.
Since their discovery over 30 years ago in Antarctic fish , antifreeze proteins (AFPs) have been found in a wide array of organisms that inhabit freezing environments. AFPs enhance survival at sub-zero temperatures by inhibiting the growth of potentially harmful ice crystals within the organism. If left uncontrolled, ice growth can lead to cellular dehydration, tissue damage and death of the organism . By binding to the surface of ice crystals, AFPs are able to depress the freezing point of a solution below its melting point . The difference between these temperatures is termed ‘thermal hysteresis’ (TH) and is used as a measure of antifreeze activity.
Fish and insects which avoid freezing typically have AFPs with high TH activity . Plants and micro-organisms that are unable to avoid freezing are thought to use AFPs with weaker TH activity to inhibit recrystallization of ice [5,6]. It is remarkable, therefore, that the Antarctic bacterium Marinomonas primoryensis produces a Ca2+-dependent AFP with a strong TH activity . This could potentially help it avoid freezing in the brackish ice-covered lakes from which it was isolated . MpAFP (M. primoryensis AFP) is an extremely large protein with unusual properties that make it refractory to conventional purification techniques. Nevertheless, we were able to sequence a number of tryptic peptides from gel-purified MpAFP, some of which resemble sequences from the Ca2+-binding regions of RTX (repeats-in-toxin) proteins (J.A. Gilbert, C.P. Garnham, L.A. Graham, J. Laybourn-Parry and P.L. Davies, unpublished work). A genomic lambda library of M. primoryensis was constructed and probed with a PCR-amplified DNA sequence developed from tryptic peptides in the RTX-like region. Preliminary analysis shows the gene encodes a huge (>1 MDa) protein that contains two repetitive sequences that divide the protein into five distinct regions (Figure 1A). Region II of the protein contains an unknown number of tandem repeats of a 104-aa (104-amino-acid) sequence. Each 104-aa repeat is perfectly conserved, even at the DNA level. This has made it difficult to determine the exact copy number of 104-aa repeats, and therefore difficult to estimate the size of MpAFP.
Sequence, expression and purification of MpAFP Region IV
In the interim, we have focused our research on Region IV, the second repetitive region of MpAFP. Region IV consists of tandem 19-aa repeats, each beginning with the consensus sequence XGTGND, where X is usually alanine or glycine. This region of MpAFP is clearly homologous with the RTX proteins. RTX proteins are a diverse family of secreted virulence factors found in Gram-negative gammaproteobacteria . They are so called because of the presence of a tandemly repeated nonapeptide motif located towards their C-terminus. The motif has a consensus sequence of GGXGXDXUX, where X represents any residue and U is a large hydrophobic residue. AP (alkaline protease), an RTX-like virulence factor secreted by the Gram-negative bacterium Pseudomonas aeruginosa, contains five copies of this nonapeptide motif. The X-ray-crystallographic structure of this protein revealed that these repeats fold into a parallel β-roll stabilized by internal Ca2+ ions sandwiched between the turns of the structure .
The wild-type MpAFP is dependent on Ca2+ for TH activity, and this activity is resistant to proteolysis by trypsin only if Ca2+ is present . This suggests a structural role for the cation, as opposed to its cofactor role in the Ca2+-dependent type II fish AFP from the Atlantic herring (Clupea harengus) [11,12]. On the basis of these data and the similarity of Region IV to the Ca2+-binding β-roll from AP, we hypothesized that Region IV was the antifreeze-active region of MpAFP and that its structure would be similar to the β-roll of AP. Indeed, cloning and expression of Region IV in Escherichia coli revealed it to be a potent AFP, capable of depressing the freezing temperature of a solution in excess of 2 °C at a concentration of 0.5 mg/ml. This level of TH is quantitatively the same as the wild-type MpAFP and indicated that Region IV could account for the antifreeze activity of MpAFP.
Attempts to solve the three-dimensional structure of MpAFP Region IV have been unsuccessful. The isolated domain has a propensity to aggregate and precipitate above 1 mg/ml, a concentration well below that needed for solution NMR or X-ray crystallography. However, the repetitive nature of Region IV and its homology with the known β-roll structure in AP make it an ideal candidate for molecular modelling.
In this paper we present a β-helical model of MpAFP Region IV based partly on the Ca2+-bound β-roll from the RTX-like AP. The model contains both a novel ice-binding motif and β-helical fold that explains the Ca2+-dependency and hyperactivity of the protein. The modelled structure revealed a putative ice-binding site along one edge of the β-helix that is flat and is comprised of parallel ranks of equally spaced threonine and asparagine residues. The threonine and asparagine residues originate from each XGTGND repeat that forms a Ca2+-bound turn in the β-helix similar to the GGXGXD Ca2+-bound turns in AP. The IBF (ice-binding face) of the model is similar to the IBFs of the hyperactive β-helical insect AFPs. They each contain a repetitive TXT motif located in a single flat β-sheet on one side of their respective structures [13,14]. It is hypothesized that this allows the insect AFPs to bind both primary prism and basal planes of an ice crystal by mimicking the distance of oxygen atoms in the ice lattice. Our model of Region IV predicts that it, too, should bind both basal and prism planes of an ice crystal lattice by the same mechanism. However, the location of the IBF in the Ca2+-bound turn of the model, as well as the use of threonine and asparagine/aspartic acid as ice-binding residues, is novel compared with the insect AFPs and highlights the plasticity of the β-helix as an AFP structural fold. Extensive site-directed mutagenesis was performed in order to validate the putative IBF of the model and its overall fold. Mutation of specific threonine and asparagine residues on the putative IBF of the protein to tyrosine resulted in large decreases in TH activity, whereas TH activity was left unchanged when similar substitutions were made off the IBF.
MATERIALS AND METHODS
Cloning and expression of MpAFP Region IV
Two primers were designed to clone and express a 322-aa portion of MpAFP as a C-terminally His6 (hexahistidine)-tagged protein. The 322-aa segment corresponds to residues 997–1318 inclusively of the sequence deposited in the NCBI protein database under accession number ABL74378. This portion contained a short segment of Region III (74 aa), followed by the entire Region IV (207 aa, corresponding to residues 1071–1277 of ABL74378) and a short segment of Region V (41 aa). The forward primer was 5′-ACGTCATATGAATGTGTCGCAATCAAATTCG-3′, which contained an NdeI site, and the reverse primer was 5′-TGCACTCGAGATAGTCAGCAAAGTCCGCAGG-3′, which contained an XhoI site. Following digestion with NdeI and XhoI, the PCR product obtained from the genomic DNA template was cloned into the corresponding sites of the expression vector pET-24a. Positive clones were inoculated into 1 litre of Luria–Bertani medium with kanamycin (100 μg/ml) and grown at 37 °C and at 200 rev./min agitation until the attenuance (D600) reached 0.5. The cells were allowed to grow for another 0.5 h at 23 °C (or until the D600 reached 1.0), then isopropyl β-D-thiogalactoside was added to a final concentration of 1 mM to induce expression overnight at 23 °C. Following incubation, the cells were recovered by centrifugation (2500 g for 30 min at 4 °C) and resuspended in 25 ml of buffer A (50 mM Tris/HCl, pH 7.5, 150 mM NaCl, and 2 mM CaCl2). The resuspension was sonicated [Fisher Scientific (Waltham, MA, U.S.A.) Sonic Dismembrator, model 5000; five bursts of 45 s each at 50% amplitude] to break open the cells and centrifuged at 21000 g and 4 °C for 1 h in a JA20 rotor to remove cellular debris.
Purification of His6-tagged Region IV of MpAFP
The crude cellular lysate was mixed with 10 ml of Ni-NTA (Ni2+-nitrilotriacetate) resin (Qiagen, Mississauga, ON, Canada) and stirred at 4 °C for 30 min, loaded into a column, washed with Buffer N [0.5 M NaCl, 50 mM Tris/HCl, pH 7.5, 2% (v/v) glycerol, 2 mM CaCl2 and 5 mM imidazole] and eluted with buffer N containing 250 mM imidazole. Fractions displaying TH were pooled and diluted 10-fold in buffer A to decrease the NaCl concentration. This material was applied to a HiLoad 16/10 Q-Sepharose High Performance column (Amersham) equilibrated with buffer A and eluted with a 0–1 M NaCl gradient in buffer A over 10 column vol.. Samples displaying TH activity were again pooled and loaded on to a HiLoad 16/60 Superdex 75 prep grade size-exclusion column (Amersham) equilibrated in buffer A. Fractions were analysed for purity by SDS/PAGE.
All mutagenesis was performed by using the QuikChange® II XL Site Directed Mutagenesis Kit as described by the manufacturer (Stratagene, La Jolla, CA, U.S.A.) and checked by DNA sequencing (Cortec, Kingston, ON, Canada). Each mutant of Region IV was purified in the same manner as wild-type recombinant Region IV.
CD spectra were measured on an Olis Rapid Scanning Monochromator with a DSM (digital subtractive method) CD module (Olis, Bogart, GA, U.S.A.) using a 121-QS 1-mm-pathlength quartz cuvette (Hellma Optik GmbH, Jena, Germany). The temperature of the cuvette during scans was maintained at 4 °C using a Peltier platform. Initial CD experiments on MpAFP Region IV in the presence of 5 mM Tris/HCl, pH 7.5, and 2 mM CaCl2 or 2 mM EDTA were carried out at a concentration of about 0.4 mg/ml. Since the exact concentration of the samples was unknown, the results are plotted solely as ellipticity versus wavelength. All subsequent CD experiments on Region IV and its mutants were carried out at a protein concentration of 1 mg/ml in buffer containing 5 mM Tris/HCl, pH 7.5, and 2 mM CaCl2. Protein concentration was determined by measuring the A280 and using the protein's molar absorption coefficient (0.666 mol·litre−1·cm−1).
Modelling and molecular dynamics
Modelling was performed using the program Modeller8v2 . Input alignments were manually created as outlined in the Results section. PyMol v0.99  was used to assemble and visualize modelled segments. Molecular-dynamics simulations were performed using Gromacs v3.3 . The model was solvated in a cube containing 9105 water molecules, with all sides of the cube 6.5 nm in length. Because the overall charge of the model was −10, ten water molecules were replaced by ten Na+ ions to neutralize the system. The solvated model was subjected to energy minimization by steepest-descents position-restrained molecular dynamics to relax the solvent, followed by full molecular dynamics. The GROMOS96 43a1 force field was chosen. Berendsen temperature coupling was applied. Simulations were performed at 277 K (4 °C) for 2 ns.
TH activity was measured using a Clifton Nanolitre Osmometer (Clifton Technical Physics, Hartford, NY, U.S.A.) as previously described . Digital images of ice crystals were recorded using a Nikon COOLPIX 4500 digital camera mounted on a Leitz Dialux 22 microscope with a Leitz Wetzlar 160/- EF 32/0.40 objective.
Ice-crystal morphology produced by Region IV was examined using a 0.08 mg/ml solution of MpAFP Region IV in 25 mM Tris/HCl (pH 7.5)/2 mM CaCl2. TH measurements of Region IV and its mutants were conducted in the same buffer at concentrations from 0.9 down to 0.025 mg/ml. The TH activity of Region IV as a function of Ca2+ concentration was determined by dialysing a 0.1 mg/ml solution of Region IV against 25 mM Tris/HCl (pH 7.5)/100 μM EDTA. Increasing amounts of Ca2+ were titrated into the solution and measurements were taken at each concentration.
Q-TOF (quadrupole time-of-flight) MS
The mass of MpAFP Region IV in EDTA and Ca2+ was determined using a Waters Q-TOF Global MS instrument. A 100 μM solution of Region IV was dialysed against 10 mM Tris/HCl, pH 7.5, and either 2 mM CaCl2 or 2 mM EDTA. Both the EDTA and CaCl2 samples were diluted to 10 pmol/μl in 10 mM ammonium formate and were subjected to Q-TOF MS in the static-spray mode to determine their respective masses. The Waters MassLynx 4.0 software package was used for data analysis and acquisition.
Cloning, expression and purification of MpAFP Region IV
Shotgun sequencing of selected clones from a genomic lambda library of M. primoryensis identified an extremely large protein that contained five distinct regions (Figure 1A). The 207-aa Region IV was tentatively identified as the antifreeze-active portion of the protein because of its perceived dependence on Ca2+ for structural integrity based on homology with the Ca2+-dependent RTX family of proteins and, in particular, the Ca2+-bound β-roll from the RTX-like AP. The sequence of Region IV is shown in Figure 1(B). Region IV was cloned as a C-terminal His6-tagged protein with an extra 74 N-terminal residues and 41 C-terminal residues to help ensure that enough sequence was present at either end of Region IV to allow for proper folding of that domain. Region IV was purified to homogeneity in three sequential chromatographic steps: Ni-NTA–agarose, Q-Sepharose and Superdex G-75. An SDS/PAGE analysis of the purification shows that the nickel column chromatography produced a very substantial enrichment of a 40 kDa band from the lysate supernatant (Figure 1C). The next two chromatographic steps removed minor impurities such that the Superdex G-75 fraction contained a single intense protein band at 40 kDa.
The TH activity of MpAFP Region IV is hyperactive and Ca2+-dependent
The purified Region IV was tested for TH activity as a function of its concentration in buffer containing 2 mM CaCl2 (Figure 2A). TH levels of just under 2 °C were obtained at a concentration of 0.1 mg/ml (2.9 μM). At this concentration (low μM levels), a typical fish AFP would only produce about 0.1 °C of TH, whereas hyperactive AFPs produce TH levels at least 10 times greater . The overall shape of the activity curve was sigmoidal. At concentrations below 0.025 mg/ml, TH activity was undetectable, but above 0.025 mg/ml, TH activity increased rapidly and approached a plateau at about 2 °C at concentrations above 0.1 mg/ml. Region IV produced hexagonally shaped ice crystals that ‘burst’ out of the sides of the hexagon at temperatures below the non-equilibrium freezing point (Figure 2B). This same phenomenon, namely of hexagonally shaped ice crystals that burst along the a-axes, was first documented with the β-helical AFP from the tortricid moth Choristoneura fumiferana (spruce budworm)  and is characteristic of all hyperactive AFPs .
Antifreeze activity of MpAFP Region IV
The TH activity of Region IV was also tested as a function of Ca2+ concentration (Figure 3). All TH activity was lost in the presence of EDTA. A rapid hyperbolic increase in activity was seen as the Ca2+ concentration increased, levelling out at about 10 mM Ca2+. Even 100 μM Ca2+ produced appreciable TH activity (>1 °C).
Ca2+-dependency of MpAFP Region IV
MpAFP Region IV has Ca2+-dependent β-sheet structure
The secondary structure of Region IV in the presence of either Ca2+ or EDTA was monitored via CD (Figure 4). In the presence of Ca2+, the spectrum was similar to that observed with proteins containing primarily β-sheet, indicated by the maximum at 196 nm and the minimum at 218 nm. The spectrum in the presence of EDTA was that of a random coil, with a large minimum at 200 nm. This indicated that Region IV adopted a Ca2+-dependent predominantly β-sheet fold that could be disrupted by EDTA.
CD spectra of MpAFP Region IV in Ca2+ and EDTA
On the basis of its hyperactivity, Ca2+-dependence, hexagonal ice-crystal shape and CD spectrum, we hypothesized that Region IV of MpAFP folded as a β-helix similar to the hyperactive insect AFPs, but one that also requires Ca2+ for structural stability. The similarity of Region IV to the Ca2+-bound β-roll from AP (Figure 5A), and in particular each XGTGND repeat of Region IV to the GGXGXD repeats of AP, suggested that the β-roll from AP could be used as a valid starting template for a homology-based model of Region IV.
Ca2+-bound β-roll from AP
Description of β-roll from AP and comparison to MpAFP Region IV
Each loop in the β-roll of AP consists of two identical GGXGXDXUX nonapeptide motifs repeated in tandem (Figure 5B). This creates one loop in the structure 18 aa in length (Figure 5A). The GGXGXD sequence within each nonapeptide motif of AP forms a turn stabilized by internal Ca2+ ions sandwiched between loops of the β-roll. This Ca2+-bound turn is followed by a three-residue XUX β-strand, the X residues of which point outwards and the U residues of which point inwards, contributing to the hydrophobic core of the β-roll. Each nonapeptide motif creates one half of an 18-aa loop, with an ensuing nonapeptide motif completing the second half. This creates a β-roll with Ca2+ bound down both sides of the structure and with a narrow hydrophobic core in between, consisting of the U residues from each XUX β-strand.
Each 19-aa repeat of MpAFP Region IV consists of a decapeptide motif with a consensus sequence of XGTGNDXUXU, followed by a nonapeptide motif the consensus sequence of which is GGXUXGXUX (Figure 1B). Each one of these motifs can be viewed as one half of a loop in a β-helical structure, similar to the β-roll of AP, yet differing in three key areas (Figure 5B). The initial decapeptide motif of each 19-aa repeat of Region IV contains a Ca2+-binding XGTGND sequence, followed by a four-residue XUXU β-strand. This is very similar to the nonapeptide motif of AP (GGXGXDXUX), except for the presence of an extra hydrophobic residue at the end of the β-strand in Region IV (i.e. XUXU). The ensuing GGXUXGXUX nonapeptide motif of each 19-aa repeat differs from the nonapeptide repeat of AP by the presence of a hydrophobic U residue at position 4 of the repeat (GGXUXGXUX) and by an aspartic acid-to-glycine substitution at position 6 of the motif (GGXUXGXUX). This creates a nine-residue sequence that contains two three-residue XUX β-strands (GGXUXGXUX), as opposed to the nonapeptide motif of AP, which contains a GGXGXD Ca2+-bound turn followed by a single three-residue XUX β-strand.
Initial modelling of MpAFP Region IV
The program Modeller was used to model MpAFP Region IV. The β-roll segment from AP (residues 330–378) was excised on the computer and used as the template. It contains five GGXGXD Ca2+-bound turns that create a 2½-loop β-roll (Figure 5A). This template posed two potential problems for the model, as it was much shorter than the 11-loop MpAFP Region IV and it contained Ca2+ ions bound down both sides of the structure. The aspartic acid-to-glycine substitution at position 6 of the nonapeptide motif in each 19-aa repeat (GGXUXGXUX) of Region IV as well as the presence of two XUX β-strands in that motif (GGXUXGXUX) as compared with AP (GGXGXDXUX) indicated that Ca2+ was likely not to be bound down that side of the model. However, the sequence of each 19-aa repeat of MpAFP Region IV was initially threaded on to the β-roll of AP by aligning each XGTGND sequence with a GGXGXD Ca2+-bound turn in AP (Figure 5B). We also introduced a gap in the alignment to accommodate the extra hydrophobic residue of each loop of Region IV (XGTGNDXUXU). Tandem 2½-loop segments of Region IV were modelled in an iterative fashion with Ca2+ present down both sides of the structure. Each segment was joined manually via a peptide bond using the program PyMol and, once complete, the model was energy-minimized and subjected to a 2 ns solvated molecular-dynamics simulation run at 4 °C using the program GROMACS. It became immediately apparent from the simulation that the model was unable to support a second row of Ca2+ ions, as they were ejected during the course of the simulation (results not shown). However, each XGTGND turn of the model was able properly to co-ordinate and retain a Ca2+ ion during the simulation, and the β-strands both immediately before and after the XGTGND turn (XUXXGTGNDXUXUGGXUXG) were stable, with their U residues creating a hydrophobic core for the protein. The main problem was with the first XUX β-strand of each nonapeptide motif (GGXUXGXUX). When aligned with AP, it was constrained as a Ca2+-bound turn instead of an XUX β-strand. This placed the side chain of each U residue in a solvent-exposed orientation, away from the hydrophobic core of the model, a situation that had a destabilizing effect on the structure. To remedy the problem, the original alignment was kept the same; however, the first XUX sequence from each nonapeptide motif of Region IV was restrained in Modeller to a β-strand conformation rather than a Ca2+-bound turn. This pointed the U residue of the XUX motif towards the interior of the structure, allowing it to interact with the other hydrophobic residues from the other β-strands.
Description of the model
The final model of MpAFP Region IV is a right-handed β-helix made up of 11 Ca2+-binding loops (Figure 6A). Each loop contains one XGTGND Ca2+-binding turn followed by three β-strands, with the U residues from each β-strand contributing to the hydrophobic core of the model (Figure 6B). This precludes a second row of bound Ca2+ ions as is seen in AP. There are three main factors that stabilize the structure:
Ca2+ ions which co-ordinate the proper folding of each XGTGND turn
a conserved hydrophobic core located throughout the length of the model
hydrogen bonding between backbone carbonyl oxygen atoms and the amide nitrogen atoms from the parallel β-strands of each loop
β-Helix model of MpAFP Region IV
The architecture of the model also allows for main-chain hydrogen bonding in the turns between the β-strands of each loop, therefore increasing the stability of each turn and hence the overall structure (Figure 6B). The most striking feature of the model is the conserved row of threonine and aspartic acid/asparagine residues that align down the Ca2+-bound side of the model (Figure 6C). This immediately drew our attention, because flat conserved arrays of threonine residues have been identified as the IBFs of the β-helical insect AFPs [13,14].
A solvated 2 ns molecular-dynamics simulation of the model was performed at 4 °C to test the feasibility of the structure (Figure 6A). The overall architecture of the model was maintained throughout the course of the simulation, with Ca2+ remaining bound in each XGTGND turn and the hydrophobic U residues of each β-strand creating the hydrophobic core. The slight twist present in the model prior to the simulation was relieved during the course of the simulation. The maintenance of the fold demonstrated that the model was stable and feasible. This was encouraging, as it identified a novel ice-binding motif (the threonine and aspartic acid/asparagine residues of each XGTGND Ca2+-bound turn) as well as a novel β-helical fold (Ca2+ bound solely down one side of the model; hydrophobic core located throughout the rest). We therefore decided to test the validity of the model using site-directed mutagenesis to confirm the location of the putative IBF and to verify its overall fold.
Site-directed mutagenesis and TH values for mutants
The interaction between an AFP and ice has been described using the analogy of a receptor (AFP) binding to its ligand (ice) . This binding is dependent upon the intimate surface/surface complementarity between the protein and the ice lattice. Therefore, the introduction of a large bulky residue to the relatively flat IBF of an AFP is very effective at lowering antifreeze activity by hindering the simultaneous engagement of all residues that normally contact the surface of an ice crystal [21–23]. We therefore made a series of sterically hindering tyrosine mutations to residues located both on and off the putative IBF of MpAFP Region IV to confirm the location of the IBF and to validate the structure of the theoretical model.
Four IBF mutants were created [two single IBF mutants (T81Y and N65Y) and two double IBF mutants (T81Y/N65Y and T81Y/T141Y)] (Figure 7A). Each mutant was purified in the same manner as wild-type Region IV and assayed for TH activity over a similar range of concentrations (0.9 down to 0.025 mg/ml). As seen in Figure 7(B), both single IBF mutants (T81Y and N65Y) showed TH activity to be decreased by 50% compared with that of wild-type Region IV. This indicated that both the threonine and aspartic acid/asparagine residues of each XGTGND Ca2+-bound turn contribute to the antifreeze activity of the protein. The double IBF mutants (T81Y/T141Y and T81Y/N65Y) each showed a decrease in the residual TH activity of a further 50%. Two single non-IBF mutants were also made (T105Y and T111Y). Thr105 lies just off the IBF and Thr111 is on the opposite surface (Figure 7A). Neither mutant showed essentially any change in TH activity as compared with the wild-type, with values of about 2 °C at a concentration of 0.5 mg/ml. This was as expected, as they are located away from the XGTGND Ca2+-bound turn of the model and are therefore not part of the IBF.
TH activity of MpAFP Region IV mutants
A final mutant (V93R) was created in order to probe the orientation of the side chain of the U residue at position 4 of the nonapeptide motif in the 19-aa loop (GGXUXGXUX) (Figure 7A). We were originally unsure as to the orientation of the side chain of this residue, as it is normally present as glycine in AP (ie: GGXGXDXUX), a situation that allows for the proper folding of a Ca2+-bound turn. However, if the side chain of the U residue points towards the interior of the structure as part of a β-strand instead of Ca2+-bound turn, changing it to a long positively charged arginine residue should disrupt the hydrophobic core of the protein and affect its folding. This would diminish the intimate surface/surface complementarity required for MpAFP Region IV to bind an ice crystal and therefore decrease its TH activity. TH values of the V93R mutant were the lowest of any mutant tested, with a decrease to less than 25% of wild-type values at a concentration of 0.9 mg/ml.
To validate the mutagenesis data further, the folding of wild-type Region IV and the T105Y, T81Y/N65Y and V93R mutants was analysed via CD spectroscopy (Figure 8). Wild-type Region IV displayed a distinctive β-sheet spectrum, with a maximum molar ellipticity at 196 nm and minimum at 218 nm. Both the double IBF mutant T81Y/N65Y and the single non-IBF mutant T105Y displayed CD spectra virtually identical with that of wild-type Region IV. This indicated correct folding of the mutated constructs. The spectrum of the V93R mutant showed a dramatic change, with a large decrease in the amplitude of both the maximum and minimum molar ellipticity. This indicated a decrease in the β-sheet content of the protein and an increase in the random-coil content. These data strongly suggested that the side chain of the U residue at position 4 of the nonapeptide motif of each 19-aa loop (GGXUXGXUX) points towards the interior of the structure, therefore confirming the notion that, unlike the situation in AP, Ca2+ is not present in that side of the model. As further evidence for the destabilizing effect of the V93R mutation on the structure of Region IV, the mutant was eluted from a Superdex G-75 size-exclusion column a full 10 ml earlier (50 ml as against 60 ml) than any other mutant or the wild-type (results not shown). Clearly MpAFP Region IV is destabilized by the V93R mutation, with a resulting increase in the Stokes radius of the protein.
CD spectra of MpAFP Region IV mutants
Q-TOF MS on MpAFP Region IV
Further confirmation for the presence of only a single row of Ca2+ in the β-helix of MpAFP Region IV was obtained by Q-TOF MS (Table 1). Two distinct masses were observed for Region IV in the presence of EDTA. The first mass of 34 405.6 Da corresponds to the predicted mass of Region IV (34 405 Da). The second mass of 34 423.4 Da corresponds to the predicted mass of Region IV with a single ammonium adduct (+18 Da, due to the presence of NH4+ ions in the sample buffer). A distribution of masses was observed for Region IV in the presence of CaCl2 centering on 34 822.1 Da and increasing and decreasing by 40 Da, the mass of a single Ca2+ ion. The theoretical average mass of Region IV with ten Ca2+ ions bound (the number of Ca2+ ions predicted by the model) is 34 805 Da (an additional 10×40 Da). Each peak observed with the CaCl2 sample had a single NH4+ ion bound, increasing the mass of each species by 18 Da. Therefore the difference between the mass of the predominant species in the CaCl2 sample and the mass of the EDTA sample with an ammonium adduct is 399 Da. This mass divided by the mass of a single Ca2+ ion is 9.9, which corresponds to 10 Ca2+ ions bound in the model. This is exactly the number predicted for Region IV. The distribution of peaks observed with the CaCl2 sample is most likely the result of dynamic exchange of Ca2+ ions at the ends of the β helix. (Note: there would be half Ca2+ binding sites at each end of the β-helix.) Also, the mass value corresponding to 11 Ca2+ ions bound would indicate that the sequence before or after Region IV is able to bind Ca2+ as well. However, the manner in which it does so cannot be predicted by homology-based modelling. If indeed MpAFP Region IV did bind two Ca2+ ions per loop of the structure, a distribution of masses centering on 35 205 Da would be expected (additional 20×40=800 Da). This is clearly not the case.
|Sample||Peak no.||Mass observed (±1 Da)||NH4+ adduct (+18 Da)||Predicted mass (Da)||Difference from observed mass of Region IV in EDTA+1NH4+ adduct (Da)||Difference divided by mass of single Ca2+ion||Number of Ca2+ ions bound|
|2 mM EDTA||1||34405.6||No||34405||N/A||0|
|2 mM CaCl2||1||34783.9||Yes||34783||360.5||9.0||9|
|Sample||Peak no.||Mass observed (±1 Da)||NH4+ adduct (+18 Da)||Predicted mass (Da)||Difference from observed mass of Region IV in EDTA+1NH4+ adduct (Da)||Difference divided by mass of single Ca2+ion||Number of Ca2+ ions bound|
|2 mM EDTA||1||34405.6||No||34405||N/A||0|
|2 mM CaCl2||1||34783.9||Yes||34783||360.5||9.0||9|
General discussion of the model and comparison with hyperactive insect AFPs
Our model of MpAFP Region IV, based partly on the Ca2+-bound β-roll from AP, is able to explain both the Ca2+-dependency and hyperactivity of this AFP. Ca2+ is necessary to co-ordinate the proper folding of each XGTGND turn in the model. This, in combination with the repetitive nature of Region IV, creates a long flat IBF on the protein, consisting of the outward-projecting threonine and aspartic acid/asparagine residues from each XGTGND Ca2+-bound turn on that side of the β-helix. The threonine and aspartic acid/asparagine residues create a novel ice-binding motif that is similar to the TXT ice-binding motif of the hyperactive β-helical insect AFPs from both C. fumiferana  and that of the yellow mealworm beetle, Tenebrio molitor . The hyperactivity of the insect AFPs is attributed to their ability to bind the basal planes of ice in addition to prism planes [13,14,24], a phenomenon not attributed to the moderately active fish AFPs . The insect AFPs contain a repetitive array of TXT β-strands that align in a flat parallel β-sheet on one side of the protein [13,14,24]. The distance between side-chain oxygen atoms of threonine in the same β-strand is about 7.35 Å (1 Å=0.1 nm), whereas the distance between side-chain oxygen atoms of threonine from adjacent β-strands is about 4.5 Å. These distances match the spacing of oxygen atoms in an ice lattice on both the primary prism and basal planes. Oxygen atoms repeat at 4.52 Å along the a-axis in both the primary prism and basal planes. Oxygen atoms also repeat at 7.35 Å along the c-axis of the primary prism plane and at 7.8 Å along the a-axis of the basal plane. In the model of MpAFP Region IV, the spacing between threonine and asparagine side-chain oxygen atoms in the same Ca2+-bound turn is about 7.35 Å, whereas the spacing between the side-chain oxygen atoms of both threonine and asparagine from adjacent Ca2+-bound turns is about 4.5 Å. This spacing, which could only occur in the Ca2+-bound form of the protein, would allow Region IV to bind both the primary prism and basal planes of an ice crystal, therefore explaining its Ca2+-dependent hyperactive antifreeze activity.
We have chosen this uniform orientation of the threonine residues that places the methyl groups on the exterior of the IBF, because the spacing between oxygen atoms matches that seen with the two insect AFPs. However, although this orientation is consistent with the molecular-dynamics simulations performed, there is no clear indication that it is preferred over one where the methyl groups are on the inside in place of the hydroxy groups. Another reason to choose the first orientation is that the linear array of threonine hydroxy groups is then flanked, and perhaps insulated, by two hydrophobic zones, one formed by the outer rank of threonine methyl groups and the other by the rank of backbone methylene groups from the Cα atoms of the glycine residues located between the threonine and aspartic acid/asparagine residues in the Ca2+-bound turns of the β-helix. Here the methylene groups may serve a similar function to that of the methyl groups of the second threonine array in the insect AFPs [13,14]. Thus, although at first sight the IBF of MpAFP appears to be dominated by hydrogen-bonding groups, the threonine hydroxy groups are flanked by hydrophobic zones, as are the threonine hydroxy groups of the longest threonine array in the insect AFPs. The relative contribution of hydrogen-bonding and hydrophobic interactions to ice binding continues to be a contentious issue and is an active area of investigation in the antifreeze field [25–28].
The difference in location and aa composition of the IBF of MpAFP Region IV compared with that in the hyperactive β-helical insect AFPs is noteworthy. MpAFP Region IV is the first putative β-helical AFP to locate its IBF in a Ca2+-bound turn as opposed to a flat parallel β-sheet, as is seen with the insect AFPs. Region IV is also the first AFP to use a TGN ice-binding motif. This is in contrast with the TXT sequences that define the ice-binding motifs of the insect AFPs. The X residue of each insect TXT motif points towards the interior of the structure, helping to stabilize the β-helix and to avoid steric clashes on the IBF of the protein. Glycine is necessary in the TGN ice-binding motif of Region IV to allow proper folding of each XGTGND Ca2+-bound turn and to maintain a flat ice-binding surface. The fact that such different ice-binding motifs in different locations of a β-helix are able to lower the freezing point of a solution in a hyperactive manner is a testament to the plasticity of the β-helix as an AFP structural fold.
The aa composition of MpAFP Region IV is rich in acidic residues (pI 3.67, calculated from the aa composition). Aspartic acid accounts for 13.5% of the residues, whereas lysine and arginine combined account for only 2.9% of the total residues. Glycine is the most prevalent aa, at 18.4%. The region is also high in threonine, serine, (iso)leucine and valine, and it does not contain a single proline or cysteine residue. Proteins high in these specific residues have a propensity to fold as β-helices . The high proportion of glycine is necessary to allow proper turns in the helix, while valine and (iso)leucine have high β-sheet propensities and their side chains create the hydrophobic core for the protein. Proline residues are under-represented, owing to their destabilizing effects on β-sheets. The actual pI of Ca2+-bound Region IV is probably higher than pH 3.67, owing to the fact that 11 of its 28 aspartic acid residues point inwards to co-ordinate Ca2+ ions in the XGTGND turns. Their charge is offset by the Ca2+ ions and they are not exposed to the solvent.
We were originally unsure of the contribution that aspartic acid/asparagine residues of the XGTGND Ca2+-bound turns made towards ice binding. However, the mutagenesis data clearly show that the threonine and aspartic acid/asparagine residues contribute roughly equally to the ice-binding ability of this protein. Single mutations of residues located in the middle of the flat IBF, namely T81Y or N65Y, both lowered TH levels by 50%. The double threonine or threonine/asparagine mutants, namely T81Y/N65Y and T81Y/T141Y, both lowered TH levels to 25%. This clearly implicates the aspartic acid/asparagine residues of each XGTGND Ca2+-bound turn in ice-binding, therefore making this the first report of a repetitive array of both threonine and aspartic acid/asparagine constituting the IBF of an AFP.
Structural basis for the novel β-helical fold
The presence of a single row of bound Ca2+ ions in Region IV, as opposed to two, is understandable when its aa sequence is compared with that of AP. Each 19-aa loop of Region IV contains only one XGTGND Ca2+-binding motif. This is in contrast with each 18-aa loop of AP, which contains two GGXGXD Ca2+-binding motifs. Region IV only requires Ca2+ down the side of the IBF, because it is stabilized by a more extensive hydrophobic core down the opposite side of the structure. Every 18-aa loop of AP contains two inward-pointing hydrophobic U residues that are found in each XUX β-strand. The 19-aa loops of Region IV contain 4 hydrophobic U residues that point inwards. These two extra hydrophobic residues are located at the end of each decapeptide motif (that is, XGTGNDXUXU) and in the first β-strand of each nonapeptide motif (that is, GGXUXGXUX). The aspartic acid-to-glycine mutation at position 6 of each nonapeptide motif of Region IV (that is, GGXUXGXUX) was perhaps the strongest indicator that Ca2+ cannot be bound down the side opposite the IBF, since aspartic acid residues are crucial in the co-ordination of Ca2+ in a GGXGXD turn. Molecular-dynamics simulations suggested that, in the absence of the aspartic acid residues, the remaining interactions with the carbonyl oxygen atom of the glycine and ‘X’ residues are insufficient to keep the Ca2+ in place (results not shown). Instead, the conserved glycine residue at that position facilitates the turn between XUX β-strands in each GGXUXGXUX nonapeptide motif of Region IV.
Differences between Region IV and other β-helical proteins
MpAFP is the first reported instance of a β-helix that contains a single Ca2+-binding turn per loop. There are over 100 structures in the PDB (Protein Data Bank) today that contain some variation of a β-helix fold . The only β-helical proteins identified to date that bind Ca2+ and utilize it in a structural role are AP and its homologues, as well as C5-epimerase from Azotobacter vinlandii . They all contain the archetypal RTX nonapeptide motif of a GGXGXD Ca2+-binding turn followed by a three-residue XUX β-strand. The 19-aa repeats of MpAFP Region IV therefore represent a novel β-helical fold that contains a single RTX Ca2+-binding motif per loop of the structure. The prevalence of these novel β-helical 19-aa repeats in other proteins in the database is much lower as compared with the classic tandemly repeated RTX nonapeptide motifs. The top three BLASTp hits to MpAFP Region IV are RTX-like proteins produced by alphaproteobacteria from the Magnetospirillum genus. All three contain 12 copies of the 19-aa β-helix repeat that are nearly identical with MpAFP Region IV. However, they do not contain the conserved threonine and asparagine residues at the XGTGND position of the Ca2+-bound turn that is seen in Region IV. We have cloned and recombinantly expressed the top BLASTp hit to Region IV (gi=83311100) in E. coli as a C-terminal His6-tagged protein (results not shown). The protein did not display any TH activity, further demonstrating the necessity of the conserved threonine and aspartic acid/asparagine residues for ice binding.
Comparison of MpAFP with models of plant AFPs
Asparagine residues have been implicated in ice binding in a carrot (Daucus carota) AFP . A theoretical model exists for the protein, which has been predicted to fold as a β-helical leucine-rich repeat with loops of 23–25 aa in length. A well-conserved leucine/isoleucine hydrophobic core stabilizes the structure, and the putative IBF of the model consists of an asparagine residue in each loop flanked by hydrophilic residues on either side. The TH activity of the wild-type protein is very low, however, with TH levels of about 0.4 °C at a concentration of 2.5 mg/ml. It does have appreciable ice re-crystallization inhibition activity, however. The authors of  support the contention that the low TH activity of the protein is the result of the plant having evolved a freeze-tolerance strategy as opposed to a freeze-avoidance strategy, where high levels of TH are actually detrimental to the organism.
A theoretical β-roll model also exists for the AFP from Lolium perenne (perennial ryegrass) . It contains a conserved hydrophobic core of valine residues, with an asparagine ‘ladder’ at each end of the roll that helps to stabilize the structure. These asparagine residues are internal and make hydrogen-bond contacts with the β-roll backbone. The model contains two potential ice-binding surfaces on either side of the structure, which consist of two rows of solvent-accessible residues similar to those of the insect AFPs, but less well conserved. As was the case for the carrot AFP, this AFP has very low TH activity, but is very effective at inhibiting ice recrystallization, a common trend displayed among plant AFPs.
Comparison of MpAFP with other bacterial AFPs
Our biophysical characterization, modelling and site-directed mutagenesis of MpAFP Region IV has generated the first plausible structure of a bacterial AFP. Prior to this presently reported study, only two other bacterial AFPs had been characterized at the sequence level. The arctic plant-growth-promoting rhizobacterium Pseudomonas putida GR12-2 secretes a 164 kDa lipoglycoprotein (afpA) that contains five RTX-like XGXGXD repeats within its sequence . AfpA shows affinity for ice by being able to shape ice crystals, but its antifreeze activity is questionable, since no TH value has been reported. It is unclear whether afpA's XGXGXD repeats play an active role in its ability to shape ice. TH assays of the recombinantly expressed protein were performed in 1mM EDTA, which, in the case of MpAFP Region IV, would have destroyed its ice-shaping ability. However, afpA was still able to shape ice crystals. It is therefore possible that afpA's XGXGXD repeats perform some function involved in the secretion and subsequent extracellular targeting of the protein and not in its antifreeze activity. This idea is bolstered by the fact that these repeats do not contain the requisite threonine and aspartic acid/asparagine residues at the corresponding XGXGXD positions of the motif that have been shown to be vital for TH activity in MpAFP Region IV. Also, if indeed afpA's XGXGXD repeats fold as a Ca2+-stabilized β-roll, the size of the roll, and hence the putative IBF, of the structure would likely be too small to effectively inhibit the growth of ice.
More recently, an Antarctic strain of the Gram-negative bacterium Colwelia sp. SLW05 was found to produce a 26 kDa IBP (ice-binding protein) that it secretes into the medium . The protein is able to shape ice crystals, but it displays extremely low TH values (<0.1 °C). The aa sequence of the protein is similar to those of IBPs of other sea-ice bacteria (Polaribacter irgensii and Psychromonas ingrahamii), a sea-ice diatom (Navicula glaciei) and a snow-mould fungus (Typhula ishikariensis). There is, however, no similarity between it and any region of MpAFP. Like the plant AFPs from D. carota and L. perenne, it is hypothesized to function as an inhibitor of ice recrystallization, where it can effectively limit the growth of large ice crystals at the expense of smaller ones.
In conclusion, we have identified a novel β-helical ice-binding fold found in the region of MpAFP responsible for its antifreeze activity. On the basis, partly, of the β-roll from AP, the model for Region IV identified a repetitive array of threonine and aspartic acid/asparagine found in each XGTGND Ca2+-bound turn of the β-helix that constitutes the IBF of the protein. A single row of Ca2+ ions is required in the structure to co-ordinate properly the folding of each XGTGND turn. This turn is followed by three β-strands in each loop, the hydrophobic residues of which point inwards, creating a structurally stabilizing hydrophobic core for the protein. The Ca2+-dependent hyperactivity of the protein is explained by the ability of the side-chain oxygen atoms of the threonine and aspartic acid/asparagine residues located in the Ca2+-bound turns of the β-helix to mimic the distance of oxygen atoms in an ice crystal in multiple planes of the ice lattice, a mechanism similar to that employed by the hyperactive β-helical insect AFPs.
Note added in proof (Received 7 February 2008)
Since this paper was proofed out, the crystal structure of an extracellular lipase from Serratia marcescens was solved . The structure reveals a β-roll with Ca2+ ions bound internally down one side of the roll, lending further support to the plausibility of the model put forward in the present paper.
We thank Mr Kim Munro and Dr David Hyndman from the Protein Function Discovery facility at Queen's University for help with acquiring and interpreting CD and MS data respectively. We thank Mrs Sherry Gauthier of this Department for her expert technical assistance with the mutagenesis, and Dr Laurie A. Graham, also of this Department, for her thoughtful critique of this manuscript before its submission. This research was funded by a grant to P. L. D. from the Canadian Institutes for Health Research. P. L. D. holds a Canada Research Chair in Protein Engineering. C. P. G. was supported by an Natural Sciences and Engineering Research Council of Canada 3-Year Postgraduate Doctoral Scholarship (NSERC-PGSD3).
Present address: Plymouth Marine Laboratory, Prospect Place, The Hoe, Plymouth, PL1 3DH, U.K.
Present address: Private Bag 51, University of Tasmania, Hobart, Tasmania 701, Australia.