A decade ago, motif at N-terminus with eight-cysteines (MANEC) was defined as a new protein domain family. This domain is found exclusively at the N-terminus of >400 multi-domain type-1 transmembrane proteins from animals. Despite the large number of MANEC-containing proteins, only one has been characterized at the protein level: hepatocyte growth factor activator inhibitor-1 (HAI-1). HAI-1 is an essential protein, as knockout mice die in utero due to placental defects. HAI-1 is an inhibitor of matriptase, hepsin and hepatocyte growth factor (HGF) activator, all serine proteases with important roles in epithelial development, cell growth and homoeostasis. Dysregulation of these proteases has been causatively implicated in pathological conditions such as skin diseases and cancer. Detailed functional understanding of HAI-1 and other MANEC-containing proteins is hampered by the lack of structural information on MANEC. Although many MANEC sequences exist, sequence-based database searches fail to predict structural homology. In the present paper, we present the NMR solution structure of the MANEC domain from HAI-1, the first three-dimensional (3D) structure from the MANEC domain family. Unexpectedly, MANEC is a new subclass of the PAN/apple domain family, with its own unifying features, such as two additional disulfide bonds, two extended loop regions and additional α-helical elements. As shown for other PAN/apple domain-containing proteins, we propose a similar active role of the MANEC domain in intramolecular and intermolecular interactions. The structure provides a tool for the further elucidation of HAI-1 function as well as a reference for the study of other MANEC-containing proteins.

INTRODUCTION

Motif at N-terminus with eight-cysteines (MANEC) was discovered in 2004 as a new class of protein domains with unknown function and structure. Initially, the domain was believed to contain seven cysteines, hence the early name MANSC [1], however, subsequent sequence analysis added the eighth cysteine to the conserved pattern (BLAST and SMART, http://smart.embl-heidelberg.de/) [2]. The domain encompasses 80–100 amino acid residues and is found in more than 420 proteins exclusively from animals. Apart from the uncharacterized protein UPI0002B46A4F from Hydra vulgaris predicted to contain 11 MANEC domains, MANEC domains occur as a single domain motif in the primo part of type-I transmembrane multi-domain proteins.

Only one MANEC-containing protein, hepatocyte growth factor activator inhibitor-1 (HAI-1), has been functionally characterized. The functions of two other MANEC-containing proteins, LRP-11 and KIAA0319, were predicted from sequence homology (see Supplementary Figure S1). The three proteins are all among the five MANEC-containing proteins in humans (Supplementary Figure S2A). HAI-1 was originally identified as an inhibitor of hepatocyte growth factor activator (HGFA) [3,4]. The inhibitory repertoire of HAI-1 has subsequently expanded to include the type-II transmembrane serine proteases matriptase [4,5] and hepsin [6,7] as well as the membrane-associated serine protease prostasin [6]. The observation that HAI-1 deficient mice die in utero due to placental defects further supported the importance of its protease inhibitory activity [8]. A mild overexpression of matriptase relative to HAI-1 results in ∼70% of the mice developing spontaneous squamous cell carcinoma. The oncogenic effects of matriptase were removed by additional expression of HAI-1 [9]. Moreover, HAI-1 appears to be an important regulator of a delicate balance between homoeostasis and the development of cancer. As a result, low HAI-1 levels have been proposed as a prognostic marker of a poor prognosis in several cancers [10,11].

The mature HAI-1 protein has a molecular mass of ∼53 kDa and encompasses a large uncharacterized N-terminal region containing the MANEC domain followed by a predicted ‘internal’ domain, a Kunitz-type inhibitor domain, a LDLR (low-density lipoprotein receptor) class A domain, a second Kunitz-type inhibitor domain, a transmembrane region, and a short C-terminal cytoplasmic tail (Figure 1 and Supplementary Figure S1). HAI-1 exists in two splice variants, the full-length isoform 1 [3] and isoform 2 [12] with a 16 amino acid deletion (306–321) between the first Kunitz-type inhibitor domain and the LDLR class A domain. Three potential N-glycosylation sites are found at positions 66, 235 and 523 (507 in isoform 2).

HAI-1 full length protein sequence with domain locations

Figure 1
HAI-1 full length protein sequence with domain locations

Signal sequence, domain architecture, and transmembrane region are shown as labelled boxes below the corresponding amino acid residues (isoform 1, Uniprot #O43278-1). The location of signal sequence, Kunitz-type inhibitor and LDLR class A domins and the transmembrane region is as previously reported in Shimomura et al. (1997) [3]. The location of MANEC sequence corresponds to what was determined in the present study. The location of the internal domain is our prediction based on Kojima et al. (2008) [14]. The positions of the three potential N-glycosylation sites are indicated with an asterisk (*).

Figure 1
HAI-1 full length protein sequence with domain locations

Signal sequence, domain architecture, and transmembrane region are shown as labelled boxes below the corresponding amino acid residues (isoform 1, Uniprot #O43278-1). The location of signal sequence, Kunitz-type inhibitor and LDLR class A domins and the transmembrane region is as previously reported in Shimomura et al. (1997) [3]. The location of MANEC sequence corresponds to what was determined in the present study. The location of the internal domain is our prediction based on Kojima et al. (2008) [14]. The positions of the three potential N-glycosylation sites are indicated with an asterisk (*).

The protease inhibitory activity resides strictly in the first Kunitz-type inhibitor domain. The N-terminal region encompassing the MANEC domain and the internal domain has been proposed to play a role in the regulation of the inhibitory activity of HAI-1. Moreover, the MANEC domain has been suggested to participate in both intramolecular interactions with other domains of HAI-1 and intermolecular interactions with the target protease [13,14].

Characterization of the MANEC domain remains a challenge for a comprehensive biochemical understanding of MANEC-containing proteins. A keystone for elucidating protein function is the availability of structural information. Unfortunately, sequence-based homology searches failed to predict the MANEC fold. We have generated the MANEC domain from human HAI-1 and show that the expressed protein domain is soluble and stable. Using state-of-the-art four-dimensional NMR technology we solved the three-dimensional solution structure. The structure revealed a well-defined fold with a four disulfide bond pattern. Unexpectedly, structure-based homology searches revealed a close homology to the PAN/apple domain family but also distinct differences from proteins of this family. Based on our data, we identify MANEC domains as a new subclass of the PAN/apple domain family. The homology to the PAN/apple domains suggests a similar role of the MANEC domain as a mediator of molecular interactions with potential regulatory properties for the activity of the parent protein.

EXPERIMENTAL

Protein production

The DNA sequence encoding the MANEC domain (G47-L152) was amplified from full-length human HAI-1 cDNA (isoform 1, Uniprot #O43278-1). The N66Q variant was generated by site-directed mutagenesis. The PCR product was subcloned into the XhoI–SalI sites of the pPICZaA expression vector (Invitrogen) and linearized by digestion with the restriction enzyme SacI prior to transformation into Pichia pastoris X-33 strain (Invitrogen). Protein producing clones were stored at −80°C in 15% glycerol. Protein production followed the manufacturer's recommendations (Invitrogen).

Protein purification

The medium was cleared from yeast cells by centrifugation and concentrated >20-fold using a stirring cell concentrator (3.5 kDa cut-off). The concentrate was loaded on to a Ni-NTA (nitrolotriacetate) column equilibrated with 20 mM NaH2PO4 pH 6.5, 100 mM NaCl, washed with buffer supplemented with 20 mM imidazole and finally bound protein was eluted by increasing the imidazole concentration to 300 mM. The sample was dialysed against 20 mM NaH2PO4 pH 6.5, 100 mM NaCl. The identity and integrity of the purified protein was verified by Western blotting and mass spectrometry (MS). Protein concentration was measured by absorbance or attenuance at 280 nm using a calculated molar absorbtion coefficient of 11770 M−1·cm−1 from the Protparam service (http://web.expasy.org/protparam/).

Uniform 13C and 15N protein labelling

MANEC producing yeast was inoculated in 5 ml of YPD (yeast extract-peptone-dextrose) medium and grown for 1 day. The yeast cells (250 μl) were pelleted by centrifugation at 5000 g, resuspended in 5 ml of growth medium containing 100 mM KH2PO4 pH 6.0, 0.34% yeast nitrogen base (YNB) [lacking (NH4)2SO4], 0.5% (w/v) (15NH4)2SO4 and 2% (w/v) 13C-glucose (99%, Cambridge Isotope Laboratories) buffer medium and grown for an additional day. The following day, an additional 100 ml of labelled growth medium was added and cells allowed to grow 1 day more. The yeast cells were harvested by centrifugation at 5000 g, resuspended in 500 ml of induction-medium containing a similar composition to growth medium with methanol instead of glucose as a carbon source and supplemented daily with 1% (v/v) 13C-methanol for 4 days. Suspension cultures were grown at 28°C with orbital shaking at 300 rev./min in ruffled conical flasks. Labelling efficiency was determined to >99% by MS (Supplementary Figure S3A). Protein labelled uniformly with 15N only was obtained by a similar procedure (mass spectrum not shown).

Mass spectrometry

Approximately 2 μg of purified intact MANEC protein was acidified and desalted using Poros50 R1 micro columns essentially as described by Gobom et al. [15]. Bound protein was eluted using 90% acetonitrile, 0.1% TFA (trifluoroacetic acid), lyophilized, and resuspended in 2 μl of 1% TFA and mixed with an equal volume of DHAP (2,5-dihydroxyacetophenone) matrix solution prepared in 20 mM diammonium hydrogen citrate, 75% (v/v) ethanol. After thorough mixing, samples were spotted on to a matrix-assisted laser desorption-ionization (MALDI) target and mass spectra acquired using a Bruker Autoflex III instrument operated in linear mode and calibrated in the mass range of 5000–17500 Da using Protein calibration standard I (Bruker Daltronics). The degree of 13C and/or 15N labelling was estimated on the basis of mass increase relative to the unlabelled material. For fragmentation studies, purified MANEC (3 μg) was digested for 1 h at 37°C using porcine trypsin (Promega) in 50 mM ammonium bicarbonate containing 5 mM iodoacetamide to block any free cysteines. Approximately 20 pmol protein was removed and desalted using a StageTip (C8, Thermo Scientific). Bound peptides were eluted directly on to the MALDI target plate using α-cyano-4-hydroxy-cinnamic acid in 70% acetonitrile, 0.1% TFA. Peptides were subsequently analysed by using an Autoflex Smartbeam III instrument (Bruker) operated in positive and linear mode. Prior to analysis, the instrument was calibrated by external calibration using a peptide mixture containing seven calibrants (Bruker). The obtained data were evaluated by using the GPMAW software (gpmaw.com).

CD spectroscopy

CD spectroscopy was performed on a JASCO-810 circular dichroism (CD) system with Peltier temperature control. Wavelength scans were averages of five scans between 190 nm and 350 nm collected at 25°C in a quartz cuvette with a 1-mm path length with 5 μM protein in 10 mM NaH2PO4 pH 7.4. Secondary structure analysis was perfomed using an online service (http://dichroweb.cryst.bbk.ac.uk/html/home.shtml). For thermal denaturation experiments, 10 μM samples were heated from 25°C to 95°C at 1°C/min with or without 10 mM DTT (dithiothreitol). All CD data were buffer subtracted. Tm values were determined by sigmoidal curve fit between the low (native) and high (unfolded) temperature plateaus, as averages of at least three experiments.

NMR sequence-specific resonance assignment

NMR measurements used for backbone resonance assignment were acquired at 15°C and 37°C on a Bruker 500-MHz spectrometer equipped with a triple resonance probe. The 3D experiments were processed with NMRpipe [16] and analysed with SPARKY [17]. Sequence-specific backbone resonance assignments were obtained from 3D HNCACB, HN(CO)CACB, HNCO and HN(CO)CA.

NMR side chain resonance assignment and distance restraint extraction from NOESY spectra

NMR measurements used for side chain resonance assignment and collection of distance restraints were acquired at 25°C on an Agilent DDR2 800 MHz spectrometer equipped with a cryogenic probe-head and an Agilent DDR2 600 MHz spectrometer equipped with a ‘PENTA’ (1H,13C,15N, 31P, 2H) probe-head.

Both 3D and 4D experiments were performed using sparse random sampling of indirectly detected dimensions to increase resolution (relevant parameters of all NMR spectra measured with NUS (non-uniform sampling) are given in Supplementary Table S4). Chemical shifts in 1H NMR spectra were reported with respect to external deuterated 4,4-dimethyl-4-silapentane-1-sulfonic acid. Chemical shifts of 13C and 15N signals were referenced indirectly using the 0.251449530 and 0.101329118 frequency ratios for 13C/1H and 15N/1H, respectively [18]. The 2D and conventionally-sampled 3D experiments were processed with NMRpipe [16], 3D and 4D NUS experiments were processed by SSA software package available at http://nmr.cent3.uw.edu.pl/software [1921]. Processed spectra were analysed with SPARKY [17]. Aliphatic side chain assignment was achieved using 1H–13C HSQC, 4D HabCab(CO)NH [22,23] and 4D HCCH-TOCSY [21,24,25]. The aromatic side chain resonances were assigned from the analysis of 1H–13C HSQC tuned to aromatic carbons, 2D (HB)CB(CGCD)HD, 2D (HB)CB(CGCDCE)HE, 3D HBCB(CGCD)HD, 3D HBCB(CGCDCE)HE [26] acquired at 800 MHz, 3D 13C-edited NOESY HSQC tuned to aromatic carbons (measured at 600 MHz) [27] and 800 MHz 4D 13Cali,13Caro-edited HMQC–NOESY–HSQC [28]. Distance constraints were obtained at 800 MHz from the 15N-edited NOESY–HSQC [29], 4D 13Cali,13Caro-edited HMQC–NOESY–HSQC, 4D 13Cali,13Cali-edited HMQC–NOESY–HMQC [30,31], and 4D 15N,13C-edited HMQC–NOESY–HSQC [21].

NMR structure calculation procedure

The NMR structures were calculated using the CYANA 3.96 automated NOE assignment and structure-calculation protocol [32]. Input for the structure calculations were peak lists from the following five NOESY spectra: 4D 13C-edited aliphatic–aliphatic NOESY (2479 peaks); 4D 13C-edited aliphatic–15N-edited NOESY (811 peaks); 4D 13C-edited aromatic–aliphatic NOESY (190 peaks); 3D 15N-edited NOESY (1662 peaks); 3D 13C-edited aromatic NOESY (59 peaks).

Tolerances for chemical shift assignments were set to 0.02 ppm for all 1H dimensions, and 0.4 for 15N and 13C dimensions in all spectra. In each structure calculation cycle, 80 structures were calculated of which 20 were selected for CYANA analysis. Ambiguous assignments were kept in the final cycle. In order to cross-validate the final structures, no dihedral-angle restraints from TALOS+ (http://spin.niddk.nih.gov/bax/nmrserver/talos/) [33] were included in the structure calculations.

NMR structure refinement and validation

Structure refinement was done using a simulated-annealing protocol with explicit solvent in the software YASARA Structure [34]. The protocol employed the YASARA2 force field with potentials for Van der Waals and electrostatic interactions to improve non-bonded interactions, and knowledge-based potentials to define rotamer states. For structure refinement, the 20 best (lowest energy) conformers were used, together with the upper-distance restraint limits of the last CYANA cycle. The upper-distance restraint limits were exported to XPLOR format for use in YASARA. Structures were refined until no violations of distance restraints above 0.5 Å (1 Å=0.1 nm) were found. Local and global geometry of the final ensemble of 20 refined structures were checked using the WHATIF web server (http://swift.cmbi.ru.nl/whatif/). In addition, two independent measures of validation using experimental data were applied to the structures: (1) determination of the disulfide bonding partners by MS and (2) comparison of TALOS+ chemical shift-based secondary-structure predictions with secondary-structure elements in the final refined structure ensemble.

SAXS data acquisition and analysis

The SAXS data were collected on the laboratory-based instrument at iNANO, Aarhus University, Denmark as described in [35]. Samples were placed in re-usable quartz capillaries at a controlled temperature of 25°C. The acquisition time was 3600 s for a MANEC protein sample of 1.0 mg/ml in 20 mM NaH2PO4, pH 7.4, 100 mM NaCl. Corresponding buffer data were collected. Background buffer subtraction and conversion of the data to absolute scale by use of water as a primary standard was performed using the SUPERSAXS program package (Jan S. Pedersen and Cristiano L. Pinto Oliveira, Aarhus University, unpublished). The instrumental sample-to-detector distance was set to 640 cm, giving a q range of 0.01–0.345 Å−1, where q is the length of the scattering vector, defined as q=4πsin(θ)/λ, where λ is the X-ray wavelength at 1.54 Å and 2θ is the scattering angle between the incident and scattered beam. The final intensity is I(q), in units of cm−1. Indirect Fourier transformation analysis was performed using the program WIFT ([36] and Jan S. Pedersen and Cristiano L. Pinto Oliveira, Aarhus University, unpublished) to obtain the pair distance distribution function, p(r), along with the characteristic parameters: the maximum diameter, Dmax, the radius of gyration, Rg and the forward scattering, I(q=0). The theoretical scattering profile of the NMR structure was computed and compared to the experimental data using the program CRYSOL [37]. To obtain a structural model of the scattering molecule in solution without the use of any assumptions other than Dmax, ab initio modelling was performed using the program DAMMIF [38]. Finally, our NMR structure ensemble was directly compared to the ab initio SAXS model using part of the DAMAVER program package, specifically the programs SUPCOMB [39] and DAMSEL [40].

ACCESSION NUMBERS

The atomic coordinates, resonance assignment lists and distance restraints lists (PDB ID code: 2msx) have been deposited in the Protein Data Bank, Research Collaboratory for Structural Bioinformatics, Rutgers University, New Brunswick, NJ, U.S.A. (http://www.rcsb.org/) and to BMRB (accession code: 25139), University of Wisconsin, WI, U.S.A. (http://www.bmrb.wisc.edu).

RESULTS

The MANEC domain from HAI-1 is expressed as a soluble entity

To allow for the expression of the MANEC domain, we aligned all available HAI-1 sequences to obtain a consensus sequence of the MANEC domain. This analysis identified G47-L152 as the MANEC domain of human HAI-1 (Supplementary Figure S2B). The N-terminal region of human HAI-1 encompassing M1-A35 is predicted to represent the signal peptide, producing mature HAI-1 containing a short segment of 11 amino acid residues preceding the MANEC domain. This segment P36-A46 was predicted to be unstructured due to the high content of Pro and Gly/Ala (I-TASSER [41]). Likewise, the segment succeeding L152 is also predicted to be unstructured (I-TASSER), and a good candidate for an interdomain-linker region between MANEC and the predicted internal domain (Figure 1). Hence, we decided to clone and express the 106 amino acid residue segment representing G47-L152. The MANEC domain was expressed in P. pastoris to facilitate correct folding including the establishment of four predicted disulfide bonds. An N66Q substitution was introduced to remove the predicted glycosylation site, thus producing a protein migrating similarly to deglycosylated wild-type MANEC on SDS/PAGE (Supplementary Figure 2A). Analysis by MS produced the expected mass of 12815 Da, showing the absence of any post-translational modifications (Supplementary Figure S3A). When subjected to size-exclusion chromatography, MANEC eluted as a single peak with an elution time corresponding well to the theoretical mass of the monomer (12.8 kDa) (Figure 2B) with no tendency for interdomain cysteine bond formation (Figure 2A, lane 5). The purified protein was soluble with no sign of precipitation at high concentrations (≤40 mg/ml). In summary, the purified HAI-1 MANEC domain appears as a well-behaved isolated protein suitable for structural studies.

Biochemical and biophysical characterization of the purified MANEC

Figure 2
Biochemical and biophysical characterization of the purified MANEC

(A) SDS/PAGE analysis (15%) of MANEC purified from Pichia pastoris. Lanes 1–5 contain: 3 μg wild-type MANEC (lane 1), wild-type MANEC incubated with PNGaseF to remove N-linked glycans (lane 2), N66Q MANEC (lane 3), PNGaseF (lane 4) and N66Q MANEC under nonreducing conditions (lane 5). The samples in all lanes but lane 5 were incubated with 1% 2-mercaptoethanol to disrupt disulphide bridges. Prestained protein marker (Fermentas) is shown between lanes 4 and 5 with corresponding masses shown on the right. (B) Elution profile of MANEC protein established by size-exclusion chromatography by using a Superdex75 column (GE Healthcare). MANEC eluted as a single peak at 13.8 ml. The void volumne and elution volumes of protein standards are indicated above the profile. (C) A representative CD wavelength scan spectrum for a 5 μM MANEC sample recorded at 25°C. The ellipticity (mdeg) is given as a function of the wavelength of the circular polarized light (nm). (D) Thermal denaturation curve for MANEC protein by CD spectroscopy. The change in CD signal as a function of increasing temperature was monitored at 220 nm. A representative thermal scan performed in standard phosphate buffer is shown with grey circles, and supplemented with 10 mM DTT (dithiothreitol) to reduce disulfide bonds is shown with black squares. The corresponding fit to a sigmoidal curve to obtain the Tm value for the DTT (dithiothreitol) experiment is shown as a solid grey line.

Figure 2
Biochemical and biophysical characterization of the purified MANEC

(A) SDS/PAGE analysis (15%) of MANEC purified from Pichia pastoris. Lanes 1–5 contain: 3 μg wild-type MANEC (lane 1), wild-type MANEC incubated with PNGaseF to remove N-linked glycans (lane 2), N66Q MANEC (lane 3), PNGaseF (lane 4) and N66Q MANEC under nonreducing conditions (lane 5). The samples in all lanes but lane 5 were incubated with 1% 2-mercaptoethanol to disrupt disulphide bridges. Prestained protein marker (Fermentas) is shown between lanes 4 and 5 with corresponding masses shown on the right. (B) Elution profile of MANEC protein established by size-exclusion chromatography by using a Superdex75 column (GE Healthcare). MANEC eluted as a single peak at 13.8 ml. The void volumne and elution volumes of protein standards are indicated above the profile. (C) A representative CD wavelength scan spectrum for a 5 μM MANEC sample recorded at 25°C. The ellipticity (mdeg) is given as a function of the wavelength of the circular polarized light (nm). (D) Thermal denaturation curve for MANEC protein by CD spectroscopy. The change in CD signal as a function of increasing temperature was monitored at 220 nm. A representative thermal scan performed in standard phosphate buffer is shown with grey circles, and supplemented with 10 mM DTT (dithiothreitol) to reduce disulfide bonds is shown with black squares. The corresponding fit to a sigmoidal curve to obtain the Tm value for the DTT (dithiothreitol) experiment is shown as a solid grey line.

The MANEC domain is thermodynamically stable and homogenously folded

The secondary structure content of MANEC was investigated by CD spectroscopy. From the shape of the wavelength scan spectrum, it was predicted that the protein contains both α-helical and β-sheet structures (Figure 2C). The thermal denaturation profile of MANEC evaluated by CD spectroscopy revealed a minimal loss of secondary structure up to ∼80°C (Figure 2D). When the experiment was performed under reducing conditions, a single melting point transition was observed with a Tm of 64.0±0.1°C (Figure 2D). This is a surprisingly high thermal stability for a small protein in the absence of disulfide bonds. Finally, a two-dimensional [1H–15N] NMR HSQC spectrum was recorded on a uniformly 15N-labelled sample. In the well-dispersed spectrum, each amino acid was represented by a unique peak as expected for a homogeneous sample of a folded protein (Figure 3). In concert, the CD and NMR data show that purified MANEC domain is a stable protein with a defined fold.

Two-dimensional 1H–15N HSQC NMR spectrum of MANEC

Figure 3
Two-dimensional 1H–15N HSQC NMR spectrum of MANEC

The spectrum shows the relative positions of the peaks, each representing a N–H pair of the amides in the protein main chain. Assigments of individual peaks are shown as labels containing the single letter amino acid code and the corresponding residue number. The peak positions are relative to the 1H and 15N chemical shift scales of the ω1 and ω2 dimensions, respectively (ppm scales). The spectrum was recorded at 500 MHz and 298 K and displayed using SPARKY which revealed a well-dispersed spectrum with high signal to noise ratio.

Figure 3
Two-dimensional 1H–15N HSQC NMR spectrum of MANEC

The spectrum shows the relative positions of the peaks, each representing a N–H pair of the amides in the protein main chain. Assigments of individual peaks are shown as labels containing the single letter amino acid code and the corresponding residue number. The peak positions are relative to the 1H and 15N chemical shift scales of the ω1 and ω2 dimensions, respectively (ppm scales). The spectrum was recorded at 500 MHz and 298 K and displayed using SPARKY which revealed a well-dispersed spectrum with high signal to noise ratio.

NMR reveals a well-defined MANEC structure stabilized by four disulfide bonds

Uniformly 13C- and 15N-labelled MANEC was prepared (Figure S3A) and state-of-the-art 3D and 4D liquid-state NMR experiments were used to obtain 88% complete chemical shift assignments and NOESY peak lists. The assignments and the NOESY data from 3D and the highly unambiguous and information-rich 4D experiments were used as input for automated NOE assignment and structure calculation in CYANA. An example of NOESY spectra quality and NOE assignment is given in Supplementary Figure S4. The carbon-13 chemical shift-values of all assigned cysteines was in agreement with an oxidized state [42]. Initial structure calculations immediately revealed three of the four possible disulfide bridges: C87–C116, C91–C97, and C121–C129, with a consistent maximum distance of 4 Å between the β-carbon atoms within each cysteine pair. This emphasized the quality of the NMR data and was a strong indication of the validity of the calculated structure. Additional restraints for the four disulfide bonds were included in the subsequent CYANA calculations. The assignment of ∼40 distance-restraints per residue (average) (Figure S5A) resulted in a well-defined structure with very low ensemble-averages of backbone and heavy atom with RMSD values of 0.47±0.13 Å and 0.97±0.12 Å, respectively (Supplementary Table S1). High conformational variability is only observed for the three N-terminal residues and the C-terminal hexa-His-tag (Figure 4) for which only few or no NOE restraints were observed (Supplementary Figure S5A). To further validate the convergence of the structure calculations, five additional independent CYANA calculations were performed using different random seeds. All resulting 120 structures agreed very well with average backbone and heavy atom RMSD values of 0.68±0.16 Å and 1.20±0.16 Å, respectively. Overall, the sequence-predicted secondary structure elements by I-TASSER (http://zhanglab.ccmb.med.umich.edu/I-TASSER/) [41] as well as those predicted by the TALOS+ analysis of the backbone chemical shifts (Supplementary Figure S5B) agrees well with the calculated structure. However, in two cases, TALOS+ predicted α-helical (96–97) and β-strand (119–123) secondary structure for an unstructured loop region in the structure. In both cases, the presence of disulfide bridges affects the local structure, and affects chemical shifts in the vicinity of the cysteine residues (Cys96 and Cys121). These segments may therefore not adopt canonical secondary structure, or may no longer be identified as such on the basis of chemical shifts alone.

The MANEC NMR solution structure ensemble

Figure 4
The MANEC NMR solution structure ensemble

The Figure shows an alignment of the 20 structures with the lowest energy from the final CYANA calculation (PDB ID: 2msx). The back bone traces are shown as a blue ribbon. The flexible N- and C-termini are shown to the left and right, respectively. The structure Figures and aligments were prepared using PyMOL (the PyMOL Molecular Graphics System, Version 1.5.0.4 Schrödinger, LLC).

Figure 4
The MANEC NMR solution structure ensemble

The Figure shows an alignment of the 20 structures with the lowest energy from the final CYANA calculation (PDB ID: 2msx). The back bone traces are shown as a blue ribbon. The flexible N- and C-termini are shown to the left and right, respectively. The structure Figures and aligments were prepared using PyMOL (the PyMOL Molecular Graphics System, Version 1.5.0.4 Schrödinger, LLC).

The NMR-derived disulfide bond pattern was verified by mass spectrometry

To verify the connectivity of the disulfide bonds in the NMR structure, a MANEC sample was digested with trypsin and analysed by MALDI-MS (Supplementary Figure S3B). The ion of m/z 3194.0 corresponds to the disulfide-linked double peptide encompassing G84–R89 and G109–R130, corroborating the disulfide bridge between C87 and C116 identified by NMR analysis (Table 1). The absence of any S-carbamidomethylated cysteine residues confirmed the intrapeptide bond between C121 and C129 within the G109–R130 peptide. A second ion of m/z 3350.2 represented the same disulfide-linked double peptide encompassing the peptides R83–R89 and G109–R130 (Table 1). The remaining two disulfide bonds were represented by m/z 5713.1 corresponding to the disulfide linked G47–R82 and A90–R108 peptides with no S-carbamidomethylated cysteine residues (Table 1). The presence of this ion validated the NMR data, establishing the C50–C92 and C91–C97 disulfide bridges. In addition to ions representing disulfide linked peptides, a triplet of ions of m/z 3594.6, 3627.6 and 3660.1 was observed. This cluster represented the MALDI-induced fragmentation of the C50–C91 intermolecular disulfide bond present in the m/z 5713.1 ion producing the G47–R82 peptide without sulfur (m/z 3594.6), with sulfur (m/z 3627.6) or with an additional sulfur atom originating from C91 (m/z 3660.1) [43]. In summary, the data obtained by MALDI-MS analysis correlates with the findings by NMR analysis establishing disulfide bonds between C50–C92, C87–C116, C91–C97 and C121–C129.

Table 1
MANEC disulfide bonds analysis by MS

The assignment of [M+H]+ ions observed in the range m/z 3000–6000 corresponding to disulfide-linked fragments following tryptic digest. The observed and theoretical calculated molecular mass of the indicated peptide sequences are shown in Da.

Observed mass (Da)a Calculated mass (Da)b Disulfide-linked peptides 
3193.0 3193.4  
 733.3 Gly84–Arg89 (Cys87) 
 2460.1 Gly109–Arg130 (Cys116, Cys121, Cys129) 
3349.2 3349.5  
 889.4 Arg83–Arg89 (Cys87) 
 2460.1 Gly109–Arg130 (Cys116, Cys121, Cys129) 
3593.6  Gly47–Arg82 (Cys50) 
3626.6  Ions generated by in source disruption of the disulfide-linkage of 5712.1 Da 
3659.1   
5712.1 5714.7  
 3626.8 Gly47–Arg82 (Cys50) 
 2087.9 Ala90–Arg108 (Cys91, Cys92, Cys97) 
Observed mass (Da)a Calculated mass (Da)b Disulfide-linked peptides 
3193.0 3193.4  
 733.3 Gly84–Arg89 (Cys87) 
 2460.1 Gly109–Arg130 (Cys116, Cys121, Cys129) 
3349.2 3349.5  
 889.4 Arg83–Arg89 (Cys87) 
 2460.1 Gly109–Arg130 (Cys116, Cys121, Cys129) 
3593.6  Gly47–Arg82 (Cys50) 
3626.6  Ions generated by in source disruption of the disulfide-linkage of 5712.1 Da 
3659.1   
5712.1 5714.7  
 3626.8 Gly47–Arg82 (Cys50) 
 2087.9 Ala90–Arg108 (Cys91, Cys92, Cys97) 
a

The given mass is calculated on the basis of the observed m/z value ([M+H]+).

b

The calculated mass includes all cysteines in oxidized state. The mass of the individual peptides are given in italics.

MANEC domains represent a new subclass of the PAN/apple domain family

Next, we wanted to investigate whether our new structure represented a novel fold as suggested by the sequence-based homology searches. A three-dimensional structure-based search using the DALI server (http://ekhidna.biocenter.helsinki.fi/dali_server/start) [44] identified several proteins with homologous structures including proteins from the plasminogen subfamily of the serine proteases, plasminogen, hepatocyte growth factor (HGF) and coagulation factor XI (Supplementary Table S2). In all cases, the MANEC structure matched the fold of a PAN or apple domain from the respective proteins. The PAN domain is a divergent subclass of the apple domain family defined by a single central five-stranded β-sheet, a short α-helix connecting strands 5 and 3 (equivalent to MANEC α-helix 2), and a short α-helix in the C-terminus (equivalent to MANEC α-helix 3) [45]. The domain is stabilized by the presence of two conserved disulphide bonds interlocking the fold (Figures 5B and 5D). The apple domain differs from the PAN domain by encompassing an additional conserved disulfide bond bridging the N- and the C-terminus to form a structure with an ‘apple-like’ appearance (Figures 5C and 5D). Another characteristic feature of the common PAN/apple fold is the way β-strands 1 and 2 bend and adopt a conformation in which strand 2 ‘wraps around’ strand 1 (Figures 5B and 5C). By comparing our structure to the HGF PAN domain with the best fit (Supplementary Table S2), it was possible to identify several additional features of the HAI-1 MANEC domain not found in the common PAN/apple domain fold (Figures 5B–5D). The most obvious difference is the four disulfide bonds, of which only two are shared with the PAN/apple domain consensus. The two additional disulfide bonds are between C50 and C92, locking the N-terminus of MANEC to α-helix 2, and between C121 and C129, pinching the long loop between strand 4 and strand 2 (Figures 5A and 5D). Additional unique features of MANEC include a short α-helix (α-helix 1) between strand 1 and strand 5 containing the N-glycosylation site, an extra-long α-helix (α-helix 3) in the extended C-terminus, and finally the two protruding loop regions between strand 3 and strand 4 (105–112) and between strand 4 and strand 2 (122–129). Although looking at sequence alignments (Figure 5D and Supplementary Table S2), it is now clear why sequence-based homology searches fail in predicting homology to the PAN/apple domain. Only 11 amino acid residues, including four cysteines, are conserved, with an overall sequence homology as low as 13% between the HAI-1 MANEC domain and HGF PAN, the closest structural homologue based on RMSD. Impressively, an in silico structure predicted by the Quark server (http://zhanglab.ccmb.med.umich.edu/QUARK/) [46], compared very well with our NMR structure (Supplementary Figure S7) with an RMSD of 2.5 Å. DALI searches using the predicted structure was again able to pick up homology to the PAN/apple domain family (Supplementary Table S3), showing that in silico structure folding may represent a useful tool to predict structure homologies where sequence-based searches fail.

MANEC represent a new subclass of the PAN/Apple domain family

Figure 5
MANEC represent a new subclass of the PAN/Apple domain family

(A) A representative structure solution (#2) of the MANEC domain is shown in cartoon (PDB ID: 2msx). Secondary structure elements are numbered and highlighted by different colours: α-helices as sky blue, β-strands as green and loops as wheat. Cysteine side chains including disulfide bonds and are shown as yellow sticks. The β-carbon of the potential glycosylation site at residue 66 is shown as a purple sphere. The location of N- and C-terminal is highlighted by labels. (B) Alignment of the MANEC domain (colours as in A) with the PAN domain from HGF (PDB ID: 1NK1) in red cartoon reveals the high degree of overall structural similarity with an RMSD of 2.3 Å. The N- and C-terminal labels and the cysteine side chains shown as yellow sticks belong to the PAN domain. (C) Alignment of the MANEC domain (colours as in A) with the apple domain 1 from coagulation factor XI (PDB ID: 2F83) in red cartoon with an RMSD of 2.9 Å. The N- and C-terminal labels and the cysteine side chains shown as yellow sticks belong to the apple domain. The structure orientation in (B) and (C) is as in (A) left panel. (D) Alignment of sequence and secondary structure elements of MANEC domain of HAI-1, PAN domain of HGF and apple domain 1 of coagulation factor XI using the structure-based DALI-server. HAI-1 numbering and secondary structural elements including numbers are indicated above the sequence: α-helices are drawn as helices, β-strands as arrows, other elements as solid lines. Partially defined elements in our MANEC structure are shown as dashed lines. Disulfide connection-lines are shown below the sequences. The glycosylation site (N66) in HAI-1 MANEC is highlighted by an asterisk (*). The structure Figures and aligments were prepared using PyMOL (the PyMOL Molecular Graphics System, Version 1.5.0.4 Schrödinger, LLC).

Figure 5
MANEC represent a new subclass of the PAN/Apple domain family

(A) A representative structure solution (#2) of the MANEC domain is shown in cartoon (PDB ID: 2msx). Secondary structure elements are numbered and highlighted by different colours: α-helices as sky blue, β-strands as green and loops as wheat. Cysteine side chains including disulfide bonds and are shown as yellow sticks. The β-carbon of the potential glycosylation site at residue 66 is shown as a purple sphere. The location of N- and C-terminal is highlighted by labels. (B) Alignment of the MANEC domain (colours as in A) with the PAN domain from HGF (PDB ID: 1NK1) in red cartoon reveals the high degree of overall structural similarity with an RMSD of 2.3 Å. The N- and C-terminal labels and the cysteine side chains shown as yellow sticks belong to the PAN domain. (C) Alignment of the MANEC domain (colours as in A) with the apple domain 1 from coagulation factor XI (PDB ID: 2F83) in red cartoon with an RMSD of 2.9 Å. The N- and C-terminal labels and the cysteine side chains shown as yellow sticks belong to the apple domain. The structure orientation in (B) and (C) is as in (A) left panel. (D) Alignment of sequence and secondary structure elements of MANEC domain of HAI-1, PAN domain of HGF and apple domain 1 of coagulation factor XI using the structure-based DALI-server. HAI-1 numbering and secondary structural elements including numbers are indicated above the sequence: α-helices are drawn as helices, β-strands as arrows, other elements as solid lines. Partially defined elements in our MANEC structure are shown as dashed lines. Disulfide connection-lines are shown below the sequences. The glycosylation site (N66) in HAI-1 MANEC is highlighted by an asterisk (*). The structure Figures and aligments were prepared using PyMOL (the PyMOL Molecular Graphics System, Version 1.5.0.4 Schrödinger, LLC).

SAXS analysis shows that the recombinant HAI-1 MANEC domain is a monomer

It is well documented that the PAN domain of HGF supports in the formation of a HGF homodimer as a crucial step for receptor activation [47,48]. In order to verify that our recombinant MANEC domain is monomeric in solution, as suggested by our data obtained by size-exclusion chromatography, we decided to analyse our MANEC preparation by small angle X-ray scattering (SAXS). The shape of the scattering curve for MANEC showed a well-behaved homogeneous sample with only low tendency to aggregation (upturn at low q values) (Figure 6A). Comparing the theoretically calculated scattering profile of our monomeric NMR structure to the SAXS data using CRYSOL, an almost perfect fit was obtained (Figure 6A). By indirect Fourier transformation of the SAXS data, a Dmax of 43 Å, a radius of gyration of 16.3±0.9 Å and the pair distance distribution function, p(r) (Supplementary Figure S6) was determined. The shape of the p(r) function, in combination with the small radius of gyration, clearly shows that MANEC in solution is a small globular domain. From the SAXS data, it was also possible to construct an ab initio low resolution molecular model representing the overall shape of MANEC which fitted very well to our NMR ensemble (Figure 6B). In summary, the SAXS data confirmed that the recombinant HAI-1 MANEC domain is a monomer in solution.

SAXS solution data demonstrates a monomeric state of HAI-1 MANEC

Figure 6
SAXS solution data demonstrates a monomeric state of HAI-1 MANEC

(A) The experimental SAXS data (black circles) compared, overlaid, with the theoretical scattering curve (solid black line) of our NMR structure as calculated by CRYSOL. (B) The ab initio molecule shape of the MANEC protein constructed from the experimental SAXS data using the program DAMMIF, superimposed with the 20 structure NMR ensemble of the MANEC domain.

Figure 6
SAXS solution data demonstrates a monomeric state of HAI-1 MANEC

(A) The experimental SAXS data (black circles) compared, overlaid, with the theoretical scattering curve (solid black line) of our NMR structure as calculated by CRYSOL. (B) The ab initio molecule shape of the MANEC protein constructed from the experimental SAXS data using the program DAMMIF, superimposed with the 20 structure NMR ensemble of the MANEC domain.

DISCUSSION

The hitherto uncharacterized N-terminal region of HAI-1, encompassing the MANEC domain, constitutes a large portion of the extracellular part of the protein (Figure 1). To facilitate a deeper insight into the biological properties of HAI-1 and the MANEC domains as a family, we decided to perform a structural characterization of the MANEC domain of HAI-1.

The purified HAI-1 MANEC domain proved to be highly soluble with a thermodynamically stable fold, allowing structure determination by liquid-state NMR. Our studies produced a structure of the MANEC domain with a resolution of 1.8 Å (calculated during the validation and upload process to http://www.bmrb.wisc.edu). As observed in our structure presentation (Figure 5A) and shown in the secondary-structure labels above the sequence alignment (Figure 5D) some of our β-sheet strands have only partially ideal hydrogen bonding patterns and thus are not picked up by PDB viewing software using standard settings.

After determining the structure, it was possible to use a structure-based homology search with the DALI server, and to demonstrate structural homology between MANEC and members of the PAN/apple domain family. PAN/apple domains have been ascribed important functions in numerous studies, all linked to the ability of mediating protein–protein or protein–oligosaccharide interactions. Deletion of the PAN domain from HGF abolishes both receptor binding as well as heparin binding [49,50]. Our MANEC domain shows no binding to heparin as it passes unaffected through heparin sepharose columns at physiological ionic strength and elutes as a particle of monomeric size from a size-exclusion column in the presence of excess heparin (data not shown). HGF has two natural splice variants, NK1 and NK2, which contain the N-terminal PAN domain, followed by one or two kringle domains, respectively, and finally the inactive protease domain (Supplementary Figure S1). NK1, which is a receptor agonist, has been shown to form a head-to-tail dimer by X-ray crystallography, partly via the PAN domain. Mutations in the NK1 dimer interface convert NK1 into an antagonist [51]. The crystal structure of NK2 was shown to form a ‘closed’ monomeric conformation through interdomain interactions between the PAN domain and the second kringle domain [52]. In the case of plasminogen, the PAN domain is involved in intramolecular domain–domain interactions determining the overall conformation and thus the activation of the proform of the serine protease. In the structure of the full-length ‘closed form’ of plasminogen, the PAN domain appear to interact with kringle IV and kringle V out of the five kringle domains between the PAN and the protease domain (Supplementary Figure S1) [53]. Through its role in the formation of the ‘closed form’ that exhibits weak affinity towards fibrin, the PAN domain is also important for the biological localization of the enzyme [54]. The apple domain of plasma prekallikrein (Supplementary Figure S1) is known to mediate the binding of high molecular mass kininogen [55]. Factor XI of the coagulation cascade comprises of four tandem apple domains and a C-terminal serine protease domain (Supplementary Figure S1). The apple domains of factor XI have been shown to contain binding sites for factor XIIa, platelets, kininogen, factor IX and heparin [56]. In an NMR solution structure of the apple domain IV of factor XI, it was shown that the domain mediates the formation of the disulfide-linked factor XI dimer [57]. These observations underline essential roles of the PAN/apple domain, primarily as mediator of protein:protein interactions. Also, the primo location in the N-terminus of a multi-domain protein appears in many cases to be important in order to potentiate a structural effect of interdomain interactions on the overall function of the protein.

In 2008, Kojima et al. proposed a model in which the MANEC domain of HAI-1 plays an active role in the regulation of the inhibitory activity. It was proposed that the tertiary structure of HAI-1 collapses via direct interaction between the MANEC domain and the second Kunitz-type inhibitor domain, resulting in steric interference with the protease–inhibitor interaction [14]. Our findings support this model by introducing homology to other proteins, such as plasminogen and certain forms of HGF, in which intermolecular interactions involving the MANEC/PAN/apple domain have been proven for regulation of protein function. Our MANEC structure also provides clues to where such interaction surfaces may be localized. The presence of two additional conserved disulfide bonds not only stabilizes the already well-defined PAN-like fold, but also appear to introduce the aforementioned unique structural features. The additional fourth disulfide bond between C121 and C129 forces a seven-residue loop between the two cysteines into an extended conformation projecting out from the overall globular shape of the MANEC fold. This loop region (L122–Y128) along with the loop region Q105–A112, are both longer than corresponding linkers of the typical PAN-domain (Figure 5D) and project in opposite directions from both poles of the central β-sheet (Figure 4A). As both extended loops contain multiple charged side chains, they both represent likely areas for mediating protein–protein interactions. Furthermore, the location of the N- and C-terminals and the additional α-helix with a verified N-linked glycosylation appears to shield a large part of the remaining MANEC surface. We also postulate that the conserved primo position of the MANEC in the N-terminus of the parent proteins may reflects a general mechanism by which it participates in the regulation of the tertiary fold and thus the function of these multi-domain proteins. Why a MANEC domain is preferred by membrane proteins over the common PAN/apple fold, as found in circulating proteins, remains to be uncovered.

In conclusion, we provide here the first structural characterization of the MANEC domain. The structure will allow for a detailed mapping of potential protein–protein interaction sites in the parent protein HAI-1 by site-directed mutagenesis. The structure will also become a general reference structure for all predicted MANEC domains from the many so far uncharacterized proteins. We were able to use our structure to define MANEC domains as a new subclass of the PAN/apple domain family. The homology to the PAN/apple domains seems to suggest a similar function of the MANEC domain as a mediator of protein–protein interactions with potential regulatory properties for the parent protein.

Abbreviations

     
  • HAI-1

    hepatocyte growth factor activator inhibitor-1

  •  
  • HGF

    hepatocyte growth factor

  •  
  • LDLR

    low-density lipoprotein receptor

  •  
  • MALDI

    matrix-assisted laser desorption-ionization

  •  
  • MANEC

    motif at N-terminus with eight-cysteines

  •  
  • NUS

    non-uniform sampling

  •  
  • TFA

    trifluoroacetic acid

AUTHOR CONTRIBUTION

Zebin Hong carried out the majority of the experiments, initial NMR assignment and participated in the preparation of the manuscript, Michal Nowakowski performed NMR and side chain assignment, Chris Spronk supported NMR analysis and structure calculation, Steen Petersen performed mass spectrometry analyses, Wiktor Koźmiński supported NMR setup and supplied instrumentation, Frans Mulder performed initial NMR experiments and NMR data analysis and Jan Jensen designed the experiments, analysed the data and wrote the manuscript.

We thank Professor Jan Skov Pedersen for access to SAXS equipment and Christine R. Schar for useful comments on the manuscript.

FUNDING

This work is supported by the Danish National Research Foundation [grant number 26-331-6]; the Lundbeck Foundation [grant number R34-A3528]; and Danish Cancer Society [grant number R56-A2997-12-S2] and Foundation. M.N. and W.K. was supported by the Foundation for Polish Science, TEAM programme.

References

References
1
Guo
J.
Chen
S.
Huang
C.
Chen
L.
Studholme
D.J.
Zhao
S.
Yu
L.
MANSC: a seven-cysteine-containing domain present in animal membrane and extracellular proteins
Trends Biochem. Sci.
2004
, vol. 
29
 (pg. 
172
-
174
)
[PubMed]
2
Letunic
I.
Doerks
T.
Bork
P.
SMART 7: recent updates to the protein domain annotation resource
Nucleic Acids Res.
2012
, vol. 
40
 (pg. 
D302
-
D305
)
[PubMed]
3
Shimomura
T.
Denda
K.
Kitamura
A.
Kawaguchi
T.
Kito
M.
Kondo
J.
Kagaya
S.
Qin
L.
Takata
H.
Miyazawa
K.
Kitamura
N.
Hepatocyte growth factor activator inhibitor, a novel Kunitz-type serine protease inhibitor
J. Biol. Chem.
1997
, vol. 
272
 (pg. 
6370
-
6376
)
[PubMed]
4
Kataoka
H.
Shimomura
T.
Kawaguchi
T.
Hamasuna
R.
Itoh
H.
Kitamura
N.
Miyazawa
K.
Koono
M.
Hepatocyte growth factor activator inhibitor type 1 is a specific cell surface binding protein of hepatocyte growth factor activator (HGFA) and regulates HGFA activity in the pericellular microenvironment
J. Biol. Chem.
2000
, vol. 
275
 (pg. 
40453
-
40462
)
[PubMed]
5
Szabo
R.
Hobson
J.P.
List
K.
Molinolo
A.
Lin
C.Y.
Bugge
T.H.
Potent inhibition and global co-localization implicate the transmembrane Kunitz-type serine protease inhibitor hepatocyte growth factor activator inhibitor-2 in the regulation of epithelial matriptase activity
J. Biol. Chem.
2008
, vol. 
283
 (pg. 
29495
-
29504
)
[PubMed]
6
Fan
B.
Wu
T.D.
Li
W.
Kirchhofer
D.
Identification of hepatocyte growth factor activator inhibitor-1B as a potential physiological inhibitor of prostasin
J. Biol. Chem.
2005
, vol. 
280
 (pg. 
34513
-
34520
)
[PubMed]
7
Kirchhofer
D.
Peek
M.
Lipari
M.T.
Billeci
K.
Fan
B.
Moran
P.
Hepsin activates pro-hepatocyte growth factor and is inhibited by hepatocyte growth factor activator inhibitor-1B (HAI-1B) and HAI-2
FEBS Lett.
2005
, vol. 
579
 (pg. 
1945
-
1950
)
[PubMed]
8
Fan
B.
Brennan
J.
Grant
D.
Peale
F.
Rangell
L.
Kirchhofer
D.
Hepatocyte growth factor activator inhibitor-1 (HAI-1) is essential for the integrity of basement membranes in the developing placental labyrinth
Dev. Biol.
2007
, vol. 
303
 (pg. 
222
-
230
)
[PubMed]
9
List
K.
Szabo
R.
Molinolo
A.
Sriuranpong
V.
Redeye
V.
Murdock
T.
Burke
B.
Nielsen
B.S.
Gutkind
J.S.
Bugge
T.H.
Deregulated matriptase causes ras-independent multistage carcinogenesis and promotes ras-mediated malignant transformation
Gene Dev.
2005
, vol. 
19
 (pg. 
1934
-
1950
)
10
Nakamura
K.
Abarzua
F.
Hongo
A.
Kodama
J.
Nasu
Y.
Kumon
H.
Hiramatsu
Y.
The role of hepatocyte growth factor activator inhibitor-1 (HAI-1) as a prognostic indicator in cervical cancer
Int. J. Oncol.
2009
, vol. 
35
 (pg. 
239
-
248
)
[PubMed]
11
Hu
C.Y.
Jiang
N.
Wang
G.Z.
Zheng
J.C.
Yang
W.
Yang
J.W.
Expression of hepatocyte growth factor activator inhibitor-1 (HAI-1) gene in prostate cancer: clinical and biological significance
J. BUON
2014
, vol. 
19
 (pg. 
215
-
220
)
[PubMed]
12
Kirchhofer
D.
Peek
M.
Li
W.
Stamos
J.
Eigenbrot
C.
Kadkhodayan
S.
Elliott
J.M.
Corpuz
R.T.
Lazarus
R.A.
Moran
P.
Tissue expression, protease specificity, and Kunitz domain functions of hepatocyte growth factor activator inhibitor-1B (HAI-1B), a new splice variant of HAI-1
J. Biol. Chem.
2003
, vol. 
278
 (pg. 
36341
-
36349
)
[PubMed]
13
Denda
K.
Shimomura
T.
Kawaguchi
T.
Miyazawa
K.
Kitamura
N.
Functional characterization of Kunitz domains in hepatocyte growth factor activator inhibitor type 1
J. Biol. Chem.
2002
, vol. 
277
 (pg. 
14053
-
14059
)
[PubMed]
14
Kojima
K.
Tsuzuki
S.
Fushiki
T.
Inouye
K.
Roles of functional and structural domains of hepatocyte growth factor activator inhibitor type 1 in the inhibition of matriptase
J. Biol. Chem.
2008
, vol. 
283
 (pg. 
2478
-
2487
)
[PubMed]
15
Gobom
J.
Nordhoff
E.
Mirgorodskaya
E.
Ekman
R.
Roepstorff
P.
Sample purification and preparation technique based on nano-scale reversed-phase columns for the sensitive analysis of complex peptide mixtures by matrix-assisted laser desorption/ionization mass spectrometry
J. Mass Spectrom.
1999
, vol. 
34
 (pg. 
105
-
116
)
[PubMed]
16
Delaglio
F.
Grzesiek
S.
Vuister
G.W.
Zhu
G.
Pfeifer
J.
Bax
A.
NMRPipe: a multidimensional spectral processing system based on UNIX pipes
J. Biomol. NMR.
1995
, vol. 
6
 (pg. 
277
-
293
)
[PubMed]
17
Goddard
T.D. a. K. D.G.
SPARKY3
2002
San Francisco
University of California
18
Wishart
D.S.
Bigam
C.G.
Yao
J.
Abildgaard
F.
Dyson
H.J.
Oldfield
E.
Markley
J.L.
Sykes
B.D.
1H, 13C and 15N chemical shift referencing in biomolecular NMR
J. Biomol. NMR.
1995
, vol. 
6
 (pg. 
135
-
140
)
[PubMed]
19
Kazimierczuk
K.
Zawadzka
A.
Kozminski
W.
Zhukov
I.
Random sampling of evolution time space and Fourier transform processing
J. Biomol. NMR
2006
, vol. 
36
 (pg. 
157
-
168
)
[PubMed]
20
Stanek
J.
Kozminski
W.
Iterative algorithm of discrete Fourier transform for processing randomly sampled NMR data sets
J. Biomol. NMR
2010
, vol. 
47
 (pg. 
65
-
77
)
[PubMed]
21
Stanek
J.
Augustyniak
R.
Kozminski
W.
Suppression of sampling artefacts in high-resolution four-dimensional NMR spectra using signal separation algorithm
J. Magn. Reson.
2012
, vol. 
214
 (pg. 
91
-
102
)
[PubMed]
22
Atreya
H.S.
Szyperski
T.
G-matrix Fourier transform NMR spectroscopy for complete protein resonance assignment
Proc. Natl. Acad. Sci. U.S.A.
2004
, vol. 
101
 (pg. 
9642
-
9647
)
[PubMed]
23
Kazimierczuk
K.
Zawadzka-Kazimierczuk
A.
Kozminski
W.
Non-uniform frequency domain for optimal exploitation of non-uniform sampling
J. Magn. Reson.
2010
, vol. 
205
 (pg. 
286
-
292
)
[PubMed]
24
Olejniczak
E.T.
Xu
R.X.
Fesik
S.W.
A 4D HCCH-TOCSY experiment for assigning the side chain 1H and 13C resonances of proteins
J. Biomol. NMR
1992
, vol. 
2
 (pg. 
655
-
659
)
[PubMed]
25
Coggins
B.E.
Zhou
P.
High resolution 4-D spectroscopy with sparse concentric shell sampling and FFT-CLEAN
J. Biomol. NMR
2008
, vol. 
42
 (pg. 
225
-
239
)
[PubMed]
26
Yamazaki
T.
Forman-Kay
J.D.
Kay
L.E.
Two-dimensional NMR experiments for correlating 13Cb and 1Hd/e chemical shifts of aromatic residues in 13C-labeled proteins via scalar couplings
J. Am. Chem. Soc.
1993
, vol. 
115
 (pg. 
11054
-
11055
)
27
Muhandiram
D.R.
Farrow
N.A.
Xu
G.-Y.
Smallcombe
S.H.
Kay
L.E.
A gradient 13C NOESY-HSQC experiment for recording NOESY spectra of 13C-labeled proteins dissolved in H2O
J. Magn. Reson. B
1993
, vol. 
49
 (pg. 
317
-
321
)
28
Stanek
J.
Nowakowski
M.
Saxena
S.
Ruszczynska-Bartnik
K.
Ejchart
A.
Kozminski
W.
Selective diagonal-free (13)C, (13)C-edited aliphatic-aromatic NOESY experiment with non-uniform sampling
J. Biomol. NMR
2013
, vol. 
56
 (pg. 
217
-
226
)
[PubMed]
29
Zhang
O.
Kay
L.E.
Olivier
J.P.
Forman-Kay
J.D.
Backbone 1H and 15N resonance assignments of the N-terminal SH3 domain of drk in folded and unfolded states using enhanced-sensitivity pulsed field gradient NMR techniques
J. Biomol. NMR
1994
, vol. 
4
 (pg. 
845
-
858
)
[PubMed]
30
Vuister
G.W.
Marius Clore
C.
Gronenborn
A.M.
Powers
R.
Garret
D.S
Tschudin
R.
Bax
A.
Increased resolution and improved spectral quality in four-dimensional 13C/13C-separated HMQC-NOESY-HMQC spectra using pulsed field gradients
J. Magn. Reson. B
1993
, vol. 
101
 (pg. 
210
-
213
)
31
Tugarinov
V.
Kay
L.E.
Ibraghimov
I.
Orekhov
V.Y.
High-resolution four-dimensional 1H–13C NOE spectroscopy using methyl-TROSY, sparse data acquisition, and multidimensional decomposition
J. Am. Chem. Soc.
2005
, vol. 
127
 (pg. 
2767
-
2775
)
[PubMed]
32
Guntert
P.
Automated NMR structure calculation with CYANA
Methods Mol. Biol.
2004
, vol. 
278
 (pg. 
353
-
378
)
[PubMed]
33
Shen
Y.
Delaglio
F.
Cornilescu
G.
Bax
A.
TALOS+: a hybrid method for predicting protein backbone torsion angles from NMR chemical shifts
J. Biomol. NMR
2009
, vol. 
44
 (pg. 
213
-
223
)
[PubMed]
34
Krieger
E.
Koraimann
G.
Vriend
G.
Increasing the precision of comparative models with YASARA NOVA—a self-parameterizing force field
Proteins
2002
, vol. 
47
 (pg. 
393
-
402
)
[PubMed]
35
Pedersen
J.S.
A flux- and background-optimized version of the NanoSTAR small-angle X-ray scattering camera for solution scattering
J. Appl. Crystallogr.
2004
, vol. 
37
 (pg. 
369
-
380
)
36
Pedersen
J.S.
Hansen
S.
Bauer
R.
The aggregation behavior of zinc-free insulin studied by small-angle neutron scattering
Eur. Biophy. J.
1994
, vol. 
22
 (pg. 
379
-
389
)
37
Svergun
D.
Barberato
C.
Koch
M.H. J.
CRYSOL–a program to evaluate X-ray solution scattering of biological macromolecules from atomic coordinates
J. Appl. Crystallogr.
1995
, vol. 
28
 (pg. 
768
-
773
)
38
Franke
D.
Svergun
D.I.
DAMMIF, a program for rapid ab-initio shape determination in small-angle scattering
J. Appl. Crystallogr.
2009
, vol. 
42
 (pg. 
342
-
346
)
39
Kozin
M.B.
Svergun
D.I.
Automated matching of high- and low-resolution structural models
J. Appl. Crystallogr.
2001
, vol. 
34
 (pg. 
33
-
41
)
40
Volkov
V.V.
Svergun
D.I.
Uniqueness of ab initio shape determination in small-angle scattering
J. Appl. Crystallogr.
2003
, vol. 
36
 (pg. 
860
-
864
)
41
Roy
A.
Kucukural
A.
Zhang
Y.
I-TASSER: a unified platform for automated protein structure and function prediction
Nat. Protoc.
2010
, vol. 
5
 (pg. 
725
-
738
)
[PubMed]
42
Sharma
D.
Rajarathnam
K.
13C NMR chemical shifts can predict disulfide bond formation
J. Biomol. NMR
2000
, vol. 
18
 (pg. 
165
-
171
)
[PubMed]
43
Beam
M.F.
Carr
S.A.
Characterization of disulfide bond position in proteins and sequence analysis of cystine-bridged peptides by tandem mass spectrometry
Anal. Biochem.
1992
, vol. 
201
 (pg. 
216
-
226
)
[PubMed]
44
Holm
L.
Rosenstrom
P.
Dali server: conservation mapping in 3D
Nucleic Acids Res.
2010
, vol. 
38
 (pg. 
W545
-
W549
)
[PubMed]
45
Tordai
H.
Banyai
L.
Patthy
L.
The PAN module: the N-terminal domains of plasminogen and hepatocyte growth factor are homologous with the apple domains of the prekallikrein family and with a novel domain found in numerous nematode proteins
FEBS Lett.
1999
, vol. 
461
 (pg. 
63
-
67
)
[PubMed]
46
Xu
D.
Zhang
Y.
Ab initio protein structure assembly using continuous structure fragments and optimized knowledge-based force field
Proteins
2012
, vol. 
80
 (pg. 
1715
-
1735
)
[PubMed]
47
Gherardi
E.
Sandin
S.
Petoukhov
M.V.
Finch
J.
Youles
M.E.
Ofverstedt
L.G.
Miguel
R.N.
Blundell
T.L.
Vande Woude
G.F.
Skoglund
U.
Svergun
D.I.
Structural basis of hepatocyte growth factor/scatter factor and MET signalling
Proc. Natl. Acad. Sci. U.S.A.
2006
, vol. 
103
 (pg. 
4046
-
4051
)
[PubMed]
48
Tolbert
W.D.
Daugherty
J.
Gao
C.
Xie
Q.
Miranti
C.
Gherardi
E.
Vande Woude
G.
Xu
H.E.
A mechanistic basis for converting a receptor tyrosine kinase agonist to an antagonist
Proc. Natl. Acad. Sci. U.S.A.
2007
, vol. 
104
 (pg. 
14592
-
14597
)
[PubMed]
49
Lokker
N.A.
Presta
L.G.
Godowski
P.J.
Mutational analysis and molecular modeling of the N-terminal kringle-containing domain of hepatocyte growth factor identifies amino acid side chains important for interaction with the c-Met receptor
Protein Eng.
1994
, vol. 
7
 (pg. 
895
-
903
)
[PubMed]
50
Sakata
H.
Stahl
S.J.
Taylor
W.G.
Rosenberg
J.M.
Sakaguchi
K.
Wingfield
P.T.
Rubin
J.S.
Heparin binding and oligomerization of hepatocyte growth factor/scatter factor isoforms. Heparan sulfate glycosaminoglycan requirement for Met binding and signaling
J. Biol. Chem.
1997
, vol. 
272
 (pg. 
9457
-
9463
)
[PubMed]
51
Ultsch
M.
Lokker
N.A.
Godowski
P.J.
de Vos
A.M.
Crystal structure of the NK1 fragment of human hepatocyte growth factor at 2.0 Å resolution
Structure
1998
, vol. 
6
 (pg. 
1383
-
1393
)
[PubMed]
52
Tolbert
W.D.
Daugherty-Holtrop
J.
Gherardi
E.
Vande Woude
G.
Xu
H.E.
Structural basis for agonism and antagonism of hepatocyte growth factor
Proc. Natl. Acad. Sci. U.S.A.
2010
, vol. 
107
 (pg. 
13264
-
13269
)
[PubMed]
53
Law
R.H.
Caradoc-Davies
T.
Cowieson
N.
Horvath
A.J.
Quek
A.J.
Encarnacao
J.A.
Steer
D.
Cowan
A.
Zhang
Q.
Lu
B.G.
, et al. 
The X-ray crystal structure of full-length human plasminogen
Cell Rep.
2012
, vol. 
1
 (pg. 
185
-
190
)
[PubMed]
54
Banyai
L.
Patthy
L.
Importance of intramolecular interactions in the control of the fibrin affinity and activation of human plasminogen
J. Biol. Chem.
1984
, vol. 
259
 (pg. 
6466
-
6471
)
[PubMed]
55
Herwald
H.
Renne
T.
Meijers
J.C.
Chung
D.W.
Page
J.D.
Colman
R.W.
Muller-Esterl
W.
Mapping of the discontinuous kininogen binding site of prekallikrein. A distal binding segment is located in the heavy chain domain A4
J. Biol. Chem.
1996
, vol. 
271
 (pg. 
13061
-
13067
)
[PubMed]
56
Ho
D.H.
Badellino
K.
Baglia
F.A.
Walsh
P.N.
A binding site for heparin in the apple 3 domain of factor XI
J. Biol. Chem.
1998
, vol. 
273
 (pg. 
16382
-
16390
)
[PubMed]
57
Samuel
D.
Cheng
H.
Riley
P.W.
Canutescu
A.A.
Nagaswami
C.
Weisel
J.W.
Bu
Z.
Walsh
P.N.
Roder
H.
Solution structure of the A4 domain of factor XI sheds light on the mechanism of zymogen activation
Proc. Natl. Acad. Sci. U.S.A.
2007
, vol. 
104
 (pg. 
15693
-
15698
)
[PubMed]

Author notes

The co-ordinates reported for the Protein Data Bank will appear in the PDB under accession codes 2msx, 1NK1, 2F83.

Supplementary data