Compound phenotype of osteogenesis imperfecta and Ehlers–Danlos syndrome caused by combined mutations in COL1A1 and COL5A1

Osteogenesis imperfecta (OI) is an inherited connective tissue disorder with a broad clinical spectrum that can overlap with Ehlers–Danlos syndrome (EDS). To date, patients with both OI and EDS have rarely been reported. In the present study, we investigated a family with four members, one healthy individual, one displaying OI only, and two displaying the compound phenotype of OI and EDS, and identified the pathogenic mutations. Whole exome sequencing was applied to the proband and her brother. To verify that the mutations were responsible for the pathogenesis, conventional Sanger sequencing was performed for all members of the family. We identified a known COL1A1 (encoding collagen type I α 1 chain) mutation (c.2010delT, p.Gly671Alafs*95) in all three patients (the proband, her brother, and her mother) in this family, but also a novel heterozygous COL5A1 (encoding collagen type V α 1 chain) mutation (c.5335A>G, p.N1779D) in the region encoding the C-terminal propeptide domain in the proband and her mother, who both had the compound phenotype of OI and EDS. The results of the present study suggested that the proband and her mother presented with the compound OI–EDS phenotype caused by pathogenic mutations in COL5A1 and COL1A1.


Introduction
Osteogenesis imperfecta (OI), or brittle bone disease, is a clinically and genetically heterogeneous disorder that mainly results in osteopenia, bone fragility, blue sclerae, dentinogenesis imperfecta, and hearing loss [1]. OI can be classified into types I-IV, and approximately 85-90% of individuals with OI have a mutation in either collagen type I α 1 chain (COL1A1) or collagen type I α 2 chain (COL1A2). Type I collagen is the most abundant protein in bone, skin, and the tendon extracellular matrix [2]. The OI Mutation Consortium, an international collaboration of many laboratories that identify OI mutations, has found that 80% of COL1A1/COL1A2 mutations give rise to substitution of glycine residues in the type I collagen chain, and the remaining 20% of mutations result in abnormalities of mRNA splicing [3]. Mutations in the gene coding type I procollagen produce a range of disorders, including autosomal dominant OI and the rare arthrochalasis subtype of Ehlers-Danlos syndrome (EDS) [4][5][6]. EDS is a connective tissue disorder that is characterized by abnormal wound healing, easy bruising, atrophic scarring, and joint hypermobility [7]. Classic-type EDS (cEDS) occurs because of a COL5A1/2 (encoding collagen type V α 1 or 2 chain) mutation and is inherited in an autosomal dominant manner. It is estimated that approximately 50% of patients with cEDS harbor a COL5A1 or COL5A2 mutation [7].
In the present study, we report a rare family presenting with a compound phenotype of OI and EDS, such as multiple fractures, blue sclerae, atrophic scarring, easy bruising, and joint hypermobility. To understand the basis of the disorder, provide a theoretical foundation for genetic counseling, and to determine whether patients simultaneously harbor pathogenic mutations in genes associated with OI and EDS, whole exome sequencing was performed. We identified a COL1A1 mutation known to be responsible for OI, and a novel C-propeptide domain mutation in COL5A1. To the best of our knowledge, this is the first report of patients with compound phenotypes of OI and EDS that harbor both COL1A1 and COL5A1 mutations.

Materials and methods
The study was approved by the Ethics Committee of the Second Hospital, Shantou University Medical College. Written informed consent was obtained from each individual for their DNA to be used for research purposes.

DNA extraction
DNA extraction was performed using a QIAamp DNA Mini Kit (Cat. No. 51104, Qiagen, Hilden, Germany) according to the manufacturer's protocol. Genomic DNA was obtained from peripheral blood samples from the family members ( Figure 1), including the proband (III-1), her brother (III-2), and their parents (II-1, II-2).

Whole exome sequencing
Library preparation and sequencing Whole exome sequencing was performed for the two affected children (III-1, III-2) at the Beijing Novogene Bioinformatics Technology Co., Ltd (Beijing, China). Exome sequences were enriched using an Agilent liquid capture system (Agilent SureSelect Human All Exon V6; Agilent Technologies, Santa Clara, CA, U.S.A.) according to the manufacturer's protocol. First, genomic DNA was randomly fragmented to an average size of 180-280 bp using a Covaris S220 sonicator (Covaris, Brighton, U.K.). Second, the DNA fragments were end-repaired and phosphorylated, followed by A-tailing and ligation at the 3 ends with paired-end adaptors (Illumina, San Diego, CA, U.S.A.) with a single 'T' base overhang, and purified using AMPure SPRI beads from Agencourt (Azincourt, France). Then, the size distribution and concentration of the libraries were determined using an Agilent 2100 Bioanalyzer and qualified by using real-time PCR. The DNA libraries were then sequenced on an IlluminaHiSeq 4000 sequencer for paired-end 150 bp reads at Beijing Novogene Bioinformatics Technology Co., Ltd. The raw data were saved as a FASTQ (fq) format file.

Selection of valid sequencing data
Initially, reads with adapter contamination were filtered out. Then, reads that contained more than 10% uncertain nucleotides and paired reads with single reads of low quality (Phred-like quality score (Q score) <5) were also discarded.

Sequencing data mapping to reference sequences and variant calling
The valid sequencing data were mapped to the reference human genome (UCSC hg19) using Burrows-Wheeler Aligner (BWA) software (version 0.7.8; https://sourceforge.net/p/bio-bwa/mailman/message/32169236/). Subsequently, Samtools software 1.0 (also from Sourceforge) was used to sort the BAM files. Picard (http://broadinstitute. github.io/picard) was then employed to identify and delete duplicates. Finally, Samtool smpileup and BCF tools were used to perform variant calling and identify single nucleotide polymorphisms (SNPs) and indels, which were stored as a variant call format (VCF) file.

Functional annotation and variant filter
ANNOVAR (http://annovar.openbioinformatics.org/en/latest/) was used to annotate the VCF file. The variant position, variant type, conservation prediction, and other information were obtained at this step using a variety of databases, such as dbSNP, 1000 Genome, ExAC, CADD, and HGMD. Gene transcript annotation databases, such as Consensus CDS, RefSeq, Ensembl, and UCSC, were also applied for annotation to identify amino acid alterations. Variants were filtered with a Minor Allele Frequency (MAF) > 0.1% in the 1000 Genomes databases (1000 Genomes Project Consortium). Then, synonymous single nucleotide variants (SNVs) were discarded and the retained nonsynonymous SNVs were submitted to PolyPhen-2, SIFT, MutationTaster, and CADD for functional prediction. A nonsynonymous SNV was retained if at least two out of the four software programs showed it to be 'not benign' . Finally, we focused on genes known to be associated with OI and EDS.

Sanger sequencing of candidate variants
To confirm the candidate variants identified by whole exome sequencing, Sanger sequencing was performed for all members of the family displaying complicated phenotypes of OI and EDS. Primers were designed by using Premier Primer 5 software (PREMIER Biosoft International, Palo Alto, CA, U.S.A.). We used genomic DNA to amplify the region of the respective variant using Takara ExTaq ® Hot Start Version (RR006A; Takara, Shiga, Japan). The Beijing Genomics Institute performed the purification of the PCR-amplified DNA and Sanger sequencing (using an ABI 3730XL sequencer).

Evolutionary conservation analysis
To evaluate the evolutionary conservation of the site of the novel COL5A1 mutation, the protein sequences of COL5A1 from eight animal species, including human, rhesus, mouse, elephant, opossum, chicken, Xenopus laevis, and zebrafish, were aligned using ClustalW embedded in MEGA7 (https://www.megasoftware.net/).

Clinical characteristics of a Chinese family with OI
This OI family had two patients with a compound OI and EDS phenotype, a patient with only OI, and a healthy individual ( Figure 1). In this family, the female proband (III-1), who was 18 years old, appeared to be healthy before the age of 12 years. Thereafter, fractures of long bones occurred three times, and her knee ligament ruptured when she was involved in a car accident at 17 years old. She also presented with blue sclerae, atrophic scarring, joint hypermobility, prominent ears, and easy bruising (Table 1 and Figure 2). Similarly, the proband's mother (II-1), 40 years old, not only presented with multiple fractures of long bones and blue sclerae, but also suffered from easy bruising after minor trauma, atrophic scarring, joint hypermobility, joint laxity, and prominent ears ( Table 1). The proband's brother (III-2) was 14 years old, and showed multiple fractures of long bones and blue sclerae, but did not have easy bruising, atrophic scarring, joint hypermobility, or joint laxity ( Table 1). In this family, the proband's father (II-2), 42 years old, was healthy (i.e., presented no symptoms of OI or EDS); all patients had normal mental development and normal hearing ( Table 1).

Identification of mutations
To identify the causative mutations of OI in the family, whole exome sequencing was performed on proband (III-1) and her brother (III-2). The total raw data comprised 27.35 GB for the proband and her brother. An average of 98.2% of the reads had a Q score greater than 20, and 95.5% of the reads had a Q score greater than 30. Quality control indicated that 99.4% of the raw data were valid sequencing data. For III-1 and III-2, the average sequencing depth on target was 165.5× and 129.4×, respectively, and the fractions of the target covering a depth of at least 10× were 99.5 and 99.6%, respectively.

Discussion
We characterized three patients with OI and a healthy individual from the same Chinese family. Among them, two patients had a compound OI and EDS phenotype, manifesting as multiple fractures, blue sclerae, atrophic scarring, easy bruising, and joint hypermobility. To understand cause of the phenotypic variability, we performed whole exome sequencing and identified a COL1A1 mutation and a novel COL5A1 mutation in the patients with the compound OI/EDS phenotype.
Type I collagen is the most abundant organic component of bone, skin, and tendon extracellular matrix [2]. To date, more than 1000 COL1A1/2 mutations have been identified in patients with OI. OI caused by COL1A1/2 mutations is classified into two types. The first type involves the substitution of a glycine within the Gly-x-y triplet domain of the triple helix, which can give rise to the abnormal synthesis of collagen fibrils. The second type of mutation takes the form of frameshift, nonsense, and splice-site mutations, which can result in haploinsufficiency [1,8].
In this family, the patient (proband (III-1), her mother (II-1), and her brother (III-2)) have the heterozygous COL1A1 mutation (c.2010delT) and the COL1A1 mutation in proband (III-1) and her brother (III-2) were inherited from their affected mother. This frameshift mutation (c.2010delT) has been reported in type I/IV OI [9,10], and is predicted to cause premature termination at codon 94 [10], which could result in haploinsufficiency. Thus, our results further support the view that the COL1A1 mutation (c.2010delT) can result in OI.
The proband's brother harbors only the c.2010delT COL1A1 mutation, and only has the clinical symptoms of OI. However, the phenotypes of the proband (III-1) and her mother are different from those of her brother. Interestingly, we identified an additional heterozygous gene mutation in the region of COL5A1 encoding the C-terminal propeptide domain, which correlates with the EDS phenotype, and the COL5A1 mutation in proband (III-1) is inherited from her affected mother. This finding may explain why the phenotypes of proband III-1 and her mother are different from those of proband's brother. The molecular basis of cEDS is essentially a deficiency of type V collagen, which is a quantitatively minor fibrillar collagen that is widely distributed in a variety of connective tissues [11]. The major variant of type V collagen is a heterotrimer that is composed of two pro-α1 (V) chains and a single pro-α2 (V) chain, which are encoded by the COL5A1 and COL5A2 genes, respectively [12,13].
The COL5A1 gene encodes the α1 chain of type V collagen, which is a minor fibrillar collagen found in ligament and tendons, as well as other tissues [14]. Mutations in the COL5A1 C-terminal propeptide domain can cause 'functional' haploinsufficiency of type V collagen [15,16], either because of inefficient trafficking of the mutant protein through the endoplasmic reticulum [17] or the impaired incorporation of the mutant α1 chain into type V collagen, and is an important factor in the pathogenesis of cEDS [16,18]. The mutant α1 chain of type V collagen also plays a negative role by disrupting the interactions with other ECM components, which can be observed in patients with COL5A1 mutations in the C-terminal propeptide domain [19].
Type I and V collagens are the two main components of ligaments. Thus, mutations in COL1A1 and COL5A1 are potential risk factors for ligament rupture [20], as displayed by our proband. The action of an external force could make the ligament easier to rupture. The proband experienced ligament rupture caused by trauma resulting from a car accident, not from a spontaneous rupture. In addition, the proband's mother had no history of trauma, and had not suffered ligament rupture. The type V collagen plays a central role in collagen fibrillogenesis and co-assembles with type I collagen to form heterotypic fibrils [12,23]. Type V collagen intercalates into the core of type I collagen fibrils, where it is involved in the organization and regulation of type I collagen fibril diameter [21]. In col5a1 +/− mice, tendons have larger diameter fibrils, resulting in an irregular shape [22]. Irregularly shaped fibrils generate a diminished dynamic mechanical response of col5a1 +/− tendons [22], and COL5A1 mutations give rise to structural tendon pathology and low tendon stiffness responsible for joint hypermobility [23]. The COL5A1 mutation (c.5335A>G) results in a change from asparagine to aspartic acid. However, interpreting the COL5A1 mutation as disease causing without some functional studies is a limitation of our report. Mutations described previously in the C-propeptide domain of the COL5A1 gene have involved Cysteine residues. Cysteine residues form disulfide bonds between collagen chains and changing Cysteine to another amino acid interferes with the ability of the individual collagen molecule to assemble into a trimer [18]. The effect of a mutation involving asparagine to aspartic acid change in the C-propeptide domain of COL5A1 has been not reported. However, the c.5335A>G mutation in COL5A1 was predicted by SIFT, Mutation Taster and CADD, respectively, to be deleterious, disease causing and damaging, indicating that this mutation could be potentially causative of disease, consistent with a role in EDS. In support of this, the site of this COL5A1 mutation is evolutionarily conserved, suggesting that it has an important biological function. Other known and potential pathogenic variants for OI and EDS were not identified by whole exome sequencing in this family. In addition, the COL5A1 mutation was not found in the proband's brother, who did not display the clinical symptoms of EDS. Finally, mutations in TNXB, which has been associated with EDS and COL1A1 can give rise to overlapping phenotypes of OI and EDS [24]. Therefore, we speculated that the combined mutations in COL1A1 and the COL5A1 C-terminal propeptide domain could result in the hybrid EDS-OI phenotype of the proband III-1 and her mother, both of whom exclusively carry the COL5A1 (c.5335A>G) mutation within the family. Further functional studies with a larger sample size are needed to confirm the results.
In conclusion, in a family with OI, we identified a COL1A1 mutation known to be responsible for OI, and further identified a novel second mutation in the COL5A1 C-terminal propeptide domain in patients with OI and EDS. These results suggested that a combination of COL5A1 and COL1A1 mutations might lead to compound phenotypes of OI and EDS. To the best of our knowledge, this is the first report showing that members of a family with both COL1A1 and COL5A1 mutations present with a hybrid phenotype of OI and EDS. In addition, our results support the conclusion that that COL1A1 (c.2010delT) can result in OI. Whole exome sequencing can help us to understand the basis of diseases with compound phenotypes and provide a theoretical foundation for genetic counseling.