Comprehensive functional network analysis and screening of deleterious pathogenic variants in non-syndromic hearing loss causative genes

Abstract Hearing loss (HL) is a significant public health problem and causes the most frequent congenital disability in developed societies. The genetic analysis of non-syndromic hearing loss (NSHL) may be considered as a complement to the existent plethora of diagnostic modalities available. The present study focuses on exploring more target genes with respective non-synonymous single nucleotide polymorphisms (nsSNPs) involved in the development of NSHL. The functional network analysis and variant study have successfully been carried out from the gene pool retrieved from reported research articles of the last decade. The analyses have been done through STRING. According to predicted biological processes, various variant analysis tools have successfully classified the NSHL causative genes and identified the deleterious nsSNPs, respectively. Among the predicted pathogenic nsSNPs with rsIDs rs80356586 (I515T), rs80356596 (L1011P), rs80356606 (P1987R) in OTOF have been reported in NSHL earlier. The rs121909642 (P722S), rs267606805 (P722H) in FGFR1, rs121918506 (E565A) and rs121918509 (A628T, A629T) in FGFR2 have not been reported in NSHL yet, which should be clinically experimented in NSHL. This also indicates this variant’s novelty as its association in NSHL. The findings and the analyzed data have delivered some vibrant genetic pathogenesis of NSHL. These data might be used in the diagnostic and prognostic purposes in non-syndromic congenitally deaf children.


Introduction
Hearing impairment is the highest age-standardized disability globally [1]. It affects nearly 1 in every 1000 livelihoods worldwide [2]. Hearing impairment affects speech development and language acquirement and hinders children's education [3]. The causes of hearing loss (HL) can broadly be classified as conductive, sensorineural, and mixed HL. HL, which is predominantly due to genetic etiology, is usually present in an early life without any additional clinical phenotypes.
Seventy percent of neonates with HL are presumed to have inherited HL, classified as non-syndromic hearing loss (NSHL). They are not associated with other distinguishing physical findings [1]. NSHL generally follows simple Mendelian inheritance with a 75-80% transmission rate as autosomal recessive, 20% as autosomal dominant, 25% as X-linked and remaining 1% as mitochondrial mutation [1,2,4]. Fifty percent of congenital sensorineural HL is hereditary, caused by genetic mutations in a single gene or combination of multiple genes [3].
The explosions of genetic information and advancement in technology have radically improved the deep understanding of inherited diseases. In the case of NSHL, genetic correlations are a significant challenge due to wide clinical and genetic heterogeneity. Due to the diverse genetic underlying, the broad subsets of mutated genes associated with the initial development and progression of HL is often indistinguishable [4].
Management options include surgical treatment of craniofacial abnormalities, hearing aids and cochlear implants, depending on the degree, and type of HL. But for an improved understanding of the pathophysiology and molecular mechanisms of the underlying HL, the promotion of genetic testing in the advancement of the new treatment can be used to a greater extent [5]. It will help in earlier detection of HL, and thereby early intervention can also be initiated, with a better outcome in the development of speech and hearing.
The improved diagnostic, prognostic, and therapeutic options are the potential translational outcomes of systematic elucidation of NSHL genes [4]. Involvement of gene-encoded proteins in hearing function is expected because the inner ear and hearing mechanism has a very complicated structure [11]. Thus, the present study has focused on exploring some other target genes and most deleterious mutations in genes other than GJB2, which might lead to genetic NSHL through systematic review since the last decade (2009-2020) and in silico analyses like functional network analysis and variant study.

Filtering of data
The curated articles had been screened, followed by inclusion and exclusion criteria to satisfy the aim and objectives of the present work.

Inclusion criteria
All the available full-length original research articles and case reports were included. The following criteria were followed for selecting the articles:

Network analysis
The network analysis was carried out with the screened unique genes to analyze the functional and the physical association between them through STRING (https://string-db.org/) database [12]. A high confidence score of 0.700 was used to build the network between the genes.
The linked genes were grouped based on their involvement in some of the major NSHL associated biological processes resulted from STRING. The functional protein-protein interaction (PPI) networks were built again at a high confidence score of 0.700 between the grouped genes according to the respective biological processes. The highly interacting co-expressed genes were analyzed and processed further for variant study.

Article selection and gene sorting
The PRISMA guidelines based search methodology identified a total of 14216 articles from the PubMed (3172), JS-TOR (209), ScienceDirect (10827), and Cochrane (7) databases based on used MeSH terminologies. Only 787 unique studies, out of 14216 were satisfied all the inclusion criteria and shortlisted for further studies ( Figure 1). There was a total of 2707 genes have been collected from 787 original research articles and subjected to duplication removal which resulted 423 unique genes. But among these, mitochondrial RNA (5), DNA (1), and reported miRNAs (7) have been removed, and remaining 382 genes were taken as final unique target genes for network analysis.

Network analysis
All the 382 NSHL-associated target genes were subjected to network analysis through the STRING database, which built a strong functional association between the genes with 255 nodes, 571 edges, and PPI with enrichment significant P-value <1.0e- 16. UBC (28 interactions) was the most interacted gene at high confidence score 0.700 (Figure 2), followed by CDH23 (20 interactions), SOX2 (19 interactions), MYO7A (17 interactions), PCDH15 (15 interactions) etc. The biological processes resulted in STRING were grouped into four different categories on the basis of important processes associated with HL, i.e. Group I-ear development (ear morphogenesis, inner ear morphogenesis, inner ear receptor cell development, inner ear auditory receptor cell differentiation, inner ear receptor cell stereocilium organization, auditory receptor cell morphogenesis, auditory receptor cell stereocilium organization, cochlea morphogenesis, ear development, outer ear morphogenesis, middle ear morphogenesis, inner ear receptor cell differentiation, cochlea development, vestibulocochlear nerve development, vestibulocochlear nerve formation, auditory receptor cell morphogenesis, auditory receptor cell stereocilium organization, auditory receptor cell development, inner ear development, auditory receptor cell fate commitment); Group II-ion transport (ion transport, cell junction organization, regulation of ion transmembrane transport, cell junction assembly, cell-cell signaling, gap junction assembly, regulation of potassium ion transmembrane transport, potassium ion transmembrane transport, chemical homeostasis, sodium ion transmembrane transport, regulation of cell junction assembly, ion transmembrane transport); Group III-sensory organ development (sensory organ development, sensory organ morphogenesis, sensory system development); and Group IV-sensory signaling pathways (sensory perception of sound, sensory perception, detection of mechanical stimulus involved in sensory perception of sound, response to auditory stimulus) ( Table 1). The number of genes were 60, 113, 68, and 95 genes in group I, group II, group III, and group IV, respectively. From the above analysis, it has been observed that there were 62 genes uniquely involved in group II, 2 genes in group III and 32 genes in group IV of biological processes. But there were no unique involvement of genes found in group I of biological processes. Apart from these, there were 15 genes commonly involved in group I, III and IV; where as 7 genes in group I, II and III; 1 gene in group II, III and IV; 23 genes in group II and IV; 14 genes in group I and III; and 5 genes were commonly found in group II and III. Lastly, the involvement of 15 genes was commonly identified in all the four groups of biological processes analyzed in the present study (Table 1 and Figure 3) (Supplementary File S1).
These biological processes associated genes have been again processed in networking analysis and generated four individual networks for each group, respectively, at high confidence (0.700). The PPI network in the ear development group of genes was developed with 40 nodes, 72 edges, and <1.0e-16 P-value, where SOX2 (14 interactions) has found as highly interacted gene followed by MYO7A and PCDH15 (10 interactions each) ( Figure 4). The ion transport group of genes have built a PPI enrichment network with 86 nodes, 128 edges and <1.0e-16 significant P-value, where HGF and UBC (10 interactions each) have identified as most interacting genes, followed by CDH23, RAC1 (9 interactions each) ( Figure 5). For the sensory organ development group SOX2 (14 interactions) has found as highly interacted gene, followed by MYO7A and PCDH15 (10 interactions each) in the PPI network built with 45 nodes, 75 edges, and <1.0e-16 significant P-value ( Figure 6). Lastly between the sensory signaling group genes, the PPI network has been built with 54 nodes, 128 edges, and <1.0e-16 significant P-value, where CDH23 (16 interactions) is the highly interacted gene followed by MYO7A (14 interactions) (Figure 7).

Screening of deleterious SNPs
All the deleterious nsSNPs in the respective genes OTOF, FGFR1, and FGFR2 have been reported in UniProt and ClinVar databases. But only the nsSNPs rs80356586 (Ile 515 Thr), rs80356596 (Leu 1011 Pro), rs80356606 (Pro 1987 Arg) found in OTOF have been reported particularly in NSHL conditions, whereas the nsSNPs rs121909642 (Pro 722 Ser) of FGFR1 has reported in hypogonadotropic hypogonadism 2 with anosmia, but rs267606805 (Pro 722 His) has not reported in any disease condition. Likewise the nsSNPs of FGFR2, rs121918506 (Glu 565 Ala) has reported in Pfeiffer syndrome and craniosynostosis syndrome; and rs121918509 (Ala 628 Thr, Ala 629 Thr) has found in LADD syndrome.

Discussion
Congenital NSHL has long been considered to be due to genetic mutations. The genetics of NSHL should be explored more to explain the genetic diversity of NSHL. According to previously reported data, GJB2 is the most frequently associated target gene in NSHL. In the GJB2 mutant condition, the functional gap junction channel formation is defective [18]. The researchers have been reported the role of GJB2 mutations in the pathogenesis of NSHL through different experimental analyses [6,[19][20][21][22][23]. Though GJB2 is the most common mutation worldwide in different populations, other important genes include SLC26A4, GJB3, GJB6, MYO15A, MYO7A, TMC1, CDH23 etc. were also identified in the pathogenesis of NSHL [24][25][26][27][28]. This study reviewed all the described target genes identified experimentally in NSHL during the period between 2009 and 2020 and attempted to study the interactions between them through functional network analysis and identify the possible deleterious variants. The congenital HL may be due to the defects in the outer, middle (conductive), and inner ear (sensorineural) [18]. The sensorineural type of HL involves multiple mechanisms due to genetic defects leading to abnormal biocellular processes. From the resulted functional network at high confidence through STRING the association between the identified target genes on the basis of biological processes has been analyzed. This provided an opportunity to identify further vital genes and their underlying mechanisms for ear development and morphogenesis (dysfunction of ear and/or cochlea; abnormality in the structural morphogenesis or transformation of ear) [18,29]; ion homeostasis/transport [30], auditory sensory system/sensory signaling [31] which are causative towards disease pathology. The system biology approach has provided the advantages of gene regulatory network analysis to understand the interactive roles of genes in disease pathogenesis [32]. However, the gene-gene interaction network describes the close connections between the genes in a particular pathway, which is easier to interpret and validate the research objectives [33]. Previous studies have reported some other genes as the gene hub in the PPI networks through in silico applications; such as Fan et al., in 2014 has identified TMPRRS3 (interacted with GJB2, SLC26A4, MYO7A, DFNB59) as the most interacted target gene in NSHL through network analysis (between 98 genes) by using STRING 9.0 [34].
Likewise, MYO7A (interacted with MYO6, KCTD3, NUMA1, MYH9, KCNQ1, UBC, DIAPH1, PSMC2, and RDX) has reported as the most interacted central gene hub in the PPI network (between 116 HL genes) generated through Enricher and PANTHER databases by Lebeko et al. in 2017 [35]. In another study, the network analysis was done between three groups of HL genes [(i) nonsyndromic group of genes (63 genes), (ii) syndromic or non-syndromic group (107 genes), and (iii) -otic capsule development and malformation group of genes (112 genes)]. By using the ingenuity pathway analysis (IPA) software, from which TGFB1 (with 35 connections) was found in the central node of the network in the first group (NSHL group of genes) and MAPK3/MAPK1 MAP kinase (with 33 gene connections) was identified as the central node of the network in the second group (both syndromic and NSHL group of genes) [23]. Thus, this present study has tried to explore the gene network analysis through STRING database with 382 genes involved in NSHL at high confidence score and identified the highly interacted HL target genes UBC, HGF, CDH23, RAC1, SOX2, MYO7A, and PCDH15 in each of the four biological processes groups have been chosen as gene hub in the NSHL target gene panels. In addition to these hub genes, the co-expressed target genes are also important in disease pathogenesis. The co-expressed networks can identify the possible gene pairs, regulatory genes, similar gene matrixes etc., in the disease conditions, which indicates the simultaneously active target genes in the disease progression [36]. Thus, the present study has also chosen the associated co-expressed genes ACTB, OTOF, ATP2B2, EYA1, FGFR1, FGFR2, POU3F4, CHD7, and SALL1, along with the target hub genes in all possible functional groups of hearing impairment.
However, it has been reported that inner ear dysfunction is a relatively common consequence of human genetic mutation [37]. In these genetic mutations, the nsSNPs have a vital role in damaging or modifying protein-coding sites, consequently affecting the protein's structure and function [38]. The analysis of most functionally interactive genes for possible pathogenic non-synonymous variants could identify only in OTOF, FGFR1, and FGFR2 genes through in silico tools based on scoring algorithms. The expression of these three genes and their mutational effects in the progression of NSHL has been reported since the last decade. Some of the pathogenic (Arg 798 X, Gly 829 X, Leu 391 Arg, Glu 747 X, Arg 425 X, Tyr 474 X, Trp 717 X, Tyr 1064 X, Gln 1072 X, Arg 1856 Gln, Arg 1172 Gln) and likely pathogenic (Pro 489 Ser, His 513 Arg, Arg 1583 His, Arg 1792 Cys, Arg 1792 His) variants in OTOF have been reported in autosomal recessive NSHL cases of Texas, Qatar, and Japan populations [9,39,40].
The pathogenic nsSNPs for all the highly interacted and co-expressed genes have been collected from dbSNP and analyzed in SIFT, PredictSNP1, and PredictSNP2. Among the analyzed nsSNPs, only some nsSNPs rs80356586, rs80356596, rs80356606 in OTOF, rs121909642 in FGFR1, and rs121918506, rs121918509 in FGFR2 have been identified as deleterious in all the prediction algorithms. So these variants can be called as most deleterious SNPs in the respective genes, which might leads to NSHL. The gene OTOF is responsible for the composition of ribbon synaptic vesicles in cochlear inner hair cells, and the mutations in OTOF are responsible for 2-3% of NSHL [9,41]. The association of OTOF in NSHL has been experimented in immortal lymphoblastoid cell lines, inner hair cell (IHC) and human embryonic kidney cells (HEK) [42,43]. In this study, the most deleterious nsSNPs rs80356586 (Ile 515 Thr) [44], rs80356596 (Leu 1011 Pro) [45,46], rs80356606 (Pro 1987 Arg) [47] found in OTOF gene were reported in UniProt and ClinVar datasets for NSHL cases. Among these three variants, two have been identified in the Turkish population (Ile 515 Thr, Leu 1011 Pro) and one in northern Lebanon populations (Pro 1987 Arg).
The involvement of FGFR1 has been reported in the development of the auditory sensory epithelium in vitro studies on mice [48]. FGFR1 and FGFR2 have also been used in the reference gene panel, which has been used in the genomic diagnosis of NSHL cases in the Spain population earlier [42]. No reports have described either the role of FGFR1 and FGFR2, or on presently predicted nsSNPs rs121909642 (Pro 722 Ser) of FGFR1 and rs121918506 (Glu 565 Ala), rs121918509 (Ala 628 Thr, Ala 629 Thr) of FGFR2 in the NSHL in humans.
These deleterious variants might have structural and functional effects on respective proteins, leading to NSHL. Thus, the found variants in the respective gene could be considered potential targets for NSHL after clinical authentication.

Conclusion
Genetic counselling remains a crucial analysis for patients with NSHL. The advancement in in-silico tools and techniques, including GWAS and NGS technologies, are excellent resources for the research community in the present and