Bioinformatics analysis and genetic polymorphisms in genomic region of the bovine SH2B2 gene and their associations with molecular breeding for body size traits in qinchuan beef cattle

Abstract The Src homology 2 B 2 (SH2B2) gene regulate energy balance and body weight at least partially by enhancing Janus kinase-2 (JAK2)-mediated cytokine signaling, including leptin and/or GH signaling. Leptin is an adipose hormone that controls body weight. The objective of the present study is to evaluate the association between body measurement traits and SH2B2 gene polymorphisms as responsible mutations. For this purpose, we selected four single-nucleotide polymorphisms (SNPs) in SH2B2 gene, including two in intron 5 (g.20545A>G, and g.20570G>A, one synonymous SNP g.20693T>C, in exon 6 and one in intron 8 (g.24070C>A, and genotyped them in Qinchuan cattle. SNPs in sample populations were in medium polymorphism level (0.250A, g.20693T>C, and g.24070C>A, significantly (P < 0.05) associated with body length (BL) and chest circumference (CC) in Qinchuan cattle. In addition, H4H3 and H5H5 diplotype had highly significantly (P < 0.01) greater body length (BL), rump length (RL), and chest circumference (CC) than H4H2. Our investigation will not only extend the spectrum of genetic variation of bovine SH2B2 gene, but also provide useful information for the marker assisted selection in beef cattle breeding program.


Introduction
To get long-term improvement in growth and key carcass characteristics that have economic importance, selective breeding is used but, it can be difficult to get efficient genetic gain using traditional breeding methods due to long periods required to finish progeny in order to get information on performance [1,2]. Marker-assisted selection (MAS) for improving desirable traits is powerful and efficient [3,4]. Based on the biological function, the genes that are involved in meat quality traits or body measurements of production animals can be identified [5,6]. Qinchuan cattle used in this research are an indigenous breed in China, and are known to have good meat quality, adaptability in farming systems, and desirable physical features [7][8][9]. So, it would be valuable to understand the biological function of genes that are associated with carcass characteristics and body or growth traits [10]. In the process of livestock breeding, body measurement and meat quality traits are used as a tool to assess the economic value of animals. It has been demonstrated that many genes are related to, meat production [11], growth [12], and meat quality traits [13]. The SH2B family has three members (SH2B1, SH2B2, and SH2B3) that contain conserved dimerization (DD), pleckstrin homology, and SH2 domains. Previously, SH2B2 that categorized as an adapter protein with a PH and SH2 domain (APS) is a member of the Src homology 2 B (SH2B) and has a conserved structure of a N-terminal dimerization domain (DD), a central pleckstrin homology (PH) domain, and a C-terminal Src homology 2 (SH2) domain [14]. SH2B2 may regulate energy balance and body weight partially by enhancing Janus kinase-2 (JAK2)-mediated cytokine signaling, including leptin and growth hormone signaling. In cultured cells, SH2B2 binds via its SH2 domain to JAK2, potentiating JAK2 activation [15,16] and also binds to the insulin receptor, promoting the insulin signaling pathway [17,18]. Moreover, SH2B2 didn't affect insulin receptor numbers or insulin receptor turnover both in vivo and in vitro; however, SH2B2 increased insulin sensitivity in mice [19]. Therefore, SH2B2 has activity in mediating the insulin-stimulated activation of the c-Cb1/CAP/TC10 pathway that appears to play an important role in regulating glucose uptake in cultured adipocytes [20]. SH2B2 is expressed in multiple tissues, including targets of insulin, GH, and leptin (e.g. the brain, adipose tissue, and skeletal muscle) [14,21,22]. SH2B2, on the other hand, are known as negative regulators of B-cell proliferation [23,24] and the mRNA expression in Qinchuan beef cattle that we have detected, we found that there is a high expression of SH2B2 not only in fat but also in kidney and other splanchnic tissues, which might bring some change about animal traits. Thus, we hypothesized that SH2B2 might be associated with conformation and carcass traits on beef cattle.
There has been a lack of information about the association of bovine SH2B2 genotypes with body measurement traits in Qinchuan cattle. Therefore, the present study was designed to identify the effects of polymorphisms on SH2B2 in 468 individual Chinese Qinchuan cattle by using Real-time PCR to analyze tissue expression patterns and establishing a correlation between the bovine SH2B2 gene mutations and body measurements to identify associated quantitative traits for the benefit of cattle breeding and genetics.

Bioinformatics analyses
The bioinformatics techniques were used for the measurement of degree of conservation and biological evolution of SH2B2 protein in different species.  (http://www.jalview.org/). The analysis of protein structure and function, the motifs were searched, and conserved domains were identified through the online MEME suite website [25].

Feeding and management of Qinchuan cattle and phenotypic data collection
Total 468 female cows (non-pregnant) of Qinchuan breed cattle maintained at the experimental farm of National Beef Cattle Improvement Research Centre, Yangling, China were selected for conducting this research study. All the experimental animals were aged between 18 and 24 months of age and were randomly selected from Qinchuan cattle breeding populations, the subject animals were fed a total mixed ration (TMR), containing 25% concentrate and 75% roughages of dry straw and corn silage, and water was offered ad libitum. The feeding was offered based on NRC standards (Nutrient Requirement of Beef Cattle) [26]. Moreover, all animals were kept under uniform management system with same environment (i.e. temperature and humidity) in the shed. Animals were stunned with a captive bolt and slaughtered through exsanguination, then the collected samples were snap-frozen in liquid nitrogen for tissue RNA isolation. All samples were stored at −80 • C until subsequent analyses.

Primer design and PCR conditions
There primers to amplify of the bovine SH2B2 gene were designed based on NCBI database (GenBank accession number NC 037352.1) CDS region of ∼1949 kb. First, we mixed 468 DNA samples with equal molar ratio to constitute a DNA pool [27]. Then, DNA from 468 Qinchuan cattle were performed for PCR using Primer v5.0 software (PREMIER Biosoft International, California, U.S.A.). Primers, annealing temperature, region, and fragment sizes are shown in Table 1. The PCR was carried out in a total volume of 20 μl containing 50 ng DNA, 10 pM of each primer, 0.20 mM dNTP, 2.5 mM MgCl 2 and 0.5 U Taq DNA polymerase (TaKaRa, Dalian, China). The PCR protocol was 5 min at 95 • C; 35 cycles of 30 s at 94 • C, 35 s at corresponding temperature, 40 s at 72 • C, and a final extension step at 72 • C for 10 min. The digested products were detected by electrophoresis technique in a 0.8% agarose gels containing 0.5 μg of ethidium bromide/ml. The PCR products were sequenced through Sangon (Shanghai, China) to screen for polymorphisms. All sequences were checked using Seq Man (DNASTAR, Inc., U.S.A.) software, and the SNPs were identified.

Figure 4. SH2B2 Protein Sequences (Multiple sequence alignment) of 11 species
The conserved properties were marked with different background shading. With blue being 100%; gray with blue, 80%; gray with yellow, 60%, and white, not conserved.

Figure 5. Conserved structural motifs of 11 species
The P-value shows the significance of the motif site. The length of the color block shows the position, strength and significance of a particular motif site. The motif sites length is proportional to the negative logarithm of the p-value of the motif site. These colors are given through motif analysis performed through MEME suit system.

Analysis of mRNA relative expression and real-time PCR
The eight tissue specimens, including muscle, rumen, fat, abomasum, heart, spleen, kidney, and small intestine were collected from three female Qinchuan cattle aged 18 months old (n = 3). The RNA was extracted from each tissue sample using the Trizol reagent kit (TIANGEN, China), and was subjected to reverse transcription (RT) to obtain the corresponding cDNA (TaKaRa, Dalian, China). After collection from the tissue, samples were preserved in liquid nitrogen and were transferred immediately in frozen form to the molecular laboratory for the extraction of total RNA. The total RNA was extracted from the tissue using TRIzol™ Reagent (Invitrogen, ThermoFisher Scientific, Inc. U.S.A.). Data were normalized to the geometric mean of GAPDH (GenBank Accession no. NM 001034034) used as endogenous control genes. The primers used are given in Table 1. Real-time quantitative PCR was performed using the ABI 7500 RT-PCR system (Applied Biosystems, NY, U.S.A.) with the reagent TB Green Premix Ex Taq II (Takara, Kusatsu, Japan), calculated using the 2-Ct method [28].

Statistical analysis
Gene and allelic frequency of four SNPs were determined and Hardy-Weinberg equilibrium (HWE) were calculated through χ2 test via the PopGene software [29]. Linkage disequalibrium (LD) tests containing value of D' and γ2 were evaluated through HAPLOVIEW (Version 3.32) (Barrett 2005). Other population genetic data, like gene heterozygosity (He) or polymorphism information content (PIC), was statistically analyzed according to established methods [30]. The haplotype data were analyzed by the website tool: SHEsis software [31,32]. Analysis of associations between the genotypes of SNPs and body measurement traits was carried out with the GLM procedure, using SPSS software (version 13.0) by the following formula: Yij = u+Gi+Ai+Eijk Where Yij was the traits measured on each of the individual cattle, μ was the overall population mean for the traits, Gi was the fixed effect associated with the genotype, Ai was the fixed effect due to the age and Eijk was the standard error.
The mean relative mRNA expression level of SH2B2 gene in different tissues and at different age groups was analyzed by ANOVA using computer software SAS (version 8.1).

Polymorphisms and genetic diversity
Four polymorphism sites in SH2B2 gene, including (snp1 g.20545A>G, snp2 g.20570G>A, snp3 g.20693T>C, and snp4 g.24070C>A, were identified by sequencing. Genotype and allele frequency for the 4 loci are shown in (Table 2). An allele of g.20545A>G, g.20570G>A and g.24070C>A, and T allele of g.20693T>C was predominant at the four    SNPs. The PIC value is an effective variability to assess the genetic diversity from different loci of candidate gene. Our results showed that those SNPs were in medium polymorphism level (0.250<PIC<0.500). By χ 2 test, the genotypic distributions of g.20570G>A, and g.20693T>C, differed significantly from Hardy-Weinberg equilibrium (P < 0.05) (see in Table 3). Genetic parameters including genotype and allele frequencies were calculated from total 468 cattle heads of Qinchuan breed.

LD and haplotype analysis
There are two most commonly used indicators for the prediction of linkage disequilibrium (LD). One is D and other is r 2 . There is a consensus of the researchers that the latter indicator is most commonly used for pair wise measurement of the LD and hence consider less sensitive for the measurement of allelic frequencies than D' [33,34].

Effects of single markers/ haplotype combinations on growth traits in Qinchuan cattle
In this paper, four polymorphisms seem to mainly affect bovine body measurement traits (Table 5). At g.20570G>A locus, individuals with genotype GG had higher values than those with GA on BL and CC (P < 0.05). At g.20693T>C locus, genotype CC had higher mean values for BL and CC than these with the genotype TT (P < 0.05). At g.24070C>A, locus, significant differences of BL, RL and CC were observed between CC and AA genotypes (P < 0.05). No significant correlations were observed in the rest of the index for the four SNPs. In Table 6, multiple effects of the four SNPs were evaluated. H 4 H 3 and H 5 H 5 diplotype had highly significantly greater BL, RL and CC than H 4 H 2 (P < 0.05), similarly results were found between H 4 H 3 and H 1 H 1 (P < 0.05).

SH2B2 gene expression profile
Results for SH2B2 relative expression levels in each tissue were shown in Figure 2A,B. The SH2B2 has a wide tissue distribution in the bovine tissues examined, with expression in small intestine, muscle and fat being the highest. The mRNA expression level in abomasum, rumen, and spleen tissues were second highest. The SH2B2 was expressed only slightly in the heart and kidney tissue. There could be both direct and indirect relationships between body size and metabolism due physiological modulation from SH2B2. We also analyzed expression level of SH2B2 in bovine preadipocytes and adipocytes at different time points ( Figure 6B). Expression level of SH2B2 in differentiated adipocytes was decreased in day-2 (D2) as compared with day-0 (D0) of preadipocytes. Interestingly, we found an increasing trend in the expression level of SH2B2 from D2 to day-10 of adipocytes differentiation.

Biological evolution and conservation of SH2B2
The SH2B2 gene is located on chromosome 25 of the bovine genome. The total length of SH2B2 is 25296 bp, comprising the genomic coordinates starting from 34677735 to 34703030 (NC 037352.1, Reference genome bos taurus ARS-UCD1.2). This gene comprises 11 exons, the ORF which started from the start codon to the stop codon is 2040 bp, and the putative protein contains 679 amino acids ( Figure 3A). The predicted network interaction among the SH2B2 with other genes shows 67.64 % physical interactions. The co-expression, co-localization, and shared protein domains structures were 13.50%, 6.17 %, and 0.59%, respectively. ( Figure 3B). The result of multiple sequence alignment there were 11 kinds of SH2B2 protein aligned. The conserved properties were marked with different background shading. With blue being 100%; gray with blue, 80%; gray with yellow, 60%, and white, not conserved (Figure 4), the MEME online suit was used to find common significant motifs in the super secondary protein structure of the SH2B2 gene in 11 target species ( Figure 5). We found that there were many similar structures between bovine SH2B2 and other species. The secondary structure of bovine SH2B2 protein was predicted by using the Protean program in DNASTRAR 6.0 software. The online tool SWISS-MODEL was used to predict the tertiary structure of the protein, and the SH2B2 protein α-helix, β-sheet, and β-turn level were predicted. Regular curling and other structures. As shown in (Figure 6), the SH2B2 gene comparative genomics was searched through Ensmbl database (ensembl.org/Bos taurus). Genomic alignment showed total 521 numbers of genes, with 454 numbers of speciation nodes, 35 numbers of duplication and 31 numbers of ambiguous genes. The SH2B2 of cattle, goat had the closest phylogeny, and the SH2B2 of Elephant, Hagfish was much more distant from the bovine branch of the phylogenic tree (Figure 7). The domains hits SH2B2 were found not conserved in mice species. While for the rest of the species, all domains hits were found conserved. Total 10 significant motifs were found among 11 species (Figure 8), which indicated that there is functional similarity among the selected species at the protein super secondary structure level.

Discussion
Body measurement and carcass quality traits are used for the assessment of animals' worth. The loin area muscle and intramuscular fat contents are the key indicators of meat quality grading. These traits are mostly affected by age of the animals, management conditions such as nutrition and by genetics of the animals. To get sustainable improvement in these traits of economic importance, selective breeding is one of the effective strategies, but it takes very long time to get efficient genetic gain due to longer generation interval in cattle. The candidate gene strategy is an efficient tool to measure association between genetic polymorphism and traits of economic importance in marker assisted selection [1].
In the present study, four SNPs (snp1 g.20545A>G, snp2 g.20570G>A, snp3 g.20693T>C, and snp4 g.24070C>A) that were detected in the bovine SH2B2 gene coding sequence (CDS) region possibly affects body measurement traits (BMTs) and meat quality traits (MQTs). To reveal the linkage relationships among these four SNPs, the linkage disequilibrium (LD) between these four sites were estimated, which indicated that the r 2 values ranged from 0.000 to 0.343. Based on the D and r 2 values, three closely linked loci were revealed in the Qinchuan breed. According to an earlier research, if the value of r 2 is over 0.33, the LD is considered to be strong [41] Our result revealed that there was a strong linkage between g.20693T>C, and g.24070C>A, others linkages with pair-wise r 2 < 0.33 were of weak kind.
In the present study, we found significant associations of genotypes g.20570G>A and g.24070C>A, with body measurement and carcass quality traits. Here, both g.20570G>A and g.24070C>A were located in the intron region and did not change the structure of the encoded proteins, but our results demonstrated that it was still associated with several growth traits. Such associations may be the result of linkage disequilibrium between this SNP and other genes on the same chromosome that have a significant effect on the growth traits studied here [42]. Another reason may be that mutations within introns could affect both the splice donor site or nearby regions and regulatory motifs within introns [43].
Thus, we further analyzed the effects of the combined genotypes above and growth traits in cattle. Haplotypes composed of SNPs could provide accurate information than single marker analysis for economic trait associations, due to the ancestral structure captured in the distribution of haplotypes. The Hap1 (-AATC-) had the highest haplotype frequencies (33.70%). The probable cause could be artificial selection in the Qinchuan cattle population, particularly the genomic regions influencing traits of economic importance [44,45] Moreover, to further exploit the function of the SH2B2 gene in the growth and development of Qinchuan cattle, mRNA expression was investigated in different tissues and adipocytes of Qinchuan cattle. Highest expression was found in small intestine, muscle, and fat. These findings show the role of SH2B2 in metabolism, growth and development, which are supported by the previously published literature, and that SH2B2 is a positive regulator of energy and glucose metabolism [46]. In addition, we also found high expression of SH2B2 in proliferation stage of preadipocytes, which was then slightly decreased in differentiation stage of day 2, and then an increasing trend was found in the expression level of SH2B2 from day 2 to 10 of adipocytes differentiation. These findings show role of SH2B2 in proliferation and differentiation of bovine adipocytes in Qinchaun cattle. Our results are in line with the findings of previously published literature [14]. Similarly, a previous study reported that g.1220C>T and g.21049C>T showed significant associations with body weight, average daily gain, body height, body length, and hucklebone width of Nanyang cattle at different ages [47].

Conclusion
In conclusion, association analysis between SH2B2 gene polymorphisms indicated that g.20570G>A, g.20693T>C, and g.24070C>A, significantly associated with growth traits in Qinchuan cattle. In addition, H4H3 and H5H5 diplotype had highly significantly (P < 0.01) greater body length (BL), rump length (RL), and chest circumference (CC) than H4H2. Our investigation will not only extend the spectrum of genetic variation of bovine SH2B2 gene, but also provide useful information for the marker assisted selection in beef cattle breeding program.