Effects of drought stress on global gene expression profile in leaf and root samples of Dongxiang wild rice (Oryza rufipogon)

Drought is a serious constraint to rice production throughout the world, and although Dongxiang wild rice (Oryza rufipogon, DXWR) possesses a high degree of drought resistance, the underlying mechanisms of this trait remains unclear. In the present study, cDNA libraries were constructed from the leaf and root tissues of drought-stressed and untreated DXWR seedlings, and transcriptome sequencing was performed with the goal of elucidating the molecular mechanisms involved in drought-stress response. The results indicated that 11231 transcripts were differentially expressed in the leaves (4040 up-regulated and 7191 down-regulated) and 7025 transcripts were differentially expressed in the roots (3097 up-regulated and 3928 down-regulated). Among these differentially expressed genes (DEGs), the detection of many transcriptional factors and functional genes demonstrated that multiple regulatory pathways were involved in drought resistance. Meanwhile, the DEGs were also annotated with gene ontology (GO) terms and key pathways via functional classification and Kyoto Encyclopedia of Gene and Genomes (KEGG) pathway mapping, respectively. A set of the most interesting candidate genes was then identified by combining the DEGs with previously identified drought-resistant quantitative trait loci (QTL). The present work provides abundant genomic information for functional dissection of the drought resistance of DXWR, and findings will further help the current understanding of the biological regulatory mechanisms of drought resistance in plants and facilitate the breeding of new drought-resistant rice cultivars.


Introduction
Drought, which remains one of the most important natural hazards facing our world today, affects agricultural productivity and can potentially reduce global crop yield by up to 20% annually [1,2]. Among the world's most important food crops, rice (Oryza sativa) serves as the staple food for more than half the world's population, and as a typical semiaquatic plant, the crop is able to grow well in waterlogged soil [3]. However, rice is highly sensitive to drought, and when rice plants are exposed to drought stress, they respond by activating a series of complicated regulatory mechanisms [4].
In rice, drought resistance is a complex trait that is regulated by polygenes or gene complexes, and recent advances in molecular biology have facilitated the isolation and identification of increasing numbers of these genes [4,5]. Such genes have been reported to encode a variety of proteins, including transcription factors (TFs), protein kinases, enzymes related to plant hormone synthesis and other regulatory and functional proteins [6][7][8]. However, the regulatory mechanisms of drought resistance have not been fully understood. Thus, a better understanding of drought-resistance mechanisms would be helpful for breeding drought-resistant rice cultivars, in order to stabilize rice production.
Our previous studies demonstrated that an accession of wild rice (Oryza rufipogon) collected from Dongxiang County, Jiangxi Province, China (Dongxiang wild rice, i.e. DXWR) is found in more northern habitats (28 • 14 N) than other common wild rice species in China and even in the world [9]. DXWR possesses strong drought resistance, with an ability to survive under extreme drought conditions [10,11]. However, the molecular mechanism and regulatory network of the accession's drought resistance remain unclear, and to date, only 12 quantitative trait loci (QTLs) related to drought resistance in DXWR have been identified [12].
Gene expression profiling can accelerate the progress toward a comprehensive understanding of the molecular mechanisms that control responses to environmental stress. Recently, high-throughput sequencing technology has been widely applied to investigate drought stress-induced transcription in a variety of plants, including coffee [13], cassava [14], sorghum [15], peanut [16], and maize [17], but the transcription profile of drought-stressed DXWR has yet to be reported. Therefore, the main objective of the present study was to investigate the expression profiles of drought-stressed DXWR and to uncover candidate genes by combining differentially expressed genes (DEGs) with previously identified drought resistance-related QTL intervals. The results will provide new insights into the molecular mechanism of drought resistance and will accelerate the utilization of genetic resources from DXWR for drought-resistant breeding in rice.

Plant materials, growth conditions, and drought treatment
DXWR is conserved ex situ at the Jiangxi Academy of Agricultural Sciences (JXAAS, http://www.jxaas.com/), Nanchang, China, and the seeds of DXWR are freely available for scientific research. In the present study, the plants were grown and the drought treatments performed following the methods described in the previous paper [11]. Briefly, the uniformly germinated seeds of DXWR were grown in a plastic pot in a plant growth chamber at day/night temperature of 30  [18]. At the four-leaf stage, the half of the seedlings were air dried in the growth chamber until the leaves appeared wrinkled for drought-stress treatment, whereas, the other half of the seedlings were allowed to continue growth in the IRRI nutrient solution for control treatment (Supplementary Figure S1). Leaf (leaves with dry treatment (LD)) and root (roots with dry treatment (RD)) tissues from the drought-stressed group and leaf (leaves without dry treatment (LCK)) and root (roots without dry treatment (RCK)) tissues from the control group were collected and immediately frozen in liquid nitrogen. For each group and tissue type, samples were collected from ten plants and mixed, in order to minimize the effect of transcriptome unevenness among plants.

RNA extraction, cDNA library preparation, and transcriptome sequencing
Total RNA was extracted using TRIzol reagent (Invitrogen, Carlsbad, CA, U.S.A.), according to the manufacturer's instructions, and the quality and quantity of the resulting RNA were examined using agarose gel electrophoresis and an ND-1000 spectrophotometer (NanoDrop Technology, Wilmington, DE, U.S.A.). Next, cDNA libraries were constructed from the RNA of the four tissue samples (LD, RD, LCK, and RCK) and sequenced by the Beijing Genomics Institute (BGI, Shenzhen, China). Briefly, mRNA was isolated from the total RNA using magnetic beads and fragmented into short sequences that were used as templates for cDNA synthesis. After end repair, adapter ligation, and PCR amplification, paired-end cDNA libraries were constructed and sequenced using an Illumina HiSeq TM 2000 sequencing platform.

Reads filtration and functional annotation of unigenes
High-quality clean reads were obtained by removing adapter sequences from the raw reads and by excluding sequences that contained >10% unknown bases and >50% low-quality bases with a Q-value <20 [19]. The retained high-quality reads were mapped to the Nipponbare reference genome by TopHat software (-N, 2-read gap length, 3-read edit distance, 3-read realign edit distance, 0-report secondary alignments coverage search, microexon-search library type, fr-unstranded, b2-sensitive) [20,21], and a RABT (Reference Annotation Based Transcript) assembly was generated from the aligned reads using Cufflinks [22]. For functional annotation, all the unigenes were searched against the NCBI Nt (https://www.ncbi.nlm.nih.gov/) and Nr (http://www.ncbi.nlm.nih.gov/) databases, as well as   [23]. The Blast2GO program was used to assign gene ontology (GO) terms to the unigenes from the leaf and root tissues, using a P-value cutoff of 0.05 [24].

Identification of differentially expressed unigenes
The expression of each gene was calculated by quantitating the reads according to the RPKM (reads per kilobase per million reads) method [25]. An FDR (false discovery rate) of 0.001 and absolute log 2 RPKM ratio of 1 were used as significance thresholds for differences in gene expression [26].

Validation of transcriptome sequencing
To confirm the gene expression data, quantitative real-time PCR (qRT-PCR) was performed on 30 randomly selected DEGs (15 up-regulated and 15 down-regulated). The qRT-PCR was implemented using the SYBR Premix Ex Taq Kit on a Chromo 4 real-time PCR detection system (Bio-Rad, Hercules, CA, U.S.A.) and the primers are listed in Supplementary Table S1. The relative expression values were calculated using the 2 − C t method, with OsActin1 as the internal control [27]. All reactions were performed using one biological sample and three technical replicates.

Transcriptome sequencing statistics
In total, 46.8, 48.7, 43.6, and 46.7 million high-quality paired-end reads were generated by Illumina sequencing the LCK, LD, RCK, and RD cDNA libraries, respectively (Table 1). Now, it is well-accepted that cultivated rice was domesticated from common wild rice thousands of years ago [28]. Furthermore, Xie et al. [9] revealed that DXWR was genetically closer to Oryza sativa ssp. japonica than indica. The Nipponbare (O. sativa ssp. japonica) genome has been completely sequenced with Sanger sequencing technology and is ranked as the best assembled and annotated one out of all rice genomes [29,30]. Therefore, in the present study, we used the Nipponbare genome as a reference to map the reads. The alignment results indicated that 70.22-75.39% (67.85-73.17% uniquely matched) of the total reads were mapped to the Nipponbare reference genome and 57.95-63.34% (35.90-38.57% uniquely matched) were mapped to the gene regions (Tables 1 and 2). Meanwhile, there was a significant difference in the percentage of reads LD, leaves with dry treatment; RD, roots with dry treatment; LCK, leaves without dry treatment; RCK, roots without dry treatment.
mapped to the genome and gene regions, especially for the uniquely mapped reads. These results were similar to those of a previous transcriptome sequencing study of Sorghum propinquum, which reported that 72 and 46% of the total reads mapped to the genome and gene regions, respectively, with 68 and 38% of the reads being uniquely matched [31]. This suggested that it might result from reads mapping to intergenic regions and alternative mRNA splicing. Meanwhile, 24.61-29.78% of the DXWR reads remained unmapped, probably due to gaps in the reference genome sequences and diversity between the DXWR and Nipponbare genomes ( Table 1). On an average, at least 50% of more than 60% of the mapped genes were covered by uniquely mapped reads, and only approximately 15% of the genes had coverage of 20% or less (Supplementary Figure S2), which indicated that the transcriptome sequences were of high quality.

Analysis of DEGs
Based on the number of reads, gene expression levels can be estimated from Illumina sequencing with great accuracy. Putative DEGs from the drought-stressed and control samples (LD compared with LCK and RD compared with RCK) were identified, and 4040 and 7191 transcripts were up- (Supplementary Table S2) and down-regulated (Supplementary Table S3), respectively, in the LD sample, when compared with the LCK sample, whereas 3097 and 3928 transcripts were up- (Supplementary Table S4) and down-regulated (Supplementary Table S5), respectively, in the RD sample, when compared with the RCK sample ( Figure 1). Among these DEGs, 1519 and 1459 transcripts were up-(Supplementary Table S6) and down-regulated (Supplementary Table S7), respectively, in both the LD and RD samples, when compared with the control samples (LCK and RCK). Interestingly, 223 transcripts were up-regulated in the LD sample but down-regulated in the RD sample (Supplementary Table S8), and 262 transcripts were down-regulated in the LD sample but up-regulated in the RD sample (Supplementary Table S9).
Quantitative RT-PCR, which was performed to validate the expression data of 30 randomly selected DEGs, yielded results that were highly consistent with those of transcriptome sequencing (R 2 =0.9159) (Figure 2). Thus, the DEGs detected in the present study can be considered to have a high accuracy.
Meanwhile, many DEGs have been previously identified and reported to play roles in the drought stress responses of cultivated rice (Table 3). TFs, for example, have been reported to play essential roles in abiotic stress responses by regulating a large spectrum of downstream stress-responsive genes. Among TF families, the basic leucine zipper (bZIP) family is one of the largest and diverse groups. In plants, bZIP TFs are involved in multiple biological processes that include flower development, seed germination, light signaling, and hormone signaling. In addition, bZIPs are involved in responses to abiotic and biotic stresses [32]. Chen et al. [33] reported that the expression of OsbZIP16 was dramatically induced under drought conditions and that overexpressing OsbZIP16 in the japonica rice cultivar Zhonghua 11 significantly improved drought resistance at both the seedling and tillering stages. Similarly, Xiang et al. [34] reported that the expression of OsbZIP23 was strongly induced by a wide spectrum of stresses, including drought, salt, abscisic acid (ABA), and PEG treatments. Furthermore, the overexpression of OsbZIP23 in the japonica rice cultivar Zhonghua 11 significantly improved resistance to drought and high salinity stresses. In the present study, the expression of both OsbZIP16 and OsbZIP23 were also significantly induced under drought condition in both leaves and roots, which is consistent with previous reports. In addition to OsbZIP16 and OsbZIP23, the present study also identified six other bZIP TFs, i.e. OsbZIP4 (LOC Os01g36220), OsbZIP17 (LOC Os02g10140),  OsbZIP25 (LOC Os03g03550), OsbZIP48 (LOC Os06g39960), OsbZIP62 (LOC Os07g48660), and OsbZIP68 (LOC Os08g43090). Among them, OsbZIP25, OsbZIP48, and OsbZIP68 were up-regulated in both the leaves and roots of drought-stressed plants, whereas OsbZIP17 and OsbZIP62 were down-regulated; and OsbZIP4 was up-regulated in the leaves of drought-stressed plants but down-regulated in the roots.
Members of another large plant-specific TF superfamily, the NAC TFs, were also identified as DEGs in the present study. Several NAC TFs have previously been demonstrated to function as important regulators of stress responses in cultivated rice. For example, Hu et al. [35] reported that the expression of SNAC1 was predominantly induced in guard cells by drought stress and that overexpression of SNAC1 in the japonica cultivar Nipponbare significantly enhanced drought resistance in the field, and in the same cultivar, Takasaki et al. [36] reported that the expression of OsNAC5 was induced by drought, cold, high salinity, ABA, and methyl jasmonic acid, which subsequently enhanced the cultivar's resistance to drought and high salinity. In addition, Nakashima et al. [37] suggested that the expression of  the Nipponbare cultivar's resistance to drought and high salinity, and Jeong et al. [38] reported that the expression of OsNAC10 was induced by drought, high salinity, and ABA and that the overexpression of OsNAC10 increased the resistance of Nipponbare plants to drought, high salinity, and low temperature. The present study confirmed that drought stress induces the expression SNAC1, OsNAC5, OsNAC6, and OsNAC10 and found that drought stress also affected the expression of two other NAC TFs: LOC Os01g60020, which was up-regulated in both leaves and roots, and LOC Os01g50360, which was up-regulated in leaves but down-regulated in roots. The present study also found that drought stress affects many other TF families, including the MYB, zinc finger, whirly, and WRKY TFs, a finding that is consistent with the opinion that the TFs play important roles in drought resistance in plants [39,40].

Functional classification by GO
In the present study, a total of 29350 leaf transcripts in LD compared with LCK and 26656 root transcripts in RD compared with RCK were assigned GO terms. Among the 29350 leaf transcripts, 10360 transcripts were annotated for their cellular component, 9719 were annotated for their molecular function, and 9271 were annotated for their biological process ( Figure 3A). Among the 26656 transcripts from the root samples, 9073 transcripts were annotated for their cellular component, 9046 were annotated for their molecular function, and 8537 were annotated for their biological process ( Figure 3B). Within the biological process category, cellular and metabolic processes were the most highly represented groups, which suggests that extensive metabolic activities were taking place both in the leaves and roots of the drought-stressed DXWR plants. Meanwhile, within the cellular component category, transcripts that corresponded to cell, cell parts, and cell organelles were the most abundant; and binding and catalytic activities were the most abundant groups within the molecular function category.
We also identified biological process GO terms that were over-represented (P<0.05) among the DEGs of LD compared with LCK and RD compared with RCK, respectively (Supplementary Tables S10 and S11). These terms served as indicators of significant biological processes underlying the specific drought-stress responses of leaves and roots. Both sets of transcripts were enriched in GO terms for response to wounding, negative regulation of ABA-mediated signaling pathway, response to inorganic substance, negative regulation of response to alcohol, and response to oxygen-containing compound ( Table 4), which suggests that the same biological processes might be involved in the drought responses of both the tissues.
Meanwhile, in regard to the over-representation of cellular component terms (P<0.05) among the LD, LCK, RD, and RCK samples (Supplementary Tables S12-S15, respectively), we found chloroplast thylakoid, chloroplast thylakoid membrane, thylakoid, plastid thylakoid, thylakoid membrane, thylakoid part, and plastid thylakoid membrane were enriched in both sets of transcripts from each tissue (Supplementary Table S16). This is consistent with the opinion that the thylakoid is one of the most important cellular components for the normal growth and development of plants and that it is also involved in drought resistance [41,42]. Tian et al. [43] reported that the stay-green wheat mutant tasg1 exhibited greater functional stability of its thylakoid membrane proteins than the wild-type parent and that tasg1 mutants maintained higher Hill activity, actual efficiency ( PSII ), maximal photochemical efficiency of PSII (F v /F m ), and Ca 2+ -ATPase and Mg 2+ -ATPase activities under drought stress, thus conferring improved drought resistance.
For the molecular function category, endopeptidase inhibitor activity, endopeptidase regulator activity, and serine-type endopeptidase inhibitor activity were enriched in both tissues (Supplementary Table S17). This suggested that the expression of some stress-responsive genes might be regulated by post-translational modification. In addition, iron ion binding and electron carrier activity were also enriched in both tissues, which is consistent with recent studies and suggests that these DEGs play key roles in mitigating the harmful effects of drought stress [44,45].

KEGG pathway mapping
The KEGG pathway based analysis indicated that 7056 of the 11231 leaf DEGs and 4115 of the 7025 root DEGs could be classified into 20 functional categories and 128 and 126 subcategories, respectively. Furthermore, the over-represented KEGG orthology (KO) terms (Q-value <0.05) could be classified into 10 and 12 categories, respectively (Figure 4), and the most common KO terms represented by both the leaf and root DEGs included global map, translation, lipid metabolism, signal transduction, biosynthesis of other secondary metabolites, metabolism of terpenoids and polyketides, carbohydrate metabolism, and amino acid metabolism.
The over-represented KO terms for the leaf and root DEGs were further classified into 23 and 28 subcategories, respectively (Supplementary Tables S18 and S19). Among these subcategories, 11 subcategories were over-represented among both the leaf and root DEGs (Supplementary Figure S3), including tryptophan metabolism; RNA transport; plant hormone signal transduction; mRNA surveillance pathway; limonene and pinene degradation; isoflavonoid biosynthesis; flavone and flavonol biosynthesis; cutin, suberine, and wax biosynthesis; biosynthesis of secondary metabolites; benzoxazinoid biosynthesis and stilbenoid, and diarylheptanoid and gingerol biosynthesis. Among these terms, RNA transport and the mRNA surveillance pathway are related to transcription. Thus, the pathways might modulate the regulation of drought-induced gene expression.
Furthermore, 17 KO terms were exclusively among the root DEGs and 12 KO terms were exclusively enriched among the leaf DEGs. This finding suggests that there could be considerable differences in the biochemical and physiological processes involved in the drought responses of leaves and roots, and these annotations provide a valuable resource for investigating the specific processes, functions, and pathways involved in such differences.

DEGs and previously identified drought resistance related QTL intervals
Until now, only 12 drought resistance-related QTLs had been identified in DXWR [12]. These were located on chromosomes 2, 4, 5, 6, 8, 9, and 12; and among the 12 QTLs, qSDT2-1 and qSDT12-2 had been identified as the two most significant QTLs for drought resistance, explaining up to 7 and 14% of the phenotypic variance and possessing positive additive effects of 1.2 and 1.38, respectively [12]. Furthermore, qSDT2-1 was mapped within 15.9 cM on chromosome 2, and qSDT12-2 was mapped within 6.7 cM on chromosome 12 [12].
In the present study, a total of 146 and 81 DEGs were co-localized within the qSDT2-1 and qSDT12-2 intervals, respectively (Supplementary Table S20). To identify the most interesting candidate genes, the DEGs with absolute log 2 RPKM ratio values of 2 (in at least one tissue) were isolated for further analysis. This screening reduced the total number of DEGs in the qSDT2-1 interval to 71, which included 42 leaf DEGs (16 up-regulated and 26 down-regulated), 22 root DEGs (8 up-regulated and 14 down-regulated), and 7 DEGs that were differentially expressed in both leaves and roots (5 up-regulated and 2 down-regulated; Supplementary Table S20). Similarly, screening reduced the total number of DEGs in the qSDT12-2 interval to 43, which included 24 leaf DEGs (5 up-regulated and 19 down-regulated), 15 root DEGs (5 up-regulated and 10 down-regulated), 3 DEGs that were differentially expressed in both leaves and roots (2 up-regulated and 1 down-regulated), and 1 DEG that exhibited a contrasting expression pattern, being up-regulated in leaves and down-regulated in roots (Supplementary Table S20).
Many of these DEGs are involved in the resistance of plants to abiotic stress. For example, Cui et al. [46] reported that OsSGL (LOC Os02g04130), which encodes a DUF1645 domain containing protein, can be induced by multiple stresses, and the over-or hetero-expression of OsSGL have been reported to significantly improve the drought resistance of transgenic rice and Arabidopsis thaliana, respectively. Here, in the present study, the expression of LOC Os02g04130 was also significantly up-regulated by drought stress, in both leaves and roots. Previous studies have also suggested that the genes encoding 9-cis-epoxycarotenoid dioxygenase play important roles in regulating endogenous ABA levels and contributes to drought stress resistance [47][48][49]. Indeed, the present study observed that the expression of LOC Os12g42280, which putatively encodes 9-cis-epoxycarotenoid dioxygenase 1, was up-regulated in both leaves and roots under drought stress. Furthermore, the ubiquitin system plays a key role in plant biology by interfering with key components of important pathways, including those associated with abiotic stress, immunity, and hormonal signaling [50]. The present study also found that LOC Os02g06640, which putatively encodes a ubiquitin family protein, was significantly up-regulated in leaves under drought stress.
In addition, protein kinases, which constitute one of the largest gene families in plants, play critical roles in the regulation of cellular pathways, signal transduction, and plant development, and many protein kinases are involved in the drought-stress responses [51]. For example, overexpression of an Arabidopsis cysteine-rich receptor-like protein kinase, CRK5, was reported to enhance ABA sensitivity and confer drought resistance [52]. In the present study, eight protein kinase-encoding genes were significantly affected by drought stress: LOC Os02g04240, LOC Os02g05820, LOC Os02g06930, LOC Os02g08530, LOC Os12g41090, LOC Os12g41510, LOC Os12g42040, and LOC Os12g42070. Transposons and retrotransposon, which are also important upstream regulators of plant drought stress responses [53,54], were identified by the present study, as well (LOC Os02g04150, LOC Os02g04540, LOC Os02g05550, LOC Os02g07270, LOC Os02g07370, LOC Os02g07380, LOC Os02g07400, and LOC Os02g07570), as were various other types of TFs, including zinc finger TFs (LOC Os02g07930, LOC Os02g08510, and LOC Os12g42970), bZip TFs (LOC Os02g05640 and LOC Os12g40920), and MYB TFs (LOC Os02g04640 and LOC Os02g07170). These co-localized and drought-affected DEGs identified in the present study provide a basis for future studies aiming to elucidate the molecular mechanisms of drought resistance in rice.

Conclusion
The present study investigated the superior drought resistance of DXWR, which is highly valuable for genetic research and the breeding of drought resistance in rice. This work presented an original transcriptome analysis of drought-stressed DXWR leaf and root tissues. A large number of DEGs were identified; and several key pathways, including those for RNA transport, mRNA surveillance, secondary metabolite biosynthesis, and plant hormone signal transduction, and various processes, including endopeptidase inhibitor activity, serine-type endopeptidase inhibitor activity, iron ion binding, and electron carrier activity were identified as involved in drought resistance. By combining the DEGs identified in the present study with previously identified drought-resistance QTLs from DXWR, the most interesting candidate genes were identified, including various TFs, protein kinases, and other functional proteins. These findings will be useful in future studies of molecular adaptations to drought stress and will facilitate the genetic manipulation of rice to improve its drought resistance.