Which sample type is better for Xpert MTB/RIF to diagnose adult and pediatric pulmonary tuberculosis?

Abstract Objective: This review aimed to identify proper respiratory-related sample types for adult and pediatric pulmonary tuberculosis (PTB), respectively, by comparing performance of Xpert MTB/RIF when using bronchoalveolar lavage (BAL), induced sputum (IS), expectorated sputum (ES), nasopharyngeal aspirates (NPAs), and gastric aspiration (GA) as sample. Methods: Articles were searched in Web of Science, PubMed, and Ovid from inception up to 29 June 2020. Pooled sensitivity and specificity were calculated, each with a 95% confidence interval (CI). Quality assessment and heterogeneity evaluation across included studies were performed. Results: A total of 50 articles were included. The respective sensitivity and specificity were 87% (95% CI: 0.84–0.89), 91% (95% CI: 0.90–0.92) and 95% (95% CI: 0.93–0.97) in the adult BAL group; 90% (95% CI: 0.88–0.91), 98% (95% CI: 0.97–0.98) and 97% (95% CI: 0.95–0.99) in the adult ES group; 86% (95% CI: 0.84–0.89) and 97% (95% CI: 0.96–0.98) in the adult IS group. Xpert MTB/RIF showed the sensitivity and specificity of 14% (95% CI: 0.10–0.19) and 99% (95% CI: 0.97–1.00) in the pediatric ES group; 80% (95% CI: 0.72–0.87) and 94% (95% CI: 0.92–0.95) in the pediatric GA group; 67% (95% CI: 0.62–0.72) and 99% (95% CI: 0.98–0.99) in the pediatric IS group; and 54% (95% CI: 0.43–0.64) and 99% (95% CI: 0.97–0.99) in the pediatric NPA group. The heterogeneity across included studies was deemed acceptable. Conclusion: Considering diagnostic accuracy, cost and sampling process, ES was a better choice than other sample types for diagnosing adult PTB, especially HIV-associated PTB. GA might be more suitable than other sample types for diagnosing pediatric PTB. The actual choice of sample types should also consider the needs of specific situations.


Authors Authors
Mengyuan Lyu, Jian Zhou, Yuhui Cheng, Weelic Chong, Kang Wu, Teng Fang, Tianbo Fu, and Binwu Ying This article is available at Jefferson Digital Commons: https://jdc.jefferson.edu/medfp/273 Introduction Globally, there were 10 million new tuberculosis (TB) cases and 1.24 million deaths in 2018 alone [1]. A body of studies have confirmed that early diagnosis and treatment can prevent most TB deaths [2][3][4], and thus excellent diagnostic tools need to be developed. The Xpert MTB/RIF (Cepheid, Sunnyvale, CA, United States) was endorsed as a diagnostic test for use in TB endemic countries by World Health Organization (WHO) in 2010 [5]. A systematic review of Xpert MTB/RIF studies has reported different sensitivities of Xpert MTB/RIF, ranging from 25 to 100% [6]. A study has shown the sensitivity of Xpert MTB/RIF varies with the sample types, samples quality, and bacterial load of samples [7]. Thus, choosing proper sample types is critical to improve the diagnostic performance of Xpert MTB/RIF. Given the high prevalence of pulmonary TB (PTB) [8], we paid attention to the selection of respiratory-related sample types to better diagnose PTB.
The application of urine and stool for diagnosing PTB have been reported [9,10], however, considering high TB burden areas which are usually economically underdeveloped, high cost of Xpert MTB/RIF and relatively low detection rate in non-respiratory samples [11], respiratory-related sample become the first choice for detecting Mycobacterium tuberculosis (MTB) in clinical practice, at least for now. Expectorated sputum (ES), induced sputum (IS), bronchoalveolar lavage (BAL) fluid, gastric aspiration (GA), and nasopharyngeal aspirates (NPAs) are considered suitable samples to detect PTB and are regularly collected in clinical practice. ES is readily available, however, has low bacterial load. IS, sputum induced by the inhalation of hypertonic saline, usually has large sample volume and higher quality than ES [12]. BAL has been widely accepted as the most specific sample type used for accurate and timely diagnosis of lung diseases [13,14]. The WHO supports the collection of BAL to diagnose PTB when possible [15]. However, high cost, invasive sampling and the potential risk of hemorrhage, pneumothorax, laryngospasm and other adverse reactions limit its application [16]. It is also worth noting that in clinical practice, bronchoscopy and lavage are not considered as feasible in young children due to the potential risk for anesthesia and need of specialized procedural expertise, unless otherwise specified [17]. NPA is sampled after stimulation of the cough reflex via inserting a small feeding catheter into the nasopharynx [18]. Low operational requirements, less invasive sampling, and ready access enable NPA to become an alternative to BAL for pediatric PTB, but not for adult PTB due to significantly different airway microbial composition between NPA and BAL [17]. Young children, particularly who under 5 years, are unable to expectorate sputum and always swallow sputum in their stomach by mistake, thus GA is collected by a nasogastric tube during night in three consecutive mornings to detect MTB for pediatric PTB [19,20]. Nonetheless, MTB is less likely to be detected in GA than smear in adults, and therefore, and thus GA is not regarded as an option for detecting MTB for adult PTB [21]. Enough volume and consecutive samples of GA support repeat tests to improve the rate of detection [22].
Although there are many samples types to choose from, different samples vary greatly in cost, requirement for operators and operating environment, the degree of sampling invasiveness, quality, volume, and others. It is hard to assess which is the best choice for diagnosing PTB. In addition, few studies pay attention to assess the impact of different sample types on the performance of Xpert MTB/RIF when diagnosing PTB. Therefore, we undertook this systematic review and meta-analysis to compare the detection capacity of Xpert MTB/RIF when using ES, IS, and BAL as samples for diagnosing adult PTB, as well as ES, IS, GA, and NPA for diagnosing pediatric PTB. We aimed to summarize and curate reliable evidence for choosing the proper sample types to diagnose PTB patients at different ages.

Inclusion and exclusion criteria
Included studies needed to: (i) be cross-sectional studies, cohort studies, or randomized controlled trials; (ii) focus on PTB patients with or without other diseases; (iii) use two diagnostic methods-Xpert MTB/RIF and culture, and take culture as the reference standard; (iv) collect GA, NPA from children, or BAL from adults, or IS or ES from both; (v) give clear information that participants were adults or children. Exclusion criteria were: (i) reviews, case reports, letters, conference abstracts, and animal experiments; (ii) articles not written in English; (iii) articles with incomplete basic data.

Quality assessment
Two independent reviewers assessed the quality of included studies according to the Revised Tool for Quality Assessment of Diagnostic Accuracy Studies (QUADAS)-2 [23]. Both risk of bias and applicability concerns would be evaluated as 'high' , 'unclear' or 'low' . Advice of a third reviewer would be referred if necessary.

Data extraction and management
All data were collected and entered into Microsoft Excel version 2016. The basic information mainly comprised the first author, publication year, study population's characteristics, and so on. Diagnostic information included true-positive, false-positive, true-negative, and false-negative. If necessary, we contacted the authors for more essential information, otherwise the trials with incomplete information would be excluded.

Heterogeneity
The threshold effect and non-threshold effect could result in heterogeneity. The heterogeneity caused by the threshold effect could be explored by performing the Spearman correlation coefficient or plotting summary receiver operating characteristic curves (sROCs). The threshold effect would exist if P-value of the Spearman correlation coefficient was less than 0.05 or the points in the plots had a curvilinear (shoulder arm) pattern. If Chi-square P-values were less than 0.10, heterogeneity might be caused by a non-threshold effect and a random-effects model would be chosen.

Sensitivity analysis
Sensitivity analysis was performed to determine the robustness of any treatment effect by removing low-quality studies.

Statistical analysis
The diagnostic performance of Xpert MTB/RIF was assessed by calculating the pooled sensitivity, specificity, and area under the curve (AUC), with a 95% confidence interval (CI). The forest plots and sROC were also plotted. We performed statistical analysis with Meta-DiSc 1.4 (XI Cochrane Colloquium, Barcelona, Spain) and Review Manager V.5.3 (The Cochrane Collaboration, Software Update, Oxford, U.K.).
The flow diagram of articles included in this meta-analysis is presented in Figure 1. The main characteristics of 43 included articles are summarized in Table 1.

Risk of bias
The risk of bias of the included studies is shown in Table 2.
Among 17 studies in the adult BAL group, 5 articles were assessed as low risk and the rest had an unclear risk on the patient selection. For the index test, 13 publications had a low risk and 4 publications had an unclear risk. In the aspect of reference standard and flow and timing, all trials gave clear description about trial details, except for two [27,29]. A total of 14 studies had low concern for applicability, however, the risk of patient selection of 2 studies [37,72] were unclear and that of Ullah et al. [38] was high.
For 14 studies in the adult ES group, 10 studies were assessed as low risk on the patient selection, 3/14 were considered as high risk and the remaining was identified as unclear risk. All of them showed low risk on the index test, reference standard and flow, and timing. Low concern of patient selection and index test appeared in all 14 publications. All studies were evaluated as low risk on reference standard except for two [40,51].
For seven studies in the adult IS group, three articles had low risk of patient selection, three articles had unclear risk of patient selection and only one had high risk of patient selection. Two studies [50,56], which did not state whether the results from two methods were double-blind, assessed as unclear risk on the index test and reference standard. Besides, Luetkemeyer et al. [45] also had an unclear risk on reference standard. All articles in this group were considered as low risk on applicability concerns.
In the pediatric ES group, two studies showed low risk in risk of bias and applicability concerns.
In the pediatric GA group, two articles had low risk of patient selection while others were of unclear risk. Unclear risk of the index test and reference standard only appeared in one publication. Five trials had low risk of flow and timing and applicability concerns, and one [62] was evaluated as high risk for patient selection.
In the pediatric IS group, all studies had low risk of patient selection except two [59,68]. Das et al. [59], LaCourse et al. [64], and Nicol et al. [66] had unclear risk of the index test and reference standard. Five articles had low concern of applicability, while another two studies [64,69] were assessed as high risk of patient selection.
In the pediatric NPA group, two studies showed low risk of bias and applicability concerns.

Heterogeneity
The value of Spearman correlation coefficient showed that the threshold effects did not exist in the adult BAL group (Spearman correlation coefficient: 0.158; P=0.727), adult ES group (Spearman correlation coefficient: 0.251;

Results on diagnostic accuracy
The  Figures 3 and 4, respectively. Subgroup analysis was carried out to determine the potential source of heterogeneity among included articles. The detailed information is presented in Supplementary Table S1.

Discussion
Since we only included studies using Xpert MTB/RIF, all comparisons made were between the types of biological samples used. For adults, choosing ES as sample provided better diagnostic performance than BAL or IS. For children, GA showed superior diagnostic capacity to ES, IS, or NPA. Of note, GA also has its own disadvantages including invasive sampling and needing approximately three consecutive days for collecting samples. The actual choice needs to be decided according to specific situations.
It is particularly notorious that microbiology methods (including sputum microscopy and culture), immunological methods [interferon-γ release assay (IGRA) and T-cell spot (T-SPOT)] and molecular methods [polymerase chain reaction (PCR)] are widely applied worldwide before the Xpert MTB/RIF came to market. The celerity and convenience of sputum microscopy make it the most commonly used method in clinical practice, nonetheless low sensitivity (20-40%) is the fatal defect of the sputum microscopy [73]. Culture, as the reference standard for the diagnosis of TB, requires too long detection time and a highly equipped laboratory. Thus, culture is greatly limited in some developing regions typically facing a high TB burden. IGRA and T-SPOT are effective ways to screen latent TB infection. However, inoculating Bacille Calmette-Guérin (BCG), prior infection and inhalation of non-pathogenic mycobacterium can result in false positivity of IGRA and T-SPOT [74]. Thus, the clinical value of these two diagnostic methods is somewhat discounted in TB endemic areas. Although PCR is a relatively sensitive method to detect MTB, high requirement of operator skills and a laboratory environment also restrict its application. The emergency of Xpert MTB/RIF meets the urgent demand for rapid, simple, integrated, and fully automated detective methods of MTB detection. A myriad of randomized controlled trials (RCTs) have been carried out to assess the diagnostic performance of Xpert MTB/RIF and found the detection rate of MTB is dramatically improved by this novel molecular tool [75]. Apparent benefit brought by Xpert MTB/RIF to PTB patient individually and public health enables this molecular tool to be rapidly accepted and applied all over the world.
However, Xpert MTB/RIF still faces some challenges of inadequate sensitivity when diagnosing smear-negative PTB, pediatric PTB, HIV-associated PTB, and extra-PTB [76,77]. In order to increase diagnostic yield, the Xpert MTB/RIF Ultra (Ultra; Cepheid) appeared. Based on the same platform as the Xpert MTB/RIF, the Ultra harbors some improvements including using two PCR amplification targets (IS6110 and IS1081) and a melt curve analysis [78]. These two changes confer Ultra's lower limit of detection than Xpert MTB/RIF and culture [78]. The following clinical trials have found that improper sample types and poor sample quality still have an adverse influence on the sensitivity of Ultra [79,80]. Obviously, no matter how sensitive the detection method is, choosing proper sample type and ensuring the quality of sample are crucial.
In our systematic review, choosing ES as sample provided better diagnostic performance (sensitivity, specificity, and AUC) than BAL or IS for adult PTB. Since both BAL and IS had a higher bacterial load than ES [81,82], they should theoretically provide better diagnostic performance than ES. However, our study showed opposite results. One reason is the smaller volume of IS, ranging from 0.5 to 2 ml, whereas the volume of ES ranges from 1 ml to more than 2 ml. This inconsistency may also be explained by the ratio of processed samples. The ratio of processed samples in the adult ES group was higher than that in the adult BAL group. In addition, we also found there was a high agreement in diagnosing adult PTB when using IS and BAL as sample. Considering cost, sampling method, acceptance and other factors, IS might be a better choice than BAL to diagnose PTB. Blau et al. [83] gave the interpretation of this results that sampling from one or two parts of lung led to poor repeatability and limited detection rate when taking BAL as sample to diagnose lung infection diseases. Garcia et al. [84] found the gene transcriptional patterns of MTB in BAL and sputum were similar, and additionally, sputum could reflect the physiological state of MTB in the lower airway of PTB patients. Conde et al. [85] conducted a trial and reconfirmed the higher diagnostic yield of taking IS (relative to BAL) as sample to diagnose PTB, in line with our results.
For pediatric PTB, similar specificity and AUC appeared when using IS and GA as sample, nonetheless, Xpert MTB/RIF showed much higher sensitivity when using GA as sample to diagnose pediatric PTB. Although GA is sampled by invasive ways and need approximately three consecutive days for collecting samples, the characteristics of lower probability of contamination, easy sample collection, and sufficient sample volume still make GA become an ideal sample type to increase detection rate for pediatric PTB. Ruiz Jimenez et al. [86] may agree with our results that GA is an excellent sample for diagnosing PTB, and IS should be considered as a supplementary sample type. However, Zar et al. [87] drew an opposite conclusion by conducting a prospective study, with IS for diagnosing PTB and GA as supplementary. The differences may result from different diagnostic tools and the variation in sample collection. The culture, used in their study [87], was unable to detect dead MTB which was caused by the neutralization of GA [88], while Xpert MTB/RIF used in our included studies can detect both dead and live MTB and is not affected by sample neutralization. Furthermore, sample collection, technician skill level, the time from sampling to testing and choice of pre-processing methods also influence finial accuracy [89]. Thus, it is worthy for us to pay attention to the standardization of GA sampling.
Subgroup analysis showed that for adult PTB, superior diagnostic performance appeared in the smear-positive subgroup whether using BAL, ES, or IS as samples. Xpert MTB/RIF showed similar sensitivity whether using smear-positive BAL, smear-positive ES or smear-positive IS, while the highest AUC (90%) was identified when using BAL to diagnose smear-negative PTB. BAL was sampled directly from the lesion region, which was more suitable and reliable for detecting PTB patients with low bacterial load [90,91]. In addition, outperformed capacity of Xpert MTB/RIF appeared when detecting MTB in IS or ES than in BAL for HIV-positive patients. Singer-Leshinsky et al. [92] obtained consistent results with ours that collecting IS from HIV patients may be more beneficial than others. Sample pre-procession steps, the number of samples, the immune status of patient, and the degree of disease severity are the potential confounding factors. More sound evidence is still lacking to support the value of ES in diagnosing HIV-associated adult PTB, which need to be further confirmed.
For pediatric PTB, we reconfirmed the strong association between smear status, bacterial load, and the diagnostic performance of Xpert MTB/RIF. Besides, testing for MTB in IS from children provided a little additional diagnostic yield from the prospective of specificity and AUC for Xpert MTB/RIF when diagnosing smear-negative PTB, however, Xpert MTB/RIF had higher likelihood to detect MTB in GA from these smear-negative children who really suffered from PTB. Maynard-Smith et al. [6] reported that detecting MTB in GA for sputum-scarce PTB was a good choice. However, Maynard-Smith et al. did not analyze the diagnostic accuracy of Xpert MTB/RIF when detecting MTB in IS from children and make further comparisons. The diagnostic value of GA and IS for smear-negative PTB for children still need to be explored. In the pediatric IS group, subgroup analysis showed that the pooled sensitivity was very heterogeneous between HIV-positive and HIV-negative subgroups and higher sensitivity was in the HIV-positive subgroup. This result was contrary to Connell et al. [93], however, supported by Nicol et al. [65] The smear status, disease severity and samples' pre-processing may affect the results, and a further research should be warranted.

Strengths and limitations
Our study made a rigorous and comprehensive analysis and comparison about the diagnostic performance of Xpert MTB/RIF using different sample types for PTB patients with different ages by including as many studies as possible. Furthermore, we performed reasonable subgroup analysis to identify the source of heterogeneity and potential factors. However, the present study also has some limitations. Limited articles in the pediatric ES group and pediatric NPA group restricted us from further analysis. Some factors including cost, the tolerance of patients and sample contamination rate should be taken into consideration for final choice of diagnostics. Most included articles only used one sample type to analyze the performance of Xpert MTB/RIF, and thus the comparison of different sample types may be influenced by a number of potential confounding factors such as patient characteristics and the process of sampling. Thus, more eligible trials are needed.

Conclusion
When considering diagnostic performance of Xpert MTB/RIF, sampling method and patients' tolerance, using ES as sample might be a better choice to diagnose adult PTB than using BAL and IS. For pediatric PTB, GA was superior to other samples. However, the invasive sampling and relatively long time of collecting samples also should be considered. The actual choice of sample types needs to be decided according to specific situations.

Data Availability
All data generated or analyzed in the present study are contained and presented in this document.