Introduction
According to a report of the International Agency for Research on Cancer, colorectal cancer (CRC) is the third most frequently diagnosed cancer and the second leading cause of cancer deaths worldwide, with more than 1.9 million new cases and 935,000 deaths estimated in 2020.1 CRC is a heterogeneous disorder arising through different precursor lesions, different molecular pathways, and different end-stage carcinomas.2,3 Colorectal adenomas or adenomatous polyps are the most common precursors for CRC,4,5 which is the well-known “adenoma-carcinoma sequence.”6 It is estimated that over 50% of the screening-age population have one or more precancerous adenomas or polyps.7 Furthermore, since the size of the adenoma is considered one of the important markers for the potential risk of cancerization,8 adenomatous polyps are considered advanced adenomas (Adv_CRA) when the size is equal to or larger than 10 mm in diameter.9
The 5-year survival rate is around 13% when CRC is detected at the advanced metastatic stage, but it exceeds 90% if the tumor is detected and treated at an early, localized stage.10 The early detection of colorectal tumors, especially adenomas, can significantly facilitate successful treatment and is important for decreasing CRC morbidity, mortality, and economic burden.11 Colonoscopy is recognized as the golden standard of CRC screening. However, this test is poorly adhered to due to the invasiveness, frequency, and expensive price. For example, only 14% of high-risk people evaluated by a scoring system finally undertook colonoscopy screening in a recent survey in China.12 Other widely used noninvasive tests, including the fecal immunochemical test and the fecal occult blood test, show unsatisfying sensitivities for CRC and have low sensitivity for colorectal adenomas or precancers.13,14 Relatively new tests based on multi-target stool DNA, such as Cologuard, are still low in sensitivity for nonadvanced adenomas and are too expensive for large-scale screening.15 These shortcomings highlight the urgent need for the development of noninvasive and sensitive tests for CRC and precancerous lesions to improve the screening rate.
Acting as environmental factors of the human body, the gut microbiota are frequently reported to play important roles in the initiation and progression of CRC16–19 and have been extensively studied to identify noninvasive biomarkers reflecting the disease,10,20–23 including Fusobacterium nucleatum,24,25Peptostreptococcus sp., Porphyromonas, Campylobacter jejuni,26 and some other specific genes.27 Recently, microbe-derived metabolites also have been reported to serve as biomarkers of CRC.28 However, unifying microbial signatures have not been identified for CRC across studies. Furthermore, it is not clear whether these individual biomarkers of CRC can effectively predict/classify adenomas, which appear at the early stage of CRC. In fact, the current knowledge on associations between the microbiota and adenomas is limited,11,23 since only a few studies have investigated the microbial alterations in adenomas.29–31 Moreover, few studies have explored the shifts of the gut microbiota of subjects with colorectal hyperplastic or inflammatory polyps (CRP),30,32 which are usually benign types of polyps, nor have they focused on the differences between CRP and adenomas.
In this study, we collected fecal samples across colorectal carcinogenesis and analyzed the fecal microbiota of participants with CRP, adenomas smaller than 10 mm (CRA), Adv_CRA, CRC, or a normal colonoscopy (healthy subjects; HS) by 16S rRNA gene sequencing (Fig. 1). The aims of this study were as follows: 1) to elucidate the shifts and characteristics of gut bacterial communities across the adenoma-carcinoma sequence with comprehensive stages, 2) to determine whether gut bacterial features can be used to classify colorectal tumors, and 3) to explore the differences between CRP vs. Adv_CRA and CRP vs. CRA, and to evaluate the performances of the bacterial models in classifying them.
Methods
Subject enrollment and sample collection
All participants were voluntarily enrolled in this study before the colonoscopy. Exclusion criteria were as follows: the detection of bloodstream or gastrointestinal infections; use of antibiotics or probiotics one month before enrollment; prior colorectal resection; preparation for pregnancy; a history of other diseases affecting the gut microbiota, such as metabolic syndromes and autoimmunity; and contraindication to colonoscopy. Fecal samples were prospectively collected by the participants before bowel preparation. Briefly, fecal samples were mixed and collected in sterile tubes after defecation, and they were immediately stored at −80°C until DNA extraction. Lesion assessments included the location, size, number, and architecture during colonoscopy. Lesions were removed from the colon mucosa under the guidance of colonoscopy and were submitted for histological classification. According to the feedback of pathologists who had an average of five years of experience in the field, samples were grouped into CRP, CRA (size <10 mm, including adenomas with a tubular, tubulovillous, villous, or serrated growth pattern), Adv_CRA (size ≥10 mm, including adenomas with a tubular, tubulovillous, villous, or serrated growth pattern), CRC, or HS. Small sample sizes for CRC and HS were set up, since the gut microbiota of these two groups have been widely studied.
DNA extraction and amplicon sequencing
The fecal samples were thawed and homogenized, followed by DNA extraction using a Powerfecal Kit (QIAGEN, Hilden, Germany), and quality checked as previously described.33 Extracted DNA samples were amplified by polymerase chain reaction with the forward primer 5′-CCTACGGGNBGCASCAG-3′ and the reverse primer 5′-GGACTACNVGGGTWTCTAAT-3′, which targets the 16S rRNA gene V3 and V4 region. The products were purified and checked with Qubit 3.0 (Thermo Fisher Scientific, Waltham, MA, USA) and then sequenced on a HiSeq 2500 platform (Illumina, San Diego, CA, USA) using a 250-bp paired-end sequencing protocol at Xiamen Treatgut Biotechnology Co.
Bioinformatic analyses
The raw paired-end reads were assembled using FLASH with default parameters except for parameters −M = 200 and −x = 0.1534 and were further filtered using Usearch with the parameters -fastq_maxee 0.5.35 High-quality reads were denoised into zero-radius operational taxonomic units (ZOTUs) with UNOISE3.36 All analyses performed on the ZOTU table were rarefied to the sequencing depth of 13,793 reads per sample for download analyses. Taxonomic assignment of ZOTUs was performed in QIIME 1.9.137 using the SILVA132 database.38 The microbial function was predicted by PICRUSt2.39
Developing machine learning models
To train multivariable statistical models for the prediction of different stages (HC, CRP, CRA, Adv_CRA, and CRC), three levels of bacterial features (genus, OTU, or pathway) and age were permuted and combined to develop prediction models separately. Data were randomly split into training and testing sets in a 5×-repeated 5-fold cross-validation, followed by the generation of random forest models using the randomForest R package v4.6-14. Finally, all predictions were used to calculate the area under the receiver operating characteristics curve (AUC) using the pROC R package v1.17.01. To optimize the performance, a feature selection step was developed for each model. Briefly, the importance ranking of each potential feature was first obtained based on the random forest importance parameters, mean decrease accuracy, or mean decrease in Gini values. Features were filtered within the cross-validation (that is, for each training set) by first calculating the AUC of the top-ranked feature and then removing features when the AUC dropped after adding the next feature, thereby keeping features informative in the model.
Statistical analyses
Alpha diversity indexes, including observed ZOTUs (Obs), Chao1, Shannon, and Pielou’s evenness were computed based on the ZOTU table using the vegan package.40 The differences in the diversity indexes and individual taxa were determined using the nonparametric Wilcoxon rank-sum test for two groups or the Kruskal–Wallis rank-sum test with Benjamini–Hochberg corrections for multiple groups using the agricolae package.41 The beta diversity of the overall bacterial communities was measured and visualized by distance-based redundancy analysis using the Euclidean distance, and the significance was determined with PERMANOVA with 9999 permutations using the vegan package. Visualization was mainly based on ggplot242 or Venn Diagram.43 All of these analyses were in R language.44
Results
Demographic and clinical information
In total, fecal samples from 490 participants were prospectively collected, and 463 samples were included and subjected to 16S rRNA gene sequencing after a strict pathological diagnosis and exclusion process. Briefly, 45 HS, 120 CRP, 150 CRA, 113 Adv_CRA, and 35 CRC patients were included and randomly divided into the discovery phase (training set, 371 samples) and the validation phase (testing set, 92 samples) in this study. The ages of the patients were matched and were not significantly different among the five groups. The male percentages of CRC (60%), Adv_CRA (63%), CRA (73%), and CRP (67%) likely reflect the male preponderance of colorectal tumors (Table 1).
Table 1Clinical characteristics of the enrolled participants in training set
Clinical group | Mean age, years (±SD) | n | Male (%) | Female (%) |
---|
Normal colonoscopy (HS) | 54 (±9) | 36 | 21 (58%) | 15 (42%) |
Hyperplastic or inflammatory polyps (CRP) | 57 (±11) | 95 | 64 (67%) | 33 (33%) |
Adenomas (CRA, diameter < 10 mm) | 57 (±12) | 120 | 87 (73%) | 33 (27%) |
Advanced adenomas (Adv_CRA, diameter ≥ 10 mm) | 56 (±11) | 90 | 57 (63%) | 33 (37%) |
Colorectal cancer (CRC) | 57 (±11) | 30 | 18 (60%) | 12 (40%) |
Shifts in gut microbial diversity
A total of 58,185,919 high-quality reads were obtained from 463 samples (mean = 125,672). We subsampled 13,793 reads for each participant according to the sample with the lowest sequence number. Compared with the HS group, the fecal bacterial richness (Observed and Chao1) was significantly (p < 0.05) increased in patients with colorectal tumors (Fig. 2a). A marginal significance (p = 0.07) was obtained for the test of difference in Shannon diversity among the five groups. Among the four disease groups, bacterial richness was significantly decreased in CRA (n = 120) vs. CRP (n = 95) and was significantly increased in CRC (n = 30) vs. CRA or Adv_CRA (n = 90). The fecal bacterial Shannon diversity and evenness were not significantly different among the five groups. Moreover, a Venn diagram showed that 1,289 of 4,689 OTUs were shared among the five groups, while 51 (2.80%), 321 (10.25%), 445 (13.29%), 330 (10.84%), and 238 (10.00%) OTUs were unique for HS, CRP, CRA, Adv_CRA, and CRC, respectively (Fig. 2b). Beta). The beta diversity was visualized by db_RDA and indicated distinct clustering of samples from different groups, in which HS was associated with a greater abundance of Faecalibacterium, Roseburia, and Ruminoccus_2 in the top 10 genera, while Escherichia_Shigella was greater in CRC (Fig. 2c). Permutation analysis showed significant differences (PERMANOVA, F = 1.60, p < 0.001) in overall bacterial community differences among samples from the five groups.
Phylogenetic profiles of fecal microbial communities
The gut bacterial profiles were dominated by Bacteroidetes, Firmicutes, and Proteobacteria at the phylum level, together accounting for more than 90% of sequences (Fig. 3a). On average, Bacteroides, Phascolarctobacterium, un_f_Lachnospiaceae, Prevotella_9, and Faecalibacterium were the top five genera (Fig. 3b). Firmicutes was significantly decreased in the colorectal tumor groups, while Bacteroidetes and Verrucomicrobia were significantly increased (Supplementary Fig. 1a). A total of 81 genera were detected to be significantly different among the five groups (Supplementary Table 1). Phascolarctobacterium, Megasphaera, and Desulfovibro displayed increasing trends (enriched) along with the development of the disease, while un_f_Lachnospiaceae, Anaerostipes, Butyricimonas, and Dorea were significantly decreased in the disease groups (Fig. 3c). Among the four disease groups, Parabacteroides decreased along with the progression of disease. At the finer amplicon sequence variant (ZOTU) level, 589 of 4,689 amplicon sequence variants were significantly different among the five groups (Supplementary Table 2). Moreover, a total of 409 microbial functional pathways were predicted, 157 of which were detected to be significantly different among the five groups (Supplementary Table 3). Pyruvate fermentation to isobutanol (PWY-7111), pyruvate fermentation to acetate and lactate II (PWY-5100), and galactose degradation I (PWY-6317) were depleted in the disease groups (Supplementary Fig. 1d and Supplementary Table 3). Additionally, the bacterial difference at the class and family levels among the five groups were compared and are shown in the online supplementary figure (Supplementary Fig. 1b and c).
Classification of colorectal tumors
To illustrate the diagnostic value of fecal bacteria for colonic tumors, we constructed a random forest classifier model that could specifically identify patients with colorectal lesions (non_HS) from the HS group as well as the four individual stages from HS. The combination of bacterial features showed non-HS prediction accuracy with an AUC of 0.922 (95% CI: 0.901–0.944) for the training set and 0.882 (95% CI: 0.780–0.983) for the testing set (Fig. 4a). This performance resulted from feature selection based on genera, OTU, pathways, or age with mean decrease accuracy or mean decrease Gini measures (Fig. 4c). Finally, a total of 35 features were identified, including 21 OTUs, 13 pathways, and age (Fig. 4d). Specifically, 12 OTUs from Ruminococcus_2, Lachnoclostridium, Akkermansia, etc. were depleted in the non-HS groups, while 9 OTUs from Desulfovibrio, Phascolarctobacterium, etc. were enriched in the non-HS groups (Supplementary Fig. 2). Eight pathways, including galactose degradation I (Leloir pathway) (PWY-6317), L-lysine biosynthesis I (DAPLYSINESYN-PWY), etc. had a lower abundance in the non-HS groups, while the superpathway of pyrimidine deoxyribonucleotides de novo biosynthesis (E. coli) (PWY0-166), pyrimidine deoxyribonucleotides de novo biosynthesis I (PWY-7184), etc. were upregulated in the non-HS groups (Supplementary Fig. 2). High performance of the random forest models was obtained for classifying CRC (AUC: 0.952, 95% CI: 0.931–0.972), Adv_CRA (AUC: 0.902, 95% CI: 0.877–0.927), CRA (AUC: 0.924, 95% CI: 0.903–0.945), and CRP (AUC: 0.959, 95% CI: 0.945–0.973) from HS (Fig. 4a, Supplementary Table 4). Next, we trained random forest models for differentiating CRC from individual precancerous stages. With a similar strategy, 31 OTU features, 27 OTU features, and 21 OTU features (Supplementary Table 5) were finally selected and achieved high performance in classifying Adv_CRA (AUC: 0.942, 95% CI: 0.919–0.966), CRA (AUC: 0.94, 95% CI: 0.917–0.964), and CRP (AUC: 0.91, 95% CI: 0.885–0.935) from CRC (Fig. 4b).
Bacterial differences between CRP, CRA, and Adv_CRA
We further explored the alterations of the gut microbial composition from benign colorectal polyps to adenomas and advanced adenomas. Analysis of beta diversity via principal component analysis revealed no significant differences in bacterial communities among the CRP, CRA, and Adv_CRA groups (PERMANOVA, F = 1.034, p = 0.357; Fig. 5a). No significant difference was observed in the alpha-diversity indexes (Fig. 2a). A total of 4 families and 12 genera were significantly different, with no differences at the phylum or class level (Supplementary Fig. 3). These results indicated relatively weak differences between polyps and adenomas or advanced adenomas. As expected, these more biologically similar outcomes were more difficult to differentiate but might still be accessible via some bacterial features. Thus, we went on to identify specific taxa at the finer OTU level that were significantly enriched/depleted between CRP and Adv_CRA or CRA. There were 117 OTUs and 91 OTUs that were significantly different in relative abundances in CRP vs. CRA and CRP vs. Adv_CRA (Fig. 5b), of which 18 OTUs were shared and assigned as Phascolarctobacterium, Lachnoclostridium, Fusobacterium, Butyricimonas, Subdoligranulum, etc. (Supplementary Table 6). The relative abundances and fold changes of the top 20 OTUs that were different between CRP and Adv_CRA are displayed in Figure 5c. Based on these altered OTUs, we performed feature selection using the mean decrease in Gini value ranking to build microbial models for the classifications of CRP and Adv_CRA or CRA (Fig. 5c–d), including feature engineering by the combination of OTUs enriched (C1) or depleted (C2) in Adv_CRA into new features. To classify Adv_CRA from CRP, 19 features were finally selected as markers with OTUs from Lachnoclostridium, Bacteroides, Ruminiclostridium_5, etc., and AUC values of 0.802 (95% CI: 0.774–0.830) and 0.762 (95% CI: 0.612–0.902) for the training and testing sets, respectively, were achieved (Fig. 5e). Similarly, 14 genera including Butyricimonas, Porphyromonas, Akkermansia, etc. (Fig. 5f) were identified to discriminate CRP from CRA, with an AUC of 0.697 (95% CI: 0.666–0.728) for the training set and 0.706 (95% CI: 0.569–0.843) for the testing set (Fig. 5d). Both models reflected the potential of bacterial characteristics to distinguish advanced adenomas or adenomas from polyps.
Discussion
In this study, we profiled and analyzed the fecal bacterial communities and predicted the metabolic pathways of participants across five different stages of colorectal tumorigenesis, with a particular focus on the differences between benign polyps (hyperplastic or inflammatory) and precancerous adenomatous polyps. The overall bacterial communities were significantly different among the healthy controls and patients with colorectal tumors, and the patients with CRC had a greater fecal bacterial richness than the healthy controls and patients with polyps or adenomas. A total of 81 genera, 589 ZOTUs, and 157 predicted pathways were significantly different in relative abundances among the five groups. Importantly, the combination of bacterial genera, ZOTUs, pathways, or clinical information showed a promising potential for the noninvasive diagnosis of lesions. Based on feature selection, the bacterial models could achieve an average AUC of 0.92 for classifying colorectal tumors vs. HS, 0.91 for precancerous tumors vs. CRC among colorectal tumors, 0.80 for Adv_CRA vs. CRP, and 0.70 for CRA vs. CRP. Our findings suggest that alterations in the bacterial structures and pathways are associated with the occurrence and development of colorectal tumors and that the selected bacterial features may be a potential noninvasive predictor of colorectal lesions, especially in discriminating benign polyps (CRP) and precancerous adenomatous polyps (CRA or Adv_CRA).
Accumulating evidence has revealed that variations in the gut microbiota are associated with colorectal tumors. We did observe significant differences in the overall bacterial communities among the five groups at various stages of colorectal tumorigenesis. Although the gut microbial characteristics in patients with hyperplastic or inflammatory polyps were merely illustrated, individuals with CRC and adenomas have been extensively reported to have different taxonomic compositions of fecal microbiota compared to healthy controls,10,18,21,28 which is referred to as “dysbiosis.”45 More functionally, the gavage of fecal samples from patients with CRC promotes intestinal tumorigenesis, including the number of polyps, levels of intestinal dysplasia, and proliferation in mice.46 In terms of bacterial diversity, our finding of increased species richness in adenoma, particularly in CRC vs. HS, is the opposite to some previous reports.47,48 However, Nina et al. have reported an increased diversity in adenoma than controls, which is consistent with the current study,49 followed by two studies reporting that the gut microbial richness is greater in CRC than adenoma.50,51 Similarly, bacterial diversity has been reported to be significantly increased in early hepatocellular carcinoma compared to that in liver cirrhosis.52 The increased richness and diversity at the severe stage of disease may be due to the recruitment and overgrowth of various pathogenic or harmful bacteria;52 this finding is supported by the high proportion (more than 10%) of ZOTUs that were unique to each tumor group in this study.
We detected plenty of bacterial characteristics at the genus, ZOTU, and pathway levels that were significantly different across the stages. Individual taxa with abnormal abundances have been extensively reported to be associated with CRC and even with adenoma.20,21 The genera of Anaerostipes and Butyricimonas decreased along with the tumor stage in this study; these taxa are well known to produce short-chain fatty acids,53 which are essential to maintaining human health by providing energy to the intestinal epithelium, modulating the immune system, and affecting diverse metabolic routes. In fact, the microbial pathways that produce short-chain fatty acids, such as pyruvate fermentation to acetate and lactate II (PWY-5100), were depleted in patients with tumors. Desulfovibro, including the I_97 aIOTU_716 selected as model features, can produce hydrogen sulfide,54 a genotoxic insult to the colonic epithelium,55 representing a potential pathogen that directly increases the risk of the development of colorectal tumors. Some genera were reported in this study. Similarly, Phascolarctobacterium has been reported to abundantly colonize the human gastrointestinal tract56 and has been positively associated with autism spectrum disorder57 and Alzheimer’s disease.58 Although less investigated, Megasphaera has been reported to increase in abundance after appendectomy in both children and adults,33,59 while it seems to be beneficial for those with diarrheal cryptosporidiosis.60 Interestingly, we found that the galactose degradation I pathway (PWY-6317) was depleted in disease stages; therefore, it was selected as an important feature for the model classifying tumors. Galactose from fruits and vegetables can prevent CRC by binding and inhibiting lectins that can stimulate colon epithelial proliferation.61 Recently, β-galactosidase, which hydrolyzes lactose into galactose, has been reported to prevent tumor formation by inhibiting cell proliferation, promoting apoptosis of CRC cells, and retarding the growth of CRC xenografts.62 Certainly, more studies are needed to illustrate the mechanisms of the individual taxa and pathways acting on tumorigenesis.
As expected, good performances were obtained for the random forest models based on a combination of genera, OTUs, pathways, and/or age after feature selection when classifying individual stage of lesions (AUCs: 0.84–0.96) or overall colorectal tumors (AUC = 0.88) vs. healthy controls, as well as further discriminating CRC vs. other precancerous lesions (AUCs: 0.74–0.88). Good performance of microbiome-based models for classifying CRC vs. healthy controls has been published previously, with AUCs greater than 0.8 based on the meta-analysis of metagenomic20,21 and 16S rRNA gene sequencing datasets.10,23 Unfortunately, models for adenoma have been less investigated and usually provide a lower performance.11,23 Recently, Young et al. have reported 16S rRNA sequencing-based models distinguishing neoplasm (CRC or adenoma) vs. blood-negative guaiac fecal occult blood tests, with an AUC of 0.78 in a large-scale (more than 2,000 samples) bowel cancer screening program.63 Our favorable results suggest the importance of feature selection for bacterial markers in improving the performance of noninvasive diagnosis of colorectal tumors. Previous studies have shown the reduced discriminatory power of microbiome-based models to detect adenomas.21 Our analysis profiling gut microbiome-associated characteristics has the potential for the diagnosis of adenoma from polyps, including advanced adenoma. Adv_CRA was classified from CRP with 19 markers, with an AUC of approximately 0.80. Ten bacterial genera distinguished CRA from CRP, with an AUC of 0.70. To the best of our knowledge, this is the first study to explore microbial signatures between polyps and adenomas or advanced adenomas. Although the performance of the model in this study is lower than that of other models, it reflects the potential of identifying adenomas by bacterial markers.
The following limitations should be considered in this study. First, limited clinical metrics were collected due to the prospective collection of fecal samples in the hospital before colonoscopy. The addition of more clinical indexes and an independent cohort validation may further improve and verify the performance of the classifying models in the future. Second, nonbalanced samples were collected, with small sample sizes for the HS and CRC groups. Our original intention was to reveal the shifts, performance, and potentials of the bacterial communities in the noninvasive screening of colorectal tumors, with a particular focus (large sample size) on precancerous stages, including adenoma and polyps, since comparisons between CRC and HS have been well studied. Third, 16S rRNA gene sequencing was applied in this study. Metagenomic sequencing and metabolomics would provide more insights and further reveal the shifts of microbial features. This study should be extended in terms of sample size, multi-center verification with more baseline clinical characteristics as well as sample collection and storage, and shotgun metagenomic sequencing analysis for optimization to benefit patients in clinical practice.
Conclusions
In conclusion, we observed dynamic shifts in the fecal bacterial diversity, and the bacterial composition predicted the pathways across multistep colorectal tumorigenesis. Additionally, after feature selection based on genera, OTUs, pathways, and age, we built classifying models with a good performance for classifying overall colorectal tumors vs. healthy controls and precancerous tumors vs. CRC. More importantly, for the first time, we explored the differences in bacterial communities and the noninvasive models for benign polyps (hyperplastic or inflammatory) and precancerous adenomatous polyps, which is meaningful in the clinic for noninvasively identifying the risk of progression to cancer from polyps.
Supporting information
Supplementary material for this article is available at https://doi.org/10.14218/CSP.2022.00017 .
Supplementary Fig. 1
Relative abundances of phyla (a), classes (b), and top 10 families (c), and top 20 pathways (d) among patients with coloretal tumors and healthy subjects.
(TIF)
Supplementary Fig. 2
Relative abundances of OTUs that were selected as important features for classifying colorectal lesions (non_HS) from HS group.
(TIF)
Supplementary Fig. 3
Relative abundances of families (a) and genera (b) among patients with CRP, CRA, and Adv_CRA.
(TIF)
Supplementary Table 1
Relative abundances of the 81 genera that were significantly different among HS, CRP, CRA, Adv_CRA, and CRC.
(XLSX)
Supplementary Table 2
Relative abundances of the 589 OTUs that were significantly different among HS, CRP, CRA, Adv_CRA, and CRC.
(XLSX)
Supplementary Table 3
Relative abundances of the 157 pahtways that were significantly different among HS, CRP, CRA, Adv_CRA, and CRC.
(XLSX)
Supplementary Table 4
Taxonomy of the OTU features selected for models in classifying CRC, Adv_CRA, CRA and CRP from HS.
(XLSX)
Supplementary Table 5
Taxonomy of the OTU features selected for models in classifying Adv_CRA, CRA and CRP from CRC.
(XLSX)
Supplementary Table 6
Taxonomy of the OTU shared between differences from CRP vs Adv_CRA and from CRP Vs CRA.
(XLSX)
Abbreviations
- Adv_CRA:
colorectal adenomas equal to or larger than 10 mm
- AUC:
area under the receiver operating characteristic curve
- CRA:
colorectal adenomas smaller than 10 mm
- CRC:
colorectal cancer
- CRP:
colorectal hyperplastic or inflammatory polyp
- HS:
normal colonoscopy
- OTU:
operational taxonomic unit
- ZOTU:
zero-radius operational taxonomic unit
Declarations
Acknowledgement
The authors thank the lab members and clinicians of the Gastroenterology Department at Zhongshan Hospital of Xiamen University for thoughtful comments on the manuscript and for helping to manage the patients.
Ethical statement
The Ethics Committee of Zhongshan hospital, Xiamen University approved this study.
Data sharing statement
The raw sequences used to support the findings of this study were deposited in the National Center for Biotechnology Information Sequence Read Archive under accession number PRJNA869338.
Funding
This work was supported by the Xiamen Key Programs of National Health (3502Z20204007), the Xiamen Priority Programs of Medical Health (3502Z20199172), the Fujian Provincial Natural Science Foundation (2021J011329), and the Fundamental Research Funds for the Central Universities.
Conflict of interest
Prof. Jianlin Ren has been an editorial board member of Cancer Screening and Prevention since March 2022. The authors have no other conflict of interests related to this publication.
Authors’ contributions
Study concept and design (JLR, HZX, BZZ), acquisition of data (QYC, YYF, YQZ, CSY, XNY), analysis and interpretation of data (BZZ, MC, QYC, YYF), drafting of the manuscript (BZZ, QYC, MC, YYF), administrative and technical support (JJL, YYF), and study supervision (JLR, HZX). All authors contributed significantly to this study and approved the final manuscript.