Research status and trends of deep learning in colorectal cancer (2011-2023): Bibliometric analysis and visualization

World Journal of Gastroenterology 2025 AI 9 Explanations View Original
Original Paper (PDF)

Unable to display PDF. Download it here or view on PMC.

Plain-English Explanations
Pages 1-2
Why Map the Landscape of Deep Learning in Colorectal Cancer?

Colorectal cancer (CRC) is the third most prevalent cancer worldwide and the second leading cause of cancer-related death, with more than 1.92 million new cases and 900,000 deaths annually according to GLOBOCAN 2022. This enormous disease burden has driven researchers to explore every available technological advantage, and deep learning (DL), a subset of machine learning that excels at recognizing complex patterns in data, has become increasingly central to CRC research over the past decade.

Deep learning models, particularly convolutional neural networks (CNNs), have proven effective at automatically extracting features from medical images such as radiology scans, pathology slides, and colonoscopy video. More recently, transformer-based neural networks have entered the medical research space, offering improvements in performance, resilience, and data efficiency. These architectures have been applied to CRC tasks including image-based diagnosis, polyp detection, tumor classification, risk stratification, and prognostic biomarker prediction.

Despite this growing body of work, no prior bibliometric analysis had comprehensively mapped the research landscape of DL specifically in CRC. The authors set out to fill that gap by analyzing all relevant literature from 2011 to 2023, aiming to identify leading countries, institutions, authors, journals, and keywords, and to reveal the field's hotspots and future directions.

Bibliometrics is a quantitative research strategy that uses mathematical and statistical tools to analyze publication patterns across a body of literature. It complements traditional reviews by providing objective, data-driven visualizations of how a field has evolved, which collaborations exist, and where the research frontier is moving.

TL;DR: CRC is one of the deadliest cancers globally, and deep learning has become a powerful tool for its diagnosis and prognosis. This study is the first bibliometric analysis to comprehensively map the DL-in-CRC research field from 2011 to 2023, revealing trends, key contributors, and emerging hotspots.
Pages 2-3
How the Bibliometric Analysis Was Conducted

Data source and search strategy: The authors retrieved literature from the Web of Science Core Collection on October 20, 2024. The search used the terms (Malignant Colorectal Neoplasm* OR Malignant Colorectal Tumor* OR Colorectal Cancer* OR Colorectal Carcinoma* OR CRC) AND (Deep Learning OR DL). The time window was January 1, 2011 to December 31, 2023. Only English-language articles and review articles were included, resulting in 1,275 publications after screening from an initial pool of 1,345.

Visualization tools: Three specialized bibliometric software platforms were used. Scimago Graphica (version 1.0.45) was employed for national-level indicators and geographic cooperation maps. CiteSpace (version 6.3.1) was used to analyze institutions, references, keywords, and to generate co-occurrence charts, timeline diagrams, and citation burst analyses. VOSviewer (version 1.6.20) was used for journal, author, and reference network visualization maps. Additionally, Origin (2022) was used for plotting and Excel (2021) for constructing tables.

Interpretation of visualizations: In the network maps produced by these tools, node size indicates frequency of occurrence (publications, citations, or keyword appearances). Connecting lines between nodes represent collaboration or co-occurrence relationships, and thicker lines indicate closer relationships. Different colors represent different clusters. CiteSpace's "mediator centrality" metric was particularly important: a value greater than 0.1 indicates a node with a significant bridging role in the network, marked by a purple-red circle in the visualization.

The combination of these three tools provided complementary perspectives. Scimago Graphica excelled at geographic visualization, CiteSpace at temporal analysis and burst detection, and VOSviewer at network clustering. Together, they offered a thorough picture of the field's bibliometric structure.

TL;DR: The study searched Web of Science for DL-in-CRC literature (2011-2023), finding 1,275 publications. Three visualization tools (Scimago Graphica, CiteSpace, VOSviewer) were used to analyze country, institution, journal, author, reference, and keyword indicators through network maps, timeline charts, and citation burst analyses.
Pages 3-4
A Tenfold Increase in Publications After 2019

Growth trajectory: The 1,275 included publications spanned 538 journals from 74 countries and 2,267 institutions. The number of annual publications grew slowly from 2011 to 2019, then surged dramatically. After 2019, more than 100 papers were published every year, and the 2023 publication count was approximately ten times that of 2011. The year 2019 represented a clear inflection point for the field.

Quality indicators: When the publications were classified by Journal Citation Reports (JCR) quartiles, the number of Q1 articles increased particularly rapidly after 2019, reaching 141 in 2023. This indicates not just growth in volume but also in quality, as Q1 journals represent the top tier of their respective fields. Growth was also observed in Q2 and Q3 publications, though Q1 growth was the most pronounced.

Why 2019 was the turning point: The authors attribute the post-2019 surge to two converging factors. First, breakthroughs in deep learning and CNNs enabled more sophisticated processing of complex medical data such as radiology images and pathology slides. Second, CRC-related examination modalities (imaging, pathology, colonoscopy) became increasingly integrated with DL methods, creating a fertile ground for applied research. The continued upward trajectory through 2023 suggests that DL in CRC research will remain a highly active field.

TL;DR: Publications on DL in CRC grew slowly from 2011 to 2019, then exploded, reaching about ten times the 2011 count by 2023. Q1 journal articles reached 141 in 2023, signaling both quantity and quality growth. The 2019 inflection point reflects maturing DL technology meeting growing clinical need.
Pages 3-5
Countries, Institutions, and International Collaboration Networks

Leading countries: China led with 371 publications (29.1%), followed by the United States with 265 (20.8%) and Japan with 155 (12.2%). Together, these three countries accounted for 62.1% of all publications. South Korea (121), the United Kingdom (101), Germany (96), Italy (71), Spain (52), India (49), and the Netherlands (45) rounded out the top ten. Notably, the top five countries each exceeded 100 publications.

Collaboration networks: The United States had the strongest international collaboration ties, establishing cooperation networks with dozens of countries. The most closely connected nations were the United States, the United Kingdom, and Germany. China, despite leading in publication volume, had relatively limited international collaboration, suggesting that Chinese scholars could benefit from strengthening cross-border research partnerships.

Top institutions: Sun Yat-sen University in China led with 32 publications, followed by the Chinese Academy of Sciences, Harvard Medical School, and Seoul National University with 20 each. Zhejiang University (19), Catholic University of Korea (18), Southern Medical University (18), German Cancer Research Center (17), National Cancer Centre of Japan (16), and RWTH Aachen (16) completed the top ten. China hosted four of these top ten institutions, Korea and Germany each had two, and the United States and Japan had one each.

Centrality analysis: Harvard Medical School had the highest centrality score (0.12), followed by Sun Yat-sen University (0.10). High centrality indicates an institution's bridging role in the research network, meaning Harvard served as a key intermediary connecting different research groups and potentially guiding new ideas in the field. The combination of China's publication volume and the United States' institutional centrality has been a major driver of the field's development.

TL;DR: China, the US, and Japan produced 62.1% of all publications. The US had the strongest international collaboration network, while China led in volume but lacked cross-border ties. Sun Yat-sen University published the most papers (32), while Harvard Medical School had the highest network centrality (0.12), acting as a key bridge in global research.
Pages 5-6
Where the Research Was Published and Who Led It

Top publishing journals: Among 538 journals in the dataset, Scientific Reports published the most articles (34), followed by IEEE Access (31) and Frontiers in Oncology (29). Medical Image Analysis had the highest impact factor (10.7) among top publishing venues, followed by Computers in Biology and Medicine (7.0). The journal distribution was unequal, with the majority of papers appearing in a relatively small number of journals, a pattern consistent with Bradford's law of bibliometrics.

Co-citation analysis: There were 7,739 co-cited journals. Gastrointestinal Endoscopy had the most co-citations (1,053), followed by Scientific Reports (969) and Proceedings of the IEEE (946). The New England Journal of Medicine had the highest impact factor (96.2) among co-cited journals, followed by the Journal of Clinical Oncology (42.1). Nine of the top ten co-cited journals were Q1-ranked, underscoring their significant influence on the field.

Leading publishers: Elsevier and Springer Nature tied for the most publications (239 each), followed by MDPI (135), Wiley (99), and IEEE (68). The double map overlay analysis showed that citing journals were mainly distributed in molecular biology, immunology, and clinical medicine, while cited journals concentrated in molecular biology, genetics, health, and nursing.

Key authors: The analysis covered 8,101 authors. Jakob Nikolas Kather (Germany, Memorial Sloan Kettering Cancer Center) led with 12 publications and the highest co-citation count (287). Other prolific authors included Lee SH from Korea (9 papers), Liu Z from China (9), and Pickhardt PJ from the United States (9). Kather's research focused on applying DL to cancer treatment, particularly transformer-based biomarker prediction from histology images. The author collaboration network revealed 16 distinct research groups, with central collaborative research between the groups of Kather JN and Li X, though connections between groups were relatively scattered.

TL;DR: Scientific Reports led in publication count (34), while Medical Image Analysis had the highest impact factor (10.7). Elsevier and Springer Nature each published 239 papers. Jakob Nikolas Kather was the most prolific and cited author (12 papers, 287 co-citations), specializing in DL-based cancer biomarker prediction from histopathology.
Pages 6-8
The Most Influential Papers Shaping the Field

Top cited references: The included publications cited 40,035 references in total. The most-cited reference was "Deep Residual Learning for Image Recognition" by Kaiming He (149 citations), which introduced the ResNet architecture that revolutionized deep learning for image tasks. Second was "U-Net: Convolutional Networks for Biomedical Image Segmentation" by Olaf Ronneberger (124 citations), a foundational architecture for medical image segmentation. Third was the "Global Cancer Statistics" paper by Ahmedin Jemal (117 citations).

Clinical DL references: Among the top ten, several papers directly addressed DL applications in CRC. Kather JN's "Predicting survival from colorectal cancer histology slides using deep learning" (87 citations) used CNN-based tissue slice identification to predict CRC survival in a retrospective multicenter study. His other highly cited work demonstrated that deep learning could predict microsatellite instability directly from histology in gastrointestinal cancer (79 citations). Gregor Urban's paper on real-time polyp detection achieved 96% accuracy in screening colonoscopy (75 citations).

Reference clusters and evolution: References cited more than 20 times (122 total) formed three main clusters: a red cluster (58 articles) focused on DL in cancer, a green cluster (36 articles) on DL and computers, and a blue cluster (28 articles) on DL and colonoscopy. The timeline analysis revealed that citation types diversified dramatically after 2015. Eight reference clusters emerged, including artificial intelligence (cluster 0), feature extraction (cluster 1), microsatellites (cluster 2), colonoscopy (cluster 4), and polyp segmentation (cluster 5). Research orientation shifted from CRC treatment before 2015 to AI-driven diagnostic approaches after 2015.

Citation bursts: The top 25 papers with the highest citation bursts had intensities ranging from 3.99 to 11.96. "Deep Residual Learning for Image Recognition" had the highest citation explosion rate, confirming it as a foundational work that scholars repeatedly referenced when building new DL models for CRC applications.

TL;DR: ResNet (He, 149 citations) and U-Net (Ronneberger, 124 citations) were the most cited foundational DL architectures. Kather's work on predicting CRC survival and microsatellite instability from histology were the top clinical DL references. After 2015, the field shifted from treatment-focused to AI-driven diagnostic research.
Pages 8-10
Six Research Clusters and Emerging Hotspots

High-frequency keywords: Among 4,869 keywords in 1,275 articles, 81 appeared more than 20 times. The most frequent were "deep learning" (354), "colorectal-cancer" (297), "colorectal cancer" (293), "cancer" (172), "classification" (147), "artificial intelligence" (144), "colonoscopy" (136), "survival" (121), "risk" (106), and "diagnosis" (98). The keywords with the highest centrality were "colorectal cancer" (0.16), "colon" (0.13), "cancer" (0.12), and "survival" (0.11).

Six keyword clusters: CiteSpace clustering analysis identified six distinct research areas. Cluster #0 ("artificial intelligence") and Cluster #1 ("colorectal polyps") were the largest, representing the most active research domains. Cluster #2 ("metastatic colorectal cancer") focused on advanced disease. Cluster #3 ("microsatellite instability") emerged as a newer topic, with research only beginning in 2017. Cluster #4 ("prognostic score") addressed outcome prediction. Cluster #5 ("autoencoder-based model") reflected interest in generative and unsupervised DL approaches. Clusters #0, #3, and #5 were primarily DL-oriented, while #1, #2, and #4 were CRC-oriented.

Citation burst analysis: The keyword with the highest burst intensity was "colorectal-cancer" (12.34), and the longest-lasting burst keyword was "survival" (2011-2018). The most recent burst keywords were "validation" (2019-2020), "feature extraction" (2020-2021), and "system" (2020-2021). Keyword burst duration was longer before 2019 and shorter afterward, indicating that the field entered a rapid development phase after 2019 with faster topic turnover.

Timeline trends: The keyword timeline chart showed that research on microsatellite instability and autoencoder-based models is increasingly active, representing the field's transition toward molecular-level prediction and generative modeling. This shift suggests the field is moving from purely image-based detection toward integrating molecular biomarkers with DL architectures for more precise clinical decision-making.

TL;DR: Six keyword clusters were identified: AI, colorectal polyps, metastatic CRC, microsatellite instability, prognostic scores, and autoencoder models. "Colorectal-cancer" had the highest burst intensity (12.34). The field is shifting toward microsatellite instability prediction and autoencoder-based generative models as emerging frontiers.
Pages 10-13
How Deep Learning Is Transforming CRC Diagnosis, Staging, and Prognosis

CT-based DL models: Deep learning applied to contrast-enhanced computed tomography (CT) imaging enables radiologists to perform early screening, lesion detection, localization, tumor staging, and prognosis prediction for CRC. Lu et al. developed a DL system that uses contrast-enhanced CT to predict disease stage and RAS mutation status before surgery. Multi-size DL approaches applied to preoperative CT have also shown promise for prognosis prediction.

Histopathology-based DL models: Pathology image analysis has been one of the richest application areas. DL models help pathologists with diagnostic identification, lymph node metastasis detection, gene mutation prediction, tumor classification, and survival prognosis. Semi-supervised deep learning on pathological images has achieved accurate CRC recognition, while annotation-free whole-slide image analysis has identified nodal micrometastasis. Kather's work demonstrated that CNN-based tissue classification could predict survival from histology slides across multiple centers.

Colonoscopy-based DL models: Endoscopic applications focus on real-time polyp detection and assessing the depth of submucosal infiltration. Wagner et al. showed that transformer-based DL outperformed other methods in predicting CRC biomarker status from histology, increasing diagnostic speed and accuracy. These models assist gastroenterologists in detecting colorectal polyps during routine screening, with one landmark study achieving 96% accuracy in real-time polyp localization.

Molecular and biomarker prediction: A particularly exciting frontier is the ability of DL to predict molecular features directly from images. Kather's work on predicting microsatellite instability from standard histology images is notable because MSI status traditionally requires expensive molecular testing. Transformer-based models have also been applied to predict biomarker status from colorectal cancer histology in large-scale multicentric studies. DL-based prognostic risk scores have demonstrated predictive efficacy independent of established clinical risk markers.

TL;DR: DL is being applied across three major CRC modalities: CT imaging (for staging and lesion detection), histopathology (for survival prediction, gene mutation, and lymph node metastasis), and colonoscopy (for real-time polyp detection at 96% accuracy). Predicting microsatellite instability directly from histology images is a particularly promising frontier.
Pages 13-16
What This Study Could Not Capture and Where the Field Is Heading

Limitations of this bibliometric analysis: The study has two notable constraints. First, only the Web of Science Core Collection was searched, excluding databases such as PubMed, Scopus, and Embase, which may contain additional relevant literature. Second, the analysis period ended in December 2023, so publications from 2024 onward were not included. Given the field's rapid growth rate, this likely means a substantial number of recent studies were missed. These factors could introduce selection bias into the results.

Collaboration gaps: The author collaboration network analysis revealed that research groups in this field are relatively scattered, with few connections between them. The study strongly recommends strengthening multicenter and international collaboration, particularly for Chinese institutions, which lead in publication volume but lag in international cooperation. Increased cross-institutional research could help standardize datasets, validate models across diverse populations, and accelerate clinical translation.

Architectural evolution: The study recommends optimizing DL models, particularly CNNs and transformers, for CRC applications. The keyword timeline analysis suggests that autoencoder-based models and other generative architectures are gaining traction and may offer new capabilities for data augmentation, unsupervised feature learning, and synthetic data generation in settings where labeled training data is scarce.

Future research priorities: Based on the keyword cluster and burst analyses, the authors identify microsatellite instability prediction as a key emerging hotspot. The transition from image-level analysis to molecular-level prediction represents a fundamental shift in how DL is applied to CRC. Future work is expected to focus on integrating multimodal data (combining imaging, pathology, genomics, and clinical records), improving model interpretability for clinical acceptance, and conducting prospective validation studies that move DL tools from research prototypes to deployable clinical aids.

TL;DR: The study was limited to Web of Science data through 2023. Key recommendations include strengthening international collaboration (especially for Chinese institutions), optimizing CNN and transformer architectures, and focusing on emerging hotspots like microsatellite instability prediction and autoencoder-based models. Multimodal data integration and prospective clinical validation are critical next steps.
Citation: Qi LY, Li BW, Chen JQ, Bian HP, Xue JN, Zhao HX.. Open Access, 2025. Available at: PMC12142239. DOI: 10.4251/wjgo.v17.i5.103667. License: cc by-nc.