AI in Hematology: Current Trends

Overview & Background

Pages 1-2

AI Is Reshaping Hematology, and This Study Maps How

This bibliometric analysis set out to map the entire landscape of artificial intelligence research in hematology, covering 45 years of publications from 1980 through 2025. The authors searched the Web of Science Core Collection on June 1, 2025, starting with 1,362 AI-related publications in the hematology research area. After filtering out non-original work such as letters, review articles, and proceedings, they retained 376 original research articles for analysis.

The motivation behind this study is straightforward: AI technologies, including machine learning (ML), deep learning (DL), and large language models (LLMs), have rapidly expanded into clinical hematology. Automated peripheral blood smear analysis can now flag leukemia and lymphoma. Digital pathology tools classify bone marrow biopsies for myeloproliferative neoplasms. LLMs like GPT-4 have been tested for blood transfusion decision-making and found to perform with high accuracy. The question is no longer whether AI has a role in hematology, but where exactly that role is concentrated and how it has evolved.

The study specifically aimed to answer three questions: What are the dominant AI techniques in hematology research after 2020? In which hematologic subfields are AI applications most intensely investigated? And with which clinical terms or diseases is AI most frequently associated? To answer these, the authors employed bibliometric techniques including trend keyword analysis and factor analysis, using the Bibliometrix package (Biblioshiny interface) in RStudio.

Geographic distribution: The USA led with 111 publications, followed by China (79), the United Kingdom (22), and Germany (21). Harvard University was the single most productive institution. This concentration of output in the USA and China mirrors patterns seen in AI research across other medical specialties.

TL;DR: A bibliometric study analyzed 376 original research articles on AI in hematology from 1980 to 2025, sourced from the Web of Science. The USA (111 articles) and China (79) dominated output. The study uses keyword trend analysis and factor analysis to map where AI is being applied across blood disorders.

Methodology

Pages 2-3

Search Strategy, Data Filtering, and Analytical Approach

The authors used the Web of Science Core Collection (WoSCC) as their sole data source, selecting it for its rigorous coverage of high-impact journals, reliable citation indexing, and long historical record. While acknowledging that Scopus and PubMed offer complementary strengths (PubMed's MeSH-based indexing captures clinical trials more precisely, and Scopus has broader interdisciplinary coverage), WoSCC remains the standard in bibliometric research and supports the co-authorship, co-citation, and network analyses central to this study.

Search design: The search was restricted to publications classified under the "Hematology" research area. AI-related literature was identified using Boolean-combined keywords including "artificial intelligence," "AI," "ChatGPT," "ML," "DL," "neural networks," and "large language model." This yielded 1,362 publications, which were then filtered down to 376 original research articles after excluding letters, proceedings, and reviews. A flowchart and full search strategy were provided in the supplementary materials for reproducibility.

Analysis tools: Basic statistical analyses were performed in SPSS (Version 25.0). Publication timelines were visualized using Microsoft Excel. The core bibliometric analyses, including keyword frequency, trend analysis, and factor analysis, were carried out using the Bibliometrix package via the Biblioshiny interface in RStudio. Bibliometrix is widely regarded as one of the most robust tools for science mapping and literature network visualization, offering advantages over alternatives in analytical depth.

A total of 957 different author keywords were identified across the 376 articles. Before analysis, synonymous terms were consolidated (e.g., "haemorrhage" and "hemorrhage," "leukaemia" and "leukemia," "MRI" and "magnetic resonance imaging") to avoid fragmentation. The top 50 keywords appearing in five or more articles were then analyzed for frequency and temporal trends.

TL;DR: WoSCC was the sole database. From 1,362 initial publications, 376 original research articles were retained. Analysis used Bibliometrix in RStudio, with 957 unique keywords consolidated for synonyms before trend and factor analyses.

Keyword Trends

Pages 3-5

How AI Topics in Hematology Evolved from 2009 to 2025

The trend keyword analysis reveals a clear evolution from basic statistical methods toward advanced AI techniques. In 2009, the dominant terms were "statistics" and "warfarin," reflecting early, narrow applications. By 2016, "flow cytometry" emerged as a diagnostic focus. In 2017, "cardiovascular disease" became a trending keyword, followed in 2018 by "acute leukemia," "donor selection," and "survival," signaling a shift toward clinical and prognostic themes.

The 2019-2021 inflection point: Starting in 2019, research volume surged. "Atherosclerosis" and "venous thromboembolism (VTE)" became prominent. In 2020, both disease-related and methodological topics trended, including "leukaemia/acute myeloid leukaemia," "cerebral blood flow," and "logistic regression." By 2021, AI technique-specific terms took center stage: "diagnosis," "morphological analysis," "natural language processing (NLP)," "artificial neural networks," and "image analysis" all drew significant attention. This marks the point where AI moved from a background analytical tool to a primary research focus.

2022-2024 expansion: In 2022, clinical and disease-specific applications dominated, with "atrial fibrillation," "anticoagulants," "electronic health records," "hemodialysis," and "sickle cell disease" emerging as key topics. In 2023, "pulmonary embolism," "risk assessment," and "hematopoietic stem cell transplantation (HSCT)" were frequently studied alongside "ML" and "DL" as method-level keywords. By 2024, there was a marked concentration on AI-driven applications, with "prediction modeling," "thrombosis," "VTE," and "clinical decision support" trending. The prominence of VTE and thrombosis in 2024 likely reflects interest in post-COVID-19 thrombotic complications.

2025 frontiers: The most recent trends point toward "nomogram" and "transcriptome," signaling a shift toward personalized medicine and genomics-based approaches. This progression from retrospective statistical analyses to prospective, genomics-driven research represents a fundamental change in how AI is being applied in hematology.

TL;DR: AI in hematology evolved from warfarin/statistics (2009) through flow cytometry (2016) to NLP and morphological analysis (2021). By 2024, thrombosis, VTE, and clinical decision support dominated. In 2025, nomograms and transcriptomics indicate a pivot toward genomics-driven personalized medicine.

Thrombosis & VTE

Pages 5-6

AI in Thrombosis and Venous Thromboembolism: The Most Studied Application Area

Thrombosis and venous thromboembolism (VTE) emerged as the single most intensively studied application area for AI in hematology. Several high-performing models illustrate why. Ding et al. (2023) developed an XGBoost model to predict deep vein thrombosis (DVT) and pulmonary embolism after hip arthroplasty, achieving 91.3% sensitivity and 99.8% specificity, significantly outperforming conventional scoring systems. Wang et al. (2023) built a model for DVT risk after hip and knee arthroplasty that achieved an AUC of 92% and a sensitivity of 80.3%.

Early warning capability: Ryan et al. (2024) demonstrated that AI models could predict DVT 12 to 24 hours in advance, potentially reducing mortality through earlier clinical intervention. These models leveraged electronic health records (EHRs) to deliver stronger predictive performance than traditional scoring systems like the Caprini score, improving both diagnostic accuracy and prognostic assessment at the bedside.

Gil et al. (2023) provided a comprehensive overview of how AI algorithms are being used across the full VTE spectrum, from screening through treatment management. Their work emphasized the revolutionary potential of large language models and generative AI for processing healthcare data in this domain. The clinical significance is clear: AI-assisted early warning systems give clinicians critical advantages in detecting complications early and intervening before outcomes deteriorate.

COVID-19 connection: The surge in thrombosis-related AI research in 2024 is closely tied to post-COVID-19 era complications. AI-based modules were used to analyze COVID-19-related thrombotic complications, particularly in pediatric patients, to improve VTE prophylaxis. In a comparison of seven AI models for predicting VTE in COVID-19 patients, ensemble learning strategies emerged as particularly effective, highlighting how AI responds to newly recognized clinical challenges.

TL;DR: AI models for thrombosis/VTE achieved striking performance: 91.3% sensitivity and 99.8% specificity (XGBoost for DVT), 92% AUC for post-arthroplasty DVT risk, and the ability to predict DVT 12-24 hours in advance. COVID-19-era thrombotic complications drove further research into ensemble AI models.

HSCT & Transplantation

Pages 6-7

AI in Hematopoietic Stem Cell Transplantation: Donor Selection and Risk Stratification

Hematopoietic stem cell transplantation (HSCT) represents another major domain where AI has demonstrated clear clinical value. Pagliuca et al. (2024) employed Random Survival Forest (RSF) and Lasso/Elastic Net models to investigate how immunogenetic variables, particularly HLA Evolutionary Divergence (HED), affect outcomes after allogeneic stem cell transplantation. Their findings showed that HED values were predictive of critical outcomes including relapse, acute and chronic graft-versus-host disease (GVHD), and overall survival. Importantly, this study used a multi-center dataset, which strengthens generalizability.

The MatchGraft.AI model: Built on a random forest algorithm, MatchGraft.AI predicted individual acute GVHD (aGvHD) risk in an international retrospective cohort of 1,035 patients, achieving an AUC of 70%. While the predictive power is modest, the model demonstrates how AI can complement existing donor-selection strategies, enabling broader and more precise donor matching and facilitating individualized conditioning and treatment plans.

Large-scale CNN approaches: A convolutional neural network (CNN) model developed using the Japanese national registry integrated NLP and interpretable AI to classify aGvHD risk among 18,763 patients. The model confirmed a 28.8% incidence of aGvHD in the high-risk group within the test cohort, significantly outperforming traditional methods. Separately, Spellman et al. (2024) applied a "Nonparametric Failure Time Bayesian Additive Regression Trees" algorithm to a dataset of over 11,000 patients. Their analysis showed that donors aged 18 to 30 had comparable overall survival performance, while male donors significantly improved event-free survival (EFS).

Collectively, these HSCT studies demonstrate that AI can refine donor matching, stratify risks more precisely, and potentially improve individualized conditioning strategies. However, prospective validation remains a gap across all of these models, and the modest AUC of 70% in the MatchGraft.AI model underscores that these tools are best positioned as complements to, not replacements for, existing clinical judgment.

TL;DR: AI models for HSCT include MatchGraft.AI (AUC 70%, 1,035-patient cohort for aGvHD prediction), a CNN-based model classifying aGvHD risk across 18,763 patients (28.8% incidence in high-risk group), and Random Survival Forest models linking HLA Evolutionary Divergence to transplant outcomes. Prospective validation is still needed.

Leukemia & Diagnostics

Pages 7-8

AI-Driven Diagnostics and Prognosis in Leukemia

Acute myeloid leukemia (AML): AI applications in AML have focused primarily on two fronts: diagnosis from blood or marrow images and prognostic modeling. A meta-analysis by Al-Obeidat et al. (2025) demonstrated that deep learning algorithms such as CNNs can detect AML cases from microscopic blood images with over 95% accuracy and high sensitivity. Achir et al. (2024), in a systematic review covering more than 25,000 studies, confirmed that AI and image processing techniques enhance not only diagnostic accuracy but also speed and reliability, making them particularly valuable in low-resource healthcare settings where pathologist access is limited.

AML-NOS risk stratification: For genetically undefined or clinically heterogeneous subtypes like AML not otherwise specified (AML-NOS), AI-supported prognostic models show real promise. Lopez-Caro et al. (2024) developed a Random Survival Forest algorithm tested on 286 patients that could stratify AML-NOS cases into low-, intermediate-, and high-risk groups. The model achieved a concordance index (c-index) exceeding 0.77 in both clinical-only and genomics-integrated versions. Despite these results, most AML models remain limited by small sample sizes or retrospective-only validation, underscoring the need for prospective clinical trials.

Chronic myeloid leukemia (CML): By contrast, AI's clinical impact in CML remains exploratory. Bernardi et al. (2024) showed that AI contributes to CML diagnosis and monitoring through virtual patient cohort generation and big data analytics. However, given the high success rates of existing treatment regimens (particularly tyrosine kinase inhibitors), AI has not yet achieved direct clinical translation in CML. The technology remains in the "potential tool" category rather than an active clinical asset for this disease.

The broader pattern across leukemia subtypes is clear: AI is furthest along in AML, where diagnostic complexity and prognostic heterogeneity create clear use cases, while CML's relative treatment success means AI has less clinical urgency to fill gaps.

TL;DR: CNNs detect AML from blood images with over 95% accuracy. A Random Survival Forest model stratified AML-NOS patients into risk groups with a c-index above 0.77 (286 patients). CML AI remains exploratory due to the high success of current treatments.

Factor Analysis & Broader Applications

Pages 8-9

Five Research Clusters and AI's Expanding Clinical Footprint

Factor analysis using the 80 most influential keywords divided the AI-in-hematology literature into five distinct thematic clusters, revealing the conceptual structure of the field. Cluster 1 linked advanced technologies (AI, ML, DL) with core hematologic diseases: leukemia, sickle cell disease, HSCT, and blood transfusion. This cluster represents the strongest integration of AI into diagnostic, prognostic, and therapeutic workflows. Cluster 2 focused on quality improvement and epidemiological applications, showing AI's contributions to clinical decision support systems and outcome monitoring.

Cluster 3 emphasized data-driven approaches, particularly NLP and electronic health records, and the development of risk prediction models and personalized treatment strategies. Cluster 4 addressed AI in intensive care and critical patient management. Cluster 5 reflected AI-based solutions for the interaction between hematologic diseases and the cardiovascular system, including atrial fibrillation, anticoagulant therapy management, and stroke prevention.

Broader clinical applications: Beyond the major domains of VTE, HSCT, and leukemia, several notable studies demonstrate AI's expanding scope. Zhao et al. (2023) developed a ML model to predict left atrial appendage thrombosis in atrial fibrillation patients, enabling personalized risk assessment. Fard et al. (2024) showed AI-based bleeding risk models outperformed conventional methods during extended anticoagulation therapy. Chrysafi et al. (2024) used ML to support the design of novel anticoagulant molecules, pushing AI into pharmacological innovation. In sickle cell disease, Patel et al. (2024) predicted hospital readmissions more accurately than traditional LACE and HOSPITAL scores, and Padrao et al. (2021) stratified mortality risks in intensive care using phenotype-based unsupervised clustering.

This clustering structure demonstrates that AI in hematology is not confined to a single application. It spans diagnosis, prognosis, drug development, quality improvement, and personalized care across conditions ranging from sickle cell disease to atrial fibrillation to post-transplant complications.

TL;DR: Factor analysis identified five thematic clusters: (1) AI plus core blood diseases, (2) clinical quality improvement, (3) NLP and EHR-based risk prediction, (4) intensive care applications, and (5) cardiovascular-hematologic interactions. AI now extends into drug design, readmission prediction, and phenotype-based ICU risk stratification.

Limitations & Future Directions

Pages 9-11

Database Constraints, Retrospective Bias, and the Path Forward

Single-database limitation: The study relied exclusively on the Web of Science Core Collection (WoSCC), which means relevant studies indexed only in Scopus or PubMed may have been missed. PubMed's MeSH-based indexing captures clinical trials more precisely, while Scopus provides broader interdisciplinary and regional coverage with more frequently updated citation data. The authors acknowledge this may affect absolute comprehensiveness, though WoSCC's rigorous indexing and citation tracking make it the most widely adopted standard for bibliometric research.

Cross-sectional design: Because the data was collected on a single date (June 1, 2025), the analysis provides a snapshot rather than a dynamic view. Newly published articles or database updates could shift the findings. This is an inherent characteristic of bibliometric studies, not a flaw specific to this work, but it means the trends captured here may already be evolving.

Retrospective validation gap: A recurring theme across the clinical studies reviewed is the reliance on retrospective datasets and the absence of external or prospective validation. The XGBoost model for DVT (91.3% sensitivity, 99.8% specificity), the MatchGraft.AI system (70% AUC), and the AML-NOS risk stratification model (c-index above 0.77) all delivered strong numbers, but none have been validated in prospective clinical trials. Most models were developed at single centers or from registry data, which limits generalizability to diverse patient populations and healthcare systems.

Where the field is heading: The emergence of "nomogram" and "transcriptome" as trending topics in 2024 and 2025 signals a clear shift toward genomics-driven personalized medicine. The digital transformation accelerated by the COVID-19 pandemic (2020-2021) appears to have contributed significantly to the growth in AI research volume. The authors foresee this momentum continuing, with AI applications becoming increasingly central to hematology practice. The five thematic clusters suggest that the next frontier will involve integrating AI across the full patient journey, from risk prediction and diagnosis to treatment optimization, ICU management, and long-term cardiovascular monitoring.

TL;DR: Key limitations include reliance on a single database (WoSCC), cross-sectional design (data from June 1, 2025 only), and the lack of prospective validation across nearly all clinical AI models reviewed. Future directions point toward genomics, transcriptomics, and nomogram-based personalized approaches, with AI expected to become increasingly embedded in routine hematology practice.

Artificial intelligence in hematology: current trends and application areas

Original Paper (PDF)

Plain-English Explanations