B-cell non-Hodgkin lymphomas (B-NHLs) are a diverse group of cancers that originate from B lymphocytes, which are key components of the immune system. The most common subtypes are diffuse large B-cell lymphoma (DLBCL) and follicular lymphoma (FL), but the spectrum also includes mantle cell lymphoma and Burkitt lymphoma. These cancers vary widely in severity, treatment response, and prognosis, both across subtypes and among individual patients. Late diagnosis and frequent relapse further complicate clinical management.
Standard treatments: Frontline therapy typically involves CHOP (cyclophosphamide, doxorubicin, vincristine, prednisone) or R-CHOP (CHOP plus the monoclonal antibody rituximab). When these fail, additional options include antibody-drug conjugates (ADCs), radiation therapy, targeted agents such as ibrutinib, PI3K inhibitors, and venetoclax, and high-dose chemotherapy followed by stem cell transplantation for aggressive or relapsed cases.
The multimodal AI premise: This review, authored by Isavand, Aghamiri, and Amin from the University of Nebraska and Zanjan University, explores how multimodal AI can integrate data from imaging, pathology, genomics, transcriptomics, proteomics, metabolomics, epigenomics, and electronic health records (EHRs) into unified analytical frameworks. The core argument is that no single data source captures the full complexity of B-NHL tumors and their surrounding microenvironment. By fusing multiple modalities, AI systems can uncover subtle patterns that would be missed by analyzing any single modality alone.
The review covers multimodal AI frameworks and data fusion strategies, then presents case studies in B-NHLs organized around four themes: investigating the tumor-TME ecosystem, immune biomarker discovery, therapy optimization, and clinical investigations. The authors also address limitations related to data quality, AI interpretability, regulatory gaps, and ethical considerations.
Imaging modalities: The review describes several imaging technologies used in NHL. PET/CT with fluorodeoxyglucose (FDG) is the standard for staging FDG-avid NHL and assessing post-treatment response. CT remains primary for non-FDG-avid lymphomas and targeted surveillance. MRI with diffusion-weighted imaging (DWI) detects restricted water molecule motion in high-cellularity NHL tumors, providing complementary staging information. Whole-body MRI with short inversion time inversion recovery (STIR) sequences shows high concordance with FDG PET/CT for disease detection.
Digital pathology: Definitive lymphoma diagnosis requires microscopic examination of tissue sections stained with hematoxylin and eosin (H&E). Advances in whole-slide imaging (WSI) have driven a shift toward digital pathology, which creates high-resolution digital images from glass slides. This enables faster turnaround, remote access, easier data management, and potentially more objective, AI-assisted analysis.
Electronic health records (EHRs): EHRs serve as digital archives of a patient's medical history, integrating longitudinal data including demographics, diagnoses, clinical notes, and objective findings. They contain both structured data (medication orders, dosages) and unstructured data (clinical notes detailing treatment rationale).
Multi-omics data: The review covers five omics layers: genomics (identifying driver mutations through DNA methylation and gene expression), transcriptomics (revealing dynamic gene regulation through RNA analysis), epigenomics (exploring chromatin accessibility disrupted in cancer), proteomics (studying the full protein activity landscape), and metabolomics (identifying major metabolic pathways that tumors rely on). While any single modality provides useful clinical information, combining multiple omics layers enhances accuracy in understanding tumor biology and patient-specific treatment responses.
Early fusion (data-level fusion): This approach combines multiple data sources into a unified information space before analysis. A single model is trained on quantitative features extracted jointly across modalities, capturing low-level interactions to reveal subtle patterns. Early fusion promotes a holistic understanding of data and can improve processing efficiency. However, integrating diverse data types is challenging due to differences in data scale, dimensions, and sample rates. Incomplete or noisy datasets can reduce robustness and introduce biases.
Late fusion: Each modality is processed independently to train separate models, and their predictions or representations are combined at a later stage. This allows customized models for each modality and enhances robustness to missing or corrupted data in individual channels. The downside is that late fusion may miss subtle cross-modal interactions and can increase computational demands when managing separate models for each data type.
Intermediate fusion (hybrid fusion): This strategy combines data modalities during the learning process at an intermediate stage, with each modality processed individually before integration at a higher level of abstraction. It balances the benefits of early and late fusion by capturing a wide range of inter-modal interactions at various abstraction levels while offering design flexibility. The tradeoff is greater engineering complexity, larger computational requirements, and increased overfitting risk, especially with limited training data.
Learning paradigms: The review covers three training approaches. Supervised learning uses labeled data to learn associations between inputs and outcomes, suitable for classification and prediction tasks. Unsupervised learning operates on unlabeled data using generative modeling, clustering, and dimensionality reduction to identify hidden patterns. Semi-supervised learning bridges the gap by leveraging partially labeled or noisy data, which is particularly useful when complete annotations are expensive, as is common with pixel-level annotations in medical imaging.
Liquid biopsy: Zhang et al. demonstrated a multimodal AI approach for early lung cancer diagnosis that illustrates the broader principle. By fusing extracellular vesicle long RNA (evlRNA) from liquid biopsy, CT imaging features, and expert tissue analysis, they achieved 93.4% accuracy with senior expert input, 92.4% with junior expert input, and 91.9% with evlRNA plus imaging alone. Individual modalities performed significantly worse: evlRNA alone reached only 79.2% and imaging alone only 77.6%. This demonstrates the consistent advantage of multimodal fusion over single-modality analysis.
Immunotherapy response prediction: The National Institutes of Health developed LORIS (logistic regression-based immunotherapy-response score), a model that integrates six commonly measured patient features: neutrophil-to-lymphocyte ratio, age, cancer type, therapeutic history, albumin levels, and tumor mutational burden (TMB). LORIS outperformed TMB alone as a predictor of checkpoint inhibitor response and successfully identified patients previously considered poor candidates for immunotherapy. The multimodal approach, combining clinical, pathologic, and genomic features, enhanced the ability to personalize immunotherapy treatment selection.
Surgical applications: An open clinical trial (NCT06478368) is investigating multimodal AI for early detection of occult peritoneal metastases in gastric cancer surgery. The study integrates AI with dynamic video recordings of the abdominal cavity during surgery, supplemented by imaging, histopathology, and clinical data. Another trial (NCT05426135) aims to build a medical platform integrating clinical, imaging, pathological, and multi-omics data for lung, stomach, and colorectal cancers, with the goal of creating accurate diagnosis and treatment prediction models using deep learning.
Transcriptome mapping of lymphoma subtypes: Loeffler-Wirth et al. created a comprehensive transcriptome map covering six lymphoma subtypes, including DLBCL and FL, using 873 biopsy specimens with clinical, pathological, genetic, and transcriptome data. They employed the self-organizing map (SOM) machine learning technique to create low-dimensional representations of high-dimensional expression data. The SOM-generated portraits revealed that lymphoma subtypes are arranged as a spectrum of molecular expression levels rather than distinct phenotypes. Key patterns included proliferation, inflammation, and stroma signatures. Poor survival correlated with proliferation, inflammation, and plasma cell features, while inflammatory and stroma signatures linked to healthy B cells were associated with higher overall survival.
Cell-of-origin classification in DLBCL: Xu-Monette et al. developed a refined cell-of-origin (COO) classifier for DLBCL using a cohort of 418 DLBCLs with transcriptomic and genomic data. They used autoencoders (unsupervised neural networks) for dimensionality reduction, transforming selected features into a two-dimensional latent space. Logistic regression and Cox proportional hazards models built the COO classifier and predicted clinical risk. The classifier showed strong agreement with the established Nanostring Lymph2Cx assay. Results confirmed that ABC (activated B-cell-like) subtypes had much shorter survival than GCB (germinal center B-cell-like) subtypes, with 30% of patients classified as high-risk with poor survival.
Follicular lymphoma multiscale atlas: Radtke et al. built a cellular and molecular atlas of the FL environment using genomic, transcriptomic, clinical, and pathological imaging modalities. The Kassandra algorithm deconvolved bulk RNA-seq data to predict cell percentages. An adversarially regularized variational graph autoencoder analyzed spatial relationships between cells, and a convolutional neural network (CNN) segmented individual cells across different imaging modalities. The study revealed that immune signaling activation, cytokine signaling, extracellular matrix remodeling, and B-cell receptor signaling were notably increased in early relapsers, driven by interactions between FL B cells and myeloid/stromal cell niches. Unique follicular growth patterns were detected twenty months before initial relapse.
EcoTyper platform for DLBCL ecosystems: Steen et al. created the EcoTyper machine learning platform that integrates transcriptome deconvolution and single-cell RNA-seq to characterize clinically relevant DLBCL cell states and ecosystems. After estimating cell type proportions in bulk tissue transcriptomes, non-negative matrix factorization (NMF) identified distinct cell states within each cell type, and a community detection algorithm based on Jaccard indices and hierarchical clustering defined multicellular ecosystems (ecotypes). The platform recognized five cell states connected to somatic subtypes and overall survival, plus 39 TME cell states from immunological populations (CD4, CD8, plasma cells, natural killer, mast cells, monocytes, dendritic cells, neutrophils) and stromal populations (fibroblasts, endothelial cells).
Pan-cancer biomarker analysis: Carreras et al. conducted a pan-cancer analysis on 233 DLBCL patients treated with R-CHOP therapy, integrating gene expression and proteomic immunohistochemical data. Using a suite of ML techniques and two neural networks, they selected 25 genes from 54,614 gene probes to precisely predict prognosis for 100 DLBCL cases. The top three algorithms by accuracy were XGBoost tree (100%), random forest (98.3%), and random trees (97.1%). Immune profiling enriched markers associated with tumor-associated macrophages, T lymphocytes, and regulatory T lymphocytes, identifying potential therapeutic targets through immune cell inhibitors.
Follicular lymphoma transcriptional states: Krull et al. analyzed 87 newly diagnosed or untreated FL biopsies using NMF and the cophenetic correlation coefficient for B-cell gene clustering. By combining transcriptomic, genome sequencing, and TME analysis, they identified three distinct transcriptional states in FL B cells: inflammatory, proliferative, and chromatin remodeling. Genomic and TME profiling revealed enrichment with immune evasion and high T-cell infiltration, and linked cell-of-origin to a prior germinal center B phenotype.
Drug target discovery in DLBCL subtypes: Yeh et al. constructed comprehensive genetic and epigenetic networks for ABC and GCB DLBCL subtypes through extensive data mining. A deep neural network was trained on known drug-target interactions, learning complex relationships between drug and target features encoded as numerical vectors. The model projected drug candidates likely to interact with discovered biomarkers by considering drug-target interaction, drug regulation, and drug toxicity. The authors identified five pathogenic biomarkers and proposed four FDA drug candidates specifically tailored to the molecular pathways unique to ABC and GCB subtypes.
LymForest-25 prognostic model: Mosquera Orgueira et al. developed LymForest-25, a model integrating transcriptome and clinical information using a random forest algorithm (combining multiple decision trees). Trained on R-CHOP-treated patients, LymForest-25 predicted survival outcomes for patients receiving bortezomib plus R-CHOP. The model showed that incorporating bortezomib into R-CHOP may reduce the risk of death or disease progression by 30% in high-molecular-risk DLBCL patients, demonstrating how multimodal AI can evaluate treatment response using pre-existing clinical data.
CAR T-cell response prediction: Tong et al. designed a non-invasive AI method using three clinical imaging modalities (CT, low-dose CT, and 18F-FDG-PET) from FL and DLBCL patients treated with CD19 CAR T-cells. A deep learning-based image analysis model combined with rule-based reasoning predicted lesion-level and patient-level response. In a testing cohort of 26 DLBCL patients (10 responders, 16 non-responders), the model demonstrated promising accuracy surpassing the traditional International Prognostic Index (IPI).
Digital pathology for DLBCL: Hong Lee et al. used clinical data and digital pathology from 216 DLBCL patients treated with chemotherapy and immunochemotherapy. They extracted features from tissue slides using self-supervised contrastive learning with the DINO (self-distillation with no labels) method, then employed a dual-stream multiple-instance learning network with attention-based pooling. The multimodal predictive model was strongly associated with IPI and predicted both immunotherapy response and relapse-free survival without requiring labeled histopathology annotations.
NCT04154228: This prospective study is developing an AI-based tool to measure early treatment response and differentiate benign from tumorous masses in lymphoma patients. The approach collects clinical records and FDG PET/CT and PET/MR scans at three time points: before initial therapy, after 2-3 treatment cycles, and at end of treatment. By analyzing imaging changes over time, the tool aims to stratify lymphoma patients and identify high-risk cases earlier than conventional assessment methods.
SYNERGY-AI Registry (NCT03452774): This international, prospective, observational cohort study evaluates AI's clinical utility in precision oncology. It is testing ML-based tools to enhance clinical trial enrollment (CTE) for patients with advanced cancers, including B-NHLs. The study collects EHRs and genetic mutation data from adult and pediatric patients with advanced solid and hematological malignancies. Over 36 months (24 months enrollment, 12 months data collection), the study measures time to CTE initiation, progression-free survival, and overall survival. ML algorithms match patient clinical profiles with ongoing trials to maximize enrollment in appropriate studies.
NCT06241092: This investigation explores the molecular heterogeneity of papillary thyroid carcinoma using AI-based multimodal analysis of genomic, transcriptomic, and pathological images to predict lymph node metastasis and survival. Using deep learning methods, the authors generated heatmaps highlighting high-risk tumor regions with significant accuracy for predicting disease-free survival at 1, 3, and 5 years. The study also demonstrated that lymph node metastasis is associated with macrophages, cancer-associated fibroblasts, and T cells, illustrating how multimodal AI can identify multiple biomarkers to enhance prognosis.
Data quality and availability: Variations in sample collection, storage, and processing affect omics data reliability. Differences in sequencing platforms lead to sensitivity and specificity variations. Inconsistent preprocessing and normalization methods across studies make it difficult to compare outcomes. For imaging, lack of standardization between laboratories means ML/DL algorithms trained at one center may not generalize to data from another. Patient movement, image acquisition settings, and post-processing techniques all impact image quality and model interpretability.
AI methods and interpretability: Deep learning algorithms function as "black boxes," making it difficult to understand how they arrive at conclusions. Biases in training data may result in less accurate predictions for specific populations or conditions. There is no universal best fusion strategy for B-NHLs; the optimal approach depends on the clinical question, the dataset, and available expertise. Discrepancies in temporal or spatial alignment of data from different modalities create additional integration challenges.
Clinical implementation barriers: As of March 2024, the FDA reports 882 AI/ML-enabled medical devices approved in the US, with radiology representing 80%. However, no regulatory frameworks exist specifically for assessing multimodal AI algorithms for clinical use. Financial barriers include substantial investment for testing, validation, regulation, and training healthcare personnel. Ethical concerns include patient privacy, data security, liability, and the need for patient consent and transparency about AI's role in their care.
Future directions: The review highlights three promising areas. Cancer patient digital twins (CPDTs) would create virtual replicas of a patient's cancer that dynamically evolve with tumor progression and therapy response, enabling real-time treatment optimization. Virtual clinical trials could overcome the cost and enrollment limitations of randomized trials by using ML to generate digitized cohorts from retrospective data, improving phenotyping for underrepresented populations. Healthcare system integration would automate repetitive tasks like data entry, appointment scheduling, and patient triage, allowing professionals to focus more on direct patient care while AI-based monitoring systems serve as virtual assistants for early detection and proactive intervention.