AI Predicted Survival and Classified B-Cell Neoplasms

Plain-English Explanations

Overview

Pages 1-3

What This Study Does and Why It Matters

This 2022 study by Carreras, Roncador, and Hamoudi integrates multiple previous analyses to show how artificial intelligence (AI) can predict overall survival and classify subtypes of mature B-cell neoplasms, which are the most common forms of non-Hodgkin lymphoma (NHL). The lymphoma subtypes studied include chronic lymphocytic leukemia (CLL), mantle cell lymphoma (MCL), follicular lymphoma (FL), Burkitt lymphoma, diffuse large B-cell lymphoma (DLBCL), marginal zone lymphoma, and multiple myeloma, as well as acute myeloid leukemia and pan-cancer series.

The AI methods used span two major categories. First, machine learning techniques including C5 tree, Bayesian network, C&R tree, CHAID tree, discriminant analysis, KNN, logistic regression, LSVM, Quest tree, random forest, random trees, SVM, tree-AS, XGBoost linear, and XGBoost tree. Second, artificial neural networks including the multilayer perceptron (MLP) and radial basis function (RBF). These were applied to gene expression (transcriptomic) data and protein-level immunohistochemistry data.

The research focused on immuno-oncology and immune checkpoint markers, which are molecules that cancer cells exploit to evade the immune system. The study also adds a new analysis of tumor-associated macrophages (TAMs), including 3D rendering of their physical structures within the tumor microenvironment (TME). The key finding is that AI using these immune markers is a powerful predictive tool for both survival outcomes and disease classification across multiple lymphoma subtypes.

TL;DR: This study applies 16 machine learning models and neural networks to gene expression and immunohistochemistry data from multiple NHL subtypes (DLBCL, FL, MCL, CLL, Burkitt, and others). Using immuno-oncology and immune checkpoint panels, AI accurately predicts overall survival and classifies lymphoma subtypes.

Methods

Pages 3-7

Neural Network Architecture, Machine Learning Pipeline, and Validation Strategy

Multilayer perceptron design: The primary neural network architecture was the multilayer perceptron (MLP), a feed-forward network where connections flow from the input layer (gene predictors) through a hidden layer to the output layer (overall survival as dead vs. alive). The activation function for the hidden layer was the hyperbolic tangent, and the output layer used softmax for categorical classification. Cases were randomly split into training (70%) and testing (30%) datasets. Network performance was assessed using ROC curves, cumulative gains charts, lift charts, and sensitivity analysis that ranked gene importance.

Complementary analyses: Beyond neural networks, the study employed Gene Set Enrichment Analysis (GSEA) to confirm whether highlighted gene sets were enriched in specific biological states (e.g., dead vs. alive). Cox regression was used to create risk scores by multiplying beta coefficients by gene expression values, producing high-risk and low-risk patient groups. Differential gene expression used the GEO2R software with the Benjamini-Hochberg false discovery rate correction.

Immunohistochemical validation: Markers identified by AI were validated at the protein level using immunohistochemistry on independent case series from Tokai University Hospital, ranging from 97 to 293 cases depending on the project. Digital image quantification used both conventional and AI-based methods via the Waikato Environment for Knowledge Analysis (Weka), with random forest training the classifier. A comprehensive panel of 37 primary antibodies was used, targeting apoptosis (BCL2, caspase-3, caspase-8, PARP), germinal center markers (BCL6, CD10, LMO2), macrophage subtypes (CD68, CD16, CD163, MARCO, CSF1R, MITF), immune checkpoint molecules (PD-L1), and signaling pathways (STAT3, NFKB, MAPK).

Datasets: The study used publicly available datasets from the Gene Expression Omnibus (GEO) repository plus Tokai University Hospital's own transcriptomic and proteomic datasets. These covered 290 NHL cases, 414 DLBCL cases, 180 FL cases, 123 MCL cases, 308 CLL cases, 559 multiple myeloma cases, and 149 acute myeloid leukemia cases, among others.

TL;DR: The MLP neural network used hyperbolic tangent activation with 70/30 train/test splits, complemented by GSEA and Cox regression. Markers were validated by immunohistochemistry in independent Tokai University cohorts (97 to 293 cases). Datasets covered over 2,000 hematological neoplasia cases from GEO and institutional repositories.

NHL Classification

Pages 7-9

Neural Network Classification of NHL Subtypes with High Accuracy

Pan-cancer transcriptome panel: Using the entire array of 20,863 genes and a cancer transcriptome panel, a multilayer perceptron predicted the different non-Hodgkin lymphoma subtypes with high accuracy. The network architecture had 1,769 nodes in the input layer, a hidden layer of 16 nodes, and an output layer with 5 nodes representing follicular lymphoma, mantle cell lymphoma, DLBCL, Burkitt lymphoma, and marginal zone lymphoma. The ROC curve showed an area under the curve near 1.0, indicating excellent classification performance.

Top predictive genes: The sensitivity analysis ranked all genes by their normalized importance for predicting lymphoma subtype. The most relevant gene was ARG1, followed by MAGEA3, AKT2, and IL1B. Remarkably, the top 30 genes from this neural network not only classified lymphoma subtypes but also predicted the overall survival of an independent pan-cancer series of 7,441 cases from The Cancer Genome Atlas (TCGA).

Cross-disease generalizability: Using a risk score formula derived from Cox regression (risk score = sum of beta coefficients multiplied by gene expression values), patients in each series were stratified into high-risk and low-risk groups with statistically significant survival differences. The fact that these 30 genes from a pan-cancer transcriptome panel predicted survival across multiple cancer types suggests that common cancer mechanisms exist across all human neoplasia, not just lymphoma.

TL;DR: An MLP with 1,769 input nodes classified five NHL subtypes with near-perfect ROC performance. The top 30 genes (led by ARG1, MAGEA3, AKT2, IL1B) also predicted overall survival in a 7,441-case pan-cancer TCGA series, suggesting shared cancer mechanisms across tumor types.

Follicular Lymphoma

Pages 9-12

Predicting FL Survival and Linking Prognosis to the Immune Microenvironment

Combined neural network algorithm: For follicular lymphoma, an algorithm combining multilayer perceptron (MLP) and radial basis function (RBF) neural networks predicted overall survival alongside clinically relevant variables: age greater than 60 years, extranodal sites greater than 1, LDH-level ratio greater than 1, stage greater than 2, IPI score 2-3, translocation (14;18) status, immune response ratio, and survival timepoints. From an initial set of 22,215 genes, dimensionality reduction highlighted 43 genes: 18 associated with poor prognosis and 25 with good prognosis.

Immune microenvironment correlation: The prognostic genes were correlated with the immune microenvironment, specifically M2-like tumor-associated macrophages (TAMs). GSEA confirmed these associations. The identified genes were also correlated with immuno-oncology markers including CD163, CSF1R, FOXP3, PDCD1 (PD-1), TNFRSF14 (HVEM), and IL10.

Random number generator strategy: A separate analysis used a random number generator to create 120 independent MLP solutions, ranking all 22,215 gene probes by averaged normalized importance for predicting overall survival. The final neural network with 17 genes achieved an AUC of 0.89. A comparison with other machine learning techniques (Bayesian network, C&R tree, C5 tree, CHAID, KNN, logistic regression, LSVM, random forest, SVM, XGBoost linear, XGBoost tree) was also performed.

3D macrophage analysis: Tridimensional (3D) confocal microscopy analysis of TAMs in follicular lymphoma showed that progression from low grade to high grade and transformation to DLBCL were associated with increased numbers of TAMs that created a physical network-like structure. This finding suggests that TAMs may actively contribute to disease pathogenesis through their spatial organization within the tumor microenvironment.

TL;DR: Combined MLP and RBF neural networks reduced 22,215 genes to 43 prognostic markers for FL, linked to M2-like TAMs and immune checkpoint markers (CD163, PD-1, FOXP3). A 120-solution random generator strategy yielded a 17-gene model with AUC 0.89. 3D confocal imaging revealed TAMs form network-like structures during FL transformation to DLBCL.

Mantle Cell Lymphoma

Pages 12-14

MCL Survival Prediction Using Immuno-Oncology Gene Panels

Two algorithmic approaches: For mantle cell lymphoma (MCL), two methods were designed. Method 1 used 20,862 genes as input to predict overall survival outcome (dead vs. alive) and other prognostic markers. Through dimensionality reduction, a final set of 19 genes was highlighted. Method 2 used several specific gene panels to predict overall survival, identifying 125 pan-cancer and immuno-oncology genes.

Pathway associations: The highlighted genes were related to the cell cycle, apoptosis, and metabolism. The genes not only predicted the survival of MCL but also of DLBCL and a large pan-cancer series from the TCGA. A neural network algorithm combining 10 oncology and immuno-oncology panels predicted overall survival. The association with the patients' overall survival was confirmed by GSEA and conventional survival analysis with log-rank test.

MCL35 correlation: The analysis also included a correlation with the MCL35 proliferation assay, which was created by the Lymphoma/Leukemia Molecular Profiling Project. This assay is a validated prognostic tool for MCL based on proliferation gene signatures. The neural network results were complemented by multiple machine learning techniques and validated against this established prognostic standard.

TL;DR: Two MLP-based algorithms for MCL identified 19 prognostic genes (Method 1) and 125 immuno-oncology genes (Method 2) from over 20,000 gene inputs. The genes predicted MCL survival and generalized to DLBCL and pan-cancer TCGA series. Results were validated against the MCL35 proliferation assay.

DLBCL: Key Markers

Pages 15-20

DLBCL Survival Prediction: ENO3, MYC, BCL2, TNFAIP8, and Caspase-8

The 25-gene survival predictor: A multilayer perceptron analysis of 100 DLBCL cases using 54,614 gene probes highlighted 25 genes with prognostic value. Among these, ENO3 (a metabolism gene), MYC (a proto-oncogene), and BCL2 (an anti-apoptosis gene) were the most important. High expression of all three was associated with worse outcome. Remarkably, just these 3 genes alone could determine patient survival. The 25-gene set also predicted prognosis across other hematological neoplasias: CLL (n = 308), MCL (n = 92), FL (n = 180), multiple myeloma (n = 559), and acute myeloid leukemia (n = 149), all with p less than 0.001.

TNFAIP8 as a novel prognostic marker: TNFAIP8, an anti-apoptosis protein, was highlighted and validated by immunohistochemistry in 97 DLBCL cases from Tokai University. High TNFAIP8 protein expression correlated with poor overall survival, and in multivariate Cox regression analysis against the International Prognostic Index (IPI), only TNFAIP8 retained independent prognostic value (HR = 3.5, p = 0.040). TNFAIP8 also positively correlated with high M2-like CD163-positive TAMs, non-GCB cell-of-origin phenotype, and moderately with MYC (Spearman 0.389) and Ki67 (Spearman 0.48).

Caspase-8 and the apoptotic pathway: High caspase-8 (which is inhibited by TNFAIP8) protein expression correlated with favorable overall and progression-free survival (p = 0.005) in 97 DLBCL cases. Caspase-8 was correlated with its pathway markers including BCL2, caspase-3, CDK6, cleaved PARP, E2F1, Ki67, LMO2, MDM2, MYB, MYC, TNFAIP8, and TP53. The caspase-8 protein expression was successfully modeled using several machine learning techniques and neural networks.

TL;DR: An MLP identified 25 prognostic genes for DLBCL, with ENO3, MYC, and BCL2 as the top three. TNFAIP8 outperformed the IPI in multivariate Cox analysis (HR = 3.5). High caspase-8 predicted favorable survival (p = 0.005). The 25-gene set generalized across CLL, MCL, FL, myeloma, and AML (all p < 0.001).

DLBCL: Immune Checkpoint

Pages 20-25

PD-L1, IKAROS, CSF1R, and the SIRPA/CD47 Immune Checkpoint Axis

PD-L1 and IKAROS: An algorithm combining MLP, RBF, GSEA, Cox regression, and multiple machine learning techniques (including Bayesian network, C5.0, CHAID, C&R tree, logistic regression, SVM, and others) predicted overall survival of 414 DLBCL cases using 54,613 gene probes. The association of PD-L1 (CD274) and IKAROS with overall survival was validated by immunohistochemistry in 113 independent cases. High PD-L1 correlated with poor survival, while high IKAROS was associated with favorable survival. AI-based digital quantification correlated well with conventional methods.

CSF1R and tumor-associated macrophages: In 198 DLBCL cases, high CSF1R-positive TAMs were associated with poor progression-free survival, while a less frequent pattern of CSF1R-positive B-cells correlated with favorable outcome (around 30% of cases). A neural network predicted CSF1R protein expression using 10 related markers: CSF1, STAT3, NFKB1, Ki67, MYC, PD-L1, TNFAIP8, IKAROS, CD163, and CD68. The authors suggested that a CSF1R inhibitor such as Pexidartinib could potentially benefit patients with the CSF1R-positive TAM pattern.

SIRPA/CD47 axis: The immunohistochemical pattern of CSF1R-positive TAMs suggested a relationship with SIRPA, a marker expressed by macrophages that mediates negative regulation of phagocytosis. SIRPA displayed a TAM pattern similar to PD-L1, CD85A, and MARCO. The ligand for SIRPA is CD47, which showed a B-lymphocyte pattern. In the LLMPP series with R-CHOP-treated patients, high CD47 but low SIRPA correlated with poor overall survival, and SIRPA positively correlated with CSF1R. These two markers belong to the immune checkpoint pathway and mediate negative regulation of phagocytosis, representing a potential therapeutic target.

TL;DR: High PD-L1 predicted poor DLBCL survival while high IKAROS was favorable, validated in 113 cases. CSF1R-positive TAMs predicted poor outcome in 198 cases, with Pexidartinib suggested as a therapeutic option. The SIRPA/CD47 immune checkpoint axis was identified as a phagocytosis-regulating prognostic target in DLBCL.

Macrophage Pathway

Pages 25-30

M2c-like Macrophages, Immuno-Oncology Panel Performance, and Model Comparison

Macrophage pathway in DLBCL: Gene expression profiling of 233 DLBCL patients (GSE10846 dataset, R-CHOP treated) was analyzed using CD163 expression as the primary marker. High CD163 (an M2-like macrophage marker) was associated with poor prognosis. A protein-protein functional network using six macrophage and Treg markers (CD68, CD16, CD163, PTX3, MITF, and FOXP3) generated a 57-marker pathway. GSEA confirmed this pathway was enriched in the high-risk phenotype (NOM p-val less than 0.001, FDR q-val less than 0.001), with IL10 identified at fifth position. FOXP3 was notably outside the enrichment set.

Immunohistochemical validation of macrophage subtypes: In the Tokai University series (n = 132), the MLP predicted overall survival using macrophage markers. The normalized importance ranking was: PTX3 Total (100%), IL10 (95.9%), FOXP3 (48.9%), MITF (35.8%), and CD163 (6.3%). PTX3 and IL10 characterize immune regulatory M2c-like macrophages, confirming that this specific macrophage subtype is most relevant to DLBCL prognosis. A 730-gene immuno-oncology panel predicted overall survival and cell-of-origin phenotype (Lymph2Cx assay) for 106 DLBCL cases, achieving 0.89 performance for survival and 0.99 for cell-of-origin classification.

Comprehensive model comparison: Using 25 immuno-oncology gene predictors, 16 machine learning models were compared. XGBoost Tree achieved 100% accuracy, random forest reached 98.3%, random trees 97.1%, Bayesian network 89.3%, SVM 84.5%, KNN 81.9%, CHAID 79.8% (using only 6 predictors), LSVM 78.5%, logistic regression 78.1%, C5 tree 75.9% (3 predictors), tree-AS 74.3% (2 predictors), XGBoost Linear 74.3%, Quest 74.3%, C&R tree 74.3%, neural network 74.3%, and discriminant analysis 72.9%. The results confirmed that the choice of optimal model depends on data type, number of cases, and number of variables.

TL;DR: M2c-like TAMs (marked by PTX3 and IL10) were the strongest macrophage predictors of DLBCL survival. XGBoost Tree achieved 100% accuracy and random forest 98.3% among 16 models tested with 25 immuno-oncology genes. A 730-gene panel classified DLBCL cell-of-origin at 0.99 performance.

Discussion and Literature Review

Pages 31-39

Therapeutic Implications, Literature Context, and Future Directions

Immuno-oncology markers as therapeutic targets: The study highlights a comprehensive set of immuno-oncology markers as potential therapeutic targets. These include TAM markers (CD163, CSF1R, CSF1, SIRPA, MARCO, IL10, PD-L1), T-cell markers (PD-1, BTLA, FOXP3), apoptosis regulators (BCL2, caspase-3, caspase-8, PARP, TNFAIP8), signaling pathway markers (STAT3, NFKB, MAPK, IKAROS), and metabolic markers (ENO3, GGA3). Many of these can be targeted using inhibitors, and in DLBCL specifically, immunomodulatory drugs and immune checkpoint inhibitors represent a promising field beyond classical R-CHOP therapy.

Cross-disease applicability: Some identified markers were also relevant for nonhematological neoplasia prognosis, suggesting common pathogenic mechanisms across all types of neoplasia. The 25-gene set from the DLBCL analysis was tested on breast cancer (1,215 TCGA patients) using the same 16 models. Only random forest achieved suitable modeling (98.4% accuracy) for breast cancer, with PD-L1 (CD274) ranked as the most important predictor, followed by FOXP3 and ENO3. The MLP failed to properly predict breast cancer survival (AUC = 0.61), highlighting that model performance is data-dependent.

Literature review findings: A systematic review of PubMed literature on AI in hematopathology cataloged recent advances across five input data types: PET/CT scan-based AI (e.g., CNN-based DLBCL staging, MCL relapse prediction with 64-70% accuracy), histological image-based AI (e.g., 17 CNN types achieving 99.7-100% DLBCL classification), immunophenotype-based AI (e.g., deep learning on multiparameter flow cytometry with F1 score 0.94 across 18,274 cases), clinicopathological variable-based AI, and gene expression/mutational analysis-based AI. The Schmitz et al. 2018 NEJM study using random forest on 574 DLBCL cases with exome and transcriptome sequencing was highlighted as a landmark work.

Practical considerations: Each machine learning method has specific strengths and weaknesses. Decision trees have difficulty handling large variable sets. Bayesian networks provide acceptable but not superior results. Logistic regression typically achieves high accuracy with many variables. The most practical strategy is to test all available methods and select those that perform best for a given dataset. The study used "basic" but robust neural network architectures as building blocks for more complex, multivariate algorithms, making the analysis comparable to classical multivariate analysis while leveraging the flexibility of AI.

TL;DR: The identified immuno-oncology markers (PD-L1, PD-1, CSF1R, SIRPA/CD47, IL10, TNFAIP8) represent therapeutic targets beyond R-CHOP. The 25-gene set generalized to breast cancer with random forest (98.4% accuracy), though MLP failed. Literature review shows AI in hematopathology spans PET/CT, histology, flow cytometry, and genomics, with CNN-based models achieving up to 100% classification accuracy.