Artificial Intelligence in Lung Cancer Screening: The Future Is Now

PMC 2023 AI 9 Explanations View Original
Original Paper (PDF)

Unable to display PDF. Download it here or view on PMC.

Plain-English Explanations
Pages 1-3
Why Lung Cancer Screening Needs AI Now

Lung cancer is the leading cause of malignancy-related deaths worldwide, with 1.8 million new diagnoses and 1.6 million deaths per year. The net five-year survival rate sits at just 13.8%, and a staggering 75% of cases are discovered only at advanced stages when treatment options are limited. In Italy alone, lung cancer accounts for 20% of all cancer deaths and is the third most prevalent neoplasm (11% of all cancers as of 2018). The National Lung Screening Trial (NLST) demonstrated a statistically significant 20% reduction in lung cancer mortality among high-risk adults who received three consecutive annual low-dose computed tomography (LDCT) screenings versus those who received chest X-rays. The European NELSON trial confirmed these findings, and multiple randomized controlled trials now support that LDCT screening reduces lung cancer mortality by 20-30%, with particularly strong benefits in women.

The screening landscape: Currently, only the United States and China offer nationwide lung cancer screening programs. The Netherlands and the UK offer screening only in select locations, and Europe has not yet established a shared program despite published guidelines. The European Commission recommends screening for current and former smokers (quit within 15 years), aged 50-75, with a 30-pack-year smoking history. The American Cancer Society guideline supports yearly LDCT for adults aged 50-80 with a 20-pack-year history who currently smoke or have quit within 15 years. Italian screening trials, including MILD, ITALUNG, and DANTE, have produced promising results, and non-randomized studies (COSMOS and BioMILD) have shown the value of integrating blood biomarkers with LDCT every 3 years.

The radiologist bottleneck: Even with LDCT screening established as effective, the practical challenge remains enormous. Radiologists face a high workload evaluating, characterizing, and detecting lung nodules across large screening populations. Given that 90% of CT screening results are negative, the labor of sifting through normal scans is massive. The rising demand for medical imaging has limited radiologists' ability to interpret lung cancer screening tests in many countries. This creates a natural opening for AI-based tools that can automate detection, reduce inter-observer variability, and maintain low false-positive rates while increasing screening sensitivity.

This narrative review by Cellina et al. (2023, published in Cancers) aims to provide a comprehensive overview of all possible AI applications in lung cancer screening. The authors searched PubMed and Google Scholar using terms combining "lung cancer screening," "lung nodule detection," "lung nodule characterization," and "low-dose chest CT reconstruction" with "artificial intelligence," "deep learning," or "machine learning." The paper covers the full screening workflow: image reconstruction, personalized screening programs, computer-aided detection (CAD) systems, nodule segmentation, and nodule characterization including radiomics and virtual biopsy.

TL;DR: Lung cancer kills 1.6 million per year with only 13.8% five-year survival. LDCT screening reduces mortality by 20-30%, but 75% of cases are still caught late. Only the US and China offer nationwide screening. This 2023 narrative review covers all AI applications across the lung cancer screening workflow, from image reconstruction to virtual biopsy.
Pages 3-5
Machine Learning, Deep Learning, and CNNs in Radiology

The paper provides a structured overview of AI terminology relevant to lung cancer screening. Machine learning (ML), a term rooted in the convergence of statistics and computer science, is divided into two primary categories. Supervised learning uses labeled training data (input paired with correct labels) to train models that predict or classify new data. In a medical context, this means training a model on images labeled with the presence or absence of a condition, enabling it to generalize to unseen cases. Unsupervised learning uses unlabeled data to discover hidden patterns, structures, or clusters without specific prediction objectives. For example, unsupervised clustering can uncover natural groupings within patient data, aiding identification of disease subtypes or patient cohorts.

Deep learning and neural networks: Deep learning is a subset of ML that trains artificial neural networks with many layers to recognize hierarchical data representations. These deep neural networks consist of an input layer, one or more hidden layers, and an output layer. During training, the network iteratively adjusts weights and biases based on provided data. Deeper layers learn increasingly complex representations, enabling the network to capture intricate relationships. A key advantage of deep learning is its ability to automatically learn features from raw data, eliminating the need for manual feature engineering. In computer vision, deep learning models have achieved state-of-the-art performance in object detection, image classification, and image segmentation.

Applications in radiology: Convolutional neural networks (CNNs) are particularly well-suited for medical imaging tasks. They can automatically classify anatomical structures, identify abnormalities, detect tumors, and accurately segment organs. ML algorithms also contribute to quality control and image enhancement by reducing noise, enhancing details, and standardizing image acquisition protocols. The authors emphasize that while ML algorithms have tremendous potential, they are designed to complement radiologists rather than replace them. Radiologists' expertise and clinical judgment remain essential for accurate diagnosis and effective patient care.

TL;DR: Supervised ML uses labeled data for prediction and classification; unsupervised ML discovers hidden patterns in unlabeled data. Deep learning trains multi-layer neural networks to learn hierarchical features automatically. CNNs are the primary architecture for medical image classification, segmentation, and nodule detection, but they complement rather than replace radiologist expertise.
Pages 5-7
Risk Stratification and Deep Learning Reconstruction for Dose Reduction

Risk-stratified screening: The paper discusses multiple risk prediction models used to determine screening eligibility and interval, including the Bach model, the Lung Cancer Risk Assessment Tool (LCRAT), the Lung Cancer Death Risk Assessment Tool (LCDRAT), the Liverpool Lung Project (LLP) model, and the PLCOm2012 model. Schreuder et al. created a model based on NLST data that could prevent 10.4% of all second-round screening exams without delaying lung cancer diagnosis. Tammemagi et al. expanded the PLCOm2012 model by incorporating reclassified Lung-RADS results from the NLST, finding that positive screening results indicated increased lung cancer risk regardless of initial risk assessment. For patients below a certain PLCOm2012 threshold, screening intervals could be extended, but notably, some individuals with high baseline risk still had elevated cancer incidence after three consecutive negative screens.

Blood biomarkers for risk stratification: Laboratory parameters such as autoantibodies, DNA fragments, microRNA, and other blood-circulating components are under investigation for their role in risk stratification. Preliminary results from the Bio-MILD study show that integrating blood microRNA with CT screening results helps determine the correct screening interval. MicroRNA profiles appear important in distinguishing symptomatic lung cancer patients from controls, and polygenic risk scores may help identify a person's absolute risk. These biomarkers are not yet validated for clinical use but represent a promising direction for personalizing screening programs.

Deep learning reconstruction (DLR): Repeated LDCT scans expose patients to cumulative radiation risk, even at low-dose protocols (typically 1-4 millisieverts per scan). Traditional reconstruction techniques like model-based iterative reconstruction (MBIR) and hybrid iterative reconstruction (HIR) reduce noise and artifacts but have limitations at very low doses. DLR represents a new approach that trains on pairs of noisy and noise-free images to extract true information. In comparative studies, DLR outperformed both MBIR and HIR in accurately measuring artificial lung nodule sizes on ultra-low-dose CT, with lower volume measurement errors and better inter-observer agreement between different readers.

Vendor-agnostic solutions: Vendor-agnostic deep learning models can enhance image quality and reduce radiation dose across different scanner manufacturers. ClariCT.AI (ClariPI) is one such model that works in post-processing and does not require projection data. Nam et al. demonstrated that for ultra-low-dose CT, vendor-agnostic deep learning was superior to vendor-specific DLR, though the study evaluated image quality subjectively at a single radiation dose without assessing diagnostic performance. These developments are especially relevant given the high demand for chest CT during the COVID-19 pandemic.

TL;DR: Risk models like PLCOm2012 can extend screening intervals, with one NLST-based model preventing 10.4% of second-round screens without delaying diagnosis. Blood microRNA from Bio-MILD helps determine screening intervals. Deep learning reconstruction outperforms MBIR and HIR for nodule measurement on ultra-low-dose CT, with lower volume errors and better inter-reader agreement. Vendor-agnostic DL models further enable dose reduction across different scanners.
Pages 7-11
CAD Systems: Architecture, Public Datasets, and Detection Accuracy

CAD system structure: Computer-aided detection (CAD) systems consist of several general components including data collection and pre-processing, followed by lung-specific functions: lung segmentation (separating lung boundaries from surrounding thoracic parenchyma), nodule detection (locating suspicious masses), false-positive reduction (separating real nodules from candidate artifacts), nodule segmentation (isolating each nodule from parenchyma), feature extraction (measuring nodule characteristics), and final classification (benign vs. malignant). CAD systems reduce observational errors and false-negative rates, provide quantitative support for biopsy decisions, and help differentiate between malignant and benign tumors.

Public benchmark datasets: The paper details the key datasets driving CAD development. The LIDC-IDRI database, established by the National Cancer Institute with FDA and NIH support, includes 1,018 chest CT scans annotated by four expert radiologists. It contains 7,371 lesions classified as nodules by at least one radiologist, of which 2,669 were nodules 3 mm or larger, and 928 were unanimously classified as nodules by all four radiologists. The LUNA16 dataset, derived from LIDC by selecting all thin-slice CT scans (less than 3 mm slice thickness), contains 888 exams. The ELCAP database has 50 LDCT exams (1.25 mm slice thickness) from I-ELCAP. The ANODE09 database from the NELSON trial, Europe's largest CT lung cancer screening database, provided 55 CT scans (0.71 mm average slice thickness) from male heavy smokers aged 50-75. The NSCLC dataset includes 1,355 CT exams from 211 patients.

Detection performance: Chi et al. developed a three-cascaded-network framework using U-Net architectures trained on LUNA16 and Ali Tianchi, achieving precision of 0.8792, sensitivity of 0.8878, and specificity of 0.9590. Its main limitation was difficulty distinguishing low-density nodules at parenchyma edges. Khosravan et al.'s S4ND framework on LUNA16 achieved 95.2% sensitivity without requiring post-processing or user guidance. Nasrullah et al.'s CMixNet system on LIDC-IDRI reached 94% sensitivity and 91% specificity, with the added strength of integrating patient factors (symptoms, age, smoking history, biomarkers) to reduce false positives. Cai et al.'s MaskRCNN with feature pyramid network achieved 88.70% sensitivity on LUNA16 while also providing segmentation and 3D visualization.

High-accuracy classification systems: Manickavasagam et al. developed a 5-layer CNN on LIDC/IDRI that achieved 98.88% accuracy, 99.62% sensitivity, 93.73% specificity, and AUC of 0.928. Tran et al.'s system divided nodules from non-nodules with 97.2% accuracy, 96.0% sensitivity, and 97.3% specificity. Wu et al. reached 98.23% average accuracy with a 1.65% false-positive rate, and Mastouri et al. achieved 91.99% accuracy. For benign vs. malignant classification specifically, Zhang et al., Al-Shabi et al., and Liu et al. achieved 92.4%, 92.57%, and 90% accuracy respectively. Despite these results, the main limitation of CAD systems remains the high number of false positives related to blood vessels and other soft tissue structures that the systems misinterpret.

TL;DR: LIDC-IDRI contains 1,018 CT scans with 7,371 annotated lesions; LUNA16 has 888 thin-slice exams. Top CAD results include 98.88% accuracy and 0.928 AUC (5-layer CNN on LIDC/IDRI), 97.2% accuracy with 97.3% specificity (Tran et al.), 95.2% sensitivity (S4ND on LUNA16), and 94% sensitivity with 91% specificity (CMixNet on LIDC-IDRI). False positives from blood vessels remain the primary CAD limitation.
Pages 11-13
U-Net, FCN, and Multi-View Architectures for Precise Nodule Segmentation

Precise lung nodule segmentation is challenging due to the typically small size of nodules and their proximity to lung edges or blood vessels. Two commonly used base architectures are U-Net and Fully Convolutional Neural Networks (FCN), both of which follow a two-step process: down-sampling to extract feature maps while filtering irrelevant information, then up-sampling to achieve higher-resolution output. Huang et al. customized an FCN for detection, merging, false-positive reduction, and segmentation, achieving an average Dice Similarity Coefficient (DSC) of 0.793 on the LIDC-IDRI dataset. Usman et al. introduced a dynamic modified ROI algorithm using Deep Res-UNet architecture, reaching an average DSC of 87.55%, sensitivity of 91.62%, and a positive predictive value (PPV) of 88.24%, though the method struggled with nodules attached to non-nodule structures and very small diameters.

GAN-augmented and V-Net approaches: Zhao et al. implemented a patch-based 3D U-Net combined with generative adversarial networks (GANs) and contextual CNN for automatic segmentation and classification. Their multi-step pipeline handled segmentation first, then benign-vs-malignant classification, all in a single integrated algorithm. Kumar et al. utilized the V-Net architecture (originally developed for prostate MRI segmentation) and achieved a high DSC of 0.9615 on the LUNA16 dataset by focusing on convolutional layers and omitting pooling layers. Keetha et al. integrated U-Net with Bi-FPN to create a resource-efficient U-Det architecture that achieved an average DSC of 82.82%, sensitivity of 92.25%, and PPV of 78.92% on LUNA16, with particular strength in segmenting challenging cases such as cavitary nodules, ground glass nodules, and small peripheral nodules.

Multi-view and multi-scale techniques: Zhang et al. employed a multiscale Laplacian of Gaussian filter achieving a detection score of 0.947 on LUNA16. Cao et al. presented DB-ResNet (dual-branch residual network), integrating multi-view and multi-scale CT nodule features with CNN intensity features, obtaining an average sensitivity of 89.35% and DSC of 82.74%, which exceeded human expert performance. Wu et al. developed PN-SAMP (Pulmonary Nodule Segmentation Attributes and Malignancy Prediction), a multitask U-Net model achieving 73.89% DSC but an impressive 97.58% sensitivity while also estimating malignancy risk. Wang et al. proposed a two-stage central-focused CNN (CF-CNN) tested on LIDC-IDRI, reaching 82.15% DSC and 92.75% sensitivity by capturing nodule characteristics from both 3D and 2D images simultaneously.

Boundary enhancement approaches: Detecting nodules near blood vessels and pleura poses the greatest segmentation challenge. Pezzano et al. proposed a U-Net with multiple convolutional layers (MCL) module on LIDC-IDRI to better define boundaries and morphological edges while reducing image resolution loss. Dong et al. included voxel heterogeneity (VH) and shape heterogeneity (SH) properties in their model, finding that VH effectively learned grey-level information while SH excelled at capturing border information. Cao et al. introduced a central intensity-pooling layer (CIP) in their DBResNet model, which was particularly effective for juxta-pleural and small nodule evaluation. Al-Shabi et al. used non-local blocks for global features and residual blocks with 3x3 kernel size for local features, achieving an AUC of 95.62% for transfer learning on LIDC-IDRI, outperforming DenseNet and ResNet.

TL;DR: V-Net achieved the highest DSC of 0.9615 on LUNA16. DB-ResNet surpassed human experts with 82.74% DSC and 89.35% sensitivity. PN-SAMP reached 97.58% sensitivity for combined segmentation and malignancy prediction. Deep Res-UNet hit 87.55% DSC with 91.62% sensitivity. Al-Shabi et al.'s transfer learning approach achieved 95.62% AUC on LIDC-IDRI. Key challenge remains segmenting nodules near vessels and pleura.
Pages 13-16
Radiomics, Feature Extraction, and the Path to Virtual Biopsy

Once a pulmonary nodule has been detected and segmented, the next critical step is characterizing it as benign or malignant. The paper describes two AI-based approaches: (1) automatic segmentation with assessment of lesion size, volume, and densitometric features, and (2) segmentation followed by radiomic feature extraction to provide a "virtual biopsy." Radiomics introduces quantitative evaluation into what has traditionally been qualitative radiological interpretation. Unlike other "-omics" fields (genomics, proteomics, transcriptomics), radiomics is based on radiological imaging analysis and extraction of quantitative features rather than invasive biopsy or molecular assays. These quantitative image features serve as non-invasive biomarkers reflecting underlying tumor pathophysiology and heterogeneity.

Feature extraction and model building: Radiomics processes multiple imaging modalities (CT, PET, MRI, ultrasound) by analyzing the selected region of interest to obtain features using data-characterization algorithms. Features include first-order histogram features, second-order texture features, and higher-order texture features, analyzed through univariate and multivariate statistical models. Tu et al. demonstrated that radiomic analysis of grey-scale intensity from thin-section CT is useful for differentiating benign and malignant nodules. Chae et al. used texture-based features to differentiate pre-invasive from invasive lung adenocarcinoma, achieving an AUC of 0.981 in a study of 86 part-solid ground glass nodules. Perez-Morales et al. estimated radiomic properties from both intratumoral and peritumoral regions to model lung cancer outcomes when tumors were identified during screening.

Clinical applications of radiomics: Radiomics has expanded well beyond simple benign-vs-malignant classification. Coroller et al. found that a radiomic signature could predict response to chemoradiotherapy in NSCLC patients. Cousin et al. developed a CT-based delta-radiomics signature to identify patients likely to benefit from PD-1/PD-L1 inhibitors in advanced or recurrent NSCLC. Hou et al. constructed a combined deep learning model with clinical and radiomic features for NSCLC survival prediction. Dou et al. demonstrated that peritumoral rim radiomic features are significantly associated with distant metastasis from lung adenocarcinoma. Huang et al. introduced a radiomic nomogram combining nodular and peri-nodular radiomic signatures to distinguish pre-invasive from invasive pulmonary lesions preoperatively.

Segmentation for radiomics: Image segmentation for feature extraction can be manual, semi-automatic, or fully automatic through deep learning algorithms. Manual and semi-automatic approaches are most common but are time-consuming and influenced by observer bias, making it important to verify intra- and inter-observer reproducibility of radiomic features. Fully automatic deep learning-based segmentation is still in development. Some AI techniques can not only detect and quantify lung nodules but also recommend Lung-RADS classification by combining mean nodule diameter measurement with nodule type. Most radiomics studies on lung cancer remain retrospective and based on conventional clinical imaging, with performance dependent on thin-section CT image quality.

TL;DR: Radiomics extracts quantitative features from imaging as non-invasive biomarkers. Chae et al. achieved AUC of 0.981 for differentiating pre-invasive from invasive adenocarcinoma in 86 ground glass nodules. Clinical applications extend to chemoradiotherapy response prediction, PD-1/PD-L1 inhibitor selection, survival prediction, and metastasis risk assessment. Most studies remain retrospective, and fully automatic segmentation for radiomics is still developing.
Pages 16-17
Non-Invasive Tumor Profiling Through Imaging-Genomics Integration

Virtual biopsy is an emerging concept that leverages the spatial and temporal heterogeneity inherent in solid tumors. Traditional tissue biopsy has fundamental limitations because genes, proteins, cells, microenvironment, tissues, and organs all exhibit heterogeneity that a single biopsy sample may not capture. In contrast, non-invasive imaging can capture intra-tumoral heterogeneity across the entire lesion. The introduction of targeted immunotherapies for lung cancer has transformed the treatment landscape, making identification of targetable mutations and expression levels a critical step in patient management. This provides additional justification for developing virtual biopsy approaches that could reduce the need for physical tissue sampling.

Radiogenomics and radioproteomics: These fields aim to associate imaging phenotypes with genetic and protein expression patterns through bioinformatics approaches. Radiogenomics specifically concerns the relationship between radiology and genomics, non-invasively researching biological features related to clinical outcomes. At the Mayo Clinic, Lee et al. created a machine learning method called Computer-Aided Nodule Analysis and Risk Yield (CANARY), which discovered 9 distinct radiomic exemplars ("radiomic fingerprints") defining the lung cancer spectrum. CANARY has been demonstrated to correlate directly with adenocarcinoma invasion, functioning as a true virtual biopsy technique.

Biomarker prediction from imaging: Nair et al. showed that radiomic signals derived from diagnostic CT and PET-CT images have the potential to be used as biomarkers for predicting EGFR mutations in NSCLC. Lafata et al. demonstrated preliminary feasibility of an integrated radiomic, cell-free DNA (cfDNA), and circulating tumor DNA (ctDNA) liquid biopsy analysis in patients with locally advanced lung cancer. Their research found that tumors appearing more homogeneous and attenuated on CT imaging had detectable ctDNA TP53 mutations and static changes in cfDNA content early in therapy. These findings illustrate how imaging-based and molecular biomarkers can be combined for a more comprehensive tumor profile.

The authors note that the quality and sophistication of published radiomics studies are increasing, yielding a growing volume of radiomics-based findings in lung cancer. The era of precision medicine demands radiophenotyping for accurate patient classification, and radiomics characteristics and signatures can ideally serve as imaging biomarkers. The next challenge for radiologists will be keeping pace with the rapid advancement of these technologies.

TL;DR: Virtual biopsy uses imaging to non-invasively capture tumor heterogeneity. CANARY identified 9 radiomic fingerprints across the lung cancer spectrum and correlates directly with adenocarcinoma invasion. Radiomic signals from CT and PET-CT can predict EGFR mutations in NSCLC. Integrated radiomic and liquid biopsy (cfDNA/ctDNA) analysis is feasible and can detect TP53 mutations early in therapy.
Pages 17-18
Validation Gaps, Data Curation Challenges, and Clinical Readiness

The authors identify several significant barriers preventing AI tools from entering routine clinical practice. The most fundamental issue is that AI algorithms remain difficult to create and validate due to the disorganized curation of imaging and clinical data. Without standardized, well-curated datasets, training robust models that generalize across different institutions and populations is extremely challenging. Most radiomics studies on lung cancer are retrospective and based on conventional medical images from routine clinical protocols, meaning their real-world applicability remains uncertain.

False-positive burden: The high number of false positives remains the primary limitation of CAD systems in lung cancer screening. Blood vessels and other soft tissue structures are frequently misinterpreted by detection algorithms, reducing the accuracy and efficacy of CAD screening tools when applied to large populations. While effective classification techniques (such as Tran et al.'s 97.2% accuracy system) can reduce false-positive rates, this problem has not been fully solved and continues to undermine confidence in automated screening at scale.

Segmentation challenges: Precise nodule segmentation remains difficult due to the small size of nodules and their frequent proximity to lung edges, blood vessels, or pleura. Nodules attached to non-nodule structures pose particular problems for automated systems. While models like V-Net (DSC 0.9615) and DB-ResNet (exceeding human expert performance) show promise, segmenting juxta-vascular, juxta-pleural, and ground glass opacity nodules continues to challenge all approaches. The shift from analyzing cubes containing individual nodules to processing whole 3D lung volumes represents an unsolved technical hurdle.

Lack of prospective validation: The absence of prospective, multi-center clinical trials confirming that retrospective performance metrics translate to real-world screening settings is perhaps the most critical limitation. Manual and semi-automatic segmentation, while most commonly used, are time-consuming and influenced by observer bias. Fully automatic deep learning segmentation is still in development. Furthermore, the integration of multimodality data and the creation of tested, validated, efficient models have not yet been achieved at a level that permits routine clinical deployment. Collaboration between radiologists, clinicians, and AI developers is necessary but has not been systematically organized.

TL;DR: Key limitations include disorganized data curation preventing robust model development, high false-positive rates from blood vessels in CAD systems, difficulty segmenting nodules near vessels and pleura, and the absence of prospective multi-center validation trials. Most radiomics studies remain retrospective, and fully automatic segmentation for clinical use is still under development.
Pages 17-18
Toward an Integrated AI Screening Pipeline

The authors outline a four-step vision for integrating AI into the complete lung cancer screening workflow. Step 1: Pre-screening AI applications for personalized risk assessment, optimizing patient eligibility criteria using models like PLCOm2012 combined with blood biomarkers and polygenic risk scores. Step 2: Image acquisition using low-dose protocols with deep learning-based reconstruction algorithms that maintain optimal image quality at reduced radiation exposure. Step 3: AI-based automated nodule detection to reduce the radiologist's workload, functioning as a concurrent or second reader. Step 4: AI-driven nodule characterization (benign vs. malignant) to optimize resource utilization, costs, and the chance of unnecessary biopsy or surgery.

Radiomics and radiogenomics potential: In future perspectives, the authors anticipate that radiomics and radiogenomics analysis will enable personalized patient profiles, therapy selection, and prognosis prediction, though clear accuracy benchmarks are not yet available. The integration of radiomic signatures with liquid biopsy data (cfDNA, ctDNA) and clinical data promises a comprehensive "virtual biopsy" approach that could reduce reliance on invasive tissue sampling. Automating time-consuming tasks like identifying image-based biomarkers and checking for nodules will eventually be possible, bringing personalized medicine with non-invasive, repeatable disease characterization closer to reality.

Infrastructure requirements: The paper stresses that platforms for choosing between different AI applications, and the integration of AI technology into picture archiving and communication systems (PACS), are essential to bring AI into routine practice. The gap between research-level performance (e.g., 98.88% accuracy, AUC of 0.981, DSC of 0.9615) and clinical-grade deployment remains significant. Collaboration, cooperation, and integration between radiologists and clinicians is identified as the necessary bridge. The authors conclude that AI has huge potential clinical applications in chest imaging and that this integration represents one of the most important challenges of future medicine.

The overarching message of the review is one of cautious optimism: AI technologies have demonstrated impressive performance across every stage of the lung cancer screening pipeline, from dose reduction and risk stratification through detection, segmentation, characterization, and virtual biopsy. However, these capabilities remain siloed in individual research studies. The critical next step is to validate and integrate these tools into a unified, clinically deployed screening workflow that is accessible across healthcare systems and demonstrates real-world mortality reduction.

TL;DR: The authors envision a 4-step integrated AI screening pipeline: personalized risk assessment, DLR-based low-dose imaging, automated nodule detection, and AI-driven characterization. Top performance metrics across the review include 98.88% detection accuracy, 0.981 AUC for adenocarcinoma classification, and 0.9615 DSC for segmentation. Clinical deployment requires PACS integration, prospective validation, and radiologist-clinician-AI collaboration.
Citation: Cellina M, Cacioppa LM, Cè M, et al.. Open Access, 2023. Available at: PMC10486721. DOI: 10.3390/cancers15174344. License: cc by.