MRI Deep Learning for Endometrial Cancer Evaluation

Plain-English Explanations

Overview & Background

Pages 1-2

Why MRI and Deep Learning Matter for Endometrial Cancer Staging

Endometrial cancer (EC) is one of the most common malignant tumors of the female reproductive system, with an incidence second only to cervical cancer and accounting for 15% to 20% of all gynecological malignancies. EC is most frequently seen in women aged 50 to 60 years, with an average age of onset around 61 years. Risk factors include obesity, menopause, hypertension, polycystic ovary syndrome, family history of cancer, and prior pelvic radiotherapy. Most cases are endometrioid carcinoma, though mucinous adenocarcinoma, serous adenocarcinoma, and clear cell carcinoma also occur. Accurate preoperative staging is essential because the treatment plan depends on the precise extent and scope of the lesion.

Limitations of conventional imaging: Standard CT imaging struggles to distinguish the range of endometrial lesions and has difficulty differentiating between stage II, III, and IV disease. Ultrasound also provides limited staging accuracy for EC. Magnetic resonance imaging (MRI), by contrast, offers excellent soft tissue resolution, multidirectional imaging capability, and high accuracy. MRI is considered one of the best imaging modalities for preoperative staging of EC, capable of assessing endometrial thickness, cervical canal width, fibrous matrix ring integrity, signal characteristics, and lymph node metastasis.

The deep learning opportunity: While MRI produces high-quality images, its interpretation has traditionally been limited to qualitative assessments, and many research results lack quantitative measurement. Deep learning neural networks offer fast nonlinear mapping and ultrafast propagation abilities that are well suited for image recognition and information processing. This study aims to combine deep learning with MRI to achieve quantitative and timed imaging for EC, providing a more objective and reproducible basis for diagnosis and staging.

Study design: The researchers selected 80 patients with EC from a hospital in Wuhan, China. All patients underwent preoperative MRI examination with coronal and sagittal T1WI and T2WI imaging. The study applied a ResNet-based deep learning network to optimize MRI image recognition and compared three different network architectures for classification accuracy. Surgical pathology served as the gold standard for validating MRI staging results.

TL;DR: Endometrial cancer accounts for 15-20% of female reproductive malignancies. MRI is the preferred preoperative staging tool but lacks quantitative rigor. This study applied deep learning (ResNet) to MRI images of 80 EC patients to improve image classification and staging accuracy, using surgical pathology as the gold standard.

Methodology

Pages 2-3

Patient Selection, MRI Protocol, and Image Analysis

Patient cohort: Eighty EC patients were enrolled, aged 42 to 70 years with a mean age of 51.67 +/- 3.13 years. The cohort included 32 cases of endometrioid adenocarcinoma, 6 cases of serous adenocarcinoma, 9 cases of low differentiation, 10 cases of high differentiation, and 2 cases of carcinosarcoma. The study was approved by the hospital's medical ethics committee, and all patients or their representatives signed informed consent forms.

Inclusion and exclusion criteria: Patients were included if they met the 2009 FIGO (International Federation of Gynecology and Obstetrics) staging standard for EC, had complete clinical data, could cooperate independently with medical staff, did not interrupt treatment, and met treatment indications. Exclusion criteria covered patients with severe organic diseases or hematopoietic system disorders, mental illness, poor treatment compliance, communication barriers, treatment interruptions, or tumors at other sites or metastatic tumors.

MRI examination protocol: A 1.5 T MRI machine was used for pelvic examination. The protocol included conventional coronal sagittal T2WI (fs FRFSE-XL, TR/TE of 3500-3800 ms/130 ms, NEX 2-4, FOV 26), cross-sectional T1WI (fs FSE-XL, TR/TE of 400 ms/8 ms, NEX 2, FOV 32), cross-sectional diffusion-weighted imaging (DWI, b=700, TR/TE of 4000 ms/73 ms), and cross-sectional T2WI (fs FRFSE-XL, TR/TE of 4000 ms/130 ms, NEX 4, FOV 32). LAVA cross-sectional masks were collected before MRI, and Gd-DTPA contrast was injected intravenously for multiphase dynamic contrast-enhanced scanning. Sagittal and coronal LAVA enhancement followed.

Image interpretation: Two experienced middle and senior professional physicians analyzed the images independently using a double-blind method, following the 2009 FIGO diagnostic criteria for EC. They assessed endometrial thickness, cervical canal width, fibrous matrix ring integrity, signal characteristics, lymph node metastasis, and bilateral accessory and organ invasion. Dynamic contrast-enhanced scanning that showed discontinuous epithelial cervical mucosal enhancement indicated cervical involvement, and patients were grouped accordingly into a control group (without cervical infiltration) and an observation group (with cervical infiltration).

TL;DR: The study enrolled 80 EC patients (mean age 51.67, range 42-70) scanned on a 1.5 T MRI with T1WI, T2WI, DWI, and Gd-DTPA contrast-enhanced sequences. Two senior physicians independently interpreted images using double-blind FIGO 2009 criteria. Patients were grouped by cervical infiltration status.

Deep Learning Architecture

Pages 3-5

CNN, ResNet, and the Optimized Network Architecture

Convolutional neural network fundamentals: The study employed a deep learning network commonly used in medical image segmentation, classification, and target localization. In these networks, pixel information at different scales is combined to extract optimal size information, and convolution kernel size directly affects the neural network's processing speed. The researchers used the overlapping-echo detachment (OLED) method within the deep learning framework, which enables quantitative T2 imaging using two or four overlapping echoes. This approach adds quantitative measurement capability to single-scan MRI, though reconstruction quality for long parameter values in regions of interest can be relatively poor with fewer echoes.

ResNet architecture and residual learning: The ResNet (Residual Network) architecture was chosen because it avoids the vanishing gradient problem that plagues deeper networks such as AlexNet and GoogLeNet, where classification accuracy declines as layers increase beyond a certain depth. ResNet uses skip connections that link each layer's output directly to its input, allowing the network to learn residual functions F(x) rather than complete mappings. The residual block computes F = W2 * ReLU(W1 * X), and the final output Y = F(X) + X passes through a second ReLU activation. This design enables the network to be much deeper without gradient explosion, and ResNet's accuracy exceeds that of AlexNet and GoogLeNet.

Optimized ResNet network: The authors built an optimized version of ResNet specifically for MRI image recognition. The input images were preprocessed to 100 x 100 pixels. In the first convolutional layer, the convolution kernel was configured to increase the receptive field, and the stride in the first bottleneck residual block was set to 1. Every layer used the ReLU activation function, and the Adam optimizer was added for gradient descent training. The network structure consisted of five groups of bottleneck blocks with progressively increasing filter sizes (128, 256, 512, 1024, and 2048 channels), each containing multiple 1x1 convolution layers with ReLU activations.

Loss function and training: The loss function was defined as a distance function measuring the mean squared error between the reconstructed image and the ground-truth label. Network parameters were iteratively updated using gradient descent to minimize this distance. The training process involved initializing the network topology, inputting sequences, computing hidden layer (H) and output layer (Y) values, calculating MSE between output and expected values, and adjusting weighted coefficients using a learning efficiency parameter (eta). The Sigmoid activation function was used for precise parameter control, and the number of hidden neuron nodes was determined using an empirical equation based on input and output dimensions.

TL;DR: The study compared a shallow CNN, standard ResNet, and an optimized ResNet with 5 bottleneck groups (up to 2048 channels), Adam optimizer, ReLU activations, and 100x100 pixel inputs. ResNet's skip connections solve the vanishing gradient problem, and the optimized version was specifically tuned for MRI image recognition with MSE-based loss and gradient descent training.

Model Performance

Pages 5-6

Comparing Three Network Models on MRI Image Recognition

Experimental setup: The experiments were run on an Intel Core i7-8750H CPU at 2.20 GHz with a 1 TB SSD, using Python 3.6.5, CUDA/CUDNN 10.0.130, OpenCV-Python 4.1.0.25, NumPy 1.18.1, and Matplotlib 2.2.2. The dataset consisted of 867 training images and 436 test images, which were used identically across all three network architectures to ensure a fair comparison.

Three-model comparison: The study evaluated three architectures for MRI image identification: a shallow CNN network, a standard ResNet network, and the optimized ResNet network. Average identification times per image were 0.14 seconds for the shallow CNN, 0.39 seconds for ResNet, and 0.48 seconds for the optimized ResNet. While the optimized network was the slowest per image, it delivered the best overall performance in terms of recognition accuracy. The deeper architecture and additional optimization layers required more computation time but extracted richer feature representations from the MRI images.

Recognition accuracy gains: The optimized ResNet network outperformed both the shallow CNN and the standard ResNet in classification accuracy. The key architectural improvements included the Adam optimizer for more efficient gradient updates, the bottleneck residual structure with progressive channel expansion from 128 to 2048, and the stride-1 configuration in the first bottleneck block that preserved spatial information. These modifications allowed the optimized network to capture fine-grained texture differences in endometrial tissue that the shallower networks missed.

The trade-off between speed and accuracy is a recurring theme in deep learning for medical imaging. The shallow CNN processed images roughly 3.4 times faster than the optimized ResNet (0.14s vs. 0.48s per image), but the accuracy gap justified the additional computation time, particularly in a clinical context where diagnostic precision is more important than processing speed. With a dataset of 867 training images and 436 test images, the optimized ResNet demonstrated that deeper, well-configured architectures can extract meaningful features even from moderately sized medical imaging datasets.

TL;DR: Three models were tested on 867 training and 436 test MRI images. The optimized ResNet achieved the best recognition accuracy despite being slower (0.48s/image vs. 0.14s for shallow CNN and 0.39s for standard ResNet). The Adam optimizer and bottleneck residual blocks with up to 2048 channels drove the accuracy gains.

MRI Findings

Pages 6-7

MRI Image Characteristics and Staging Patterns in Endometrial Cancer

Normal vs. cancerous endometrium on MRI: The study presented detailed MRI image comparisons between normal endometrial tissue and EC lesions. In healthy patients, the endometrium appears as a well-defined layer with consistent signal intensity. In EC patients, MRI reveals characteristic findings including widened uterine cavity, thickened endometrium with uneven enhancement, and altered signal patterns on T2-weighted imaging. The high-signal area widening in the central uterine body on sagittal T2WI exceeds the normal range and serves as a key diagnostic indicator.

Case examples: One illustrative case was a 52-year-old female with early symptoms of heavy menstrual volume and intermittent abdominal pain lasting over one year. MRI scanning was performed in three orientations: sagittal, coronal, and axial, with the scanning direction parallel to the uterine body long axis and perpendicular to the short axis. Another case involved a 50-year-old female with incomplete menstruation for 13 days, where detection showed hypoechoic and unevenly distributed uterine cavity contents. This patient had stage II EC with medium-to-high differentiation, cancer infiltration depth less than half of the muscular layer, and involvement of the cervical canal. MRI images showed dilated uterine cavity and thinning of the endometrial junction zone.

MRI manifestations by invasion range: The study systematically categorized MRI appearances by invasion depth. Tumors limited to the endometrium showed normal or thickened endometrium (less than 5 mm after menopause, less than 10 mm before menopause) with intact subendometrial enhancement. Shallow muscle layer invasion showed partial or full-thickness disruption of junctional bands with invasion not exceeding half the myometrium. Deep muscular layer invasion showed complete disruption of the junction zone with tumor extending beyond 50% of the myometrium. Cervical mucosal involvement showed cervical canal widening beyond 3 mm with an intact fibrous stromal ring, while cervical stromal invasion showed tumor signals within the fibrous stromal ring itself.

Lymph node and distant assessment: For lymph node involvement, MRI identified pelvic or para-aortic lymph nodes greater than 10 mm in diameter as suspicious. Vaginal, bladder, and rectal involvement was characterized by segmental disruption of normal signaling replaced by tumor signal patterns. Plasma membrane invasion showed discontinuous outer myometrial edges with tumor extending beyond the uterine contour. These detailed MRI criteria, when combined with the deep learning classification system, provided a structured framework for systematic staging assessment.

TL;DR: MRI findings were systematically mapped to FIGO stages: endometrial thickening (less than 5 mm post-menopause, less than 10 mm pre-menopause) for confined tumors, junction zone disruption for myometrial invasion depth, cervical canal widening beyond 3 mm for mucosal involvement, and lymph nodes greater than 10 mm for nodal disease. Case examples illustrated stage II disease with cervical canal involvement and junction zone thinning.

Diagnostic Performance

Page 7

MRI Diagnostic Accuracy: 88.75% Accuracy and 95% Specificity

Overall diagnostic metrics: For MRI T2-weighted imaging combined with enhanced scanning, the study reported accuracy of 88.75%, specificity of 95%, sensitivity of 87.5%, negative predictive value of 93.75%, and positive predictive value of 86.25% for assessing the depth of muscular infiltration. These results demonstrate that MRI, aided by the deep learning classification framework, provides strong diagnostic capability for distinguishing invasion depth in EC patients.

Staging concordance with pathology: Among the 80 EC patients, MRI diagnosis identified 72 cases (90%) as stage Ia (shallow muscle layer infiltration), 3 cases as stage II (cervical infiltration), 2 cases as stage III (accessory vaginal involvement), 2 cases as stage Ib (deep muscle layer infiltration), and 1 case as stage IV (bladder and rectum involvement). When compared against surgical pathological findings, the MRI staging showed the highest concordance for stage Ia disease. The pathological control table revealed that of the 72 MRI-diagnosed stage Ia cases, 56 were confirmed as true stage Ia, while 15 were actually stage II and 1 was stage III, indicating some understaging of cervical and accessory involvement.

Clinical significance of the metrics: The 95% specificity is particularly noteworthy because it means that when MRI indicates no deep muscular infiltration, the result is highly reliable. The 93.75% negative predictive value further supports this, indicating that a negative MRI result for deep invasion is correct in nearly 94 out of 100 cases. The sensitivity of 87.5% means that MRI correctly detects deep muscular invasion in most cases where it is actually present, though roughly 1 in 8 cases of true deep invasion may be missed.

The positive predictive value of 86.25% indicates that when MRI identifies deep muscular infiltration, it is correct approximately 86% of the time. This combination of high specificity and strong negative predictive value makes MRI with deep learning assistance particularly useful as a screening tool for identifying low-risk patients who may not need extensive surgical intervention, while the somewhat lower positive predictive value suggests that positive findings should be interpreted with additional clinical context.

TL;DR: MRI with T2WI and enhanced scanning achieved 88.75% accuracy, 95% specificity, 87.5% sensitivity, 93.75% negative predictive value, and 86.25% positive predictive value for muscular infiltration depth. Of 80 patients, 90% were diagnosed as stage Ia by MRI, with 56 of 72 stage Ia cases confirmed by pathology. The 95% specificity makes it highly reliable for ruling out deep invasion.

Discussion

Pages 7-8

Contextualizing Results: MRI Performance in the Broader Literature

Deep vs. shallow myometrial infiltration accuracy: The authors situate their findings within the established literature on MRI accuracy for EC staging. Prior studies have reported that MRI achieves 92% to 97% accuracy for diagnosing deep myometrial infiltration but only 69% to 74% accuracy for shallow myometrial infiltration. This discrepancy arises because deep invasion produces more obvious disruption of the junction zone and subendometrial enhancement bands, while shallow invasion can be subtle and easily confused with benign endometrial thickening or adenomyosis.

Comparison with other MRI studies: The study by Tsuyoshi et al. used MRI to detect EC and reported sensitivity of 100%, specificity of 96.9%, and accuracy of 97.0% for lesion-based detection of regional lymph node metastasis. Bi et al. (2019) demonstrated that combining T2-weighted imaging, dynamic contrast-enhanced MRI, and diffusion-weighted imaging produces the highest diagnostic accuracy with high specificity for evaluating muscular infiltration. Fasmer et al. recommended preoperative pelvic MRI for local EC staging and found that whole-tumor radiological characteristics produce medium-to-high diagnostic performance for predicting invasive EC, enabling personalized treatment strategies.

The role of multi-sequence MRI: Dynamic contrast-enhanced MRI and diffusion-weighted imaging can improve both sensitivity and specificity for detecting myometrial infiltration beyond what T2WI alone provides. MRI also demonstrates high specificity for detecting cervical infiltration and lymph node metastasis. The combination of MRI with positron emission tomography (PET) further enhances diagnostic value for evaluating primary tumors in EC patients. The study emphasizes that MRI can provide a comprehensive diagnostic strategy that may replace conventional imaging approaches for EC staging.

Deep learning's contribution: The addition of the ResNet-based network to the imaging pipeline effectively improved model accuracy beyond what standard image interpretation alone could achieve. The study positions its deep learning approach as a tool for improving quantitative parameter extraction from MRI images, moving beyond subjective qualitative assessment toward more objective and reproducible measurements. This shift is particularly important given that MRI measurement of uterine size is influenced by multiple confounding factors including uterine position, fibroids, patient age, and hormone therapy history.

TL;DR: Prior literature reports MRI accuracy of 92-97% for deep myometrial infiltration and 69-74% for shallow infiltration. Tsuyoshi et al. achieved 97% accuracy for lymph node metastasis detection. Combining T2WI with dynamic contrast-enhanced MRI and DWI yields the highest diagnostic accuracy. The ResNet-based deep learning approach adds quantitative rigor to what has traditionally been a subjective qualitative assessment.

Limitations & Future Directions

Pages 8-9

Study Limitations and Opportunities for Improvement

Small sample size: With only 80 patients, this study has a limited cohort that constrains the generalizability of its findings. A dataset of 867 training images and 436 test images, while sufficient for proof-of-concept deep learning experiments, is small by the standards of modern medical imaging AI, where successful models typically train on thousands to tens of thousands of annotated images. The small sample increases the risk of overfitting, where the model memorizes training data patterns rather than learning generalizable features.

Subjective factor interference: The authors acknowledge that the experimental data cannot completely eliminate the interference of subjective factors. MRI image quality is affected by patient movement (both conscious and unconscious), which creates artifacts that can complicate both human and AI interpretation. The double-blind interpretation by two physicians helps reduce subjectivity in the reference standard, but the 2009 FIGO staging criteria themselves involve qualitative judgments about tissue signal characteristics and invasion depth that inherently carry some subjectivity.

Single-center design: The study was conducted at a single hospital using a single 1.5 T MRI scanner with a specific protocol. This limits external validity because MRI image characteristics vary across different scanner manufacturers, field strengths (1.5 T vs. 3 T), coil configurations, and sequence parameters. A model optimized for one scanner's output may not perform equally well on images from a different institution with different equipment. Multi-center validation across diverse imaging environments would be needed to establish clinical reliability.

Lack of standardized evaluation indicators: The authors note that evaluation indicators need to be standardized in future work. The current study used accuracy, sensitivity, and specificity as primary metrics, but did not report AUC values, confidence intervals, or perform cross-validation experiments that would provide more robust estimates of model performance. Future studies should incorporate larger and more diverse patient populations, additional physiological and clinical data, and standardized quantitative evaluation frameworks to strengthen the evidence base for deep learning-assisted MRI in EC staging.

Future directions: The researchers hope to expand the sample size, incorporate more physiological and clinical data, and further develop single-scan quantitative MRI techniques using the OLED framework. The U-Net network approach for image segmentation, combined with kernel space neighborhood information, represents a promising direction for improving endometrial image analysis. Broader adoption would also require integration of the deep learning tools into existing clinical imaging workflows and prospective validation studies comparing AI-assisted staging against standard-of-care approaches in terms of patient outcomes.

TL;DR: Key limitations include a small 80-patient single-center cohort, only 867 training and 436 test images, inability to fully eliminate subjective factors, and lack of standardized evaluation metrics such as AUC and confidence intervals. Future work should expand sample sizes, add multi-center validation, incorporate cross-validation, and integrate the deep learning pipeline into clinical imaging workflows.

Evaluation and Monitoring of Endometrial Cancer Based on Magnetic Resonance Imaging Features of Deep Learning

Original Paper (PDF)