Latest Papers on Radiology AI. Tags: Mammography, Order: Best Match, Limit: 10.

Robust evaluation of tissue-specific radiomic features for classifying breast tissue density grades.

Dong V, Mankowski W, Silva Filho TM, McCarthy AM, Kontos D, Maidment ADA, Barufaldi B

•papers•Nov 1 2025

Breast cancer risk depends on an accurate assessment of breast density due to lesion masking. Although governed by standardized guidelines, radiologist assessment of breast density is still highly variable. Automated breast density assessment tools leverage deep learning but are limited by model robustness and interpretability. We assessed the robustness of a feature selection methodology (RFE-SHAP) for classifying breast density grades using tissue-specific radiomic features extracted from raw central projections of digital breast tomosynthesis screenings ( <math xmlns="http://www.w3.org/1998/Math/MathML"> <mrow> <msub><mrow><mi>n</mi></mrow> <mrow><mi>I</mi></mrow> </msub> <mo>=</mo> <mn>651</mn></mrow> </math> , <math xmlns="http://www.w3.org/1998/Math/MathML"> <mrow> <msub><mrow><mi>n</mi></mrow> <mrow><mi>II</mi></mrow> </msub> <mo>=</mo> <mn>100</mn></mrow> </math> ). RFE-SHAP leverages traditional and explainable AI methods to identify highly predictive and influential features. A simple logistic regression (LR) classifier was used to assess classification performance, and unsupervised clustering was employed to investigate the intrinsic separability of density grade classes. LR classifiers yielded cross-validated areas under the receiver operating characteristic (AUCs) per density grade of [ <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow><mi>A</mi></mrow> </math> : <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow><mn>0.909</mn> <mo>±</mo> <mn>0.032</mn></mrow> </math> , <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow><mi>B</mi></mrow> </math> : <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow><mn>0.858</mn> <mo>±</mo> <mn>0.027</mn></mrow> </math> , <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow><mi>C</mi></mrow> </math> : <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow><mn>0.927</mn> <mo>±</mo> <mn>0.013</mn></mrow> </math> , <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow><mi>D</mi></mrow> </math> : <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow><mn>0.890</mn> <mo>±</mo> <mn>0.089</mn></mrow> </math> ] and an AUC of <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow><mn>0.936</mn> <mo>±</mo> <mn>0.016</mn></mrow> </math> for classifying patients as nondense or dense. In external validation, we observed per density grade AUCs of [ <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow><mi>A</mi></mrow> </math> : 0.880, <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow><mi>B</mi></mrow> </math> : 0.779, <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow><mi>C</mi></mrow> </math> : 0.878, <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow><mi>D</mi></mrow> </math> : 0.673] and nondense/dense AUC of 0.823. Unsupervised clustering highlighted the ability of these features to characterize different density grades. Our RFE-SHAP feature selection methodology for classifying breast tissue density generalized well to validation datasets after accounting for natural class imbalance, and the identified radiomic features properly captured the progression of density grades. Our results potentiate future research into correlating selected radiomic features with clinical descriptors of breast tissue density.

Mammography Classification Breast Retrospective Clinical In Silico Academic Lab Reproducibility

Comparing percent breast density assessments of an AI-based method with expert reader estimates: inter-observer variability.

Romanov S, Howell S, Harkness E, Gareth Evans D, Astley S, Fergie M

•papers•Nov 1 2025

Breast density estimation is an important part of breast cancer risk assessment, as mammographic density is associated with risk. However, density assessed by multiple experts can be subject to high inter-observer variability, so automated methods are increasingly used. We investigate the inter-reader variability and risk prediction for expert assessors and a deep learning approach. Screening data from a cohort of 1328 women, case-control matched, was used to compare between two expert readers and between a single reader and a deep learning model, Manchester artificial intelligence - visual analog scale (MAI-VAS). Bland-Altman analysis was used to assess the variability and matched concordance index to assess risk. Although the mean differences for the two experiments were alike, the limits of agreement between MAI-VAS and a single reader are substantially lower at +SD (standard deviation) 21 (95% CI: 19.65, 21.69) -SD 22 (95% CI: <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow><mo>-</mo> <mn>22.71</mn></mrow> </math> , <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow><mo>-</mo> <mn>20.68</mn></mrow> </math> ) than between two expert readers +SD 31 (95% CI: 32.08, 29.23) -SD 29 (95% CI: <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow><mo>-</mo> <mn>29.94</mn></mrow> </math> , <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow><mo>-</mo> <mn>27.09</mn></mrow> </math> ). In addition, breast cancer risk discrimination for the deep learning method and density readings from a single expert was similar, with a matched concordance of 0.628 (95% CI: 0.598, 0.658) and 0.624 (95% CI: 0.595, 0.654), respectively. The automatic method had a similar inter-view agreement to experts and maintained consistency across density quartiles. The artificial intelligence breast density assessment tool MAI-VAS has a better inter-observer agreement with a randomly selected expert reader than that between two expert readers. Deep learning-based density methods provide consistent density scores without compromising on breast cancer risk discrimination.

Mammography Classification Breast Retrospective Clinical In Silico Academic Lab

MammosighTR: Nationwide Breast Cancer Screening Mammogram Dataset with BI-RADS Annotations for Artificial Intelligence Applications.

Koç U, Beşler MS, Sezer EA, Karakaş E, Özkaya YA, Evrimler Ş, Yalçın A, Kızıloğlu A, Kesimal U, Oruç M, Çankaya İ, Koç Keleş D, Merd N, Özkan E, Çevik Nİ, Gökhan MB, Boyraz Hayat B, Özer M, Tokur O, Işık F, Tezcan A, Battal F, Yüzkat M, Sebik NB, Karademir F, Topuz Y, Sezer Ö, Varlı S, Ülgü MM, Akdoğan E, Birinci Ş

•papers•Aug 13 2025

<i>"Just Accepted" papers have undergone full peer review and have been accepted for publication in <i>Radiology: Artificial Intelligence</i>. This article will undergo copyediting, layout, and proof review before it is published in its final version. Please note that during production of the final copyedited article, errors may be discovered which could affect the content</i>. The MammosighTR dataset, derived from Türkiye's national breast cancer screening mammography program, provides BI-RADS-labeled mammograms with detailed annotations on breast composition and lesion quadrant location, which may be useful for developing and testing AI models in breast cancer detection. ©RSNA, 2025.

Mammography Detection Breast Dataset Release In Silico Open Dataset

Improving discriminative ability in mammographic microcalcification classification using deep learning: a novel double transfer learning approach validated with an explainable artificial intelligence technique

Arlan, K., Bjornstrom, M., Makela, T., Meretoja, T. J., Hukkinen, K.

•preprint•Aug 11 2025

BackgroundBreast microcalcification diagnostics are challenging due to their subtle presentation, overlapping with benign findings, and high inter-reader variability, often leading to unnecessary biopsies. While deep learning (DL) models - particularly deep convolutional neural networks (DCNNs) - have shown potential to improve diagnostic accuracy, their clinical application remains limited by the need for large annotated datasets and the "black box" nature of their decision-making. PurposeTo develop and validate a deep learning model (DCNN) using a double transfer learning (d-TL) strategy for classifying suspected mammographic microcalcifications, with explainable AI (XAI) techniques to support model interpretability. Material and methodsA retrospective dataset of 396 annotated regions of interest (ROIs) from full-field digital mammography (FFDM) images of 194 patients who underwent stereotactic vacuum-assisted biopsy at the Womens Hospital radiological department, Helsinki University Hospital, was collected. The dataset was randomly split into training and test sets (24% test set, balanced for benign and malignant cases). A ResNeXt-based DCNN was developed using a d-TL approach: first pretrained on ImageNet, then adapted using an intermediate mammography dataset before fine-tuning on the target microcalcification data. Saliency maps were generated using Gradient-weighted Class Activation Mapping (Grad-CAM) to evaluate the visual relevance of model predictions. Diagnostic performance was compared to a radiologists BI-RADS-based assessment, using final histopathology as the reference standard. ResultsThe ensemble DCNN achieved an area under the ROC curve (AUC) of 0.76, with 65% sensitivity, 83% specificity, 79% positive predictive value (PPV), and 70% accuracy. The radiologist achieved an AUC of 0.65 with 100% sensitivity but lower specificity (30%) and PPV (59%). Grad-CAM visualizations showed consistent activation of the correct ROIs, even in misclassified cases where confidence scores fell below the threshold. ConclusionThe DCNN model utilizing d-TL achieved performance comparable to radiologists, with higher specificity and PPV than BI-RADS. The approach addresses data limitation issues and may help reduce additional imaging and unnecessary biopsies.

Mammography Classification Breast Retrospective Clinical In Silico Academic Lab GenAI

Dense breasts and women's health: which screenings are essential?

Mota BS, Shimizu C, Reis YN, Gonçalves R, Soares Junior JM, Baracat EC, Filassi JR

•papers•Aug 9 2025

This review synthesizes current evidence regarding optimal breast cancer screening strategies for women with dense breasts, a population at increased risk due to decreased mammographic sensitivity. A systematic literature review was performed in accordance with PRISMA criteria, covering MEDLINE, EMBASE, CINAHL Plus, Scopus, and Web of Science until May 2025. The analysis examines advanced imaging techniques such as digital breast tomosynthesis (DBT), contrast-enhanced spectral mammography (CESM), ultrasound, and magnetic resonance imaging (MRI), assessing their effectiveness in addressing the shortcomings of traditional mammography in dense breast tissue. The review rigorously evaluates the incorporation of risk stratification models, such as the BCSC, in customizing screening regimens, in conjunction with innovative technologies like liquid biopsy and artificial intelligence-based image analysis for improved risk prediction. A key emphasis is placed on the heterogeneity in international screening guidelines and the challenges in translating research findings to diverse clinical settings, particularly in resource-constrained environments. The discussion includes ethical implications regarding compulsory breast density notification and the possibility of intensifying disparities in health care. The review ultimately encourages the development of evidence-based, context-specific guidelines that facilitate equitable access to effective breast cancer screening for all women with dense breasts.

Mammography Classification Breast Review In Silico Ethics Policy

Transformer-Based Explainable Deep Learning for Breast Cancer Detection in Mammography: The MammoFormer Framework

Ojonugwa Oluwafemi Ejiga Peter, Daniel Emakporuena, Bamidele Dayo Tunde, Maryam Abdulkarim, Abdullahi Bn Umar

•preprint•Aug 8 2025

Breast cancer detection through mammography interpretation remains difficult because of the minimal nature of abnormalities that experts need to identify alongside the variable interpretations between readers. The potential of CNNs for medical image analysis faces two limitations: they fail to process both local information and wide contextual data adequately, and do not provide explainable AI (XAI) operations that doctors need to accept them in clinics. The researcher developed the MammoFormer framework, which unites transformer-based architecture with multi-feature enhancement components and XAI functionalities within one framework. Seven different architectures consisting of CNNs, Vision Transformer, Swin Transformer, and ConvNext were tested alongside four enhancement techniques, including original images, negative transformation, adaptive histogram equalization, and histogram of oriented gradients. The MammoFormer framework addresses critical clinical adoption barriers of AI mammography systems through: (1) systematic optimization of transformer architectures via architecture-specific feature enhancement, achieving up to 13% performance improvement, (2) comprehensive explainable AI integration providing multi-perspective diagnostic interpretability, and (3) a clinically deployable ensemble system combining CNN reliability with transformer global context modeling. The combination of transformer models with suitable feature enhancements enables them to achieve equal or better results than CNN approaches. ViT achieves 98.3% accuracy alongside AHE while Swin Transformer gains a 13.0% advantage through HOG enhancements

Mammography Classification Breast Methodology In Silico GenAI Ethics

Advanced Multi-Architecture Deep Learning Framework for BIRADS-Based Mammographic Image Retrieval: Comprehensive Performance Analysis with Super-Ensemble Optimization

MD Shaikh Rahman, Feiroz Humayara, Syed Maudud E Rabbi, Muhammad Mahbubur Rashid

•preprint•Aug 6 2025

Content-based mammographic image retrieval systems require exact BIRADS categorical matching across five distinct classes, presenting significantly greater complexity than binary classification tasks commonly addressed in literature. Current medical image retrieval studies suffer from methodological limitations including inadequate sample sizes, improper data splitting, and insufficient statistical validation that hinder clinical translation. We developed a comprehensive evaluation framework systematically comparing CNN architectures (DenseNet121, ResNet50, VGG16) with advanced training strategies including sophisticated fine-tuning, metric learning, and super-ensemble optimization. Our evaluation employed rigorous stratified data splitting (50%/20%/30% train/validation/test), 602 test queries, and systematic validation using bootstrap confidence intervals with 1,000 samples. Advanced fine-tuning with differential learning rates achieved substantial improvements: DenseNet121 (34.79% precision@10, 19.64% improvement) and ResNet50 (34.54%, 19.58% improvement). Super-ensemble optimization combining complementary architectures achieved 36.33% precision@10 (95% CI: [34.78%, 37.88%]), representing 24.93% improvement over baseline and providing 3.6 relevant cases per query. Statistical analysis revealed significant performance differences between optimization strategies (p<0.001) with large effect sizes (Cohen's d>0.8), while maintaining practical search efficiency (2.8milliseconds). Performance significantly exceeds realistic expectations for 5-class medical retrieval tasks, where literature suggests 20-25% precision@10 represents achievable performance for exact BIRADS matching. Our framework establishes new performance benchmarks while providing evidence-based architecture selection guidelines for clinical deployment in diagnostic support and quality assurance applications.

Mammography Classification Breast Methodology In Silico Benchmark SOTA

Retrospective evaluation of interval breast cancer screening mammograms by radiologists and AI.

Subelack J, Morant R, Blum M, Gräwingholt A, Vogel J, Geissler A, Ehlig D

•papers•Aug 4 2025

To determine whether an AI system can identify breast cancer risk in interval breast cancer (IBC) screening mammograms. IBC screening mammograms from a Swiss screening program were retrospectively analyzed by radiologists/an AI system. Radiologists determined whether the IBC mammogram showed human visible signs of breast cancer (potentially missed IBCs) or not (IBCs without retrospective abnormalities). The AI system provided a case score and a prognostic risk category per mammogram. 119 IBC cases (mean age 57.3 (5.4)) were available with complete retrospective evaluations by radiologists/the AI system. 82 (68.9%) were classified as IBCs without retrospective abnormalities and 37 (31.1%) as potentially missed IBCs. 46.2% of all IBCs received a case score ≥ 25, 25.2% ≥ 50, and 13.4% ≥ 75. Of the 25.2% of the IBCs ≥ 50 (vs. 13.4% of a no breast cancer population), 45.2% had not been discussed during a consensus conference, reflecting 11.4% of all IBC cases. The potentially missed IBCs received significantly higher case scores and risk classifications than IBCs without retrospective abnormalities (case score mean: 54.1 vs. 23.1; high risk: 48.7% vs. 14.7%; p < 0.05). 13.4% of the IBCs without retrospective abnormalities received a case score ≥ 50, of which 62.5% had not been discussed during a consensus conference. An AI system can identify IBC screening mammograms with a higher risk for breast cancer, particularly in potentially missed IBCs but also in some IBCs without retrospective abnormalities where radiologists did not see anything, indicating its ability to improve mammography screening quality. Question AI presents a promising opportunity to enhance breast cancer screening in general, but evidence is missing regarding its ability to reduce interval breast cancers. Findings The AI system detected a high risk of breast cancer in most interval breast cancer screening mammograms where radiologists retrospectively detected abnormalities. Clinical relevance Utilization of an AI system in mammography screening programs can identify breast cancer risk in many interval breast cancer screening mammograms and thus potentially reduce the number of interval breast cancers.

Mammography Classification Breast Retrospective Clinical In Silico

A RF-based end-to-end Breast Cancer Prediction algorithm.

Win KN

•papers•Aug 1 2025

Breast cancer became the primary cause of cancer-related deaths among women year by year. Early detection and accurate prediction of breast cancer play a crucial role in strengthening the quality of human life. Many scientists have concentrated on analyzing and conducting the development of many algorithms and progressing computer-aided diagnosis applications. Whereas many research have been conducted, feature research on cancer diagnosis is rare, especially regarding predicting the desired features by providing and feeding breast cancer features into the system. In this regard, this paper proposed a Breast Cancer Prediction (RF-BCP) algorithm based on Random Forest by taking inputs to predict cancer. For the experiment of the proposed algorithm, two datasets were utilized namely Breast Cancer dataset and a curated mammography dataset, and also compared the accuracy of the proposed algorithm with SVM, Gaussian NB, and KNN algorithms. Experimental results show that the proposed algorithm can predict well and outperform other existing machine learning algorithms to support decision-making.

Mammography Classification Breast Methodology In Silico

Application of Tuning-Ensemble N-Best in Auto-Sklearn for Mammographic Radiomic Analysis for Breast Cancer Prediction.

Ismail FA, Karim MKA, Zaidon SIA, Noor KA

•papers•Jul 31 2025

Breast cancer is a major cause of mortality among women globally. While mammography remains the gold standard for detection, its interpretation is often limited by radiologist variability and the challenge of differentiating benign and malignant lesions. The study explores the use of Auto- Sklearn, an automated machine learning (AutoML) framework, for breast tumor classification based on mammographic radiomic features. 244 mammographic images were enhanced using Contrast Limited Adaptive Histogram Equalization (CLAHE) and segmented with Active Contour Method (ACM). Thirty-seven radiomic features, including first-order statistics, Gray-Level Co-occurance Matrix (GLCM) texture and shape features were extracted and standardized. Auto-Sklearn was employed to automate model selection, hyperparameter tuning and ensemble construction. The dataset was divided into 80% training and 20% testing set. The initial Auto-Sklearn model achieved an 88.71% accuracy on the training set and 55.10% on the testing sets. After the resampling strategy was applied, the accuracy for the training set and testing set increased to 95.26% and 76.16%, respectively. The Receiver Operating Curve and Area Under Curve (ROC-AUC) for the standard and resampling strategy of Auto-Sklearn were 0.660 and 0.840, outperforming conventional models, demonstrating its efficiency in automating radiomic classification tasks. The findings underscore Auto-Sklearn's ability to automate and enhance tumor classification performance using handcrafted radiomic features. Limitations include dataset size and absence of clinical metadata. This study highlights the application of Auto-Sklearn as a scalable, automated and clinically relevant tool for breast cancer classification using mammographic radiomics.