Sort by:
Page 1 of 11102 results
Next

Mammo-CLIP Dissect: A Framework for Analysing Mammography Concepts in Vision-Language Models

Suaiba Amina Salahuddin, Teresa Dorszewski, Marit Almenning Martiniussen, Tone Hovda, Antonio Portaluri, Solveig Thrun, Michael Kampffmeyer, Elisabeth Wetzer, Kristoffer Wickstrøm, Robert Jenssen

arxiv logopreprintSep 25 2025
Understanding what deep learning (DL) models learn is essential for the safe deployment of artificial intelligence (AI) in clinical settings. While previous work has focused on pixel-based explainability methods, less attention has been paid to the textual concepts learned by these models, which may better reflect the reasoning used by clinicians. We introduce Mammo-CLIP Dissect, the first concept-based explainability framework for systematically dissecting DL vision models trained for mammography. Leveraging a mammography-specific vision-language model (Mammo-CLIP) as a "dissector," our approach labels neurons at specified layers with human-interpretable textual concepts and quantifies their alignment to domain knowledge. Using Mammo-CLIP Dissect, we investigate three key questions: (1) how concept learning differs between DL vision models trained on general image datasets versus mammography-specific datasets; (2) how fine-tuning for downstream mammography tasks affects concept specialisation; and (3) which mammography-relevant concepts remain underrepresented. We show that models trained on mammography data capture more clinically relevant concepts and align more closely with radiologists' workflows than models not trained on mammography data. Fine-tuning for task-specific classification enhances the capture of certain concept categories (e.g., benign calcifications) but can reduce coverage of others (e.g., density-related features), indicating a trade-off between specialisation and generalisation. Our findings show that Mammo-CLIP Dissect provides insights into how convolutional neural networks (CNNs) capture mammography-specific knowledge. By comparing models across training data and fine-tuning regimes, we reveal how domain-specific training and task-specific adaptation shape concept learning. Code and concept set are available: https://github.com/Suaiba/Mammo-CLIP-Dissect.

Region-of-Interest Augmentation for Mammography Classification under Patient-Level Cross-Validation

Farbod Bigdeli, Mohsen Mohammadagha, Ali Bigdeli

arxiv logopreprintSep 24 2025
Breast cancer screening with mammography remains central to early detection and mortality reduction. Deep learning has shown strong potential for automating mammogram interpretation, yet limited-resolution datasets and small sample sizes continue to restrict performance. We revisit the Mini-DDSM dataset (9,684 images; 2,414 patients) and introduce a lightweight region-of-interest (ROI) augmentation strategy. During training, full images are probabilistically replaced with random ROI crops sampled from a precomputed, label-free bounding-box bank, with optional jitter to increase variability. We evaluate under strict patient-level cross-validation and report ROC-AUC, PR-AUC, and training-time efficiency metrics (throughput and GPU memory). Because ROI augmentation is training-only, inference-time cost remains unchanged. On Mini-DDSM, ROI augmentation (best: p_roi = 0.10, alpha = 0.10) yields modest average ROC-AUC gains, with performance varying across folds; PR-AUC is flat to slightly lower. These results demonstrate that simple, data-centric ROI strategies can enhance mammography classification in constrained settings without requiring additional labels or architectural modifications.

A Contrastive Learning Framework for Breast Cancer Detection

Samia Saeed, Khuram Naveed

arxiv logopreprintSep 24 2025
Breast cancer, the second leading cause of cancer-related deaths globally, accounts for a quarter of all cancer cases [1]. To lower this death rate, it is crucial to detect tumors early, as early-stage detection significantly improves treatment outcomes. Advances in non-invasive imaging techniques have made early detection possible through computer-aided detection (CAD) systems which rely on traditional image analysis to identify malignancies. However, there is a growing shift towards deep learning methods due to their superior effectiveness. Despite their potential, deep learning methods often struggle with accuracy due to the limited availability of large-labeled datasets for training. To address this issue, our study introduces a Contrastive Learning (CL) framework, which excels with smaller labeled datasets. In this regard, we train Resnet-50 in semi supervised CL approach using similarity index on a large amount of unlabeled mammogram data. In this regard, we use various augmentation and transformations which help improve the performance of our approach. Finally, we tune our model on a small set of labelled data that outperforms the existing state of the art. Specifically, we observed a 96.7% accuracy in detecting breast cancer on benchmark datasets INbreast and MIAS.

A Versatile Foundation Model for AI-enabled Mammogram Interpretation

Fuxiang Huang, Jiayi Zhu, Yunfang Yu, Yu Xie, Yuan Guo, Qingcong Kong, Mingxiang Wu, Xinrui Jiang, Shu Yang, Jiabo Ma, Ziyi Liu, Zhe Xu, Zhixuan Chen, Yujie Tan, Zifan He, Luhui Mao, Xi Wang, Junlin Hou, Lei Zhang, Qiong Luo, Zhenhui Li, Herui Yao, Hao Chen

arxiv logopreprintSep 24 2025
Breast cancer is the most commonly diagnosed cancer and the leading cause of cancer-related mortality in women globally. Mammography is essential for the early detection and diagnosis of breast lesions. Despite recent progress in foundation models (FMs) for mammogram analysis, their clinical translation remains constrained by several fundamental limitations, including insufficient diversity in training data, limited model generalizability, and a lack of comprehensive evaluation across clinically relevant tasks. Here, we introduce VersaMammo, a versatile foundation model for mammograms, designed to overcome these limitations. We curated the largest multi-institutional mammogram dataset to date, comprising 706,239 images from 21 sources. To improve generalization, we propose a two-stage pre-training strategy to develop VersaMammo, a mammogram foundation model. First, a teacher model is trained via self-supervised learning to extract transferable features from unlabeled mammograms. Then, supervised learning combined with knowledge distillation transfers both features and clinical knowledge into VersaMammo. To ensure a comprehensive evaluation, we established a benchmark comprising 92 specific tasks, including 68 internal tasks and 24 external validation tasks, spanning 5 major clinical task categories: lesion detection, segmentation, classification, image retrieval, and visual question answering. VersaMammo achieves state-of-the-art performance, ranking first in 50 out of 68 specific internal tasks and 20 out of 24 external validation tasks, with average ranks of 1.5 and 1.2, respectively. These results demonstrate its superior generalization and clinical utility, offering a substantial advancement toward reliable and scalable breast cancer screening and diagnosis.

The LongiMam model for improved breast cancer risk prediction using longitudinal mammograms

Manel Rakez, Thomas Louis, Julien Guillaumin, Foucauld Chamming's, Pierre Fillard, Brice Amadeo, Virginie Rondeau

arxiv logopreprintSep 23 2025
Risk-adapted breast cancer screening requires robust models that leverage longitudinal imaging data. Most current deep learning models use single or limited prior mammograms and lack adaptation for real-world settings marked by imbalanced outcome distribution and heterogeneous follow-up. We developed LongiMam, an end-to-end deep learning model that integrates both current and up to four prior mammograms. LongiMam combines a convolutional and a recurrent neural network to capture spatial and temporal patterns predictive of breast cancer. The model was trained and evaluated using a large, population-based screening dataset with disproportionate case-to-control ratio typical of clinical screening. Across several scenarios that varied in the number and composition of prior exams, LongiMam consistently improved prediction when prior mammograms were included. The addition of prior and current visits outperformed single-visit models, while priors alone performed less well, highlighting the importance of combining historical and recent information. Subgroup analyses confirmed the model's efficacy across key risk groups, including women with dense breasts and those aged 55 years or older. Moreover, the model performed best in women with observed changes in mammographic density over time. These findings demonstrate that longitudinal modeling enhances breast cancer prediction and support the use of repeated mammograms to refine risk stratification in screening programs. LongiMam is publicly available as open-source software.

Benign vs malignant tumors classification from tumor outlines in mammography scans using artificial intelligence techniques.

Beni HM, Asaei FY

pubmed logopapersSep 21 2025
Breast cancer is one of the most important causes of death among women due to cancer. With the early diagnosis of this condition, the probability of survival will increase. For this purpose, medical imaging methods, especially mammography, are used for screening and early diagnosis of breast abnormalities. The main goal of this study is to distinguish benign or malignant tumors based on tumor morphology features extracted from tumor outlines extracted from mammography images. Unlike previous studies, this study does not use the mammographic image itself but only extracts the exact outline of the tumor. These outlines were extracted from a new and publicly available mammography database published in 2024. The features outlines were calculated using known pre-trained Convolutional Neural Networks (CNN), including VGG16, ResNet50, Xception65, AlexNet, DenseNet, GoogLeNet, Inception-v3, and a combination of them to improve performance. These pre-trained networks have been used in many studies in various fields. In the classification part, known Machine Learning (ML) algorithms, such as Support Vector Machine (SVM), K-Nearest Neighbor (KNN), Neural Network (NN), Naïve Bayes (NB), Decision Tree (DT), and a combination of them have been compared in outcome measures, namely accuracy, specificity, sensitivity, and precision. Also, with the use of data augmentation, the dataset size was increased about 6-8 times, and the K-fold cross-validation technique (K = 5) was used in this study. Based on the performed simulations, a combination of the features from all pre-trained deep networks and the NB classifier resulted in the best possible outcomes with 88.13 % accuracy, 92.52 % specificity, 83.73 % sensitivity, and 92.04 % precision. Furthermore, validation on DMID dataset using ResNet50 features along with NB classifier, led to 92.03 % accuracy, 95.57 % specificity, 88.49 % sensitivity, and 95.23 % precision. This study sheds light on using AI algorithms to prevent biopsy tests and speed up breast cancer tumor classification using tumor outlines in mammographic images.

Influence of Mammography Acquisition Parameters on AI and Radiologist Interpretive Performance.

Lotter W, Hippe DS, Oshiro T, Lowry KP, Milch HS, Miglioretti DL, Elmore JG, Lee CI, Hsu W

pubmed logopapersSep 17 2025
<i>"Just Accepted" papers have undergone full peer review and have been accepted for publication in <i>Radiology: Artificial Intelligence</i>. This article will undergo copyediting, layout, and proof review before it is published in its final version. Please note that during production of the final copyedited article, errors may be discovered which could affect the content</i>. Purpose To evaluate the impact of screening mammography acquisition parameters on the interpretive performance of AI and radiologists. Materials and Methods The associations between seven mammogram acquisition parameters-mammography machine version, kVp, x-ray exposure delivered, relative x-ray exposure, paddle size, compression force, and breast thickness-and AI and radiologist performance in interpreting two-dimensional screening mammograms acquired by a diverse health system between December 2010 and 2019 were retrospectively evaluated. The top 11 AI models and the ensemble model from the Digital Mammography DREAM Challenge were assessed. The associations between each acquisition parameter and the sensitivity and specificity of the AI models and the radiologists' interpretations were separately evaluated using generalized estimating equations-based models at the examination level, adjusted for several clinical factors. Results The dataset included 28,278 screening two-dimensional mammograms from 22,626 women (mean age 58.5 years ± 11.5 [SD]; 4913 women had multiple mammograms). Of these, 324 examinations resulted in breast cancer diagnosis within 1 year. The acquisition parameters were significantly associated with the performance of both AI and radiologists, with absolute effect sizes reaching 10% for sensitivity and 5% for specificity; however, the associations differed between AI and radiologists for several parameters. Increased exposure delivered reduced the specificity for the ensemble AI (-4.5% per 1 SD increase; <i>P</i> < .001) but not radiologists (<i>P</i> = .44). Increased compression force reduced the specificity for radiologists (-1.3% per 1 SD increase; <i>P</i> < .001) but not for AI (<i>P</i> = .60). Conclusion Screening mammography acquisition parameters impacted the performance of both AI and radiologists, with some parameters impacting performance differently. ©RSNA, 2025.

Predicting cardiovascular events from routine mammograms using machine learning.

Barraclough JY, Gandomkar Z, Fletcher RA, Barbieri S, Kuo NI, Rodgers A, Douglas K, Poppe KK, Woodward M, Luxan BG, Neal B, Jorm L, Brennan P, Arnott C

pubmed logopapersSep 16 2025
Cardiovascular risk is underassessed in women. Many women undergo screening mammography in midlife when the risk of cardiovascular disease rises. Mammographic features such as breast arterial calcification and tissue density are associated with cardiovascular risk. We developed and tested a deep learning algorithm for cardiovascular risk prediction based on routine mammography images. Lifepool is a cohort of women with at least one screening mammogram linked to hospitalisation and death databases. A deep learning model based on DeepSurv architecture was developed to predict major cardiovascular events from mammography images. Model performance was compared against standard risk prediction models using the concordance index, comparative to the Harrells C-statistic. There were 49 196 women included, with a median follow-up of 8.8 years (IQR 7.7-10.6), among whom 3392 experienced a first major cardiovascular event. The DeepSurv model using mammography features and participant age had a concordance index of 0.72 (95% CI 0.71 to 0.73), with similar performance to modern models containing age and clinical variables including the New Zealand 'PREDICT' tool and the American Heart Association 'PREVENT' equations. A deep learning algorithm based on only mammographic features and age predicted cardiovascular risk with performance comparable to traditional cardiovascular risk equations. Risk assessments based on mammography may be a novel opportunity for improving cardiovascular risk screening in women.

Mammographic features in screening mammograms with high AI scores but a true-negative screening result.

Koch HW, Bergan MB, Gjesvik J, Larsen M, Bartsch H, Haldorsen IHS, Hofvind S

pubmed logopapersSep 16 2025
BackgroundThe use of artificial intelligence (AI) in screen-reading of mammograms has shown promising results for cancer detection. However, less attention has been paid to the false positives generated by AI.PurposeTo investigate mammographic features in screening mammograms with high AI scores but a true-negative screening result.Material and MethodsIn this retrospective study, 54,662 screening examinations from BreastScreen Norway 2010-2022 were analyzed with a commercially available AI system (Transpara v. 2.0.0). An AI score of 1-10 indicated the suspiciousness of malignancy. We selected examinations with an AI score of 10, with a true-negative screening result, followed by two consecutive true-negative screening examinations. Of the 2,124 examinations matching these criteria, 382 random examinations underwent blinded consensus review by three experienced breast radiologists. The examinations were classified according to mammographic features, radiologist interpretation score (1-5), and mammographic breast density (BI-RADS 5th ed. a-d).ResultsThe reviews classified 91.1% (348/382) of the examinations as negative (interpretation score 1). All examinations (26/26) categorized as BI-RADS d were given an interpretation score of 1. Classification of mammographic features: asymmetry = 30.6% (117/382); calcifications = 30.1% (115/382); asymmetry with calcifications = 29.3% (112/382); mass = 8.9% (34/382); distortion = 0.8% (3/382); spiculated mass = 0.3% (1/382). For examinations with calcifications, 79.1% (91/115) were classified with benign morphology.ConclusionThe majority of false-positive screening examinations generated by AI were classified as non-suspicious in a retrospective blinded consensus review and would likely not have been recalled for further assessment in a real screening setting using AI as a decision support.

GLAM: Geometry-Guided Local Alignment for Multi-View VLP in Mammography

Yuexi Du, Lihui Chen, Nicha C. Dvornek

arxiv logopreprintSep 12 2025
Mammography screening is an essential tool for early detection of breast cancer. The speed and accuracy of mammography interpretation have the potential to be improved with deep learning methods. However, the development of a foundation visual language model (VLM) is hindered by limited data and domain differences between natural and medical images. Existing mammography VLMs, adapted from natural images, often ignore domain-specific characteristics, such as multi-view relationships in mammography. Unlike radiologists who analyze both views together to process ipsilateral correspondence, current methods treat them as independent images or do not properly model the multi-view correspondence learning, losing critical geometric context and resulting in suboptimal prediction. We propose GLAM: Global and Local Alignment for Multi-view mammography for VLM pretraining using geometry guidance. By leveraging the prior knowledge about the multi-view imaging process of mammograms, our model learns local cross-view alignments and fine-grained local features through joint global and local, visual-visual, and visual-language contrastive learning. Pretrained on EMBED [14], one of the largest open mammography datasets, our model outperforms baselines across multiple datasets under different settings.
Page 1 of 11102 results
Show
per page

Ready to Sharpen Your Edge?

Join hundreds of your peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.