Sort by:
Page 79 of 91901 results

Deep Learning-Based Breast Cancer Detection in Mammography: A Multi-Center Validation Study in Thai Population

Isarun Chamveha, Supphanut Chaiyungyuen, Sasinun Worakriangkrai, Nattawadee Prasawang, Warasinee Chaisangmongkon, Pornpim Korpraphong, Voraparee Suvannarerg, Shanigarn Thiravit, Chalermdej Kannawat, Kewalin Rungsinaporn, Suwara Issaragrisil, Payia Chadbunchachai, Pattiya Gatechumpol, Chawiporn Muktabhant, Patarachai Sereerat

arxiv logopreprintMay 29 2025
This study presents a deep learning system for breast cancer detection in mammography, developed using a modified EfficientNetV2 architecture with enhanced attention mechanisms. The model was trained on mammograms from a major Thai medical center and validated on three distinct datasets: an in-domain test set (9,421 cases), a biopsy-confirmed set (883 cases), and an out-of-domain generalizability set (761 cases) collected from two different hospitals. For cancer detection, the model achieved AUROCs of 0.89, 0.96, and 0.94 on the respective datasets. The system's lesion localization capability, evaluated using metrics including Lesion Localization Fraction (LLF) and Non-Lesion Localization Fraction (NLF), demonstrated robust performance in identifying suspicious regions. Clinical validation through concordance tests showed strong agreement with radiologists: 83.5% classification and 84.0% localization concordance for biopsy-confirmed cases, and 78.1% classification and 79.6% localization concordance for out-of-domain cases. Expert radiologists' acceptance rate also averaged 96.7% for biopsy-confirmed cases, and 89.3% for out-of-domain cases. The system achieved a System Usability Scale score of 74.17 for source hospital, and 69.20 for validation hospitals, indicating good clinical acceptance. These results demonstrate the model's effectiveness in assisting mammogram interpretation, with the potential to enhance breast cancer screening workflows in clinical practice.

CT-denoimer: efficient contextual transformer network for low-dose CT denoising.

Zhang Y, Xu F, Zhang R, Guo Y, Wang H, Wei B, Ma F, Meng J, Liu J, Lu H, Chen Y

pubmed logopapersMay 29 2025
Low-dose computed tomography (LDCT) effectively reduces radiation exposure to patients, but introduces severe noise artifacts that affect diagnostic accuracy. Recently, Transformer-based network architectures have been widely applied to LDCT image denoising, generally achieving superior results compared to traditional convolutional methods. However, these methods are often hindered by high computational costs and struggles in capturing complex local contextual features, which negatively impact denoising performance. In this work, we propose CT-Denoimer, an efficient CT Denoising Transformer network that captures both global correlations and intricate, spatially varying local contextual details in CT images, enabling the generation of high-quality images. The core of our framework is a Transformer module that consists of two key components: the Multi-Dconv head Transposed Attention (MDTA) and the Mixed Contextual Feed-forward Network (MCFN). The MDTA block captures global correlations in the image with linear computational complexity, while the MCFN block manages multi-scale local contextual information, both static and dynamic, through a series of Enhanced Contextual Transformer (eCoT) modules. In addition, we incorporate Operation-Wise Attention Layers (OWALs) to enable collaborative refinement in the proposed CT-Denoimer, enhancing its ability to more effectively handle complex and varying noise patterns in LDCT images. Extensive experimental validation on both the AAPM-Mayo public dataset and a real-world clinical dataset demonstrated the state-of-the-art performance of the proposed CT-Denoimer. It achieved a peak signal-to-noise ratio (PSNR) of 33.681 dB, a structural similarity index measure (SSIM) of 0.921, an information fidelity criterion (IFC) of 2.857 and a visual information fidelity (VIF) of 0.349. Subjective assessment by radiologists gave an average score of 4.39, confirming its clinical applicability and clear advantages over existing methods. This study presents an innovative CT denoising Transformer network that sets a new benchmark in LDCT image denoising, excelling in both noise reduction and fine structure preservation.

Image Aesthetic Reasoning: A New Benchmark for Medical Image Screening with MLLMs

Zheng Sun, Yi Wei, Long Yu

arxiv logopreprintMay 29 2025
Multimodal Large Language Models (MLLMs) are of great application across many domains, such as multimodal understanding and generation. With the development of diffusion models (DM) and unified MLLMs, the performance of image generation has been significantly improved, however, the study of image screening is rare and its performance with MLLMs is unsatisfactory due to the lack of data and the week image aesthetic reasoning ability in MLLMs. In this work, we propose a complete solution to address these problems in terms of data and methodology. For data, we collect a comprehensive medical image screening dataset with 1500+ samples, each sample consists of a medical image, four generated images, and a multiple-choice answer. The dataset evaluates the aesthetic reasoning ability under four aspects: \textit{(1) Appearance Deformation, (2) Principles of Physical Lighting and Shadow, (3) Placement Layout, (4) Extension Rationality}. For methodology, we utilize long chains of thought (CoT) and Group Relative Policy Optimization with Dynamic Proportional Accuracy reward, called DPA-GRPO, to enhance the image aesthetic reasoning ability of MLLMs. Our experimental results reveal that even state-of-the-art closed-source MLLMs, such as GPT-4o and Qwen-VL-Max, exhibit performance akin to random guessing in image aesthetic reasoning. In contrast, by leveraging the reinforcement learning approach, we are able to surpass the score of both large-scale models and leading closed-source models using a much smaller model. We hope our attempt on medical image screening will serve as a regular configuration in image aesthetic reasoning in the future.

Interpreting Chest X-rays Like a Radiologist: A Benchmark with Clinical Reasoning

Jinquan Guan, Qi Chen, Lizhou Liang, Yuhang Liu, Vu Minh Hieu Phan, Minh-Son To, Jian Chen, Yutong Xie

arxiv logopreprintMay 29 2025
Artificial intelligence (AI)-based chest X-ray (CXR) interpretation assistants have demonstrated significant progress and are increasingly being applied in clinical settings. However, contemporary medical AI models often adhere to a simplistic input-to-output paradigm, directly processing an image and an instruction to generate a result, where the instructions may be integral to the model's architecture. This approach overlooks the modeling of the inherent diagnostic reasoning in chest X-ray interpretation. Such reasoning is typically sequential, where each interpretive stage considers the images, the current task, and the contextual information from previous stages. This oversight leads to several shortcomings, including misalignment with clinical scenarios, contextless reasoning, and untraceable errors. To fill this gap, we construct CXRTrek, a new multi-stage visual question answering (VQA) dataset for CXR interpretation. The dataset is designed to explicitly simulate the diagnostic reasoning process employed by radiologists in real-world clinical settings for the first time. CXRTrek covers 8 sequential diagnostic stages, comprising 428,966 samples and over 11 million question-answer (Q&A) pairs, with an average of 26.29 Q&A pairs per sample. Building on the CXRTrek dataset, we propose a new vision-language large model (VLLM), CXRTrekNet, specifically designed to incorporate the clinical reasoning flow into the VLLM framework. CXRTrekNet effectively models the dependencies between diagnostic stages and captures reasoning patterns within the radiological context. Trained on our dataset, the model consistently outperforms existing medical VLLMs on the CXRTrek benchmarks and demonstrates superior generalization across multiple tasks on five diverse external datasets. The dataset and model can be found in our repository (https://github.com/guanjinquan/CXRTrek).

Deep learning enables fast and accurate quantification of MRI-guided near-infrared spectral tomography for breast cancer diagnosis.

Feng J, Tang Y, Lin S, Jiang S, Xu J, Zhang W, Geng M, Dang Y, Wei C, Li Z, Sun Z, Jia K, Pogue BW, Paulsen KD

pubmed logopapersMay 29 2025
The utilization of magnetic resonance (MR) im-aging to guide near-infrared spectral tomography (NIRST) shows significant potential for improving the specificity and sensitivity of breast cancer diagnosis. However, the ef-ficiency and accuracy of NIRST image reconstruction have been limited by the complexities of light propagation mod-eling and MRI image segmentation. To address these chal-lenges, we developed and evaluated a deep learning-based approach for MR-guided 3D NIRST image reconstruction (DL-MRg-NIRST). Using a network trained on synthetic data, the DL-MRg-NIRST system reconstructed images from data acquired during 38 clinical imaging exams of pa-tients with breast abnormalities. Statistical analysis of the results demonstrated a sensitivity of 87.5%, a specificity of 92.9%, and a diagnostic accuracy of 89.5% in distinguishing pathologically defined benign from malignant lesions. Ad-ditionally, the combined use of MRI and DL-MRg-NIRST di-agnoses achieved an area under the receiver operating characteristic (ROC) curve of 0.98. Remarkably, the DL-MRg-NIRST image reconstruction process required only 1.4 seconds, significantly faster than state-of-the-art MR-guided NIRST methods.

Integrating SEResNet101 and SE-VGG19 for advanced cervical lesion detection: a step forward in precision oncology.

Ye Y, Chen Y, Pan J, Li P, Ni F, He H

pubmed logopapersMay 28 2025
Cervical cancer remains a significant global health issue, with accurate differentiation between low-grade (LSIL) and high-grade squamous intraepithelial lesions (HSIL) crucial for effective screening and management. Current methods, such as Pap smears and HPV testing, often fall short in sensitivity and specificity. Deep learning models hold the potential to enhance the accuracy of cervical cancer screening but require thorough evaluation to ascertain their practical utility. This study compares the performance of two advanced deep learning models, SEResNet101 and SE-VGG19, in classifying cervical lesions using a dataset of 3,305 high-quality colposcopy images. We assessed the models based on their accuracy, sensitivity, specificity, and area under the receiver operating characteristic curve (AUC). The SEResNet101 model demonstrated superior performance over SE-VGG19 across all evaluated metrics. Specifically, SEResNet101 achieved a sensitivity of 95%, a specificity of 97%, and an AUC of 0.98, compared to 89% sensitivity, 93% specificity, and an AUC of 0.94 for SE-VGG19. These findings suggest that SEResNet101 could significantly reduce both over- and under-treatment rates by enhancing diagnostic precision. Our results indicate that SEResNet101 offers a promising enhancement over existing screening methods, integrating advanced deep learning algorithms to significantly improve the precision of cervical lesion classification. This study advocates for the inclusion of SEResNet101 in clinical workflows to enhance cervical cancer screening protocols, thereby improving patient outcomes. Future work should focus on multicentric trials to validate these findings and facilitate widespread clinical adoption.

Deep Separable Spatiotemporal Learning for Fast Dynamic Cardiac MRI.

Wang Z, Xiao M, Zhou Y, Wang C, Wu N, Li Y, Gong Y, Chang S, Chen Y, Zhu L, Zhou J, Cai C, Wang H, Jiang X, Guo D, Yang G, Qu X

pubmed logopapersMay 28 2025
Dynamic magnetic resonance imaging (MRI) plays an indispensable role in cardiac diagnosis. To enable fast imaging, the k-space data can be undersampled but the image reconstruction poses a great challenge of high-dimensional processing. This challenge necessitates extensive training data in deep learning reconstruction methods. In this work, we propose a novel and efficient approach, leveraging a dimension-reduced separable learning scheme that can perform exceptionally well even with highly limited training data. We design this new approach by incorporating spatiotemporal priors into the development of a Deep Separable Spatiotemporal Learning network (DeepSSL), which unrolls an iteration process of a 2D spatiotemporal reconstruction model with both temporal lowrankness and spatial sparsity. Intermediate outputs can also be visualized to provide insights into the network behavior and enhance interpretability. Extensive results on cardiac cine datasets demonstrate that the proposed DeepSSL surpasses stateof-the-art methods both visually and quantitatively, while reducing the demand for training cases by up to 75%. Additionally, its preliminary adaptability to unseen cardiac patients has been verified through a blind reader study conducted by experienced radiologists and cardiologists. Furthermore, DeepSSL enhances the accuracy of the downstream task of cardiac segmentation and exhibits robustness in prospectively undersampled real-time cardiac MRI. DeepSSL is efficient under highly limited training data and adaptive to patients and prospective undersampling. This approach holds promise in addressing the escalating demand for high-dimensional data reconstruction in MRI applications.

Large Scale MRI Collection and Segmentation of Cirrhotic Liver.

Jha D, Susladkar OK, Gorade V, Keles E, Antalek M, Seyithanoglu D, Cebeci T, Aktas HE, Kartal GD, Kaymakoglu S, Erturk SM, Velichko Y, Ladner DP, Borhani AA, Medetalibeyoglu A, Durak G, Bagci U

pubmed logopapersMay 28 2025
Liver cirrhosis represents the end stage of chronic liver disease, characterized by extensive fibrosis and nodular regeneration that significantly increases mortality risk. While magnetic resonance imaging (MRI) offers a non-invasive assessment, accurately segmenting cirrhotic livers presents substantial challenges due to morphological alterations and heterogeneous signal characteristics. Deep learning approaches show promise for automating these tasks, but progress has been limited by the absence of large-scale, annotated datasets. Here, we present CirrMRI600+, the first comprehensive dataset comprising 628 high-resolution abdominal MRI scans (310 T1-weighted and 318 T2-weighted sequences, totaling nearly 40,000 annotated slices) with expert-validated segmentation labels for cirrhotic livers. The dataset includes demographic information, clinical parameters, and histopathological validation where available. Additionally, we provide benchmark results from 11 state-of-the-art deep learning experiments to establish performance standards. CirrMRI600+ enables the development and validation of advanced computational methods for cirrhotic liver analysis, potentially accelerating progress toward automated Cirrhosis visual staging and personalized treatment planning.

Automatic identification of Parkinsonism using clinical multi-contrast brain MRI: a large self-supervised vision foundation model strategy.

Suo X, Chen M, Chen L, Luo C, Kemp GJ, Lui S, Sun H

pubmed logopapersMay 27 2025
Valid non-invasive biomarkers for Parkinson's disease (PD) and Parkinson-plus syndrome (PPS) are urgently needed. Based on our recent self-supervised vision foundation model the Shift Window UNET TRansformer (Swin UNETR), which uses clinical multi-contrast whole brain MRI, we aimed to develop an efficient and practical model ('SwinClassifier') for the discrimination of PD vs PPS using routine clinical MRI scans. We used 75,861 clinical head MRI scans including T1-weighted, T2-weighted and fluid attenuated inversion recovery imaging as a pre-training dataset to develop a foundation model, using self-supervised learning with a cross-contrast context recovery task. Then clinical head MRI scans from n = 1992 participants with PD and n = 1989 participants with PPS were used as a downstream PD vs PPS classification dataset. We then assessed SwinClassifier's performance in confusion matrices compared to a comparative self-supervised vanilla Vision Transformer (ViT) autoencoder ('ViTClassifier'), and to two convolutional neural networks (DenseNet121 and ResNet50) trained from scratch. SwinClassifier showed very good performance (F1 score 0.83, 95% confidence interval [CI] [0.79-0.87], AUC 0.89) in PD vs PPS discrimination in independent test datasets (n = 173 participants with PD and n = 165 participants with PPS). This self-supervised classifier with pretrained weights outperformed the ViTClassifier and convolutional classifiers trained from scratch (F1 score 0.77-0.82, AUC 0.83-0.85). Occlusion sensitivity mapping in the correctly-classified cases (n = 160 PD and n = 114 PPS) highlighted the brain regions guiding discrimination mainly in sensorimotor and midline structures including cerebellum, brain stem, ventricle and basal ganglia. Our self-supervised digital model based on routine clinical head MRI discriminated PD vs PPS with good accuracy and sensitivity. With incremental improvements the approach may be diagnostically useful in early disease. National Key Research and Development Program of China.

Improving Breast Cancer Diagnosis in Ultrasound Images Using Deep Learning with Feature Fusion and Attention Mechanism.

Asif S, Yan Y, Feng B, Wang M, Zheng Y, Jiang T, Fu R, Yao J, Lv L, Song M, Sui L, Yin Z, Wang VY, Xu D

pubmed logopapersMay 27 2025
Early detection of malignant lesions in ultrasound images is crucial for effective cancer diagnosis and treatment. While traditional methods rely on radiologists, deep learning models can improve accuracy, reduce errors, and enhance efficiency. This study explores the application of a deep learning model for classifying benign and malignant lesions, focusing on its performance and interpretability. In this study, we proposed a feature fusion-based deep learning model for classifying benign and malignant lesions in ultrasound images. The model leverages advanced architectures such as MobileNetV2 and DenseNet121, enhanced with feature fusion and attention mechanisms to boost classification accuracy. The clinical dataset comprises 2171 images collected from 1758 patients between December 2020 and May 2024. Additionally, we utilized the publicly available BUSI dataset, consisting of 780 images from female patients aged 25 to 75, collected in 2018. To enhance interpretability, we applied Grad-CAM, Saliency Maps, and shapley additive explanations (SHAP) techniques to explain the model's decision-making. A comparative analysis with radiologists of varying expertise levels is also conducted. The proposed model exhibited the highest performance, achieving an AUC of 0.9320 on our private dataset and an area under the curve (AUC) of 0.9834 on the public dataset, significantly outperforming traditional deep convolutional neural network models. It also exceeded the diagnostic performance of radiologists, showcasing its potential as a reliable tool for medical image classification. The model's success can be attributed to its incorporation of advanced architectures, feature fusion, and attention mechanisms. The model's decision-making process was further clarified using interpretability techniques like Grad-CAM, Saliency Maps, and SHAP, offering insights into its ability to focus on relevant image features for accurate classification. The proposed deep learning model offers superior accuracy in classifying benign and malignant lesions in ultrasound images, outperforming traditional models and radiologists. Its strong performance, coupled with interpretability techniques, demonstrates its potential as a reliable and efficient tool for medical diagnostics. The datasets generated and analyzed during the current study are not publicly available due to the nature of this research and participants of this study, but may be available from the corresponding author on reasonable request.
Page 79 of 91901 results
Show
per page

Ready to Sharpen Your Edge?

Join hundreds of your peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.