Sort by:
Page 4 of 32317 results

Generating Synthetic MR Spectroscopic Imaging Data with Generative Adversarial Networks to Train Machine Learning Models.

Maruyama S, Takeshima H

pubmed logopapersSep 26 2025
To develop a new method to generate synthetic MR spectroscopic imaging (MRSI) data for training machine learning models. This study targeted routine MRI examination protocols with single voxel spectroscopy (SVS). A novel model derived from pix2pix generative adversarial networks was proposed to generate synthetic MRSI data using MRI and SVS data as inputs. T1- and T2-weighted, SVS, and reference MRSI data were acquired from healthy brains with clinically available sequences. The proposed model was trained to generate synthetic MRSI data. Quantitative evaluation involved the calculation of the mean squared error (MSE) against the reference and metabolite ratio value. The effect of the location of and the number of the SVS data on the quality of the synthetic MRSI data was investigated using the MSE. The synthetic MRSI data generated from the proposed model were visually closer to the reference. The 95% confidence interval (CI) of the metabolite ratio value of synthetic MRSI data overlapped with the reference for seven of eight metabolite ratios. The MSEs tended to be lower in the same location than in different locations. The MSEs among groups of numbers of SVS data were not significantly different. A new method was developed to generate MRSI data by integrating MRI and SVS data. Our method can potentially increase the volume of MRSI data training for other machine learning models by adding SVS acquisition to routine MRI examinations.

Hemorica: A Comprehensive CT Scan Dataset for Automated Brain Hemorrhage Classification, Segmentation, and Detection

Kasra Davoodi, Mohammad Hoseyni, Javad Khoramdel, Reza Barati, Reihaneh Mortazavi, Amirhossein Nikoofard, Mahdi Aliyari-Shoorehdeli, Jaber Hatam Parikhan

arxiv logopreprintSep 26 2025
Timely diagnosis of Intracranial hemorrhage (ICH) on Computed Tomography (CT) scans remains a clinical priority, yet the development of robust Artificial Intelligence (AI) solutions is still hindered by fragmented public data. To close this gap, we introduce Hemorica, a publicly available collection of 372 head CT examinations acquired between 2012 and 2024. Each scan has been exhaustively annotated for five ICH subtypes-epidural (EPH), subdural (SDH), subarachnoid (SAH), intraparenchymal (IPH), and intraventricular (IVH)-yielding patient-wise and slice-wise classification labels, subtype-specific bounding boxes, two-dimensional pixel masks and three-dimensional voxel masks. A double-reading workflow, preceded by a pilot consensus phase and supported by neurosurgeon adjudication, maintained low inter-rater variability. Comprehensive statistical analysis confirms the clinical realism of the dataset. To establish reference baselines, standard convolutional and transformer architectures were fine-tuned for binary slice classification and hemorrhage segmentation. With only minimal fine-tuning, lightweight models such as MobileViT-XS achieved an F1 score of 87.8% in binary classification, whereas a U-Net with a DenseNet161 encoder reached a Dice score of 85.5% for binary lesion segmentation that validate both the quality of the annotations and the sufficiency of the sample size. Hemorica therefore offers a unified, fine-grained benchmark that supports multi-task and curriculum learning, facilitates transfer to larger but weakly labelled cohorts, and facilitates the process of designing an AI-based assistant for ICH detection and quantification systems.

Brain Tumor Classification from MRI Scans via Transfer Learning and Enhanced Feature Representation

Ahta-Shamul Hoque Emran, Hafija Akter, Abdullah Al Shiam, Abu Saleh Musa Miah, Anichur Rahman, Fahmid Al Farid, Hezerul Abdul Karim

arxiv logopreprintSep 26 2025
Brain tumors are abnormal cell growths in the central nervous system (CNS), and their timely detection is critical for improving patient outcomes. This paper proposes an automatic and efficient deep-learning framework for brain tumor detection from magnetic resonance imaging (MRI) scans. The framework employs a pre-trained ResNet50 model for feature extraction, followed by Global Average Pooling (GAP) and linear projection to obtain compact, high-level image representations. These features are then processed by a novel Dense-Dropout sequence, a core contribution of this work, which enhances non-linear feature learning, reduces overfitting, and improves robustness through diverse feature transformations. Another major contribution is the creation of the Mymensingh Medical College Brain Tumor (MMCBT) dataset, designed to address the lack of reliable brain tumor MRI resources. The dataset comprises MRI scans from 209 subjects (ages 9 to 65), including 3671 tumor and 13273 non-tumor images, all clinically verified under expert supervision. To overcome class imbalance, the tumor class was augmented, resulting in a balanced dataset well-suited for deep learning research.

An open deep learning-based framework and model for tooth instance segmentation in dental CBCT.

Zhou Y, Xu Y, Khalil B, Nalley A, Tarce M

pubmed logopapersSep 25 2025
Current dental CBCT segmentation tools often lack accuracy, accessibility, or comprehensive anatomical coverage. To address this, we constructed a densely annotated dental CBCT dataset and developed a deep learning model, OraSeg, for tooth-level instance segmentation, which is then deployed as a one-click tool and made freely accessible for non-commercial use. We established a standardized annotated dataset covering 35 key oral anatomical structures and employed UNetR as the backbone network, combining Swin Transformer and the spatial Mamba module for multi-scale residual feature fusion. The OralSeg model was designed and optimized for precise instance segmentation of dental CBCT images, and integrated into the 3D Slicer platform, providing a graphical user interface for one-click segmentation. OralSeg had a Dice similarity coefficient of 0.8316 ± 0.0305 on CBCT instance segmentation compared to SwinUNETR and 3D U-Net. The model significantly improves segmentation performance, especially in complex oral anatomical structures, such as apical areas, alveolar bone margins, and mandibular nerve canals. The OralSeg model presented in this study provides an effective solution for instance segmentation of dental CBCT images. The tool allows clinical dentists and researchers with no AI background to perform one-click segmentation, and may be applicable in various clinical and research contexts. OralSeg can offer researchers and clinicians a user-friendly tool for tooth-level instance segmentation, which may assist in clinical diagnosis, educational training, and research, and contribute to the broader adoption of digital dentistry in precision medicine.

Automated and Interpretable Survival Analysis from Multimodal Data

Mafalda Malafaia, Peter A. N. Bosman, Coen Rasch, Tanja Alderliesten

arxiv logopreprintSep 25 2025
Accurate and interpretable survival analysis remains a core challenge in oncology. With growing multimodal data and the clinical need for transparent models to support validation and trust, this challenge increases in complexity. We propose an interpretable multimodal AI framework to automate survival analysis by integrating clinical variables and computed tomography imaging. Our MultiFIX-based framework uses deep learning to infer survival-relevant features that are further explained: imaging features are interpreted via Grad-CAM, while clinical variables are modeled as symbolic expressions through genetic programming. Risk estimation employs a transparent Cox regression, enabling stratification into groups with distinct survival outcomes. Using the open-source RADCURE dataset for head and neck cancer, MultiFIX achieves a C-index of 0.838 (prediction) and 0.826 (stratification), outperforming the clinical and academic baseline approaches and aligning with known prognostic markers. These results highlight the promise of interpretable multimodal AI for precision oncology with MultiFIX.

Knowledge distillation and teacher-student learning in medical imaging: Comprehensive overview, pivotal role, and future directions.

Li X, Li L, Li M, Yan P, Feng T, Luo H, Zhao Y, Yin S

pubmed logopapersSep 25 2025
Knowledge Distillation (KD) is a technique to transfer the knowledge from a complex model to a simplified model. It has been widely used in natural language processing and computer vision and has achieved advanced results. Recently, the research of KD in medical image analysis has grown rapidly. The definition of knowledge has been further expanded by combining with the medical field, and its role is not limited to simplifying the model. This paper attempts to comprehensively review the development and application of KD in the medical imaging field. Specifically, we first introduce the basic principles, explain the definition of knowledge and the classical teacher-student network framework. Then, the research progress in medical image classification, segmentation, detection, reconstruction, registration, radiology report generation, privacy protection and other application scenarios is presented. In particular, the introduction of application scenarios is based on the role of KD. We summarize eight main roles of KD techniques in medical image analysis, including model compression, semi-supervised method, weakly supervised method, class balancing, etc. The performance of these roles in all application scenarios is analyzed. Finally, we discuss the challenges in this field and propose potential solutions. KD is still in a rapid development stage in the medical imaging field, we give five potential development directions and research hotspots. A comprehensive literature list of this survey is available at https://github.com/XiangQA-Q/KD-in-MIA.

A Versatile Foundation Model for AI-enabled Mammogram Interpretation

Fuxiang Huang, Jiayi Zhu, Yunfang Yu, Yu Xie, Yuan Guo, Qingcong Kong, Mingxiang Wu, Xinrui Jiang, Shu Yang, Jiabo Ma, Ziyi Liu, Zhe Xu, Zhixuan Chen, Yujie Tan, Zifan He, Luhui Mao, Xi Wang, Junlin Hou, Lei Zhang, Qiong Luo, Zhenhui Li, Herui Yao, Hao Chen

arxiv logopreprintSep 24 2025
Breast cancer is the most commonly diagnosed cancer and the leading cause of cancer-related mortality in women globally. Mammography is essential for the early detection and diagnosis of breast lesions. Despite recent progress in foundation models (FMs) for mammogram analysis, their clinical translation remains constrained by several fundamental limitations, including insufficient diversity in training data, limited model generalizability, and a lack of comprehensive evaluation across clinically relevant tasks. Here, we introduce VersaMammo, a versatile foundation model for mammograms, designed to overcome these limitations. We curated the largest multi-institutional mammogram dataset to date, comprising 706,239 images from 21 sources. To improve generalization, we propose a two-stage pre-training strategy to develop VersaMammo, a mammogram foundation model. First, a teacher model is trained via self-supervised learning to extract transferable features from unlabeled mammograms. Then, supervised learning combined with knowledge distillation transfers both features and clinical knowledge into VersaMammo. To ensure a comprehensive evaluation, we established a benchmark comprising 92 specific tasks, including 68 internal tasks and 24 external validation tasks, spanning 5 major clinical task categories: lesion detection, segmentation, classification, image retrieval, and visual question answering. VersaMammo achieves state-of-the-art performance, ranking first in 50 out of 68 specific internal tasks and 20 out of 24 external validation tasks, with average ranks of 1.5 and 1.2, respectively. These results demonstrate its superior generalization and clinical utility, offering a substantial advancement toward reliable and scalable breast cancer screening and diagnosis.

Region-of-Interest Augmentation for Mammography Classification under Patient-Level Cross-Validation

Farbod Bigdeli, Mohsen Mohammadagha, Ali Bigdeli

arxiv logopreprintSep 24 2025
Breast cancer screening with mammography remains central to early detection and mortality reduction. Deep learning has shown strong potential for automating mammogram interpretation, yet limited-resolution datasets and small sample sizes continue to restrict performance. We revisit the Mini-DDSM dataset (9,684 images; 2,414 patients) and introduce a lightweight region-of-interest (ROI) augmentation strategy. During training, full images are probabilistically replaced with random ROI crops sampled from a precomputed, label-free bounding-box bank, with optional jitter to increase variability. We evaluate under strict patient-level cross-validation and report ROC-AUC, PR-AUC, and training-time efficiency metrics (throughput and GPU memory). Because ROI augmentation is training-only, inference-time cost remains unchanged. On Mini-DDSM, ROI augmentation (best: p_roi = 0.10, alpha = 0.10) yields modest average ROC-AUC gains, with performance varying across folds; PR-AUC is flat to slightly lower. These results demonstrate that simple, data-centric ROI strategies can enhance mammography classification in constrained settings without requiring additional labels or architectural modifications.

Generating Brain MRI with StyleGAN2-ADA: The Effect of the Training Set Size on the Quality of Synthetic Images.

Lai M, Mascalchi M, Tessa C, Diciotti S

pubmed logopapersSep 23 2025
The potential of deep learning for medical imaging is often constrained by limited data availability. Generative models can unlock this potential by generating synthetic data that reproduces the statistical properties of real data while being more accessible for sharing. In this study, we investigated the influence of training set size on the performance of a state-of-the-art generative adversarial network, the StyleGAN2-ADA, trained on a cohort of 3,227 subjects from the OpenBHB dataset to generate 2D slices of brain MR images from healthy subjects. The quality of the synthetic images was assessed through qualitative evaluations and state-of-the-art quantitative metrics, which are provided in a publicly accessible repository. Our results demonstrate that StyleGAN2-ADA generates realistic and high-quality images, deceiving even expert radiologists while preserving privacy, as it did not memorize training images. Notably, increasing the training set size led to slight improvements in fidelity metrics. However, training set size had no noticeable impact on diversity metrics, highlighting the persistent limitation of mode collapse. Furthermore, we observed that diversity metrics, such as coverage and β-recall, are highly sensitive to the number of synthetic images used in their computation, leading to inflated values when synthetic data significantly outnumber real ones. These findings underscore the need to carefully interpret diversity metrics and the importance of employing complementary evaluation strategies for robust assessment. Overall, while StyleGAN2-ADA shows promise as a tool for generating privacy-preserving synthetic medical images, overcoming diversity limitations will require exploring alternative generative architectures or incorporating additional regularization techniques.

Citrus-V: Advancing Medical Foundation Models with Unified Medical Image Grounding for Clinical Reasoning

Guoxin Wang, Jun Zhao, Xinyi Liu, Yanbo Liu, Xuyang Cao, Chao Li, Zhuoyun Liu, Qintian Sun, Fangru Zhou, Haoqiang Xing, Zhenhong Yang

arxiv logopreprintSep 23 2025
Medical imaging provides critical evidence for clinical diagnosis, treatment planning, and surgical decisions, yet most existing imaging models are narrowly focused and require multiple specialized networks, limiting their generalization. Although large-scale language and multimodal models exhibit strong reasoning and multi-task capabilities, real-world clinical applications demand precise visual grounding, multimodal integration, and chain-of-thought reasoning. We introduce Citrus-V, a multimodal medical foundation model that combines image analysis with textual reasoning. The model integrates detection, segmentation, and multimodal chain-of-thought reasoning, enabling pixel-level lesion localization, structured report generation, and physician-like diagnostic inference in a single framework. We propose a novel multimodal training approach and release a curated open-source data suite covering reasoning, detection, segmentation, and document understanding tasks. Evaluations demonstrate that Citrus-V outperforms existing open-source medical models and expert-level imaging systems across multiple benchmarks, delivering a unified pipeline from visual grounding to clinical reasoning and supporting precise lesion quantification, automated reporting, and reliable second opinions.
Page 4 of 32317 results
Show
per page

Ready to Sharpen Your Edge?

Join hundreds of your peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.