Latest Papers on Radiology AI.

Towards Better Dental AI: A Multimodal Benchmark and Instruction Dataset for Panoramic X-ray Analysis

Jing Hao, Yuxuan Fan, Yanpeng Sun, Kaixin Guo, Lizhuo Lin, Jinrong Yang, Qi Yong H. Ai, Lun M. Wong, Hao Tang, Kuo Feng Hung

•preprint•Sep 11 2025

Recent advances in large vision-language models (LVLMs) have demonstrated strong performance on general-purpose medical tasks. However, their effectiveness in specialized domains such as dentistry remains underexplored. In particular, panoramic X-rays, a widely used imaging modality in oral radiology, pose interpretative challenges due to dense anatomical structures and subtle pathological cues, which are not captured by existing medical benchmarks or instruction datasets. To this end, we introduce MMOral, the first large-scale multimodal instruction dataset and benchmark tailored for panoramic X-ray interpretation. MMOral consists of 20,563 annotated images paired with 1.3 million instruction-following instances across diverse task types, including attribute extraction, report generation, visual question answering, and image-grounded dialogue. In addition, we present MMOral-Bench, a comprehensive evaluation suite covering five key diagnostic dimensions in dentistry. We evaluate 64 LVLMs on MMOral-Bench and find that even the best-performing model, i.e., GPT-4o, only achieves 41.45% accuracy, revealing significant limitations of current models in this domain. To promote the progress of this specific domain, we also propose OralGPT, which conducts supervised fine-tuning (SFT) upon Qwen2.5-VL-7B with our meticulously curated MMOral instruction dataset. Remarkably, a single epoch of SFT yields substantial performance enhancements for LVLMs, e.g., OralGPT demonstrates a 24.73% improvement. Both MMOral and OralGPT hold significant potential as a critical foundation for intelligent dentistry and enable more clinically impactful multimodal AI systems in the dental field. The dataset, model, benchmark, and evaluation suite are available at https://github.com/isbrycee/OralGPT.

X-Ray Report Generation Dataset Release In Silico Academic Lab Open Dataset Open Code Benchmark SOTA

Medverse: A Universal Model for Full-Resolution 3D Medical Image Segmentation, Transformation and Enhancement

Jiesi Hu, Jianfeng Cao, Yanwu Yang, Chenfei Ye, Yixuan Zhang, Hanyang Peng, Ting Ma

•preprint•Sep 11 2025

In-context learning (ICL) offers a promising paradigm for universal medical image analysis, enabling models to perform diverse image processing tasks without retraining. However, current ICL models for medical imaging remain limited in two critical aspects: they cannot simultaneously achieve high-fidelity predictions and global anatomical understanding, and there is no unified model trained across diverse medical imaging tasks (e.g., segmentation and enhancement) and anatomical regions. As a result, the full potential of ICL in medical imaging remains underexplored. Thus, we present \textbf{Medverse}, a universal ICL model for 3D medical imaging, trained on 22 datasets covering diverse tasks in universal image segmentation, transformation, and enhancement across multiple organs, imaging modalities, and clinical centers. Medverse employs a next-scale autoregressive in-context learning framework that progressively refines predictions from coarse to fine, generating consistent, full-resolution volumetric outputs and enabling multi-scale anatomical awareness. We further propose a blockwise cross-attention module that facilitates long-range interactions between context and target inputs while preserving computational efficiency through spatial sparsity. Medverse is extensively evaluated on a broad collection of held-out datasets covering previously unseen clinical centers, organs, species, and imaging modalities. Results demonstrate that Medverse substantially outperforms existing ICL baselines and establishes a novel paradigm for in-context learning. Code and model weights will be made publicly available. Our model are publicly available at https://github.com/jiesihu/Medverse.

Mixed Modality Segmentation Whole Body Methodology In Silico Academic Lab Open Code Benchmark SOTA

Enhancing 3D Medical Image Understanding with Pretraining Aided by 2D Multimodal Large Language Models

Qiuhui Chen, Xuancheng Yao, Huping Ye, Yi Hong

•preprint•Sep 11 2025

Understanding 3D medical image volumes is critical in the medical field, yet existing 3D medical convolution and transformer-based self-supervised learning (SSL) methods often lack deep semantic comprehension. Recent advancements in multimodal large language models (MLLMs) provide a promising approach to enhance image understanding through text descriptions. To leverage these 2D MLLMs for improved 3D medical image understanding, we propose Med3DInsight, a novel pretraining framework that integrates 3D image encoders with 2D MLLMs via a specially designed plane-slice-aware transformer module. Additionally, our model employs a partial optimal transport based alignment, demonstrating greater tolerance to noise introduced by potential noises in LLM-generated content. Med3DInsight introduces a new paradigm for scalable multimodal 3D medical representation learning without requiring human annotations. Extensive experiments demonstrate our state-of-the-art performance on two downstream tasks, i.e., segmentation and classification, across various public datasets with CT and MRI modalities, outperforming current SSL methods. Med3DInsight can be seamlessly integrated into existing 3D medical image understanding networks, potentially enhancing their performance. Our source code, generated datasets, and pre-trained models will be available at https://github.com/Qybc/Med3DInsight.

Mixed Modality Segmentation Methodology In Silico Academic Lab Open Code Benchmark SOTA

FlexiD-Fuse: Flexible number of inputs multi-modal medical image fusion based on diffusion model

Yushen Xu, Xiaosong Li, Yuchun Wang, Xiaoqi Cheng, Huafeng Li, Haishu Tan

•preprint•Sep 11 2025

Different modalities of medical images provide unique physiological and anatomical information for diseases. Multi-modal medical image fusion integrates useful information from different complementary medical images with different modalities, producing a fused image that comprehensively and objectively reflects lesion characteristics to assist doctors in clinical diagnosis. However, existing fusion methods can only handle a fixed number of modality inputs, such as accepting only two-modal or tri-modal inputs, and cannot directly process varying input quantities, which hinders their application in clinical settings. To tackle this issue, we introduce FlexiD-Fuse, a diffusion-based image fusion network designed to accommodate flexible quantities of input modalities. It can end-to-end process two-modal and tri-modal medical image fusion under the same weight. FlexiD-Fuse transforms the diffusion fusion problem, which supports only fixed-condition inputs, into a maximum likelihood estimation problem based on the diffusion process and hierarchical Bayesian modeling. By incorporating the Expectation-Maximization algorithm into the diffusion sampling iteration process, FlexiD-Fuse can generate high-quality fused images with cross-modal information from source images, independently of the number of input images. We compared the latest two and tri-modal medical image fusion methods, tested them on Harvard datasets, and evaluated them using nine popular metrics. The experimental results show that our method achieves the best performance in medical image fusion with varying inputs. Meanwhile, we conducted extensive extension experiments on infrared-visible, multi-exposure, and multi-focus image fusion tasks with arbitrary numbers, and compared them with the perspective SOTA methods. The results of the extension experiments consistently demonstrate the effectiveness and superiority of our method.

Mixed Modality Image Synthesis Methodology In Silico

MetaLLMix : An XAI Aided LLM-Meta-learning Based Approach for Hyper-parameters Optimization

Mohammed Tiouti, Mohamed Bal-Ghaoui

•preprint•Sep 11 2025

Effective model and hyperparameter selection remains a major challenge in deep learning, often requiring extensive expertise and computation. While AutoML and large language models (LLMs) promise automation, current LLM-based approaches rely on trial and error and expensive APIs, which provide limited interpretability and generalizability. We propose MetaLLMiX, a zero-shot hyperparameter optimization framework combining meta-learning, explainable AI, and efficient LLM reasoning. By leveraging historical experiment outcomes with SHAP explanations, MetaLLMiX recommends optimal hyperparameters and pretrained models without additional trials. We further employ an LLM-as-judge evaluation to control output format, accuracy, and completeness. Experiments on eight medical imaging datasets using nine open-source lightweight LLMs show that MetaLLMiX achieves competitive or superior performance to traditional HPO methods while drastically reducing computational cost. Our local deployment outperforms prior API-based approaches, achieving optimal results on 5 of 8 tasks, response time reductions of 99.6-99.9%, and the fastest training times on 6 datasets (2.4-15.7x faster), maintaining accuracy within 1-5% of best-performing baselines.

Mixed Modality Classification Methodology In Silico Academic Lab Benchmark SOTA Open Code

Virtual staining for 3D X-ray histology of bone implants

Sarah C. Irvine, Christian Lucas, Diana Krüger, Bianca Guedert, Julian Moosmann, Berit Zeller-Plumhoff

•preprint•Sep 11 2025

Three-dimensional X-ray histology techniques offer a non-invasive alternative to conventional 2D histology, enabling volumetric imaging of biological tissues without the need for physical sectioning or chemical staining. However, the inherent greyscale image contrast of X-ray tomography limits its biochemical specificity compared to traditional histological stains. Within digital pathology, deep learning-based virtual staining has demonstrated utility in simulating stained appearances from label-free optical images. In this study, we extend virtual staining to the X-ray domain by applying cross-modality image translation to generate artificially stained slices from synchrotron-radiation-based micro-CT scans. Using over 50 co-registered image pairs of micro-CT and toluidine blue-stained histology from bone-implant samples, we trained a modified CycleGAN network tailored for limited paired data. Whole slide histology images were downsampled to match the voxel size of the CT data, with on-the-fly data augmentation for patch-based training. The model incorporates pixelwise supervision and greyscale consistency terms, producing histologically realistic colour outputs while preserving high-resolution structural detail. Our method outperformed Pix2Pix and standard CycleGAN baselines across SSIM, PSNR, and LPIPS metrics. Once trained, the model can be applied to full CT volumes to generate virtually stained 3D datasets, enhancing interpretability without additional sample preparation. While features such as new bone formation were able to be reproduced, some variability in the depiction of implant degradation layers highlights the need for further training data and refinement. This work introduces virtual staining to 3D X-ray imaging and offers a scalable route for chemically informative, label-free tissue characterisation in biomedical research.

CT Image Synthesis Musculoskeletal Methodology In Silico Academic Lab GenAI

Training With Local Data Remains Important for Deep Learning MRI Prostate Cancer Detection.

Carere SG, Jewell J, Nasute Fauerbach PV, Emerson DB, Finelli A, Ghai S, Haider MA

•papers•Sep 11 2025

Domain shift has been shown to have a major detrimental effect on AI model performance however prior studies on domain shift for MRI prostate cancer segmentation have been limited to small, or heterogenous cohorts. Our objective was to assess whether prostate cancer segmentation models trained on local MRI data continue to outperform those trained on external data with cohorts exceeding 1000. We simulated a multi-institutional consortium using the public PICAI dataset (PICAI-TRAIN: 1241 exams, PICAI-TEST: 259) and a local dataset (LOCAL-TRAIN: 1400 exams, LOCAL-TEST: 308). IRB approval was obtained and consent waived. We compared nnUNet-v2 models trained on the combined data (CENTRAL-TRAIN) and separately on PICAI-TRAIN and LOCAL-TRAIN. Accuracy was evaluated using the open source PICAI Score on LOCAL-TEST. Significance was tested using bootstrapping. Just 22% (309/1400) of LOCAL-TRAIN exams would be sufficient to match the performance of a model trained on PICAI-TRAIN. The CENTRAL-TRAIN performance was similar to LOCAL-TRAIN performance, with PICAI Scores [95% CI] of 65 [58-71] and 66 [60-72], respectively. Both of these models exceeded the model trained on PICAI-TRAIN alone which had a score of 58 [51-64] (P < .002). Reducing training set size did not alter these relative trends. Domain shift limits MRI prostate cancer segmentation performance even when training with over 1000 exams from 3 external institutions. Use of local data is paramount at these scales.

MRI Segmentation Abdominal Retrospective Clinical In Silico Academic Lab Benchmark SOTA Open Dataset

U-ConvNext: A Robust Approach to Glioma Segmentation in Intraoperative Ultrasound.

Vahdani AM, Rahmani M, Pour-Rashidi A, Ahmadian A, Farnia P

•papers•Sep 11 2025

Intraoperative tumor imaging is critical to achieving maximal safe resection during neurosurgery, especially for low-grade glioma resection. Given the convenience of ultrasound as an intraoperative imaging modality, but also the limitations of the ultrasound modality and the time-consuming process of manual tumor segmentation, we propose a learning-based model for the accurate segmentation of low-grade gliomas in ultrasound images. We developed a novel U-net-based architecture adopting the block architecture of the ConvNext V2 model, titled U-ConvNext, which also incorporates various architectural improvements including global response normalization, fine-tuned kernel sizes, and inception layers. We also adopted the CutMix data augmentation technique for semantic segmentation, aiming for enhanced texture detection. Conformal segmentation, a novel approach to conformal prediction for binary semantic segmentation, was also developed for uncertainty quantification, providing calibrated measures of model uncertainty in a visual format. The proposed models were trained and evaluated on three subsets of images in the RESECT dataset and achieved hold-out test Dice scores of 84.63%, 74.52%, and 90.82% on the "before," "during," and "after" subsets, respectively, which indicates increases of ~ 13-31% compared to the state of the art. Furthermore, external evaluation on the ReMIND dataset indicated a robust performance (dice score of 79.17% [95% CI: 77.82-81.62] and only a moderate decline of < 3% in expected calibration error. Our approach integrates various innovations in model design, model training, and uncertainty quantification, achieving improved results on the segmentation of low-grade glioma in ultrasound images during neurosurgery.

Ultrasound Segmentation Neurological Methodology In Silico Academic Lab Benchmark SOTA

Automatic approach for B-lines detection in lung ultrasound images using You Only Look Once algorithm.

Bottino A, Botrugno C, Casciaro E, Conversano F, Lay-Ekuakille A, Lombardi FA, Morello R, Pisani P, Vetrugno L, Casciaro S

•papers•Sep 11 2025

B-lines are among the key artifact signs observed in Lung Ultrasound (LUS), playing a critical role in differentiating pulmonary diseases and assessing overall lung condition. However, their accurate detection and quantification can be time-consuming and technically challenging, especially for less experienced operators. This study aims to evaluate the performance of a YOLO (You Only Look Once)-based algorithm for the automated detection of B-lines, offering a novel tool to support clinical decision-making. The proposed approach is designed to improve the efficiency and consistency of LUS interpretation, particularly for non-expert practitioners, and to enhance its utility in guiding respiratory management. In this observational agreement study, 644 images from both anonymized internal and clinical online database were evaluated. After a quality selection step, 386 images remained available for analysis from 46 patients. Ground truth was established by blinded expert sonographer identifying B-lines within rectangular Region Of Interest (ROI) on each frame. Algorithm performances were assessed through Precision, Recall and F1 Score, whereas to quantify the agreement between the YOLO-based algorithm and the expert operator, weighted kappa (kw) statistics were employed. The algorithm achieved a precision of 0.92 (95% CI 0.89-0.94), recall of 0.81 (95% CI 0.77-0.85), and F1-score of 0.86 (95% CI 0.83-0.88). The weighted kappa was 0.68 (95% CI 0.64-0.72), indicating substantial agreement algorithm and expert annotations. The proposed algorithm has demonstrated its potential to significantly enhance diagnostic support by accurately detecting B-lines in LUS images.

Ultrasound Detection Chest Retrospective Clinical In Silico Academic Lab

Breast cancer risk assessment for screening: a hybrid artificial intelligence approach.

Tendero R, Larroza A, Pérez-Benito FJ, Perez-Cortes JC, Román M, Llobet R

•papers•Sep 11 2025

This study evaluates whether integrating clinical data with mammographic features using artificial intelligence (AI) improves 2-year breast cancer risk prediction compared to using either data type alone. This retrospective nested case-control study included 2193 women (mean age, 59 ± 5 years) screened at Hospital del Mar, Spain (2013-2020), with 418 cases (mammograms taken 2 years before diagnosis) and 1775 controls (cancer-free for ≥ 2 years). Three models were evaluated: (1) ERTpd + im, based on Extremely Randomized Trees (ERT), split into sub-models for personal data (ERTpd) and image features (ERTim); (2) an image-only model (CNN); and (3) a hybrid model (ERTpd + im + CNN). Five-fold cross-validation, area under the receiver operating characteristic curve (AUC), bootstrapping for confidence intervals, and DeLong tests for paired data assessed performance. Robustness was evaluated across breast density quartiles and detection type (screen-detected vs. interval cancers). The hybrid model achieved an AUC of 0.75 (95% CI: 0.71-0.76), significantly outperforming the CNN model (AUC, 0.74; 95% CI: 0.70-0.75; p < 0.05) and slightly surpassing ERTpd + im (AUC, 0.74; 95% CI: 0.70-0.76). Sub-models ERTpd and ERTim had AUCs of 0.59 and 0.73, respectively. The hybrid model performed consistently across breast density quartiles (p > 0.05) and better for screen-detected (AUC, 0.79) than interval cancers (AUC, 0.59; p < 0.001). This study shows that integrating clinical and mammographic data with AI improves 2-year breast cancer risk prediction, outperforming single-source models. The hybrid model demonstrated higher accuracy and robustness across breast density quartiles, with better performance for screen-detected cancers. Question Current breast cancer risk models have limitations in accuracy. Can integrating clinical and mammographic data using artificial intelligence (AI) improve short-term risk prediction? Findings A hybrid model combining clinical and imaging data achieved the highest accuracy in predicting 2-year breast cancer risk, outperforming models using either data type alone. Clinical relevance Integrating clinical and mammographic data with AI improves breast cancer risk prediction. This approach enables personalized screening strategies and supports early detection. It helps identify high-risk women and optimizes the use of additional assessments within screening programs.

Mammography Classification Breast Retrospective Clinical In Silico

Filter Papers

Tags

Towards Better Dental AI: A Multimodal Benchmark and Instruction Dataset for Panoramic X-ray Analysis

Medverse: A Universal Model for Full-Resolution 3D Medical Image Segmentation, Transformation and Enhancement

Enhancing 3D Medical Image Understanding with Pretraining Aided by 2D Multimodal Large Language Models

FlexiD-Fuse: Flexible number of inputs multi-modal medical image fusion based on diffusion model

MetaLLMix : An XAI Aided LLM-Meta-learning Based Approach for Hyper-parameters Optimization

Virtual staining for 3D X-ray histology of bone implants

Training With Local Data Remains Important for Deep Learning MRI Prostate Cancer Detection.

U-ConvNext: A Robust Approach to Glioma Segmentation in Intraoperative Ultrasound.

Automatic approach for B-lines detection in lung ultrasound images using You Only Look Once algorithm.

Breast cancer risk assessment for screening: a hybrid artificial intelligence approach.

Ready to Sharpen Your Edge?