Latest Papers on Radiology AI. Tags: Other, Order: Best Match, Limit: 10.

Area detection improves the person-based performance of a deep learning system for classifying the presence of carotid artery calcifications on panoramic radiographs.

Kuwada C, Mitsuya Y, Fukuda M, Yang S, Kise Y, Mori M, Naitoh M, Ariji Y, Ariji E

•papers•Jul 22 2025

This study investigated deep learning (DL) systems for diagnosing carotid artery calcifications (CAC) on panoramic radiographs. To this end, two DL systems, one with preceding and one with simultaneous area detection functions, were developed to classify CAC on panoramic radiographs, and their person-based classification performances were compared with that of a DL model directly created using entire panoramic radiographs. A total of 580 panoramic radiographs from 290 patients (with CAC) and 290 controls (without CAC) were used to create and evaluate the DL systems. Two convolutional neural networks, GoogLeNet and YOLOv7, were utilized. The following three systems were created: (1) direct classification of entire panoramic images (System 1), (2) preceding region-of-interest (ROI) detection followed by classification (System 2), and (3) simultaneous ROI detection and classification (System 3). Person-based evaluation using the same test data was performed to compare the three systems. A side-based (left and right sides of participants) evaluation was also performed on Systems 2 and 3. Between-system differences in area under the receiver-operating characteristics curve (AUC) were assessed using DeLong's test. For the side-based evaluation, the AUCs of Systems 2 and 3 were 0.89 and 0.84, respectively, and in the person-based evaluation, Systems 2 and 3 had significantly higher AUC values of 0.86 and 0.90, respectively, compared with System 1 (P < 0.001). No significant difference was found between Systems 2 and 3. Preceding or simultaneous use of area detection improved the person-based performance of DL for classifying the presence of CAC on panoramic radiographs.

X-Ray Detection Retrospective Clinical In Silico Academic Lab

AURA: A Multi-Modal Medical Agent for Understanding, Reasoning & Annotation

Nima Fathi, Amar Kumar, Tal Arbel

•preprint•Jul 22 2025

Recent advancements in Large Language Models (LLMs) have catalyzed a paradigm shift from static prediction systems to agentic AI agents capable of reasoning, interacting with tools, and adapting to complex tasks. While LLM-based agentic systems have shown promise across many domains, their application to medical imaging remains in its infancy. In this work, we introduce AURA, the first visual linguistic explainability agent designed specifically for comprehensive analysis, explanation, and evaluation of medical images. By enabling dynamic interactions, contextual explanations, and hypothesis testing, AURA represents a significant advancement toward more transparent, adaptable, and clinically aligned AI systems. We highlight the promise of agentic AI in transforming medical image analysis from static predictions to interactive decision support. Leveraging Qwen-32B, an LLM-based architecture, AURA integrates a modular toolbox comprising: (i) a segmentation suite with phase grounding, pathology segmentation, and anatomy segmentation to localize clinically meaningful regions; (ii) a counterfactual image-generation module that supports reasoning through image-level explanations; and (iii) a set of evaluation tools including pixel-wise difference-map analysis, classification, and advanced state-of-the-art components to assess diagnostic relevance and visual interpretability.

Mixed Modality Classification Methodology Concept Academic Lab GenAI Breakthrough

MLRU++: Multiscale Lightweight Residual UNETR++ with Attention for Efficient 3D Medical Image Segmentation

Nand Kumar Yadav, Rodrigue Rizk, William CW Chen, KC

•preprint•Jul 22 2025

Accurate and efficient medical image segmentation is crucial but challenging due to anatomical variability and high computational demands on volumetric data. Recent hybrid CNN-Transformer architectures achieve state-of-the-art results but add significant complexity. In this paper, we propose MLRU++, a Multiscale Lightweight Residual UNETR++ architecture designed to balance segmentation accuracy and computational efficiency. It introduces two key innovations: a Lightweight Channel and Bottleneck Attention Module (LCBAM) that enhances contextual feature encoding with minimal overhead, and a Multiscale Bottleneck Block (M2B) in the decoder that captures fine-grained details via multi-resolution feature aggregation. Experiments on four publicly available benchmark datasets (Synapse, BTCV, ACDC, and Decathlon Lung) demonstrate that MLRU++ achieves state-of-the-art performance, with average Dice scores of 87.57% (Synapse), 93.00% (ACDC), and 81.12% (Lung). Compared to existing leading models, MLRU++ improves Dice scores by 5.38% and 2.12% on Synapse and ACDC, respectively, while significantly reducing parameter count and computational cost. Ablation studies evaluating LCBAM and M2B further confirm the effectiveness of the proposed architectural components. Results suggest that MLRU++ offers a practical and high-performing solution for 3D medical image segmentation tasks. Source code is available at: https://github.com/1027865/MLRUPP

Mixed Modality Segmentation Methodology In Silico Benchmark SOTA Open Code

Artificial intelligence in thyroid eye disease imaging: A systematic review.

Zhang H, Li Z, Chan HC, Song X, Zhou H, Fan X

•papers•Jul 22 2025

Thyroid eye disease (TED) is a common, complex orbital disorder characterized by soft-tissue changes visible on imaging. Artificial intelligence (AI) offers promises for improving TED diagnosis and treatment; however, no systematic review has yet characterized the research landscape, key challenges, and future directions. We followed PRISMA guidelines to search multiple databases until January, 2025, for studies applying AI to computed tomography (CT), magnetic resonance imaging, and nuclear, facial or retinal imaging in TED patients. Using the APPRAISE-AI tool, we assessed study quality and included 41 studies covering various AI applications. Sample sizes ranged from 33 to 2,288 participants, predominantly East Asian. CT and facial imaging were the most common modalities, reported in 16 and 13 articles, respectively. Studies addressed clinical tasks-diagnosis, activity assessment, severity grading, and treatment prediction-and technical tasks-classification, segmentation, and image generation-with classification being the most frequent. Researchers primarily employed deep-learning models, such as residual network (ResNet) and Visual Geometry Group (VGG). Overall, the majority of the studies were of moderate quality. Image-based AI shows strong potential to improve diagnostic accuracy and guide personalized treatment strategies in TED. Future research should prioritize robust study designs, the creation of public datasets, multimodal imaging integration, and interdisciplinary collaboration to accelerate clinical translation.

Mixed Modality Classification Review Concept Academic Lab

LA-Seg: Disentangled sinogram pattern-guided transformer for lesion segmentation in limited-angle computed tomography.

Yoon JH, Lee YJ, Yoo SB

•papers•Jul 21 2025

Limited-angle computed tomography (LACT) offers patient-friendly benefits, such as rapid scanning and reduced radiation exposure. However, the incompleteness of data in LACT often causes notable artifacts, posing challenges for precise medical interpretation. Although numerous approaches have been introduced to reconstruct LACT images into complete computed tomography (CT) scans, they focus on improving image quality and operate separately from lesion segmentation models, often overlooking essential lesion-specific information. This is because reconstruction models are primarily optimized to satisfy overall image quality rather than local lesion-specific regions, in a non-end-to-end setup where each component is optimized independently and may not contribute to reaching the global minimum of the overall objective function. To address this problem, we propose LA-Seg, a transformer-based segmentation model using the sinogram domain of LACT data. The LA-Seg method uses an auxiliary reconstruction task to estimates incomplete sinogram regions to enhance segmentation robustness. Applying transformers adapted from video prediction models captures the spatial structure and sequential patterns in sinograms and reconstructs features in incomplete regions using a disentangled representation guided by distinctive patterns. We propose contrastive abnormal feature loss to distinguish between normal and abnormal regions better. The experimental results demonstrate that LA-Seg consistently surpasses existing medical segmentation approaches in diverse LACT conditions. The source code is provided at https://github.com/jhyoon964/LA-Seg.

CT Segmentation Methodology In Silico Academic Lab Open Code

Lightweight Network Enhancing High-Resolution Feature Representation for Efficient Low Dose CT Denoising.

Li J, Li Y, Qi F, Wang S, Zhang Z, Huang Z, Yu Z

•papers•Jul 21 2025

Low-dose computed tomography plays a crucial role in reducing radiation exposure in clinical imaging, however, the resultant noise significantly impacts image quality and diagnostic precision. Recent transformer-based models have demonstrated strong denoising capabilities but are often constrained by high computational complexity. To overcome these limitations, we propose AMFA-Net, an adaptive multi-order feature aggregation network that provides a lightweight architecture for enhancing highresolution feature representation in low-dose CT imaging. AMFA-Net effectively integrates local and global contexts within high-resolution feature maps while learning discriminative representations through multi-order context aggregation. We introduce an agent-based self-attention crossshaped window transformer block that efficiently captures global context in high-resolution feature maps, which is subsequently fused with backbone features to preserve critical structural information. Our approach employs multiorder gated aggregation to adaptively guide the network in capturing expressive interactions that may be overlooked in fused features, thereby producing robust representations for denoised image reconstruction. Experiments on two challenging public datasets with 25% and 10% full-dose CT image quality demonstrate that our method surpasses state-of-the-art approaches in denoising performance with low computational cost, highlighting its potential for realtime medical applications.

CT Reconstruction Methodology In Silico Academic Lab Benchmark SOTA

LLM-driven Medical Report Generation via Communication-efficient Heterogeneous Federated Learning.

Che H, Jin H, Gu Z, Lin Y, Jin C, Chen H

•papers•Jul 21 2025

Large Language Models (LLMs) have demonstrated significant potential in Medical Report Generation (MRG), yet their development requires large amounts of medical image-report pairs, which are commonly scattered across multiple centers. Centralizing these data is exceptionally challenging due to privacy regulations, thereby impeding model development and broader adoption of LLM-driven MRG models. To address this challenge, we present FedMRG, the first framework that leverages Federated Learning (FL) to enable privacy-preserving, multi-center development of LLM-driven MRG models, specifically designed to overcome the critical challenge of communication-efficient LLM training under multi-modal data heterogeneity. To start with, our framework tackles the fundamental challenge of communication overhead in federated LLM tuning by employing low-rank factorization to efficiently decompose parameter updates, significantly reducing gradient transmission costs and making LLM-driven MRG feasible in bandwidth-constrained FL settings. Furthermore, we observed the dual heterogeneity in MRG under the FL scenario: varying image characteristics across medical centers, as well as diverse reporting styles and terminology preferences. To address the data heterogeneity, we further enhance FedMRG with (1) client-aware contrastive learning in the MRG encoder, coupled with diagnosis-driven prompts, which capture both globally generalizable and locally distinctive features while maintaining diagnostic accuracy; and (2) a dual-adapter mutual boosting mechanism in the MRG decoder that harmonizes generic and specialized adapters to address variations in reporting styles and terminology. Through extensive evaluation of our established FL-MRG benchmark, we demonstrate the generalizability and adaptability of FedMRG, underscoring its potential in harnessing multi-center data and generating clinically accurate reports while maintaining communication efficiency.

Mixed Modality Report Generation Methodology In Silico Academic Lab GenAI

Noninvasive Deep Learning System for Preoperative Diagnosis of Follicular-Like Thyroid Neoplasms Using Ultrasound Images: A Multicenter, Retrospective Study.

Shen H, Huang Y, Yan W, Zhang C, Liang T, Yang D, Feng X, Liu S, Wang Y, Cao W, Cheng Y, Chen H, Ni Q, Wang F, You J, Jin Z, He W, Sun J, Yang D, Liu L, Cao B, Zhang X, Li Y, Pei S, Zhang S, Zhang B

•papers•Jul 21 2025

To propose a deep learning (DL) system for the preoperative diagnosis of follicular-like thyroid neoplasms (FNs) using routine ultrasound images. Preoperative diagnosis of malignancy in nodules suspicious for an FN remains challenging. Ultrasound, fine-needle aspiration cytology, and intraoperative frozen section pathology cannot unambiguously distinguish between benign and malignant FNs, leading to unnecessary biopsies and operations in benign nodules. This multicenter, retrospective study included 3634 patients who underwent ultrasound and received a definite diagnosis of FN from 11 centers, comprising thyroid follicular adenoma (n=1748), follicular carcinoma (n=299), and follicular variant of papillary thyroid carcinoma (n=1587). Four DL models including Inception-v3, ResNet50, Inception-ResNet-v2, and DenseNet161 were constructed on a training set (n=2587, 6178 images) and were verified on an internal validation set (n=648, 1633 images) and an external validation set (n=399, 847 images). The diagnostic efficacy of the DL models was evaluated against the ACR TI-RADS regarding the area under the curve (AUC), sensitivity, specificity, and unnecessary biopsy rate. When externally validated, the four DL models yielded robust and comparable performance, with AUCs of 82.2%-85.2%, sensitivities of 69.6%-76.0%, and specificities of 84.1%-89.2%, which outperformed the ACR TI-RADS. Compared to ACR TI-RADS, the DL models showed a higher biopsy rate of malignancy (71.6% -79.9% vs 37.7%, P<0.001) and a significantly lower unnecessary FNAB rate (8.5% -12.8% vs 40.7%, P<0.001). This study provides a noninvasive DL tool for accurate preoperative diagnosis of FNs, showing better performance than ACR TI-RADS and reducing unnecessary invasive interventions.

Ultrasound Classification Retrospective Clinical In Silico Academic Lab Benchmark SOTA

SegDT: A Diffusion Transformer-Based Segmentation Model for Medical Imaging

Salah Eddine Bekhouche, Gaby Maroun, Fadi Dornaika, Abdenour Hadid

•preprint•Jul 21 2025

Medical image segmentation is crucial for many healthcare tasks, including disease diagnosis and treatment planning. One key area is the segmentation of skin lesions, which is vital for diagnosing skin cancer and monitoring patients. In this context, this paper introduces SegDT, a new segmentation model based on diffusion transformer (DiT). SegDT is designed to work on low-cost hardware and incorporates Rectified Flow, which improves the generation quality at reduced inference steps and maintains the flexibility of standard diffusion models. Our method is evaluated on three benchmarking datasets and compared against several existing works, achieving state-of-the-art results while maintaining fast inference speeds. This makes the proposed model appealing for real-world medical applications. This work advances the performance and capabilities of deep learning models in medical image analysis, enabling faster, more accurate diagnostic tools for healthcare professionals. The code is made publicly available at \href{https://github.com/Bekhouche/SegDT}{GitHub}.

Mixed Modality Segmentation Methodology In Silico Academic Lab Benchmark SOTA Open Code

PXseg: automatic tooth segmentation, numbering and abnormal morphology detection based on CBCT and panoramic radiographs.

Wang R, Cheng F, Dai G, Zhang J, Fan C, Yu J, Li J, Jiang F

•papers•Jul 21 2025

PXseg, a novel approach for tooth segmentation, numbering and abnormal morphology detection in panoramic X-ray (PX), was designed and promoted through optimizing annotation and applying pre-training. Derived from multicenter, ctPXs generated from cone beam computed tomography (CBCT) with accurate 3D labels were utilized for pre-training, while conventional PXs (cPXs) with 2D labels were input for training. Visual and statistical analyses were conducted using the internal dataset to assess segmentation and numbering performances of PXseg and compared with the model without ctPX pre-training, while the accuracy of PXseg detecting abnormal teeth was evaluated using the external dataset consisting of cPXs with complex dental diseases. Besides, a diagnostic testing was performed to contrast diagnostic efficiency with and without PXseg's assistance. The DSC and F1-score of PXseg in tooth segmentation reached 0.882 and 0.902, which increased by 4.6% and 4.0% compared to the model without pre-training. For tooth numbering, the F1-score of PXseg reached 0.943 and increased by 2.2%. Based on the promotion in segmentation, the accuracy of abnormal tooth morphology detection exceeded 0.957 and was 4.3% higher. A website was constructed to assist in PX interpretation, and the diagnostic efficiency was greatly enhanced with the assistance of PXseg. The application of accurate labels in ctPX increased the pre-training weight of PXseg and improved the training effect, achieving promotions in tooth segmentation, numbering and abnormal morphology detection. Rapid and accurate results provided by PXseg streamlined the workflow of PX diagnosis, possessing significant clinical application prospect.

Mixed Modality Segmentation Retrospective Clinical In Silico Academic Lab Benchmark SOTA

Area detection improves the person-based performance of a deep learning system for classifying the presence of carotid artery calcifications on panoramic radiographs.

AURA: A Multi-Modal Medical Agent for Understanding, Reasoning & Annotation

MLRU++: Multiscale Lightweight Residual UNETR++ with Attention for Efficient 3D Medical Image Segmentation

Artificial intelligence in thyroid eye disease imaging: A systematic review.

LA-Seg: Disentangled sinogram pattern-guided transformer for lesion segmentation in limited-angle computed tomography.

Lightweight Network Enhancing High-Resolution Feature Representation for Efficient Low Dose CT Denoising.

LLM-driven Medical Report Generation via Communication-efficient Heterogeneous Federated Learning.

Noninvasive Deep Learning System for Preoperative Diagnosis of Follicular-Like Thyroid Neoplasms Using Ultrasound Images: A Multicenter, Retrospective Study.

SegDT: A Diffusion Transformer-Based Segmentation Model for Medical Imaging

PXseg: automatic tooth segmentation, numbering and abnormal morphology detection based on CBCT and panoramic radiographs.

Ready to Sharpen Your Edge?