Latest Papers on Radiology AI. Tags: Other, Order: Best Match, Limit: 10.

Segmentation of temporomandibular joint structures on mri images using neural networks for diagnosis of pathologies

Maksim I. Ivanov, Olga E. Mendybaeva, Yuri E. Karyakin, Igor N. Glukhikh, Aleksey V. Lebedev

•preprint•May 19 2025

This article explores the use of artificial intelligence for the diagnosis of pathologies of the temporomandibular joint (TMJ), in particular, for the segmentation of the articular disc on MRI images. The relevance of the work is due to the high prevalence of TMJ pathologies, as well as the need to improve the accuracy and speed of diagnosis in medical institutions. During the study, the existing solutions (Diagnocat, MandSeg) were analyzed, which, as a result, are not suitable for studying the articular disc due to the orientation towards bone structures. To solve the problem, an original dataset was collected from 94 images with the classes "temporomandibular joint" and "jaw". To increase the amount of data, augmentation methods were used. After that, the models of U-Net, YOLOv8n, YOLOv11n and Roboflow neural networks were trained and compared. The evaluation was carried out according to the Dice Score, Precision, Sensitivity, Specificity, and Mean Average Precision metrics. The results confirm the potential of using the Roboflow model for segmentation of the temporomandibular joint. In the future, it is planned to develop an algorithm for measuring the distance between the jaws and determining the position of the articular disc, which will improve the diagnosis of TMJ pathologies.

MRI Segmentation Other Methodology In Silico

The Role of Machine Learning to Detect Occult Neck Lymph Node Metastases in Early-Stage (T1-T2/N0) Oral Cavity Carcinomas.

Troise S, Ugga L, Esposito M, Positano M, Elefante A, Capasso S, Cuocolo R, Merola R, Committeri U, Abbate V, Bonavolontà P, Nocini R, Dell'Aversana Orabona G

•papers•May 19 2025

Oral cavity carcinomas (OCCs) represent roughly 50% of all head and neck cancers. The risk of occult neck metastases for early-stage OCCs ranges from 15% to 35%, hence the need to develop tools that can support the diagnosis of detecting these neck metastases. Machine learning and radiomic features are emerging as effective tools in this field. Thus, the aim of this study is to demonstrate the effectiveness of radiomic features to predict the risk of occult neck metastases in early-stage (T1-T2/N0) OCCs. Retrospective study. A single-institution analysis (Maxillo-facial Surgery Unit, University of Naples Federico II). A retrospective analysis was conducted on 75 patients surgically treated for early-stage OCC. For all patients, data regarding TNM, in particular pN status after the histopathological examination, have been obtained and the analysis of radiomic features from MRI has been extrapolated. 56 patients confirmed N0 status after surgery, while 19 resulted in pN+. The radiomic features, extracted by a machine-learning algorithm, exhibited the ability to preoperatively discriminate occult neck metastases with a sensitivity of 78%, specificity of 83%, an AUC of 86%, accuracy of 80%, and a positive predictive value (PPV) of 63%. Our results seem to confirm that radiomic features, extracted by machine learning methods, are effective tools in detecting occult neck metastases in early-stage OCCs. The clinical relevance of this study is that radiomics could be used routinely as a preoperative tool to support diagnosis and to help surgeons in the surgical decision-making process, particularly regarding surgical indications for neck lymph node treatment.

MRI Detection Other Retrospective Clinical In Silico Academic Lab

Effectiveness of Artificial Intelligence in detecting sinonasal pathology using clinical imaging modalities: a systematic review.

Petsiou DP, Spinos D, Martinos A, Muzaffar J, Garas G, Georgalas C

•papers•May 19 2025

Sinonasal pathology can be complex and requires a systematic and meticulous approach. Artificial Intelligence (AI) has the potential to improve diagnostic accuracy and efficiency in sinonasal imaging, but its clinical applicability remains an area of ongoing research. This systematic review evaluates the methodologies and clinical relevance of AI in detecting sinonasal pathology through radiological imaging. Key search terms included "artificial intelligence," "deep learning," "machine learning," "neural network," and "paranasal sinuses,". Abstract and full-text screening was conducted using predefined inclusion and exclusion criteria. Data were extracted on study design, AI architectures used (e.g., Convolutional Neural Networks (CNN), Machine Learning classifiers), and clinical characteristics, such as imaging modality (e.g., Computed Tomography (CT), Magnetic Resonance Imaging (MRI)). A total of 53 studies were analyzed, with 85% retrospective, 68% single-center, and 92.5% using internal databases. CT was the most common imaging modality (60.4%), and chronic rhinosinusitis without nasal polyposis (CRSsNP) was the most studied condition (34.0%). Forty-one studies employed neural networks, with classification as the most frequent AI task (35.8%). Key performance metrics included Area Under the Curve (AUC), accuracy, sensitivity, specificity, precision, and F1-score. Quality assessment based on CONSORT-AI yielded a mean score of 16.0 ± 2. AI shows promise in improving sinonasal imaging interpretation. However, as existing research is predominantly retrospective and single-center, further studies are needed to evaluate AI's generalizability and applicability. More research is also required to explore AI's role in treatment planning and post-treatment prediction for clinical integration.

CT Classification Other Review In Silico Academic Lab

CTLformer: A Hybrid Denoising Model Combining Convolutional Layers and Self-Attention for Enhanced CT Image Reconstruction

Zhiting Zheng, Shuqi Wu, Wen Ding

•preprint•May 18 2025

Low-dose CT (LDCT) images are often accompanied by significant noise, which negatively impacts image quality and subsequent diagnostic accuracy. To address the challenges of multi-scale feature fusion and diverse noise distribution patterns in LDCT denoising, this paper introduces an innovative model, CTLformer, which combines convolutional structures with transformer architecture. Two key innovations are proposed: a multi-scale attention mechanism and a dynamic attention control mechanism. The multi-scale attention mechanism, implemented through the Token2Token mechanism and self-attention interaction modules, effectively captures both fine details and global structures at different scales, enhancing relevant features and suppressing noise. The dynamic attention control mechanism adapts the attention distribution based on the noise characteristics of the input image, focusing on high-noise regions while preserving details in low-noise areas, thereby enhancing robustness and improving denoising performance. Furthermore, CTLformer integrates convolutional layers for efficient feature extraction and uses overlapping inference to mitigate boundary artifacts, further strengthening its denoising capability. Experimental results on the 2016 National Institutes of Health AAPM Mayo Clinic LDCT Challenge dataset demonstrate that CTLformer significantly outperforms existing methods in both denoising performance and model efficiency, greatly improving the quality of LDCT images. The proposed CTLformer not only provides an efficient solution for LDCT denoising but also shows broad potential in medical image analysis, especially for clinical applications dealing with complex noise patterns.

CT Reconstruction Other Methodology In Silico None Academic Lab

SMFusion: Semantic-Preserving Fusion of Multimodal Medical Images for Enhanced Clinical Diagnosis

Haozhe Xiang, Han Zhang, Yu Cheng, Xiongwen Quan, Wanwan Huang

•preprint•May 18 2025

Multimodal medical image fusion plays a crucial role in medical diagnosis by integrating complementary information from different modalities to enhance image readability and clinical applicability. However, existing methods mainly follow computer vision standards for feature extraction and fusion strategy formulation, overlooking the rich semantic information inherent in medical images. To address this limitation, we propose a novel semantic-guided medical image fusion approach that, for the first time, incorporates medical prior knowledge into the fusion process. Specifically, we construct a publicly available multimodal medical image-text dataset, upon which text descriptions generated by BiomedGPT are encoded and semantically aligned with image features in a high-dimensional space via a semantic interaction alignment module. During this process, a cross attention based linear transformation automatically maps the relationship between textual and visual features to facilitate comprehensive learning. The aligned features are then embedded into a text-injection module for further feature-level fusion. Unlike traditional methods, we further generate diagnostic reports from the fused images to assess the preservation of medical information. Additionally, we design a medical semantic loss function to enhance the retention of textual cues from the source images. Experimental results on test datasets demonstrate that the proposed method achieves superior performance in both qualitative and quantitative evaluations while preserving more critical medical information.

Mixed Modality Image Synthesis Other Methodology In Silico Open Dataset GenAI

The effect of medical explanations from large language models on diagnostic decisions in radiology

Spitzer, P., Hendriks, D., Rudolph, J., Schläger, S., Ricke, J., Kühl, N., Hoppe, B., Feuerriegel, S.

•preprint•May 18 2025

Large language models (LLMs) are increasingly used by physicians for diagnostic support. A key advantage of LLMs is the ability to generate explanations that can help physicians understand the reasoning behind a diagnosis. However, the best-suited format for LLM-generated explanations remains unclear. In this large-scale study, we examined the effect of different formats for LLM explanations on clinical decision-making. For this, we conducted a randomized experiment with radiologists reviewing patient cases with radiological images (N = 2020 assessments). Participants received either no LLM support (control group) or were supported by one of three LLM-generated explanations: (1) a standard output providing the diagnosis without explanation; (2) a differential diagnosis comparing multiple possible diagnoses; or (3) a chain-of-thought explanation offering a detailed reasoning process for the diagnosis. We find that the format of explanations significantly influences diagnostic accuracy. The chain-of-thought explanations yielded the best performance, improving the diagnostic accuracy by 12.2% compared to the control condition without LLM support (P = 0.001). The chain-of-thought explanations are also superior to the standard output without explanation (+7.2%; P = 0.040) and the differential diagnosis format (+9.7%; P = 0.004). We further assessed the robustness of these findings across case difficulty and different physician backgrounds such as general vs. specialized radiologists. Evidently, explaining the reasoning for a diagnosis helps physicians to identify and correct potential errors in LLM predictions and thus improve overall decisions. Altogether, the results highlight the importance of how explanations in medical LLMs are generated to maximize their utility in clinical practice. By designing explanations to support the reasoning processes of physicians, LLMs can improve diagnostic performance and, ultimately, patient outcomes.

LLM Radiology Report Other Prospective Clinical Pilot Academic Lab GenAI

MedAgentBoard: Benchmarking Multi-Agent Collaboration with Conventional Methods for Diverse Medical Tasks

Yinghao Zhu, Ziyi He, Haoran Hu, Xiaochen Zheng, Xichen Zhang, Zixiang Wang, Junyi Gao, Liantao Ma, Lequan Yu

•preprint•May 18 2025

The rapid advancement of Large Language Models (LLMs) has stimulated interest in multi-agent collaboration for addressing complex medical tasks. However, the practical advantages of multi-agent collaboration approaches remain insufficiently understood. Existing evaluations often lack generalizability, failing to cover diverse tasks reflective of real-world clinical practice, and frequently omit rigorous comparisons against both single-LLM-based and established conventional methods. To address this critical gap, we introduce MedAgentBoard, a comprehensive benchmark for the systematic evaluation of multi-agent collaboration, single-LLM, and conventional approaches. MedAgentBoard encompasses four diverse medical task categories: (1) medical (visual) question answering, (2) lay summary generation, (3) structured Electronic Health Record (EHR) predictive modeling, and (4) clinical workflow automation, across text, medical images, and structured EHR data. Our extensive experiments reveal a nuanced landscape: while multi-agent collaboration demonstrates benefits in specific scenarios, such as enhancing task completeness in clinical workflow automation, it does not consistently outperform advanced single LLMs (e.g., in textual medical QA) or, critically, specialized conventional methods that generally maintain better performance in tasks like medical VQA and EHR-based prediction. MedAgentBoard offers a vital resource and actionable insights, emphasizing the necessity of a task-specific, evidence-based approach to selecting and developing AI solutions in medicine. It underscores that the inherent complexity and overhead of multi-agent collaboration must be carefully weighed against tangible performance gains. All code, datasets, detailed prompts, and experimental results are open-sourced at https://medagentboard.netlify.app/.

Mixed Modality Classification Other Dataset Release In Silico None Academic Lab Benchmark SOTA Open Dataset Open Code

Fair ultrasound diagnosis via adversarial protected attribute aware perturbations on latent embeddings.

Xu Z, Tang F, Quan Q, Yao Q, Kong Q, Ding J, Ning C, Zhou SK

•papers•May 17 2025

Deep learning techniques have significantly enhanced the convenience and precision of ultrasound image diagnosis, particularly in the crucial step of lesion segmentation. However, recent studies reveal that both train-from-scratch models and pre-trained models often exhibit performance disparities across sex and age attributes, leading to biased diagnoses for different subgroups. In this paper, we propose APPLE, a novel approach designed to mitigate unfairness without altering the parameters of the base model. APPLE achieves this by learning fair perturbations in the latent space through a generative adversarial network. Extensive experiments on both a publicly available dataset and an in-house ultrasound image dataset demonstrate that our method improves segmentation and diagnostic fairness across all sensitive attributes and various backbone architectures compared to the base models. Through this study, we aim to highlight the critical importance of fairness in medical segmentation and contribute to the development of a more equitable healthcare system.

Ultrasound Segmentation Other Methodology In Silico Academic Lab Ethics

Feasibility of improving vocal fold pathology image classification with synthetic images generated by DDPM-based GenAI: a pilot study.

Khazrak I, Zainaee S, M Rezaee M, Ghasemi M, C Green R

•papers•May 17 2025

Voice disorders (VD) are often linked to vocal fold structural pathologies (VFSP). Laryngeal imaging plays a vital role in assessing VFSPs and VD in clinical and research settings, but challenges like scarce and imbalanced datasets can limit the generalizability of findings. Denoising Diffusion Probabilistic Models (DDPMs), a subtype of Generative AI, has gained attention for its ability to generate high-quality and realistic synthetic images to address these challenges. This study explores the feasibility of improving VFSP image classification by generating synthetic images using DDPMs. 404 laryngoscopic images depicting VF without and with VFSP were included. DDPMs were used to generate synthetic images to augment the original dataset. Two convolutional neural network architectures, VGG16 and ResNet50, were applied for model training. The models were initially trained only on the original dataset. Then, they were trained on the augmented datasets. Evaluation metrics were analyzed to assess the performance of the models for both binary classification (with/without VFSPs) and multi-class classification (seven specific VFSPs). Realistic and high-quality synthetic images were generated for dataset augmentation. The model first failed to converge when trained only on the original dataset, but they successfully converged and achieved low loss and high accuracy when trained on the augmented datasets. The best performance was gained for both binary and multi-class classification when the models were trained on an augmented dataset. Generating realistic images of VFSP using DDPMs is feasible and can enhance the classification of VFSPs by an AI model and may support VD screening and diagnosis.

Mixed Modality Classification Other Methodology In Silico Academic Lab GenAI

MedVKAN: Efficient Feature Extraction with Mamba and KAN for Medical Image Segmentation

Hancan Zhu, Jinhao Chen, Guanghua He

•preprint•May 17 2025

Medical image segmentation relies heavily on convolutional neural networks (CNNs) and Transformer-based models. However, CNNs are constrained by limited receptive fields, while Transformers suffer from scalability challenges due to their quadratic computational complexity. To address these limitations, recent advances have explored alternative architectures. The state-space model Mamba offers near-linear complexity while capturing long-range dependencies, and the Kolmogorov-Arnold Network (KAN) enhances nonlinear expressiveness by replacing fixed activation functions with learnable ones. Building on these strengths, we propose MedVKAN, an efficient feature extraction model integrating Mamba and KAN. Specifically, we introduce the EFC-KAN module, which enhances KAN with convolutional operations to improve local pixel interaction. We further design the VKAN module, integrating Mamba with EFC-KAN as a replacement for Transformer modules, significantly improving feature extraction. Extensive experiments on five public medical image segmentation datasets show that MedVKAN achieves state-of-the-art performance on four datasets and ranks second on the remaining one. These results validate the potential of Mamba and KAN for medical image segmentation while introducing an innovative and computationally efficient feature extraction framework. The code is available at: https://github.com/beginner-cjh/MedVKAN.

Mixed Modality Segmentation Other Methodology In Silico Benchmark SOTA Open Code

Segmentation of temporomandibular joint structures on mri images using neural networks for diagnosis of pathologies

The Role of Machine Learning to Detect Occult Neck Lymph Node Metastases in Early-Stage (T1-T2/N0) Oral Cavity Carcinomas.

Effectiveness of Artificial Intelligence in detecting sinonasal pathology using clinical imaging modalities: a systematic review.

CTLformer: A Hybrid Denoising Model Combining Convolutional Layers and Self-Attention for Enhanced CT Image Reconstruction

SMFusion: Semantic-Preserving Fusion of Multimodal Medical Images for Enhanced Clinical Diagnosis

The effect of medical explanations from large language models on diagnostic decisions in radiology

MedAgentBoard: Benchmarking Multi-Agent Collaboration with Conventional Methods for Diverse Medical Tasks

Fair ultrasound diagnosis via adversarial protected attribute aware perturbations on latent embeddings.

Feasibility of improving vocal fold pathology image classification with synthetic images generated by DDPM-based GenAI: a pilot study.

MedVKAN: Efficient Feature Extraction with Mamba and KAN for Medical Image Segmentation

Ready to Sharpen Your Edge?