Latest Papers on Radiology AI. Tags: Open Code

Masked Autoencoder Pretraining and BiXLSTM ResNet Architecture for PET/CT Tumor Segmentation

Moona Mazher, Steven A Niederer, Abdul Qayyum

•preprint•Aug 29 2025

The accurate segmentation of lesions in whole-body PET/CT imaging is es-sential for tumor characterization, treatment planning, and response assess-ment, yet current manual workflows are labor-intensive and prone to inter-observer variability. Automated deep learning methods have shown promise but often remain limited by modality specificity, isolated time points, or in-sufficient integration of expert knowledge. To address these challenges, we present a two-stage lesion segmentation framework developed for the fourth AutoPET Challenge. In the first stage, a Masked Autoencoder (MAE) is em-ployed for self-supervised pretraining on unlabeled PET/CT and longitudinal CT scans, enabling the extraction of robust modality-specific representations without manual annotations. In the second stage, the pretrained encoder is fine-tuned with a bidirectional XLSTM architecture augmented with ResNet blocks and a convolutional decoder. By jointly leveraging anatomical (CT) and functional (PET) information as complementary input channels, the model achieves improved temporal and spatial feature integration. Evalua-tion on the AutoPET Task 1 dataset demonstrates that self-supervised pre-training significantly enhances segmentation accuracy, achieving a Dice score of 0.582 compared to 0.543 without pretraining. These findings high-light the potential of combining self-supervised learning with multimodal fu-sion for robust and generalizable PET/CT lesion segmentation. Code will be available at https://github.com/RespectKnowledge/AutoPet_2025_BxLSTM_UNET_Segmentation

Mixed Modality Segmentation Whole Body Methodology In Silico Academic Lab Open Code Benchmark SOTA

Development of a Large-Scale Dataset of Chest Computed Tomography Reports in Japanese and a High-Performance Finding Classification Model: Dataset Development and Validation Study.

Yamagishi Y, Nakamura Y, Kikuchi T, Sonoda Y, Hirakawa H, Kano S, Nakamura S, Hanaoka S, Yoshikawa T, Abe O

•papers•Aug 28 2025

Recent advances in large language models have highlighted the need for high-quality multilingual medical datasets. Although Japan is a global leader in computed tomography (CT) scanner deployment and use, the absence of large-scale Japanese radiology datasets has hindered the development of specialized language models for medical imaging analysis. Despite the emergence of multilingual models and language-specific adaptations, the development of Japanese-specific medical language models has been constrained by a lack of comprehensive datasets, particularly in radiology. This study aims to address this critical gap in Japanese medical natural language processing resources, for which a comprehensive Japanese CT report dataset was developed through machine translation, to establish a specialized language model for structured classification. In addition, a rigorously validated evaluation dataset was created through expert radiologist refinement to ensure a reliable assessment of model performance. We translated the CT-RATE dataset (24,283 CT reports from 21,304 patients) into Japanese using GPT-4o mini. The training dataset consisted of 22,778 machine-translated reports, and the validation dataset included 150 reports carefully revised by radiologists. We developed CT-BERT-JPN, a specialized Bidirectional Encoder Representations from Transformers (BERT) model for Japanese radiology text, based on the "tohoku-nlp/bert-base-japanese-v3" architecture, to extract 18 structured findings from reports. Translation quality was assessed with Bilingual Evaluation Understudy (BLEU) and Recall-Oriented Understudy for Gisting Evaluation (ROUGE) scores and further evaluated by radiologists in a dedicated human-in-the-loop experiment. In that experiment, each of a randomly selected subset of reports was independently reviewed by 2 radiologists-1 senior (postgraduate year [PGY] 6-11) and 1 junior (PGY 4-5)-using a 5-point Likert scale to rate: (1) grammatical correctness, (2) medical terminology accuracy, and (3) overall readability. Inter-rater reliability was measured via quadratic weighted kappa (QWK). Model performance was benchmarked against GPT-4o using accuracy, precision, recall, F1-score, ROC (receiver operating characteristic)-AUC (area under the curve), and average precision. General text structure was preserved (BLEU: 0.731 findings, 0.690 impression; ROUGE: 0.770-0.876 findings, 0.748-0.857 impression), though expert review identified 3 categories of necessary refinements-contextual adjustment of technical terms, completion of incomplete translations, and localization of Japanese medical terminology. The radiologist-revised translations scored significantly higher than raw machine translations across all dimensions, and all improvements were statistically significant (P<.001). CT-BERT-JPN outperformed GPT-4o on 11 of 18 findings (61%), achieving perfect F1-scores for 4 conditions and F1-score >0.95 for 14 conditions, despite varied sample sizes (7-82 cases). Our study established a robust Japanese CT report dataset and demonstrated the effectiveness of a specialized language model in structured classification of findings. This hybrid approach of machine translation and expert validation enabled the creation of large-scale datasets while maintaining high-quality standards. This study provides essential resources for advancing medical artificial intelligence research in Japanese health care settings, using datasets and models publicly available for research to facilitate further advancement in the field.

CT Classification Chest Dataset Release In Silico Academic Lab Open Dataset Open Code

Adapting Foundation Model for Dental Caries Detection with Dual-View Co-Training

Tao Luo, Han Wu, Tong Yang, Dinggang Shen, Zhiming Cui

•preprint•Aug 28 2025

Accurate dental caries detection from panoramic X-rays plays a pivotal role in preventing lesion progression. However, current detection methods often yield suboptimal accuracy due to subtle contrast variations and diverse lesion morphology of dental caries. In this work, inspired by the clinical workflow where dentists systematically combine whole-image screening with detailed tooth-level inspection, we present DVCTNet, a novel Dual-View Co-Training network for accurate dental caries detection. Our DVCTNet starts with employing automated tooth detection to establish two complementary views: a global view from panoramic X-ray images and a local view from cropped tooth images. We then pretrain two vision foundation models separately on the two views. The global-view foundation model serves as the detection backbone, generating region proposals and global features, while the local-view model extracts detailed features from corresponding cropped tooth patches matched by the region proposals. To effectively integrate information from both views, we introduce a Gated Cross-View Attention (GCV-Atten) module that dynamically fuses dual-view features, enhancing the detection pipeline by integrating the fused features back into the detection model for final caries detection. To rigorously evaluate our DVCTNet, we test it on a public dataset and further validate its performance on a newly curated, high-precision dental caries detection dataset, annotated using both intra-oral images and panoramic X-rays for double verification. Experimental results demonstrate DVCTNet's superior performance against existing state-of-the-art (SOTA) methods on both datasets, indicating the clinical applicability of our method. Our code and labeled dataset are available at https://github.com/ShanghaiTech-IMPACT/DVCTNet.

X-Ray Detection Methodology In Silico Academic Lab Open Code Open Dataset Benchmark SOTA

Dino U-Net: Exploiting High-Fidelity Dense Features from Foundation Models for Medical Image Segmentation

Yifan Gao, Haoyue Li, Feng Yuan, Xiaosong Wang, Xin Gao

•preprint•Aug 28 2025

Foundation models pre-trained on large-scale natural image datasets offer a powerful paradigm for medical image segmentation. However, effectively transferring their learned representations for precise clinical applications remains a challenge. In this work, we propose Dino U-Net, a novel encoder-decoder architecture designed to exploit the high-fidelity dense features of the DINOv3 vision foundation model. Our architecture introduces an encoder built upon a frozen DINOv3 backbone, which employs a specialized adapter to fuse the model's rich semantic features with low-level spatial details. To preserve the quality of these representations during dimensionality reduction, we design a new fidelity-aware projection module (FAPM) that effectively refines and projects the features for the decoder. We conducted extensive experiments on seven diverse public medical image segmentation datasets. Our results show that Dino U-Net achieves state-of-the-art performance, consistently outperforming previous methods across various imaging modalities. Our framework proves to be highly scalable, with segmentation accuracy consistently improving as the backbone model size increases up to the 7-billion-parameter variant. The findings demonstrate that leveraging the superior, dense-pretrained features from a general-purpose foundation model provides a highly effective and parameter-efficient approach to advance the accuracy of medical image segmentation. The code is available at https://github.com/yifangao112/DinoUNet.

Mixed Modality Segmentation Methodology In Silico Academic Lab Open Code Benchmark SOTA

Mask-Guided Multi-Channel SwinUNETR Framework for Robust MRI Classification

Smriti Joshi, Lidia Garrucho, Richard Osuala, Oliver Diaz, Karim Lekadir

•preprint•Aug 28 2025

Breast cancer is one of the leading causes of cancer-related mortality in women, and early detection is essential for improving outcomes. Magnetic resonance imaging (MRI) is a highly sensitive tool for breast cancer detection, particularly in women at high risk or with dense breast tissue, where mammography is less effective. The ODELIA consortium organized a multi-center challenge to foster AI-based solutions for breast cancer diagnosis and classification. The dataset included 511 studies from six European centers, acquired on scanners from multiple vendors at both 1.5 T and 3 T. Each study was labeled for the left and right breast as no lesion, benign lesion, or malignant lesion. We developed a SwinUNETR-based deep learning framework that incorporates breast region masking, extensive data augmentation, and ensemble learning to improve robustness and generalizability. Our method achieved second place on the challenge leaderboard, highlighting its potential to support clinical breast MRI interpretation. We publicly share our codebase at https://github.com/smriti-joshi/bcnaim-odelia-challenge.git.

MRI Classification Breast Retrospective Clinical In Silico Consortium Open Code Benchmark SOTA

Learning What is Worth Learning: Active and Sequential Domain Adaptation for Multi-modal Gross Tumor Volume Segmentation

Jingyun Yang, Guoqing Zhang, Jingge Wang, Yang Li

•preprint•Aug 28 2025

Accurate gross tumor volume segmentation on multi-modal medical data is critical for radiotherapy planning in nasopharyngeal carcinoma and glioblastoma. Recent advances in deep neural networks have brought promising results in medical image segmentation, leading to an increasing demand for labeled data. Since labeling medical images is time-consuming and labor-intensive, active learning has emerged as a solution to reduce annotation costs by selecting the most informative samples to label and adapting high-performance models with as few labeled samples as possible. Previous active domain adaptation (ADA) methods seek to minimize sample redundancy by selecting samples that are farthest from the source domain. However, such one-off selection can easily cause negative transfer, and access to source medical data is often limited. Moreover, the query strategy for multi-modal medical data remains unexplored. In this work, we propose an active and sequential domain adaptation framework for dynamic multi-modal sample selection in ADA. We derive a query strategy to prioritize labeling and training on the most valuable samples based on their informativeness and representativeness. Empirical validation on diverse gross tumor volume segmentation tasks demonstrates that our method achieves favorable segmentation performance, significantly outperforming state-of-the-art ADA methods. Code is available at the git repository: \href{https://github.com/Hiyoochan/mmActS}{mmActS}.

Mixed Modality Segmentation Methodology In Silico Academic Lab Open Code

DECODE: An open-source cloud-based platform for the noninvasive management of peripheral artery disease.

AboArab MA, Anić M, Potsika VT, Saeed H, Zulfiqar M, Skalski A, Stretti E, Kostopoulos V, Psarras S, Pennati G, Berti F, Spahić L, Benolić L, Filipović N, Fotiadis DI

•papers•Aug 28 2025

Peripheral artery disease (PAD) is a progressive vascular condition affecting >237 million individuals worldwide. Accurate diagnosis and patient-specific treatment planning are critical but are often hindered by limited access to advanced imaging tools and real-time analytical support. This study presents DECODE, an open-source, cloud-based platform that integrates artificial intelligence, interactive 3D visualization, and computational modeling to improve the noninvasive management of PAD. The DECODE platform was designed as a modular backend (Django) and frontend (React) architecture that combines deep learning-based segmentation, real-time volume rendering, and finite element simulations. Peripheral artery and intima-media thickness segmentation were implemented via convolutional neural networks, including extended U-Net and nnU-Net architectures. Centreline extraction algorithms provide quantitative vascular geometry analysis. Balloon angioplasty simulations were conducted via nonlinear finite element models calibrated with experimental data. Usability was evaluated via the System Usability Scale (SUS), and user acceptance was assessed via the Technology Acceptance Model (TAM). Peripheral artery segmentation achieved an average Dice coefficient of 0.91 and a 95th percentile Hausdorff distance 1.0 mm across 22 computed tomography dataset. Intima-media segmentation evaluated on 300 intravascular optical coherence tomography images demonstrated Dice scores 0.992 for the lumen boundaries and 0.980 for the intima boundaries, with corresponding Hausdorff distances of 0.056 mm and 0.101 mm, respectively. Finite element simulations successfully reproduced the mechanical interactions between balloon and artery models in both idealized and subject-specific geometries, identifying pressure and stress distributions relevant to treatment outcomes. The platform received an average SUS score 87.5, indicating excellent usability, and an overall TAM score 4.21 out of 5, reflecting high user acceptance. DECODE provides an automated, cloud-integrated solution for PAD diagnosis and intervention planning, combining deep learning, computational modeling, and high-fidelity visualization. The platform enables precise vascular analysis, real-time procedural simulation, and interactive clinical decision support. By streamlining image processing, enhancing segmentation accuracy, and enabling in-silico trials, DECODE offers a scalable infrastructure for personalized vascular care and sets a new benchmark in digital health technologies for PAD.

Mixed Modality Segmentation Vascular Methodology In Silico Open Source Open Code Open Dataset Benchmark SOTA

An MRI Atlas of the Human Fetal Brain: Reference and Segmentation Tools for Fetal Brain MRI Analysis.

Bagheri M, Velasco-Annis C, Wang J, Faghihpirayesh R, Khan S, Calixto C, Jaimes C, Vasung L, Ouaalam A, Afacan O, Warfield SK, Rollins CK, Gholipour A

•papers•Aug 28 2025

Accurate characterization of in-utero brain development is essential for understanding typical and atypical neurodevelopment. Building upon previous efforts to construct spatiotemporal fetal brain MRI atlases, we present the CRL-2025 fetal brain atlas, which is a spatiotemporal (4D) atlas of the developing fetal brain between 21 and 37 gestational weeks. This atlas is constructed from carefully processed MRI scans of 160 fetuses with typically-developing brains using a diffeomorphic deformable registration framework integrated with kernel regression on age. CRL-2025 uniquely includes detailed tissue segmentations, transient white matter compartments, and parcellation into 126 anatomical regions. This atlas offers significantly enhanced anatomical details over the CRL-2017 atlas, and is released along with the CRL diffusion MRI atlas with its newly created tissue segmentation and labels as well as deep learning-based multiclass segmentation models for fine-grained fetal brain MRI segmentation. The CRL-2025 atlas and its associated tools provide a robust and scalable platform for fetal brain MRI segmentation, groupwise analysis, and early neurodevelopmental research, and these materials are publicly released to support the broader research community.

MRI Segmentation Neurological Dataset Release In Silico Open Dataset Open Code

E-TBI: explainable outcome prediction after traumatic brain injury using machine learning.

Ngo TH, Tran MH, Nguyen HB, Hoang VN, Le TL, Vu H, Tran TK, Nguyen HK, Can VM, Nguyen TB, Tran TH

•papers•Aug 27 2025

Traumatic brain injury (TBI) is one of the most prevalent health conditions, with severity assessment serving as an initial step for management, prognosis, and targeted therapy. Existing studies on automated outcome prediction using machine learning (ML) often overlook the importance of TBI features in decision-making and the challenges posed by limited and imbalanced training data. Furthermore, many attempts have focused on quantitatively evaluating ML algorithms without explaining the decisions, making the outcomes difficult to interpret and apply for less-experienced doctors. This study presents a novel supportive tool, named E-TBI (explainable outcome prediction after TBI), designed with a user-friendly web-based interface to assist doctors in outcome prediction after TBI using machine learning. The tool is developed with the capability to visualize rules applied in the decision-making process. At the tool's core is a feature selection and classification module that receives multimodal data from TBI patients (demographic data, clinical data, laboratory test results, and CT findings). It then infers one of four TBI severity levels. This research investigates various machine learning models and feature selection techniques, ultimately identifying the optimal combination of gradient boosting machine and random forest for the task, which we refer to as GBMRF. This method enabled us to identify a small set of essential features, reducing patient testing costs by 35%, while achieving the highest accuracy rates of 88.82% and 89.78% on two datasets (a public TBI dataset and our self-collected dataset, TBI_MH103). Classification modules are available at https://github.com/auverngo110/Traumatic_Brain_Injury_103 .

CT Classification Neurological Methodology In Silico Academic Lab Open Code

Activating Associative Disease-Aware Vision Token Memory for LLM-Based X-ray Report Generation.

Wang X, Wang F, Wang H, Jiang B, Li C, Wang Y, Tian Y, Tang J

•papers•Aug 27 2025

X-ray image based medical report generation achieves significant progress in recent years with the help of large language models, however, these models have not fully exploited the effective information in visual image regions, resulting in reports that are linguistically sound but insufficient in describing key diseases. In this paper, we propose a novel associative memory-enhanced X-ray report generation model that effectively mimics the process of professional doctors writing medical reports. It considers both the mining of global and local visual information and associates historical report information to better complete the writing of the current report. Specifically, given an X-ray image, we first utilize a classification model along with its activation maps to accomplish the mining of visual regions highly associated with diseases and the learning of disease query tokens. Then, we employ a visual Hopfield network to establish memory associations for disease-related tokens, and a report Hopfield network to retrieve report memory information. This process facilitates the generation of high-quality reports based on a large language model and achieves state-of-the-art performance on multiple benchmark datasets, including the IU X-ray, MIMIC-CXR, and Chexpert Plus. The source code and pre-trained models of this work have been released on https://github.com/Event-AHU/Medical_Image_Analysis.

X-Ray Report Generation Chest Methodology In Silico Academic Lab Benchmark SOTA Open Code

Filter Papers

Tags

Masked Autoencoder Pretraining and BiXLSTM ResNet Architecture for PET/CT Tumor Segmentation

Development of a Large-Scale Dataset of Chest Computed Tomography Reports in Japanese and a High-Performance Finding Classification Model: Dataset Development and Validation Study.

Adapting Foundation Model for Dental Caries Detection with Dual-View Co-Training

Dino U-Net: Exploiting High-Fidelity Dense Features from Foundation Models for Medical Image Segmentation

Mask-Guided Multi-Channel SwinUNETR Framework for Robust MRI Classification

Learning What is Worth Learning: Active and Sequential Domain Adaptation for Multi-modal Gross Tumor Volume Segmentation

DECODE: An open-source cloud-based platform for the noninvasive management of peripheral artery disease.

An MRI Atlas of the Human Fetal Brain: Reference and Segmentation Tools for Fetal Brain MRI Analysis.

E-TBI: explainable outcome prediction after traumatic brain injury using machine learning.

Activating Associative Disease-Aware Vision Token Memory for LLM-Based X-ray Report Generation.

Ready to Sharpen Your Edge?