Latest Papers on Radiology AI. Tags: In Silico

Improving data-driven gated (DDG) PET and CT registration in thoracic lesions: a comparison of AI registration and DDG CT.

Pan T, Thomas MA, Lu Y, Luo D

•papers•Sep 30 2025

Misregistration between CT and PET can result in mis-localization and inaccurate quantification of the tracer uptake in PET. Data-driven gated (DDG) CT can correct registration and quantification but requires a radiation dose of 1.3 mSv and 1 min of acquisition time. AI registration (AIR) does not require an additional CT and has been validated to improve registration and reduce the 'banana' misregistration artifacts around the diaphragm. We aimed to compare a validated AIR and DDG CT in registration and quantification of avid thoracic lesions misregistered in DDG PET scans. Thirty PET/CT patient data (23 with 18F-FDG, 4 with 68Ga-Dotatate, and 3 with 18F-PSMA piflufolastat) with at least one misregistered avid lesion in the thorax were recruited. Patient studies were conducted using DDG CT to correct misregistration with DDG PET data of the phases 30 to 80% on GE Discovery MI PET/CT scanners. Non-attenuation correction DDG PET and misregistered CT were input to AIR and the AIR-corrected CT data were output to register and quantify the DDG PET data. Registration and quantification of lesion SUVmax and signal-to-background ratio (SBR) of the lesion SUVmax to the 2-cm background mean SUV were compared for each of the 51 avid lesions. DDG CT outperformed AIR in misregistration correction and quantification of avid thoracic lesions (1.16 ± 0.45 cm). Most lesions (46/51, 90%) showed improved registration from DDG CT relative to AIR, with 10% (5/51) being similar between AIR and DDG CT. The lesions in the baseline CT were an average of 2.06 ± 1.0 cm from their corresponding lesions in the DDG CT, while those in the AIR CT were an average of 0.97 ± 0.54 cm away. AIR significantly improved lesion registration compared to the baseline CT (P < 0.0001). SUVmax increased by 18.1 ± 15.3% with AIR, but a statistically significantly larger increase of 34.4 ± 25.4% was observed with DDG CT (P < 0.0001). A statistically significant increase in SBR was also observed, rising from 10.5 ± 12.1% of AIR to 21.1 ± 20.5% of DDG CT (P < 0.0001). Many registration improvements by AIR were still left with misregistration. AIR could mis-localize a lymph node to the lung parenchyma or the ribs, and could also mis-localize a lung nodule to the left atrium. AIR could also distort the rib cage and the circular shape of the aorta cross section. DDG CT outperformed AIR in both localization and quantification of the thoracic avid lesions. AIR improved registration of the misregistered PET/CT. Registered lymph nodes could be falsely misregistered by AIR. AIR-induced distortion of the rib cage can also negatively impact image quality. Further research on AIR's accuracy in modeling true patient respiratory motion without introducing new misregistration or anatomical distortion is warranted.

PET Registration Chest Retrospective Clinical In Silico Academic Lab

A phase-aware Cross-Scale U-MAMba with uncertainty-aware segmentation and Switch Atrous Bifovea EfficientNetB7 classification of kidney lesion subtype.

Rmr SS, Mb S, R D, M T, P V

•papers•Sep 30 2025

Kidney lesion subtype identification is essential for precise diagnosis and personalized treatment planning. However, achieving reliable classification remains challenging due to factors such as inter-patient anatomical variability, incomplete multi-phase CT acquisitions, and ill-defined or overlapping lesion boundaries. In addition, genetic and ethnic morphological variations introduce inconsistent imaging patterns, reducing the generalizability of conventional deep learning models. To address these challenges, we introduce a unified framework called Phase-aware Cross-Scale U-MAMba and Switch Atrous Bifovea EfficientNet B7 (PCU-SABENet), which integrates multi-phase reconstruction, fine-grained lesion segmentation, and robust subtype classification. The PhaseGAN-3D synthesizes missing CT phases using binary mask-guided inter-phase priors, enabling complete four-phase reconstruction even under partial acquisition conditions. The PCU segmentation module combines Contextual Attention Blocks, Cross-Scale Skip Connections, and uncertainty-aware pseudo-labeling to delineate lesion boundaries with high anatomical fidelity. These enhancements help mitigate low contrast and intra-class ambiguity. For classification, SABENet employs Switch Atrous Convolution for multi-scale receptive field adaptation, Hierarchical Tree Pooling for structure-aware abstraction, and Bi-Fovea Self-Attention to emphasize fine lesion cues and global morphology. This configuration is particularly effective in addressing morphological diversity across patient populations. Experimental results show that the proposed model achieves state-of-the-art performance, with 99.3% classification accuracy, 94.8% Dice similarity, 89.3% IoU, 98.8% precision, 99.2% recall, a phase-consistency score of 0.94, and a subtype confidence deviation of 0.08. Moreover, the model generalizes well on external datasets (TCIA) with 98.6% accuracy and maintains efficient computational performance, requiring only 0.138 GFLOPs and 8.2 ms inference time. These outcomes confirm the model's robustness in phase-incomplete settings and its adaptability to diverse patient cohorts. The PCU-SABENet framework sets a new standard in kidney lesion subtype analysis, combining segmentation precision with clinically actionable classification, thus offering a powerful tool for enhancing diagnostic accuracy and decision-making in real-world renal cancer management.

CT Segmentation Abdominal Methodology In Silico Benchmark SOTA

A Multimodal LLM Approach for Visual Question Answering on Multiparametric 3D Brain MRI

Arvind Murari Vepa, Yannan Yu, Jingru Gan, Anthony Cuturrufo, Weikai Li, Wei Wang, Fabien Scalzo, Yizhou Sun

•preprint•Sep 30 2025

We introduce mpLLM, a prompt-conditioned hierarchical mixture-of-experts (MoE) architecture for visual question answering over multi-parametric 3D brain MRI (mpMRI). mpLLM routes across modality-level and token-level projection experts to fuse multiple interrelated 3D modalities, enabling efficient training without image--report pretraining. To address limited image-text paired supervision, mpLLM integrates a synthetic visual question answering (VQA) protocol that generates medically relevant VQA from segmentation annotations, and we collaborate with medical experts for clinical validation. mpLLM outperforms strong medical VLM baselines by 5.3% on average across multiple mpMRI datasets. Our study features three main contributions: (1) the first clinically validated VQA dataset for 3D brain mpMRI, (2) a novel multimodal LLM that handles multiple interrelated 3D modalities, and (3) strong empirical results that demonstrate the medical utility of our methodology. Ablations highlight the importance of modality-level and token-level experts and prompt-conditioned routing. We have included our source code in the supplementary materials and will release our dataset upon publication.

MRI LLM Radiology Report Neurological Methodology In Silico Academic Lab Open Code Open Dataset GenAI

Automated detection of bottom-of-sulcus dysplasia on magnetic resonance imaging-positron emission tomography in patients with drug-resistant focal epilepsy.

Macdonald-Laurs E, Warren AEL, Mito R, Genc S, Alexander B, Barton S, Yang JY, Francis P, Pardoe HR, Jackson G, Harvey AS

•papers•Sep 30 2025

Bottom-of-sulcus dysplasia (BOSD) is a diagnostically challenging subtype of focal cortical dysplasia, 60% being missed on magnetic resonance imaging (MRI). Automated MRI-based detection methods have been developed for focal cortical dysplasia, but not BOSD specifically, and few methods incorporate fluorodeoxyglucose positron emission tomography (FDG-PET) alongside MRI features. We report the development and performance of an automated BOSD detector using combined MRI + PET. The training set comprised 54 patients with focal epilepsy and BOSD. The test sets comprised 17 subsequently diagnosed patients with BOSD from the same center, and 12 published patients from a different center. Across training and test sets, 81% of patients had normal initial MRIs and most BOSDs were <1.5 cm3. In the training set, 12 features from T1-MRI, fluid-attenuated inversion recovery-MRI, and FDG-PET were evaluated to determine which features best distinguished dysplastic from normal-appearing cortex. Using the Multi-centre Epilepsy Lesion Detection group's machine-learning detection method with the addition of FDG-PET, neural network classifiers were then trained and tested on MRI + PET, MRI-only, and PET-only features. The proportion of patients whose BOSD was overlapped by the top output cluster, and the top five output clusters, were determined. Cortical and subcortical hypometabolism on FDG-PET was superior in discriminating dysplastic from normal-appearing cortex compared to MRI features. When the BOSD detector was trained on MRI + PET features, 87% BOSDs were overlapped by one of the top five clusters (69% top cluster) in the training set, 94% in the prospective test set (88% top cluster), and 75% in the published test set (58% top cluster). Cluster overlap was generally lower when the detector was trained and tested on PET-only or MRI-only features. Detection of BOSD is possible using established MRI-based automated detection methods, supplemented with FDG-PET features and trained on a BOSD-specific cohort. In clinically appropriate patients with seemingly negative MRI, the detector could suggest MRI regions to scrutinize for possible BOSD.

Mixed Modality Detection Neurological Retrospective Clinical In Silico Academic Lab

An interpretable generative multimodal neuroimaging-genomics framework for decoding Alzheimer's disease.

Dolci G, Cruciani F, Abdur Rahaman M, Abrol A, Chen J, Fu Z, Boscolo Galazzo I, Menegaz G, Calhoun VD

•papers•Sep 30 2025

Objective.Alzheimer's disease (AD) is the most prevalent form of dementia worldwide, encompassing a prodromal stage known as mild cognitive impairment (MCI), where patients may either progress to AD or remain stable. The objective of the work was to capture structural and functional modulations of brain structure and function relying on multimodal MRI data and single nucleotide polymorphisms, also in case of missing views, with the twofold goal of classifying AD patients versus healthy controls and detecting MCI converters.Approach.We propose a multimodal deep learning (DL)-based classification framework where a generative module employing cycle generative adversarial networks was introduced in the latent space for imputing missing data (a common issue of multimodal approaches). Explainable AI method was then used to extract input features' relevance allowing for post-hoc validation and enhancing the interpretability of the learned representations.Main results.Experimental results on two tasks, AD detection and MCI conversion, showed that our framework reached competitive performance in the state-of-the-art with an accuracy of0.926±0.02(CI [0.90, 0.95]) and0.711±0.01(CI [0.70, 0.72]) in the two tasks, respectively. The interpretability analysis revealed gray matter modulations in cortical and subcortical brain areas typically associated with AD. Moreover, impairments in sensory-motor and visual resting state networks along the disease continuum, as well as genetic mutations defining biological processes linked to endocytosis, amyloid-beta, and cholesterol, were identified.Significance.Our integrative and interpretable DL approach shows promising performance for AD detection and MCI prediction while shedding light on important biological insights.

MRI Classification Neurological Methodology In Silico GenAI

Enhancing Microscopic Image Quality With DiffusionFormer and Crow Search Optimization.

Patel SC, Kamath RN, Murthy TSN, Subash K, Avanija J, Sangeetha M

•papers•Sep 30 2025

Medical Image plays a vital role in diagnosis, but noise in patient scans severely affects the accuracy and quality of images. Denoising methods are important to increase the clarity of these images, particularly in low-resource settings where current diagnostic roles are inaccessible. Pneumonia is a widespread disease that presents significant diagnostic challenges due to the high similarity between its various types and the lack of medical images for emerging variants. This study introduces a novel Diffusion with swin transformer-based Optimized Crow Search algorithm to increase the image's quality and reliability. This technique utilizes four datasets such as brain tumor MRI dataset, chest X-ray image, chest CT-scan image, and BUSI. The preprocessing steps involve conversion to grayscale, resizing, and normalization to improve image quality in medical image (MI) datasets. Gaussian noise is introduced to further enhance image quality. The method incorporates a diffusion process, swin transformer networks, and optimized crow search algorithm to improve the denoising of medical images. The diffusion process reduces noise by iteratively refining images while swin transformer captures complex image features that help differentiate between noise and essential diagnostic information. The crow search optimization algorithm fine-tunes the hyperparameters, which minimizes the fitness function for optimal denoising performance. The method is tested across four datasets, indicating its optimal effectiveness against other techniques. The proposed method achieves a peak signal-to-noise ratio of 38.47 dB, a structural similarity index measure of 98.14%, a mean squared error of 0.55, and a feature similarity index measure of 0.980, which outperforms existing techniques. These outcomes reflect that the proposed approach effectively enhances the quality of images, resulting in precise and dependable diagnoses.

Mixed Modality Reconstruction Methodology In Silico Academic Lab Benchmark SOTA

LMOD+: A Comprehensive Multimodal Dataset and Benchmark for Developing and Evaluating Multimodal Large Language Models in Ophthalmology

Zhenyue Qin, Yang Liu, Yu Yin, Jinyu Ding, Haoran Zhang, Anran Li, Dylan Campbell, Xuansheng Wu, Ke Zou, Tiarnan D. L. Keenan, Emily Y. Chew, Zhiyong Lu, Yih-Chung Tham, Ninghao Liu, Xiuzhen Zhang, Qingyu Chen

•preprint•Sep 30 2025

Vision-threatening eye diseases pose a major global health burden, with timely diagnosis limited by workforce shortages and restricted access to specialized care. While multimodal large language models (MLLMs) show promise for medical image interpretation, advancing MLLMs for ophthalmology is hindered by the lack of comprehensive benchmark datasets suitable for evaluating generative models. We present a large-scale multimodal ophthalmology benchmark comprising 32,633 instances with multi-granular annotations across 12 common ophthalmic conditions and 5 imaging modalities. The dataset integrates imaging, anatomical structures, demographics, and free-text annotations, supporting anatomical structure recognition, disease screening, disease staging, and demographic prediction for bias evaluation. This work extends our preliminary LMOD benchmark with three major enhancements: (1) nearly 50% dataset expansion with substantial enlargement of color fundus photography; (2) broadened task coverage including binary disease diagnosis, multi-class diagnosis, severity classification with international grading standards, and demographic prediction; and (3) systematic evaluation of 24 state-of-the-art MLLMs. Our evaluations reveal both promise and limitations. Top-performing models achieved ~58% accuracy in disease screening under zero-shot settings, and performance remained suboptimal for challenging tasks like disease staging. We will publicly release the dataset, curation pipeline, and leaderboard to potentially advance ophthalmic AI applications and reduce the global burden of vision-threatening diseases.

Mixed Modality Classification Dataset Release In Silico Academic Lab Open Dataset GenAI

Artificial Intelligence Model for Imaging-Based Extranodal Extension Detection and Outcome Prediction in Human Papillomavirus-Positive Oropharyngeal Cancer.

Dayan GS, Hénique G, Bahig H, Nelson K, Brodeur C, Christopoulos A, Filion E, Nguyen-Tan PF, O'Sullivan B, Ayad T, Bissada E, Tabet P, Guertin L, Desilets A, Kadoury S, Letourneau-Guillon L

•papers•Sep 30 2025

Although not included in the eighth edition of the American Joint Committee on Cancer Staging System, there is growing evidence suggesting that imaging-based extranodal extension (iENE) is associated with worse outcomes in HPV-associated oropharyngeal carcinoma (OPC). Key challenges with iENE include the lack of standardized criteria, reliance on radiological expertise, and interreader variability. To develop an artificial intelligence (AI)-driven pipeline for lymph node segmentation and iENE classification using pretreatment computed tomography (CT) scans, and to evaluate its association with oncologic outcomes in HPV-positive OPC. This was a single-center cohort study conducted at a tertiary oncology center in Montreal, Canada, of adult patients with HPV-positive cN+ OPC treated with up-front (chemo)radiotherapy from January 2009 to January 2020. Participants were followed up until January 2024. Data analysis was performed from March 2024 to April 2025. Pretreatment planning CT scans along with lymph node gross tumor volume segmentations performed by expert radiation oncologists were extracted. For lymph node segmentation, an nnU-Net model was developed. For iENE classification, radiomic and deep learning feature extraction methods were compared. iENE classification accuracy was assessed against 2 expert neuroradiologist evaluations using area under the receiver operating characteristic curve (AUC). Subsequently, the association of AI-predicted iENE with oncologic outcomes-ie, overall survival (OS), recurrence-free survival (RFS), distant control (DC), and locoregional control (LRC)-was assessed. Among 397 patients (mean [SD] age, 62.3 [9.1] years; 80 females [20.2%] and 317 males [79.8%]), AI-iENE classification using radiomics achieved an AUC of 0.81. Patients with AI-predicted iENE had worse 3-year OS (83.8% vs 96.8%), RFS (80.7% vs 93.7%), and DC (84.3% vs 97.1%), but similar LRC. AI-iENE had significantly higher Concordance indices than radiologist-assessed iENE for OS (0.64 vs 0.55), RFS (0.67 vs 0.60), and DC (0.79 vs 0.68). In multivariable analysis, AI-iENE remained independently associated with OS (adjusted hazard ratio [aHR], 2.82; 95% CI, 1.21-6.57), RFS (aHR, 4.20; 95% CI, 1.93-9.11), and DC (aHR, 12.33; 95% CI, 4.15-36.67), adjusting for age, tumor category, node category, and number of lymph nodes. This single-center cohort study found that an AI-driven pipeline can successfully automate lymph node segmentation and iENE classification from pretreatment CT scans in HPV-associated OPC. Predicted iENE was independently associated with worse oncologic outcomes. External validation is required to assess generalizability and the potential for implementation in institutions without specialized imaging expertise.

CT Segmentation Retrospective Clinical In Silico Academic Lab Benchmark SOTA

A Pretraining Approach for Small-sample Training Employing Radiographs (PASTER): a Multimodal Transformer Trained by Chest Radiography and Free-text Reports.

Chen KC, Kuo M, Lee CH, Liao HC, Tsai DJ, Lin SA, Hsiang CW, Chang CK, Ko KH, Hsu YC, Chang WC, Huang GS, Fang WH, Lin CS, Lin SH, Chen YH, Hung YJ, Tsai CS, Lin C

•papers•Sep 30 2025

While deep convolutional neural networks (DCNNs) have achieved remarkable performance in chest X-ray interpretation, their success typically depends on access to large-scale, expertly annotated datasets. However, collecting such data in real-world clinical settings can be difficult because of limited labeling resources, privacy concerns, and patient variability. In this study, we applied a multimodal Transformer pretrained on free-text reports and their paired CXRs to evaluate the effectiveness of this method in settings with limited labeled data. Our dataset consisted of more than 1 million CXRs, each accompanied by reports from board-certified radiologists and 31 structured labels. The results indicated that a linear model trained on embeddings from the pretrained model achieved AUCs of 0.907 and 0.903 on internal and external test sets, respectively, using only 128 cases and 384 controls; the results were comparable those of DenseNet trained on the entire dataset, whose AUCs were 0.908 and 0.903, respectively. Additionally, we demonstrated similar results by extending the application of this approach to a subset annotated with structured echocardiographic reports. Furthermore, this multimodal model exhibited excellent small sample learning capabilities when tested on external validation sets such as CheXpert and ChestX-ray14. This research significantly reduces the sample size necessary for future artificial intelligence advancements in CXR interpretation.

X-Ray Classification Chest Methodology In Silico Academic Lab Benchmark SOTA

Multi-modal Liver Segmentation and Fibrosis Staging Using Real-world MRI Images

Yang Zhou, Kunhao Yuan, Ye Wei, Jishizhan Chen

•preprint•Sep 30 2025

Liver fibrosis represents the accumulation of excessive extracellular matrix caused by sustained hepatic injury. It disrupts normal lobular architecture and function, increasing the chances of cirrhosis and liver failure. Precise staging of fibrosis for early diagnosis and intervention is often invasive, which carries risks and complications. To address this challenge, recent advances in artificial intelligence-based liver segmentation and fibrosis staging offer a non-invasive alternative. As a result, the CARE 2025 Challenge aimed for automated methods to quantify and analyse liver fibrosis in real-world scenarios, using multi-centre, multi-modal, and multi-phase MRI data. This challenge included tasks of precise liver segmentation (LiSeg) and fibrosis staging (LiFS). In this study, we developed an automated pipeline for both tasks across all the provided MRI modalities. This pipeline integrates pseudo-labelling based on multi-modal co-registration, liver segmentation using deep neural networks, and liver fibrosis staging based on shape, textural, appearance, and directional (STAD) features derived from segmentation masks and MRI images. By solely using the released data with limited annotations, our proposed pipeline demonstrated excellent generalisability for all MRI modalities, achieving top-tier performance across all competition subtasks. This approach provides a rapid and reproducible framework for quantitative MRI-based liver fibrosis assessment, supporting early diagnosis and clinical decision-making. Code is available at https://github.com/YangForever/care2025_liver_biodreamer.

MRI Segmentation Abdominal Methodology In Silico Academic Lab Benchmark SOTA Open Code

Filter Papers

Tags

Improving data-driven gated (DDG) PET and CT registration in thoracic lesions: a comparison of AI registration and DDG CT.

A phase-aware Cross-Scale U-MAMba with uncertainty-aware segmentation and Switch Atrous Bifovea EfficientNetB7 classification of kidney lesion subtype.

A Multimodal LLM Approach for Visual Question Answering on Multiparametric 3D Brain MRI

Automated detection of bottom-of-sulcus dysplasia on magnetic resonance imaging-positron emission tomography in patients with drug-resistant focal epilepsy.

An interpretable generative multimodal neuroimaging-genomics framework for decoding Alzheimer's disease.

Enhancing Microscopic Image Quality With DiffusionFormer and Crow Search Optimization.

LMOD+: A Comprehensive Multimodal Dataset and Benchmark for Developing and Evaluating Multimodal Large Language Models in Ophthalmology

Artificial Intelligence Model for Imaging-Based Extranodal Extension Detection and Outcome Prediction in Human Papillomavirus-Positive Oropharyngeal Cancer.

A Pretraining Approach for Small-sample Training Employing Radiographs (PASTER): a Multimodal Transformer Trained by Chest Radiography and Free-text Reports.

Multi-modal Liver Segmentation and Fibrosis Staging Using Real-world MRI Images

Ready to Sharpen Your Edge?