Latest Papers on Radiology AI.

AlzFormer: Multi-modal framework for Alzheimer's classification using MRI and graph-embedded demographics guided by adaptive attention gating.

Hussain SS, Degang X, Shah PM, Khan H, Zeb A

•papers•Aug 20 2025

Alzheimer's disease (AD) is the most common neurodegenerative progressive disorder and the fifth-leading cause of death in older people. The detection of AD is a very challenging task for clinicians and radiologists due to the complex nature of this disease, thus requiring automatic data-driven machine-learning models to enhance diagnostic accuracy and support expert decision-making. However, machine learning models are hindered by three key limitations, in AD classification:(i) diffuse and subtle structural changes in the brain that make it difficult to capture global pathology (ii) non-uniform alterations across MRI planes, which limit single-view learning and (iii) the lack of deep integration of demographic context, which is often ignored despite its clinical importance. To address these challenges in this paper, we propose a novel multi-modal deep learning framework, named AlzFormer, that dynamically integrates 3D MRI with demographic features represented as knowledge graph embeddings for AD classification. Specifically, (i) to capture global and volumetric features, a 3D CNN is employed; (ii) to model plane-specific information, three parallel 2D CNNs are used for tri-planar processing (axial, coronal, sagittal), combined with a Transformer encoder; and (iii) to incorporate demographic context, we integrate demographic features as knowledge graph embeddings through a novel Adaptive Attention Gating mechanism that balances contributions from both modalities (i.e., MRI and demographics). Comprehensive experiments on two real-world datasets, including generalization tests, ablation studies, and robustness evaluation under noisy conditions, demonstrate that the proposed model provides a robust and effective solution for AD diagnosis. These results suggest strong potential for integration into Clinical Decision Support Systems (CDSS), offering a more interpretable and personalized approach to early Alzheimer's detection.

MRI Classification Neurological Methodology In Silico

Classification of familial and non-familial ADHD using auto-encoding network and binary hypothesis testing

Baboli, R., Martin, E., Qiu, Q., Zhao, L., Liu, T., Li, X.

•preprint•Aug 19 2025

Family history is one the most powerful risk factor for attention-deficit/hyperactivity disorder (ADHD), yet no study has tested whether multimodal Magnetic Resonance Imaging (MRI) combined with deep learning can separate familial ADHD (ADHD-F) and non-familial ADHD (ADHD-NF). T1-weighted and diffusion-weighted MRI data from 438 children (129 ADHD-F, 159 ADHD-NF, and 150 controls) were parcellated into 425 cortical and white-matter metrics. Our pipeline combined three feature-selection steps (t-test filtering, mutual-information ranking, and Lasso) with an auto-encoder and applied the binary-hypothesis strategy throughout; each held-out subject was assigned both possible labels in turn and evaluated under leave-one-out testing nested within five-fold cross-validation. Accuracy, sensitivity, specificity, and area under the curve (AUC) quantified performance. The model achieved accuracies/AUCs of 0.66 / 0.67 for ADHD-F vs controls, 0.67 / 0.70 for ADHD-NF vs controls, and 0.62 / 0.67 for ADHD-F vs ADHD-NF. In classification between ADHD-F and controls, the most informative metrics were the mean diffusivity (MD) of the right fornix, the MD of the left parahippocampal cingulum, and the cortical thickness of the right inferior parietal cortex. In classification between ADHD-NF and controls, the key contributors were the fractional anisotropy (FA) of the left inferior fronto-occipital fasciculus, the MD of the right fornix, and the cortical thickness of the right medial orbitofrontal cortex. In classification between ADHD-F and ADHD-NF, the highlighted features were the volume of the left cingulate cingulum tract, the volume of the right parietal segment of the superior longitudinal fasciculus, and the cortical thickness of the right fusiform cortex. Our binary hypothesis semi-supervised deep learning framework reliably separates familial and non-familial ADHD and shows that advanced semi-supervised deep learning techniques can deliver robust, generalizable neurobiological markers for neurodevelopmental disorders.

MRI Classification Neurological Retrospective Clinical In Silico Academic Lab Benchmark SOTA

Advanced liver fibrosis detection using a two-stage deep learning approach on standard T2-weighted MRI.

Gupta P, Singh S, Gulati A, Dutta N, Aggarwal Y, Kalra N, Premkumar M, Taneja S, Verma N, De A, Duseja A

•papers•Aug 19 2025

To develop and validate a deep learning model for automated detection of advanced liver fibrosis using standard T2-weighted MRI. We utilized two datasets: the public CirrMRI600 + dataset (n = 374) containing T2-weighted MRI scans from patients with cirrhosis (n = 318) and healthy subjects (n = 56), and an in-house dataset of chronic liver disease patients (n = 187). A two-stage deep learning pipeline was developed: first, an automated liver segmentation model using nnU-Net architecture trained on CirrMRI600 + and then applied to segment livers in our in-house dataset; second, a Masked Attention ResNet classification model. For classification model training, patients with liver stiffness measurement (LSM) > 12 kPa were classified as advanced fibrosis (n = 104). In contrast, healthy subjects from CirrMRI600 + and patients with LSM ≤ 12 kPa were classified as non-advanced fibrosis (n = 116). Model validation was exclusively performed on a separate test set of 23 patients with histopathological confirmation of the degree of fibrosis (METAVIR ≥ F3 indicating advanced fibrosis). We additionally compared our two-stage approach with direct classification without segmentation, and evaluated alternative architectures including DenseNet121 and SwinTransformer. The liver segmentation model performed excellently on the test set (mean Dice score: 0.960 ± 0.009; IoU: 0.923 ± 0.016). On the pathologically confirmed independent test set (n = 23), our two-stage model achieved strong diagnostic performance (sensitivity: 0.778, specificity: 0.800, AUC: 0.811, accuracy: 0.783), significantly outperforming direct classification without segmentation (AUC: 0.743). Classification performance was highly dependent on segmentation quality, with cases having excellent segmentation (Score 1) showing higher accuracy (0.818) than those with poor segmentation (Score 3, accuracy: 0.625). Alternative architectures with masked attention showed comparable but slightly lower performance (DenseNet121: AUC 0.795; SwinTransformer: AUC 0.782). Our fully automated deep learning pipeline effectively detects advanced liver fibrosis using standard non-contrast T2-weighted MRI, potentially offering a non-invasive alternative to current diagnostic approaches. The segmentation-first approach provides significant performance gains over direct classification.

MRI Segmentation Abdominal Retrospective Clinical In Silico Academic Lab Benchmark SOTA Open Dataset

A Multimodal Large Language Model as an End-to-End Classifier of Thyroid Nodule Malignancy Risk: Usability Study.

Sng GGR, Xiang Y, Lim DYZ, Tung JYM, Tan JH, Chng CL

•papers•Aug 19 2025

Thyroid nodules are common, with ultrasound imaging as the primary modality for their assessment. Risk stratification systems like the American College of Radiology Thyroid Imaging Reporting and Data System (ACR TI-RADS) have been developed but suffer from interobserver variability and low specificity. Artificial intelligence, particularly large language models (LLMs) with multimodal capabilities, presents opportunities for efficient end-to-end diagnostic processes. However, their clinical utility remains uncertain. This study evaluates the accuracy and consistency of multimodal LLMs for thyroid nodule risk stratification using the ACR TI-RADS system, examining the effects of model fine-tuning, image annotation, prompt engineering, and comparing open-source versus commercial models. In total, 3 multimodal vision-language models were evaluated: Microsoft's open-source Large Language and Visual Assistant (LLaVA) model, its medically fine-tuned variant (Large Language and Vision Assistant for bioMedicine [LLaVA-Med]), and OpenAI's commercial o3 model. A total of 192 thyroid nodules from publicly available ultrasound image datasets were assessed. Each model was evaluated using 2 prompts (basic and modified) and 2 image scenarios (unlabeled vs radiologist-annotated), yielding 6912 responses. Model outputs were compared with expert ratings for accuracy and consistency. Statistical comparisons included Chi-square tests, Mann-Whitney U tests, and Fleiss' kappa for interrater reliability. Overall, 88.4% (6110/6912) of responses were valid, with the o3 model producing the highest validity rate (2273/2304, 98.6%), followed by LLaVA (2108/2304, 91.5%) and LLaVA-Med (1729/2304, 75%; P<.001). The o3 model demonstrated the highest accuracy overall, achieving up to 57.3% accuracy in Thyroid Imaging Reporting and Data System (TI-RADS) classification, although still remaining suboptimal. Labeled images improved accuracy marginally in nodule margin assessment only when evaluating LLaVA models (407/768, 53% to 447/768, 58.2%; P=.04). Prompt engineering improved accuracy for composition (649/1,152, 56.3% vs 483/1152, 41.9%; P<.001), but significantly reduced accuracy for shape, margins, and overall classification. Consistency was the highest with the o3 model (up to 85.4%), but was comparable for LLaVA and significantly improved with image labeling and modified prompts across multiple TI-RADS categories (P<.001). Subgroup analysis for o3 alone showed prompt engineering did not affect accuracy significantly but markedly improved consistency across all TI-RADS categories (up to 97.1% for shape, P<.001). Interrater reliability was consistently poor across all combinations (Fleiss' kappa<0.60). The study demonstrates the comparative advantages and limitations of multimodal LLMs for thyroid nodule risk stratification. While the commercial model (o3) consistently outperformed open-source models in accuracy and consistency, even the best-performing model outputs remained suboptimal for direct clinical deployment. Prompt engineering significantly enhanced output consistency, particularly in the commercial model. These findings underline the importance of strategic model optimization techniques and highlight areas requiring further development before multimodal LLMs can be reliably used in clinical thyroid imaging workflows.

Ultrasound Classification Abdominal Retrospective Clinical In Silico Academic Lab GenAI

LGFFM: A Localized and Globalized Frequency Fusion Model for Ultrasound Image Segmentation.

Luo X, Wang Y, Ou-Yang L

•papers•Aug 19 2025

Accurate segmentation of ultrasound images plays a critical role in disease screening and diagnosis. Recently, neural network-based methods have garnered significant attention for their potential in improving ultrasound image segmentation. However, these methods still face significant challenges, primarily due to inherent issues in ultrasound images, such as low resolution, speckle noise, and artifacts. Additionally, ultrasound image segmentation encompasses a wide range of scenarios, including organ segmentation (e.g., cardiac and fetal head) and lesion segmentation (e.g., breast cancer and thyroid nodules), making the task highly diverse and complex. Existing methods are often designed for specific segmentation scenarios, which limits their flexibility and ability to meet the diverse needs across various scenarios. To address these challenges, we propose a novel Localized and Globalized Frequency Fusion Model (LGFFM) for ultrasound image segmentation. Specifically, we first design a Parallel Bi-Encoder (PBE) architecture that integrates Local Feature Blocks (LFB) and Global Feature Blocks (GLB) to enhance feature extraction. Additionally, we introduce a Frequency Domain Mapping Module (FDMM) to capture texture information, particularly high-frequency details such as edges. Finally, a Multi-Domain Fusion (MDF) method is developed to effectively integrate features across different domains. We conduct extensive experiments on eight representative public ultrasound datasets across four different types. The results demonstrate that LGFFM outperforms current state-of-the-art methods in both segmentation accuracy and generalization performance.

Ultrasound Segmentation Methodology In Silico Academic Lab Benchmark SOTA

Ferroelectric/Antiferroelectric HfZrOx Artificial Synapses/Neurons for Convolutional Neural Network-Spiking Neural Network Neuromorphic Computing.

Zhang J, Xu K, Lu L, Lu C, Tao X, Liu Y, Yu J, Meng J, Zhang DW, Wang T, Chen L

•papers•Aug 19 2025

Brain-inspired neuromorphic computing offers significant potential for efficient and adaptive computational platforms. Emerging ferroelectric and antiferroelectric HfZrOx devices provide key roles in convolutional neural network (CNN) and spiking neural network (SNN) computing with unique polarization switching characteristics. Here, we present ferroelectric/antiferroelectric HfZrOx devices to realize functions of artificial synapse/neurons by element doping engineering. The HfZrOx-based ferroelectric and antiferroelectric devices exhibit excellent endurance characteristics of 1 × 109 cycles. Based on the non-volatile polarization switching and spontaneous depolarization nature of ferroelectric and antiferroelectric devices, integrate-and-fire behaviors were constructed for neuromorphic computing. For the first time, a complementary ferroelectric/antiferroelectric HfZrOx artificial synapse/neuron-based hybrid CNN-SNN framework was constructed for energy-efficient cardiac magnetic resonance imaging (MRI) classification. The hybrid neural network breaks the limitation of pure SNN in 3D image recognition and improves the accuracy from 82.3 to 92.7% compared to pure CNN, highlighting the potential of composition-engineered ferroelectric materials to implement high-efficiency neuromorphic computing.

MRI Classification Cardiac Methodology Concept GenAI

Automated adaptive detection and reconstruction of quiescent cardiac phases in free-running whole-heart acquisitions using Synchronicity Maps from PHysiological mOtioN In Cine (SYMPHONIC) MRI.

Bongiolatti GMCR, Masala N, Bastiaansen JAM, Yerly J, Prša M, Rutz T, Tenisch E, Si-Mohamed S, Stuber M, Roy CW

•papers•Aug 19 2025

To reconstruct whole-heart images from free-running acquisitions through automated selection of data acceptance windows (ES: end-systole, MD: mid-diastole, ED: end-diastole) that account for heart rate variability (HRV). SYMPHONIC was developed and validated in simulated (N = 1000) and volunteer (N = 14) data. To validate SYMPHONIC, the position of the detected acceptance windows, total duration, and resulting ventricular volume were compared to the simulated ground truth to establish metrics for temporal error, quiescent interval duration, and volumetric error, respectively. SYMPHONIC MD images and those using manually defined acceptance windows with fixed (MANUALFIXED) or adaptive (MANUALADAPT) width were compared by measuring vessel sharpness (VS). The impact of HRV was assessed in patients (N = 6). Mean temporal error was larger for MD than for ED and ED in both simulations and volunteers. Mean volumetric errors were comparable. Interval duration differed for ES (p = 0.04) and ED (p < 10-3), but not for MD (p = 0.08). In simulations, SYMPHONIC and MANUALADAPT provided consistent VS for increasing HRV, while VS decreased for MANUALFIXED. In volunteers, VS differed between MANUALADAPT and MANUALFIXED (p < 0.01), but not between SYMPHONIC and MANUALADAPT (p = 0.03) or MANUALFIXED (p = 0.42). SYMPHONIC accurately detected quiescent cardiac phases in free-running data and resulted in high-quality whole-heart images despite the presence of HRV.

MRI Reconstruction Cardiac Methodology In Silico

Fracture Risk Scores Using Output from an Opportunistic Screen of Low Bone Density from Conventional X-ray.

Syme CA, Cicero MD, Adachi JD, Berger C, Morin SN, Goltzman D, Bilbily A

•papers•Aug 19 2025

Fracture risk is commonly assessed by FRAX, a tool that estimates 10-year risk for major osteoporotic fracture (MOF) and hip fracture. FRAX scores are often refined by additionally including femoral neck (FN) bone mineral density (BMD) measured by dual-energy x-ray absorptiometry (DXA) as an input. Rho™, a novel AI-powered software, estimates FN BMD T-Scores from conventional x-rays, even when FN is not in the image. Whether a FRAX score using this estimate (FRAX-Rho) can improve a FRAX score without a T-Score input (FRAX-NoT) has not been studied. We conducted a retrospective analysis of Canadian Multicentre Osteoporosis Study participants who had x-rays of the lumbar and/or thoracic spine, FRAX risk factors, and DXA T-Scores acquired at the same time point, and follow-up fracture outcomes over 9 years. In 1361 participants with lumbar x-rays, FRAX-Rho and FRAX with DXA FN T-Scores (FRAX-DXA) had very good agreement in categorizing participants by MOF risk (Cohen's weighted kappa κ=0.80 [0.77-0.82]), which tended to be better than that between FRAX-NoT and FRAX-DXA (0.76 [0.73-0.79]). Agreement in categorizing participants by hip fracture risk was significantly greater between FRAX-Rho and FRAX-DXA (0.67 [0.63-0.71]) than FRAX-NoT and FRAX-DXA (0.52 [0.48-0.56]). In predicting true incident MOF, FRAX-Rho and FRAX-DXA did not differ in their discriminative power (c-index) (0.76 and 0.77; p=0.36) and both were significantly greater than that of FRAX-NoT (0.73; p<0.004). The accuracy of FRAX-Rho for predicting MOF (Brier Score) was better than FRAX-NoT (p<0.05) but not as good as FRAX-DXA. Similar results were observed in participants with thoracic x-rays. In conclusion, FN T-Scores estimated by Rho from lumbar and thoracic x-rays add value to FRAX-NoT estimates and may be useful for risk assessment when DXA is not available.

X-Ray Registration Musculoskeletal Retrospective Clinical In Silico Startup Benchmark SOTA

Improving Deep Learning for Accelerated MRI With Data Filtering

Kang Lin, Anselm Krainovic, Kun Wang, Reinhard Heckel

•preprint•Aug 19 2025

Deep neural networks achieve state-of-the-art results for accelerated MRI reconstruction. Most research on deep learning based imaging focuses on improving neural network architectures trained and evaluated on fixed and homogeneous training and evaluation data. In this work, we investigate data curation strategies for improving MRI reconstruction. We assemble a large dataset of raw k-space data from 18 public sources consisting of 1.1M images and construct a diverse evaluation set comprising 48 test sets, capturing variations in anatomy, contrast, number of coils, and other key factors. We propose and study different data filtering strategies to enhance performance of current state-of-the-art neural networks for accelerated MRI reconstruction. Our experiments show that filtering the training data leads to consistent, albeit modest, performance gains. These performance gains are robust across different training set sizes and accelerations, and we find that filtering is particularly beneficial when the proportion of in-distribution data in the unfiltered training set is low.

MRI Reconstruction Methodology In Silico Academic Lab Benchmark SOTA Open Dataset

UNICON: UNIfied CONtinual Learning for Medical Foundational Models

Mohammad Areeb Qazi, Munachiso S Nwadike, Ibrahim Almakky, Mohammad Yaqub, Numan Saeed

•preprint•Aug 19 2025

Foundational models are trained on extensive datasets to capture the general trends of a domain. However, in medical imaging, the scarcity of data makes pre-training for every domain, modality, or task challenging. Continual learning offers a solution by fine-tuning a model sequentially on different domains or tasks, enabling it to integrate new knowledge without requiring large datasets for each training phase. In this paper, we propose UNIfied CONtinual Learning for Medical Foundational Models (UNICON), a framework that enables the seamless adaptation of foundation models to diverse domains, tasks, and modalities. Unlike conventional adaptation methods that treat these changes in isolation, UNICON provides a unified, perpetually expandable framework. Through careful integration, we show that foundation models can dynamically expand across imaging modalities, anatomical regions, and clinical objectives without catastrophic forgetting or task interference. Empirically, we validate our approach by adapting a chest CT foundation model initially trained for classification to a prognosis and segmentation task. Our results show improved performance across both additional tasks. Furthermore, we continually incorporated PET scans and achieved a 5\% improvement in Dice score compared to respective baselines. These findings establish that foundation models are not inherently constrained to their initial training scope but can evolve, paving the way toward generalist AI models for medical imaging.

Mixed Modality Segmentation Chest Methodology In Silico Academic Lab Breakthrough

Filter Papers

Tags

AlzFormer: Multi-modal framework for Alzheimer's classification using MRI and graph-embedded demographics guided by adaptive attention gating.

Classification of familial and non-familial ADHD using auto-encoding network and binary hypothesis testing

Advanced liver fibrosis detection using a two-stage deep learning approach on standard T2-weighted MRI.

A Multimodal Large Language Model as an End-to-End Classifier of Thyroid Nodule Malignancy Risk: Usability Study.

LGFFM: A Localized and Globalized Frequency Fusion Model for Ultrasound Image Segmentation.

Ferroelectric/Antiferroelectric HfZrO<sub><i>x</i></sub> Artificial Synapses/Neurons for Convolutional Neural Network-Spiking Neural Network Neuromorphic Computing.

Automated adaptive detection and reconstruction of quiescent cardiac phases in free-running whole-heart acquisitions using Synchronicity Maps from PHysiological mOtioN In Cine (SYMPHONIC) MRI.

Fracture Risk Scores Using Output from an Opportunistic Screen of Low Bone Density from Conventional X-ray.

Improving Deep Learning for Accelerated MRI With Data Filtering

UNICON: UNIfied CONtinual Learning for Medical Foundational Models

Ready to Sharpen Your Edge?