Latest Papers on Radiology AI. Tags: None

Classification of familial and non-familial ADHD using auto-encoding network and binary hypothesis testing

Baboli, R., Martin, E., Qiu, Q., Zhao, L., Liu, T., Li, X.

•preprint•Aug 19 2025

Family history is one the most powerful risk factor for attention-deficit/hyperactivity disorder (ADHD), yet no study has tested whether multimodal Magnetic Resonance Imaging (MRI) combined with deep learning can separate familial ADHD (ADHD-F) and non-familial ADHD (ADHD-NF). T1-weighted and diffusion-weighted MRI data from 438 children (129 ADHD-F, 159 ADHD-NF, and 150 controls) were parcellated into 425 cortical and white-matter metrics. Our pipeline combined three feature-selection steps (t-test filtering, mutual-information ranking, and Lasso) with an auto-encoder and applied the binary-hypothesis strategy throughout; each held-out subject was assigned both possible labels in turn and evaluated under leave-one-out testing nested within five-fold cross-validation. Accuracy, sensitivity, specificity, and area under the curve (AUC) quantified performance. The model achieved accuracies/AUCs of 0.66 / 0.67 for ADHD-F vs controls, 0.67 / 0.70 for ADHD-NF vs controls, and 0.62 / 0.67 for ADHD-F vs ADHD-NF. In classification between ADHD-F and controls, the most informative metrics were the mean diffusivity (MD) of the right fornix, the MD of the left parahippocampal cingulum, and the cortical thickness of the right inferior parietal cortex. In classification between ADHD-NF and controls, the key contributors were the fractional anisotropy (FA) of the left inferior fronto-occipital fasciculus, the MD of the right fornix, and the cortical thickness of the right medial orbitofrontal cortex. In classification between ADHD-F and ADHD-NF, the highlighted features were the volume of the left cingulate cingulum tract, the volume of the right parietal segment of the superior longitudinal fasciculus, and the cortical thickness of the right fusiform cortex. Our binary hypothesis semi-supervised deep learning framework reliably separates familial and non-familial ADHD and shows that advanced semi-supervised deep learning techniques can deliver robust, generalizable neurobiological markers for neurodevelopmental disorders.

MRI Classification Neurological Retrospective Clinical In Silico Academic Lab Benchmark SOTA

Advanced liver fibrosis detection using a two-stage deep learning approach on standard T2-weighted MRI.

Gupta P, Singh S, Gulati A, Dutta N, Aggarwal Y, Kalra N, Premkumar M, Taneja S, Verma N, De A, Duseja A

•papers•Aug 19 2025

To develop and validate a deep learning model for automated detection of advanced liver fibrosis using standard T2-weighted MRI. We utilized two datasets: the public CirrMRI600 + dataset (n = 374) containing T2-weighted MRI scans from patients with cirrhosis (n = 318) and healthy subjects (n = 56), and an in-house dataset of chronic liver disease patients (n = 187). A two-stage deep learning pipeline was developed: first, an automated liver segmentation model using nnU-Net architecture trained on CirrMRI600 + and then applied to segment livers in our in-house dataset; second, a Masked Attention ResNet classification model. For classification model training, patients with liver stiffness measurement (LSM) > 12 kPa were classified as advanced fibrosis (n = 104). In contrast, healthy subjects from CirrMRI600 + and patients with LSM ≤ 12 kPa were classified as non-advanced fibrosis (n = 116). Model validation was exclusively performed on a separate test set of 23 patients with histopathological confirmation of the degree of fibrosis (METAVIR ≥ F3 indicating advanced fibrosis). We additionally compared our two-stage approach with direct classification without segmentation, and evaluated alternative architectures including DenseNet121 and SwinTransformer. The liver segmentation model performed excellently on the test set (mean Dice score: 0.960 ± 0.009; IoU: 0.923 ± 0.016). On the pathologically confirmed independent test set (n = 23), our two-stage model achieved strong diagnostic performance (sensitivity: 0.778, specificity: 0.800, AUC: 0.811, accuracy: 0.783), significantly outperforming direct classification without segmentation (AUC: 0.743). Classification performance was highly dependent on segmentation quality, with cases having excellent segmentation (Score 1) showing higher accuracy (0.818) than those with poor segmentation (Score 3, accuracy: 0.625). Alternative architectures with masked attention showed comparable but slightly lower performance (DenseNet121: AUC 0.795; SwinTransformer: AUC 0.782). Our fully automated deep learning pipeline effectively detects advanced liver fibrosis using standard non-contrast T2-weighted MRI, potentially offering a non-invasive alternative to current diagnostic approaches. The segmentation-first approach provides significant performance gains over direct classification.

MRI Segmentation Abdominal Retrospective Clinical In Silico Academic Lab Benchmark SOTA Open Dataset

A Multimodal Large Language Model as an End-to-End Classifier of Thyroid Nodule Malignancy Risk: Usability Study.

Sng GGR, Xiang Y, Lim DYZ, Tung JYM, Tan JH, Chng CL

•papers•Aug 19 2025

Thyroid nodules are common, with ultrasound imaging as the primary modality for their assessment. Risk stratification systems like the American College of Radiology Thyroid Imaging Reporting and Data System (ACR TI-RADS) have been developed but suffer from interobserver variability and low specificity. Artificial intelligence, particularly large language models (LLMs) with multimodal capabilities, presents opportunities for efficient end-to-end diagnostic processes. However, their clinical utility remains uncertain. This study evaluates the accuracy and consistency of multimodal LLMs for thyroid nodule risk stratification using the ACR TI-RADS system, examining the effects of model fine-tuning, image annotation, prompt engineering, and comparing open-source versus commercial models. In total, 3 multimodal vision-language models were evaluated: Microsoft's open-source Large Language and Visual Assistant (LLaVA) model, its medically fine-tuned variant (Large Language and Vision Assistant for bioMedicine [LLaVA-Med]), and OpenAI's commercial o3 model. A total of 192 thyroid nodules from publicly available ultrasound image datasets were assessed. Each model was evaluated using 2 prompts (basic and modified) and 2 image scenarios (unlabeled vs radiologist-annotated), yielding 6912 responses. Model outputs were compared with expert ratings for accuracy and consistency. Statistical comparisons included Chi-square tests, Mann-Whitney U tests, and Fleiss' kappa for interrater reliability. Overall, 88.4% (6110/6912) of responses were valid, with the o3 model producing the highest validity rate (2273/2304, 98.6%), followed by LLaVA (2108/2304, 91.5%) and LLaVA-Med (1729/2304, 75%; P<.001). The o3 model demonstrated the highest accuracy overall, achieving up to 57.3% accuracy in Thyroid Imaging Reporting and Data System (TI-RADS) classification, although still remaining suboptimal. Labeled images improved accuracy marginally in nodule margin assessment only when evaluating LLaVA models (407/768, 53% to 447/768, 58.2%; P=.04). Prompt engineering improved accuracy for composition (649/1,152, 56.3% vs 483/1152, 41.9%; P<.001), but significantly reduced accuracy for shape, margins, and overall classification. Consistency was the highest with the o3 model (up to 85.4%), but was comparable for LLaVA and significantly improved with image labeling and modified prompts across multiple TI-RADS categories (P<.001). Subgroup analysis for o3 alone showed prompt engineering did not affect accuracy significantly but markedly improved consistency across all TI-RADS categories (up to 97.1% for shape, P<.001). Interrater reliability was consistently poor across all combinations (Fleiss' kappa<0.60). The study demonstrates the comparative advantages and limitations of multimodal LLMs for thyroid nodule risk stratification. While the commercial model (o3) consistently outperformed open-source models in accuracy and consistency, even the best-performing model outputs remained suboptimal for direct clinical deployment. Prompt engineering significantly enhanced output consistency, particularly in the commercial model. These findings underline the importance of strategic model optimization techniques and highlight areas requiring further development before multimodal LLMs can be reliably used in clinical thyroid imaging workflows.

Ultrasound Classification Abdominal Retrospective Clinical In Silico Academic Lab GenAI

LGFFM: A Localized and Globalized Frequency Fusion Model for Ultrasound Image Segmentation.

Luo X, Wang Y, Ou-Yang L

•papers•Aug 19 2025

Accurate segmentation of ultrasound images plays a critical role in disease screening and diagnosis. Recently, neural network-based methods have garnered significant attention for their potential in improving ultrasound image segmentation. However, these methods still face significant challenges, primarily due to inherent issues in ultrasound images, such as low resolution, speckle noise, and artifacts. Additionally, ultrasound image segmentation encompasses a wide range of scenarios, including organ segmentation (e.g., cardiac and fetal head) and lesion segmentation (e.g., breast cancer and thyroid nodules), making the task highly diverse and complex. Existing methods are often designed for specific segmentation scenarios, which limits their flexibility and ability to meet the diverse needs across various scenarios. To address these challenges, we propose a novel Localized and Globalized Frequency Fusion Model (LGFFM) for ultrasound image segmentation. Specifically, we first design a Parallel Bi-Encoder (PBE) architecture that integrates Local Feature Blocks (LFB) and Global Feature Blocks (GLB) to enhance feature extraction. Additionally, we introduce a Frequency Domain Mapping Module (FDMM) to capture texture information, particularly high-frequency details such as edges. Finally, a Multi-Domain Fusion (MDF) method is developed to effectively integrate features across different domains. We conduct extensive experiments on eight representative public ultrasound datasets across four different types. The results demonstrate that LGFFM outperforms current state-of-the-art methods in both segmentation accuracy and generalization performance.

Ultrasound Segmentation Methodology In Silico Academic Lab Benchmark SOTA

Automated adaptive detection and reconstruction of quiescent cardiac phases in free-running whole-heart acquisitions using Synchronicity Maps from PHysiological mOtioN In Cine (SYMPHONIC) MRI.

Bongiolatti GMCR, Masala N, Bastiaansen JAM, Yerly J, Prša M, Rutz T, Tenisch E, Si-Mohamed S, Stuber M, Roy CW

•papers•Aug 19 2025

To reconstruct whole-heart images from free-running acquisitions through automated selection of data acceptance windows (ES: end-systole, MD: mid-diastole, ED: end-diastole) that account for heart rate variability (HRV). SYMPHONIC was developed and validated in simulated (N = 1000) and volunteer (N = 14) data. To validate SYMPHONIC, the position of the detected acceptance windows, total duration, and resulting ventricular volume were compared to the simulated ground truth to establish metrics for temporal error, quiescent interval duration, and volumetric error, respectively. SYMPHONIC MD images and those using manually defined acceptance windows with fixed (MANUALFIXED) or adaptive (MANUALADAPT) width were compared by measuring vessel sharpness (VS). The impact of HRV was assessed in patients (N = 6). Mean temporal error was larger for MD than for ED and ED in both simulations and volunteers. Mean volumetric errors were comparable. Interval duration differed for ES (p = 0.04) and ED (p < 10-3), but not for MD (p = 0.08). In simulations, SYMPHONIC and MANUALADAPT provided consistent VS for increasing HRV, while VS decreased for MANUALFIXED. In volunteers, VS differed between MANUALADAPT and MANUALFIXED (p < 0.01), but not between SYMPHONIC and MANUALADAPT (p = 0.03) or MANUALFIXED (p = 0.42). SYMPHONIC accurately detected quiescent cardiac phases in free-running data and resulted in high-quality whole-heart images despite the presence of HRV.

MRI Reconstruction Cardiac Methodology In Silico

Fracture Risk Scores Using Output from an Opportunistic Screen of Low Bone Density from Conventional X-ray.

Syme CA, Cicero MD, Adachi JD, Berger C, Morin SN, Goltzman D, Bilbily A

•papers•Aug 19 2025

Fracture risk is commonly assessed by FRAX, a tool that estimates 10-year risk for major osteoporotic fracture (MOF) and hip fracture. FRAX scores are often refined by additionally including femoral neck (FN) bone mineral density (BMD) measured by dual-energy x-ray absorptiometry (DXA) as an input. Rho™, a novel AI-powered software, estimates FN BMD T-Scores from conventional x-rays, even when FN is not in the image. Whether a FRAX score using this estimate (FRAX-Rho) can improve a FRAX score without a T-Score input (FRAX-NoT) has not been studied. We conducted a retrospective analysis of Canadian Multicentre Osteoporosis Study participants who had x-rays of the lumbar and/or thoracic spine, FRAX risk factors, and DXA T-Scores acquired at the same time point, and follow-up fracture outcomes over 9 years. In 1361 participants with lumbar x-rays, FRAX-Rho and FRAX with DXA FN T-Scores (FRAX-DXA) had very good agreement in categorizing participants by MOF risk (Cohen's weighted kappa κ=0.80 [0.77-0.82]), which tended to be better than that between FRAX-NoT and FRAX-DXA (0.76 [0.73-0.79]). Agreement in categorizing participants by hip fracture risk was significantly greater between FRAX-Rho and FRAX-DXA (0.67 [0.63-0.71]) than FRAX-NoT and FRAX-DXA (0.52 [0.48-0.56]). In predicting true incident MOF, FRAX-Rho and FRAX-DXA did not differ in their discriminative power (c-index) (0.76 and 0.77; p=0.36) and both were significantly greater than that of FRAX-NoT (0.73; p<0.004). The accuracy of FRAX-Rho for predicting MOF (Brier Score) was better than FRAX-NoT (p<0.05) but not as good as FRAX-DXA. Similar results were observed in participants with thoracic x-rays. In conclusion, FN T-Scores estimated by Rho from lumbar and thoracic x-rays add value to FRAX-NoT estimates and may be useful for risk assessment when DXA is not available.

X-Ray Registration Musculoskeletal Retrospective Clinical In Silico Startup Benchmark SOTA

Improving Deep Learning for Accelerated MRI With Data Filtering

Kang Lin, Anselm Krainovic, Kun Wang, Reinhard Heckel

•preprint•Aug 19 2025

Deep neural networks achieve state-of-the-art results for accelerated MRI reconstruction. Most research on deep learning based imaging focuses on improving neural network architectures trained and evaluated on fixed and homogeneous training and evaluation data. In this work, we investigate data curation strategies for improving MRI reconstruction. We assemble a large dataset of raw k-space data from 18 public sources consisting of 1.1M images and construct a diverse evaluation set comprising 48 test sets, capturing variations in anatomy, contrast, number of coils, and other key factors. We propose and study different data filtering strategies to enhance performance of current state-of-the-art neural networks for accelerated MRI reconstruction. Our experiments show that filtering the training data leads to consistent, albeit modest, performance gains. These performance gains are robust across different training set sizes and accelerations, and we find that filtering is particularly beneficial when the proportion of in-distribution data in the unfiltered training set is low.

MRI Reconstruction Methodology In Silico Academic Lab Benchmark SOTA Open Dataset

UNICON: UNIfied CONtinual Learning for Medical Foundational Models

Mohammad Areeb Qazi, Munachiso S Nwadike, Ibrahim Almakky, Mohammad Yaqub, Numan Saeed

•preprint•Aug 19 2025

Foundational models are trained on extensive datasets to capture the general trends of a domain. However, in medical imaging, the scarcity of data makes pre-training for every domain, modality, or task challenging. Continual learning offers a solution by fine-tuning a model sequentially on different domains or tasks, enabling it to integrate new knowledge without requiring large datasets for each training phase. In this paper, we propose UNIfied CONtinual Learning for Medical Foundational Models (UNICON), a framework that enables the seamless adaptation of foundation models to diverse domains, tasks, and modalities. Unlike conventional adaptation methods that treat these changes in isolation, UNICON provides a unified, perpetually expandable framework. Through careful integration, we show that foundation models can dynamically expand across imaging modalities, anatomical regions, and clinical objectives without catastrophic forgetting or task interference. Empirically, we validate our approach by adapting a chest CT foundation model initially trained for classification to a prognosis and segmentation task. Our results show improved performance across both additional tasks. Furthermore, we continually incorporated PET scans and achieved a 5\% improvement in Dice score compared to respective baselines. These findings establish that foundation models are not inherently constrained to their initial training scope but can evolve, paving the way toward generalist AI models for medical imaging.

Mixed Modality Segmentation Chest Methodology In Silico Academic Lab Breakthrough

MMIS-Net for Retinal Fluid Segmentation and Detection

Nchongmaje Ndipenocha, Alina Mirona, Kezhi Wanga, Yongmin Li

•preprint•Aug 19 2025

Purpose: Deep learning methods have shown promising results in the segmentation, and detection of diseases in medical images. However, most methods are trained and tested on data from a single source, modality, organ, or disease type, overlooking the combined potential of other available annotated data. Numerous small annotated medical image datasets from various modalities, organs, and diseases are publicly available. In this work, we aim to leverage the synergistic potential of these datasets to improve performance on unseen data. Approach: To this end, we propose a novel algorithm called MMIS-Net (MultiModal Medical Image Segmentation Network), which features Similarity Fusion blocks that utilize supervision and pixel-wise similarity knowledge selection for feature map fusion. Additionally, to address inconsistent class definitions and label contradictions, we created a one-hot label space to handle classes absent in one dataset but annotated in another. MMIS-Net was trained on 10 datasets encompassing 19 organs across 2 modalities to build a single model. Results: The algorithm was evaluated on the RETOUCH grand challenge hidden test set, outperforming large foundation models for medical image segmentation and other state-of-the-art algorithms. We achieved the best mean Dice score of 0.83 and an absolute volume difference of 0.035 for the fluids segmentation task, as well as a perfect Area Under the Curve of 1 for the fluid detection task. Conclusion: The quantitative results highlight the effectiveness of our proposed model due to the incorporation of Similarity Fusion blocks into the network's backbone for supervision and similarity knowledge selection, and the use of a one-hot label space to address label class inconsistencies and contradictions.

OCT Segmentation Methodology In Silico Academic Lab Benchmark SOTA

ASDFormer: A Transformer with Mixtures of Pooling-Classifier Experts for Robust Autism Diagnosis and Biomarker Discovery

Mohammad Izadi, Mehran Safayani

•preprint•Aug 19 2025

Autism Spectrum Disorder (ASD) is a complex neurodevelopmental condition marked by disruptions in brain connectivity. Functional MRI (fMRI) offers a non-invasive window into large-scale neural dynamics by measuring blood-oxygen-level-dependent (BOLD) signals across the brain. These signals can be modeled as interactions among Regions of Interest (ROIs), which are grouped into functional communities based on their underlying roles in brain function. Emerging evidence suggests that connectivity patterns within and between these communities are particularly sensitive to ASD-related alterations. Effectively capturing these patterns and identifying interactions that deviate from typical development is essential for improving ASD diagnosis and enabling biomarker discovery. In this work, we introduce ASDFormer, a Transformer-based architecture that incorporates a Mixture of Pooling-Classifier Experts (MoE) to capture neural signatures associated with ASD. By integrating multiple specialized expert branches with attention mechanisms, ASDFormer adaptively emphasizes different brain regions and connectivity patterns relevant to autism. This enables both improved classification performance and more interpretable identification of disorder-related biomarkers. Applied to the ABIDE dataset, ASDFormer achieves state-of-the-art diagnostic accuracy and reveals robust insights into functional connectivity disruptions linked to ASD, highlighting its potential as a tool for biomarker discovery.

MRI Classification Neurological Methodology In Silico Benchmark SOTA

Filter Papers

Tags

Classification of familial and non-familial ADHD using auto-encoding network and binary hypothesis testing

Advanced liver fibrosis detection using a two-stage deep learning approach on standard T2-weighted MRI.

A Multimodal Large Language Model as an End-to-End Classifier of Thyroid Nodule Malignancy Risk: Usability Study.

LGFFM: A Localized and Globalized Frequency Fusion Model for Ultrasound Image Segmentation.

Automated adaptive detection and reconstruction of quiescent cardiac phases in free-running whole-heart acquisitions using Synchronicity Maps from PHysiological mOtioN In Cine (SYMPHONIC) MRI.

Fracture Risk Scores Using Output from an Opportunistic Screen of Low Bone Density from Conventional X-ray.

Improving Deep Learning for Accelerated MRI With Data Filtering

UNICON: UNIfied CONtinual Learning for Medical Foundational Models

MMIS-Net for Retinal Fluid Segmentation and Detection

ASDFormer: A Transformer with Mixtures of Pooling-Classifier Experts for Robust Autism Diagnosis and Biomarker Discovery

Ready to Sharpen Your Edge?