Latest Papers on Radiology AI. Tags: Benchmark SOTA

Experimental Assessment of Conventional Features, CNN-Based Features and Ensemble Schemes for Discriminating Benign Versus Malignant Lesions on Breast Ultrasound Images.

Bianconi F, Khan MU, Du H, Jassim S

•papers•Aug 28 2025

Breast ultrasound images play a pivotal role in assessing the nature of suspicious breast lesions, particularly in patients with dense tissue. Computerized analysis of breast ultrasound images has the potential to assist the physician in the clinical decision-making and improve subjective interpretation. We assess the performance of conventional features, deep learning features and ensemble schemes for discriminating benign versus malignant breast lesions on ultrasound images. A total of 19 individual feature sets (1 morphological, 2 first-order, 10 texture-based, and 6 CNN-based) were included in the analysis. Furthermore, four combined feature sets (Best by class; Top 3, 5, and 7) and four fusion schemes (feature concatenation, majority voting, sum and product rule) were considered to generate ensemble models. The experiments were carried out on three independent open-access datasets respectively containing 252 (154 benign, 98 malignant), 232 (109 benign, 123 malignant), and 281 (187 benign, 94 malignant) lesions. CNN-based features outperformed the other individual descriptors achieving levels of accuracy between 77.4% and 83.6%, followed by morphological features (71.6%-80.8%) and histograms of oriented gradients (71.4%-77.6%). Ensemble models further improved the accuracy to 80.2% to 87.5%. Fusion schemes based on product and sum rule were generally superior to feature concatenation and majority voting. Combining individual feature sets by ensemble schemes demonstrates advantages for discriminating benign versus malignant breast lesions on ultrasound images.

Ultrasound Classification Breast Methodology In Silico Academic Lab Benchmark SOTA

Self-Composing Neural Operators with Depth and Accuracy Scaling via Adaptive Train-and-Unroll Approach

Juncai He, Xinliang Liu, Jinchao Xu

•preprint•Aug 28 2025

In this work, we propose a novel framework to enhance the efficiency and accuracy of neural operators through self-composition, offering both theoretical guarantees and practical benefits. Inspired by iterative methods in solving numerical partial differential equations (PDEs), we design a specific neural operator by repeatedly applying a single neural operator block, we progressively deepen the model without explicitly adding new blocks, improving the model's capacity. To train these models efficiently, we introduce an adaptive train-and-unroll approach, where the depth of the neural operator is gradually increased during training. This approach reveals an accuracy scaling law with model depth and offers significant computational savings through our adaptive training strategy. Our architecture achieves state-of-the-art (SOTA) performance on standard benchmarks. We further demonstrate its efficacy on a challenging high-frequency ultrasound computed tomography (USCT) problem, where a multigrid-inspired backbone enables superior performance in resolving complex wave phenomena. The proposed framework provides a computationally tractable, accurate, and scalable solution for large-scale data-driven scientific machine learning applications.

Ultrasound Reconstruction Methodology In Silico Academic Lab Benchmark SOTA

Domain Adaptation Techniques for Natural and Medical Image Classification

Ahmad Chaddad, Yihang Wu, Reem Kateb, Christian Desrosiers

•preprint•Aug 28 2025

Domain adaptation (DA) techniques have the potential in machine learning to alleviate distribution differences between training and test sets by leveraging information from source domains. In image classification, most advances in DA have been made using natural images rather than medical data, which are harder to work with. Moreover, even for natural images, the use of mainstream datasets can lead to performance bias. {With the aim of better understanding the benefits of DA for both natural and medical images, this study performs 557 simulation studies using seven widely-used DA techniques for image classification in five natural and eight medical datasets that cover various scenarios, such as out-of-distribution, dynamic data streams, and limited training samples.} Our experiments yield detailed results and insightful observations highlighting the performance and medical applicability of these techniques. Notably, our results have shown the outstanding performance of the Deep Subdomain Adaptation Network (DSAN) algorithm. This algorithm achieved feasible classification accuracy (91.2\%) in the COVID-19 dataset using Resnet50 and showed an important accuracy improvement in the dynamic data stream DA scenario (+6.7\%) compared to the baseline. Our results also demonstrate that DSAN exhibits remarkable level of explainability when evaluated on COVID-19 and skin cancer datasets. These results contribute to the understanding of DA techniques and offer valuable insight into the effective adaptation of models to medical data.

Mixed Modality Classification Methodology In Silico Benchmark SOTA

Mask-Guided Multi-Channel SwinUNETR Framework for Robust MRI Classification

Smriti Joshi, Lidia Garrucho, Richard Osuala, Oliver Diaz, Karim Lekadir

•preprint•Aug 28 2025

Breast cancer is one of the leading causes of cancer-related mortality in women, and early detection is essential for improving outcomes. Magnetic resonance imaging (MRI) is a highly sensitive tool for breast cancer detection, particularly in women at high risk or with dense breast tissue, where mammography is less effective. The ODELIA consortium organized a multi-center challenge to foster AI-based solutions for breast cancer diagnosis and classification. The dataset included 511 studies from six European centers, acquired on scanners from multiple vendors at both 1.5 T and 3 T. Each study was labeled for the left and right breast as no lesion, benign lesion, or malignant lesion. We developed a SwinUNETR-based deep learning framework that incorporates breast region masking, extensive data augmentation, and ensemble learning to improve robustness and generalizability. Our method achieved second place on the challenge leaderboard, highlighting its potential to support clinical breast MRI interpretation. We publicly share our codebase at https://github.com/smriti-joshi/bcnaim-odelia-challenge.git.

MRI Classification Breast Retrospective Clinical In Silico Consortium Open Code Benchmark SOTA

AI-driven body composition monitoring and its prognostic role in mCRPC undergoing lutetium-177 PSMA radioligand therapy: insights from a retrospective single-center analysis.

Ruhwedel T, Rogasch J, Galler M, Schatka I, Wetz C, Furth C, Biernath N, De Santis M, Shnayien S, Kolck J, Geisel D, Amthauer H, Beetz NL

•papers•Aug 28 2025

Body composition (BC) analysis is performed to quantify the relative amounts of different body tissues as a measure of physical fitness and tumor cachexia. We hypothesized that relative changes in body composition (BC) parameters, assessed by an artificial intelligence-based, PACS-integrated software, between baseline imaging before the start of radioligand therapy (RLT) and interim staging after two RLT cycles could predict overall survival (OS) in patients with metastatic castration-resistant prostate cancer. We conducted a single-center, retrospective analysis of 92 patients with mCRPC undergoing [<sup>177</sup>Lu]Lu-PSMA RLT between September 2015 and December 2023. All patients had [<sup>68</sup> Ga]Ga-PSMA-11 PET/CT at baseline (≤ 6 weeks before the first RLT cycle) and at interim staging (6-8 weeks after the second RLT cycle) allowing for longitudinal BC assessment. During follow-up, 78 patients (85%) died. Median OS was 16.3 months. Median follow-up time in survivors was 25.6 months. The 1 year mortality rate was 32.6% (95%CI 23.0-42.2%) and the 5 year mortality rate was 92.9% (95%CI 85.8-100.0%). In multivariable regression, relative change in visceral adipose tissue (VAT) (HR: 0.26; p = 0.006), previous chemotherapy of any type (HR: 2.4; p = 0.003), the presence of liver metastases (HR: 2.4; p = 0.018) and a higher baseline De Ritis ratio (HR: 1.4; p < 0.001) remained independent predictors of OS. Patients with a higher decrease in VAT (< -20%) had a median OS of 10.2 months versus 18.5 months in patients with a lower VAT decrease or VAT increase (≥ -20%) (log-rank test: p = 0.008). In a separate Cox model, the change in VAT predicted OS (p = 0.005) independent of the best PSA response after 1-2 RLT cycles (p = 0.09), and there was no interaction between the two (p = 0.09). PACS-Integrated, AI-based BC monitoring detects relative changes in the VAT, Which was an independent predictor of shorter OS in our population of patients undergoing RLT.

CT Segmentation Abdominal Retrospective Clinical Clinical Pilot Academic Lab Benchmark SOTA

A Systematic Review on the Generative AI Applications in Human Medical Genomics

Anton Changalidis, Yury Barbitoff, Yulia Nasykhova, Andrey Glotov

•preprint•Aug 27 2025

Although traditional statistical techniques and machine learning methods have contributed significantly to genetics and, in particular, inherited disease diagnosis, they often struggle with complex, high-dimensional data, a challenge now addressed by state-of-the-art deep learning models. Large language models (LLMs), based on transformer architectures, have excelled in tasks requiring contextual comprehension of unstructured medical data. This systematic review examines the role of LLMs in the genetic research and diagnostics of both rare and common diseases. Automated keyword-based search in PubMed, bioRxiv, medRxiv, and arXiv was conducted, targeting studies on LLM applications in diagnostics and education within genetics and removing irrelevant or outdated models. A total of 172 studies were analyzed, highlighting applications in genomic variant identification, annotation, and interpretation, as well as medical imaging advancements through vision transformers. Key findings indicate that while transformer-based models significantly advance disease and risk stratification, variant interpretation, medical imaging analysis, and report generation, major challenges persist in integrating multimodal data (genomic sequences, imaging, and clinical records) into unified and clinically robust pipelines, facing limitations in generalizability and practical implementation in clinical settings. This review provides a comprehensive classification and assessment of the current capabilities and limitations of LLMs in transforming hereditary disease diagnostics and supporting genetic education, serving as a guide to navigate this rapidly evolving field.

Mixed Modality LLM Radiology Report Review Concept Academic Lab GenAI Benchmark SOTA

MedNet-PVS: A MedNeXt-Based Deep Learning Model for Automated Segmentation of Perivascular Spaces

Zhen Xuen Brandon Low, Rory Zhang, Hang Min, William Pham, Lucy Vivash, Jasmine Moses, Miranda Lynch, Karina Dorfman, Cassandra Marotta, Shaun Koh, Jacob Bunyamin, Ella Rowsthorn, Alex Jarema, Himashi Peiris, Zhaolin Chen, Sandy R. Shultz, David K. Wright, Dexiao Kong, Sharon L. Naismith, Terence J. O'Brien, Ying Xia, Meng Law, Benjamin Sinclair

•preprint•Aug 27 2025

Enlarged perivascular spaces (PVS) are increasingly recognized as biomarkers of cerebral small vessel disease, Alzheimer's disease, stroke, and aging-related neurodegeneration. However, manual segmentation of PVS is time-consuming and subject to moderate inter-rater reliability, while existing automated deep learning models have moderate performance and typically fail to generalize across diverse clinical and research MRI datasets. We adapted MedNeXt-L-k5, a Transformer-inspired 3D encoder-decoder convolutional network, for automated PVS segmentation. Two models were trained: one using a homogeneous dataset of 200 T2-weighted (T2w) MRI scans from the Human Connectome Project-Aging (HCP-Aging) dataset and another using 40 heterogeneous T1-weighted (T1w) MRI volumes from seven studies across six scanners. Model performance was evaluated using internal 5-fold cross validation (5FCV) and leave-one-site-out cross validation (LOSOCV). MedNeXt-L-k5 models trained on the T2w images of the HCP-Aging dataset achieved voxel-level Dice scores of 0.88+/-0.06 (white matter, WM), comparable to the reported inter-rater reliability of that dataset, and the highest yet reported in the literature. The same models trained on the T1w images of the HCP-Aging dataset achieved a substantially lower Dice score of 0.58+/-0.09 (WM). Under LOSOCV, the model had voxel-level Dice scores of 0.38+/-0.16 (WM) and 0.35+/-0.12 (BG), and cluster-level Dice scores of 0.61+/-0.19 (WM) and 0.62+/-0.21 (BG). MedNeXt-L-k5 provides an efficient solution for automated PVS segmentation across diverse T1w and T2w MRI datasets. MedNeXt-L-k5 did not outperform the nnU-Net, indicating that the attention-based mechanisms present in transformer-inspired models to provide global context are not required for high accuracy in PVS segmentation.

MRI Segmentation Neurological Retrospective Clinical In Silico Academic Lab Benchmark SOTA

Artificial intelligence system for predicting areal bone mineral density from plain X-rays.

Nguyen HG, Nguyen DT, Tran TS, Ling SH, Ho-Pham LT, Van Nguyen T

•papers•Aug 27 2025

Dual-energy X-ray absorptiometry (DXA) is the standard method for assessing areal bone mineral density (aBMD), diagnosing osteoporosis, and predicting fracture risk. However, DXA's availability is limited in resource-poor areas. This study aimed to develop an artificial intelligence (AI) system capable of estimating aBMD from standard radiographs. The study was part of the Vietnam Osteoporosis Study, a prospective population-based research involving 3783 participants aged 18 years and older. A total of 7060 digital radiographs of the frontal pelvis and lateral spine were taken using the FCR Capsula XLII system (Fujifilm Corp., Tokyo, Japan). aBMD at the femoral neck and lumbar spine was measured with DXA (Hologic Horizon, Hologic Corp., Bedford, MA, USA). An ensemble of seven deep-learning models was used to analyze the X-rays and predict bone mineral density, termed "xBMD". The correlation between xBMD and aBMD was evaluated using Pearson's correlation coefficients. The correlation between xBMD and aBMD at the femoral neck was strong ( <math xmlns="http://www.w3.org/1998/Math/MathML"><mi>r</mi></math> = 0.90; 95% CI, 0.88-0.91), and similarly high at the lumbar spine ( <math xmlns="http://www.w3.org/1998/Math/MathML"><mi>r</mi></math> = 0.87; 95% CI, 0.85-0.88). This correlation remained consistent across different age groups and genders. The AI system demonstrated excellent performance in identifying individuals at high risk for hip fractures, with area under the ROC curve (AUC) values of 0.96 (95% CI, 0.95-0.98) at the femoral neck and 0.97 (95% CI, 0.96-0.99) at the lumbar spine. These findings indicate that AI can accurately predict aBMD and identify individuals at high risk of fractures. This AI system could provide an efficient alternative to DXA for osteoporosis screening in settings with limited resources and high patient demand. An AI system developed to predict aBMD from X-rays showed strong correlations with DXA ( <math xmlns="http://www.w3.org/1998/Math/MathML"><mi>r</mi></math> = 0.90 at femoral neck; = <math xmlns="http://www.w3.org/1998/Math/MathML"><mi>r</mi></math> 0.87 at lumbar spine) and high accuracy in identifying individuals at high risk for fractures (AUC = 0.96 at femoral neck; AUC = 0.97 at lumbar spine).

X-Ray Registration Musculoskeletal Prospective Clinical Pilot Academic Lab Benchmark SOTA

MRI-based machine-learning radiomics of the liver to predict liver-related events in hepatitis B virus-associated fibrosis.

Luo Y, Luo Q, Wu Y, Zhang S, Ren H, Wang X, Liu X, Yang Q, Xu W, Wu Q, Li Y

•papers•Aug 27 2025

The onset of liver-related events (LREs) in fibrosis indicates a poor prognosis and worsens patients' quality of life, making the prediction and early detection of LREs crucial. The aim of this study was to develop a radiomics model using liver magnetic resonance imaging (MRI) to predict LRE risk in patients undergoing antiviral treatment for chronic fibrosis caused by hepatitis B virus (HBV). Patients with HBV-associated liver fibrosis and liver stiffness measurements ≥ 10 kPa were included. Feature selection and dimensionality reduction techniques identified discriminative features from three MRI sequences. Radiomics models were built using eight machine learning techniques and evaluated for performance. Shapley additive explanation and permutation importance techniques were applied to interpret the model output. A total of 222 patients aged 49 ± 10 years (mean ± standard deviation), 175 males, were evaluated, with 41 experiencing LREs. The radiomics model, incorporating 58 selected features, outperformed traditional clinical tools in prediction accuracy. Developed using a support vector machine classifier, the model achieved optimal areas under the receiver operating characteristic curves of 0.94 and 0.93 in the training and test sets, respectively, demonstrating good calibration. Machine learning techniques effectively predicted LREs in patients with fibrosis and HBV, offering comparable accuracy across algorithms and supporting personalized care decisions for HBV-related liver disease. Radiomics models based on liver multisequence MRI can improve risk prediction and management of patients with HBV-associated chronic fibrosis. In addition, it offers valuable prognostic insights and aids in making informed clinical decisions. Liver-related events (LREs) are associated with poor prognosis in chronic fibrosis. Radiomics models could predict LREs in patients with hepatitis B-associated chronic fibrosis. Radiomics contributes to personalized care choices for patients with hepatitis B-associated fibrosis.

MRI Classification Abdominal Retrospective Clinical In Silico Academic Lab Benchmark SOTA

Activating Associative Disease-Aware Vision Token Memory for LLM-Based X-ray Report Generation.

Wang X, Wang F, Wang H, Jiang B, Li C, Wang Y, Tian Y, Tang J

•papers•Aug 27 2025

X-ray image based medical report generation achieves significant progress in recent years with the help of large language models, however, these models have not fully exploited the effective information in visual image regions, resulting in reports that are linguistically sound but insufficient in describing key diseases. In this paper, we propose a novel associative memory-enhanced X-ray report generation model that effectively mimics the process of professional doctors writing medical reports. It considers both the mining of global and local visual information and associates historical report information to better complete the writing of the current report. Specifically, given an X-ray image, we first utilize a classification model along with its activation maps to accomplish the mining of visual regions highly associated with diseases and the learning of disease query tokens. Then, we employ a visual Hopfield network to establish memory associations for disease-related tokens, and a report Hopfield network to retrieve report memory information. This process facilitates the generation of high-quality reports based on a large language model and achieves state-of-the-art performance on multiple benchmark datasets, including the IU X-ray, MIMIC-CXR, and Chexpert Plus. The source code and pre-trained models of this work have been released on https://github.com/Event-AHU/Medical_Image_Analysis.

X-Ray Report Generation Chest Methodology In Silico Academic Lab Benchmark SOTA Open Code

Filter Papers

Tags

Experimental Assessment of Conventional Features, CNN-Based Features and Ensemble Schemes for Discriminating Benign Versus Malignant Lesions on Breast Ultrasound Images.

Self-Composing Neural Operators with Depth and Accuracy Scaling via Adaptive Train-and-Unroll Approach

Domain Adaptation Techniques for Natural and Medical Image Classification

Mask-Guided Multi-Channel SwinUNETR Framework for Robust MRI Classification

AI-driven body composition monitoring and its prognostic role in mCRPC undergoing lutetium-177 PSMA radioligand therapy: insights from a retrospective single-center analysis.

A Systematic Review on the Generative AI Applications in Human Medical Genomics

MedNet-PVS: A MedNeXt-Based Deep Learning Model for Automated Segmentation of Perivascular Spaces

Artificial intelligence system for predicting areal bone mineral density from plain X-rays.

MRI-based machine-learning radiomics of the liver to predict liver-related events in hepatitis B virus-associated fibrosis.

Activating Associative Disease-Aware Vision Token Memory for LLM-Based X-ray Report Generation.

Ready to Sharpen Your Edge?