Latest Papers on Radiology AI. Tags: In Silico, Order: Best Match, Limit: 10.

Improving Breast Cancer Diagnosis in Ultrasound Images Using Deep Learning with Feature Fusion and Attention Mechanism.

Asif S, Yan Y, Feng B, Wang M, Zheng Y, Jiang T, Fu R, Yao J, Lv L, Song M, Sui L, Yin Z, Wang VY, Xu D

•papers•May 27 2025

Early detection of malignant lesions in ultrasound images is crucial for effective cancer diagnosis and treatment. While traditional methods rely on radiologists, deep learning models can improve accuracy, reduce errors, and enhance efficiency. This study explores the application of a deep learning model for classifying benign and malignant lesions, focusing on its performance and interpretability. In this study, we proposed a feature fusion-based deep learning model for classifying benign and malignant lesions in ultrasound images. The model leverages advanced architectures such as MobileNetV2 and DenseNet121, enhanced with feature fusion and attention mechanisms to boost classification accuracy. The clinical dataset comprises 2171 images collected from 1758 patients between December 2020 and May 2024. Additionally, we utilized the publicly available BUSI dataset, consisting of 780 images from female patients aged 25 to 75, collected in 2018. To enhance interpretability, we applied Grad-CAM, Saliency Maps, and shapley additive explanations (SHAP) techniques to explain the model's decision-making. A comparative analysis with radiologists of varying expertise levels is also conducted. The proposed model exhibited the highest performance, achieving an AUC of 0.9320 on our private dataset and an area under the curve (AUC) of 0.9834 on the public dataset, significantly outperforming traditional deep convolutional neural network models. It also exceeded the diagnostic performance of radiologists, showcasing its potential as a reliable tool for medical image classification. The model's success can be attributed to its incorporation of advanced architectures, feature fusion, and attention mechanisms. The model's decision-making process was further clarified using interpretability techniques like Grad-CAM, Saliency Maps, and SHAP, offering insights into its ability to focus on relevant image features for accurate classification. The proposed deep learning model offers superior accuracy in classifying benign and malignant lesions in ultrasound images, outperforming traditional models and radiologists. Its strong performance, coupled with interpretability techniques, demonstrates its potential as a reliable and efficient tool for medical diagnostics. The datasets generated and analyzed during the current study are not publicly available due to the nature of this research and participants of this study, but may be available from the corresponding author on reasonable request.

Ultrasound Classification Breast Retrospective Clinical In Silico Academic Lab Benchmark SOTA

Automatic identification of Parkinsonism using clinical multi-contrast brain MRI: a large self-supervised vision foundation model strategy.

Suo X, Chen M, Chen L, Luo C, Kemp GJ, Lui S, Sun H

•papers•May 27 2025

Valid non-invasive biomarkers for Parkinson's disease (PD) and Parkinson-plus syndrome (PPS) are urgently needed. Based on our recent self-supervised vision foundation model the Shift Window UNET TRansformer (Swin UNETR), which uses clinical multi-contrast whole brain MRI, we aimed to develop an efficient and practical model ('SwinClassifier') for the discrimination of PD vs PPS using routine clinical MRI scans. We used 75,861 clinical head MRI scans including T1-weighted, T2-weighted and fluid attenuated inversion recovery imaging as a pre-training dataset to develop a foundation model, using self-supervised learning with a cross-contrast context recovery task. Then clinical head MRI scans from n = 1992 participants with PD and n = 1989 participants with PPS were used as a downstream PD vs PPS classification dataset. We then assessed SwinClassifier's performance in confusion matrices compared to a comparative self-supervised vanilla Vision Transformer (ViT) autoencoder ('ViTClassifier'), and to two convolutional neural networks (DenseNet121 and ResNet50) trained from scratch. SwinClassifier showed very good performance (F1 score 0.83, 95% confidence interval [CI] [0.79-0.87], AUC 0.89) in PD vs PPS discrimination in independent test datasets (n = 173 participants with PD and n = 165 participants with PPS). This self-supervised classifier with pretrained weights outperformed the ViTClassifier and convolutional classifiers trained from scratch (F1 score 0.77-0.82, AUC 0.83-0.85). Occlusion sensitivity mapping in the correctly-classified cases (n = 160 PD and n = 114 PPS) highlighted the brain regions guiding discrimination mainly in sensorimotor and midline structures including cerebellum, brain stem, ventricle and basal ganglia. Our self-supervised digital model based on routine clinical head MRI discriminated PD vs PPS with good accuracy and sensitivity. With incremental improvements the approach may be diagnostically useful in early disease. National Key Research and Development Program of China.

MRI Classification Neurological Retrospective Clinical In Silico Academic Lab Benchmark SOTA GenAI

A Left Atrial Positioning System to Enable Follow-Up and Cohort Studies.

Mehringer NJ, McVeigh ER

•papers•May 27 2025

We present a new algorithm to automatically convert 3-dimensional left atrium surface meshes into a standard 2-dimensional space: a Left Atrial Positioning System (LAPS). Forty-five contrast-enhanced 4- dimensional computed tomography datasets were collected from 30 subjects. The left atrium volume was segmented using a trained neural network and converted into a surface mesh. LAPS coordinates were calculated on each mesh by computing lines of longitude and latitude on the surface of the mesh with reference to the center of the posterior wall and the mitral valve. LAPS accuracy was evaluated with one-way transfer of coordinates from a template mesh to a synthetic ground truth, which was created by registering the template mesh and pre-calculated LAPS coordinates to a target mesh. The Euclidian distance error was measured between each test node and its ground truth location. The median point transfer error was 2.13 mm between follow-up scans of the same subject (n = 15) and 3.99 mm between different subjects (n = 30). The left atrium was divided into 24 anatomic regions and represented on a 2D square diagram. The Left Atrial Positioning System is fully automatic, accurate, robust to anatomic variation, and has flexible visualization for mapping data in the left atrium. This provides a framework for comparing regional LA surface data values in both follow-up and cohort studies.

CT Segmentation Cardiac Methodology In Silico Academic Lab

Estimation of time-to-total knee replacement surgery with multimodal modeling and artificial intelligence.

Cigdem O, Hedayati E, Rajamohan HR, Cho K, Chang G, Kijowski R, Deniz CM

•papers•May 27 2025

The methods for predicting time-to-total knee replacement (TKR) do not provide enough information to make robust and accurate predictions. Develop and evaluate an artificial intelligence-based model for predicting time-to-TKR by analyzing longitudinal knee data and identifying key features associated with accelerated knee osteoarthritis progression. A total of 547 subjects underwent TKR in the Osteoarthritis Initiative over nine years, and their longitudinal data was used for model training and testing. 518 and 164 subjects from Multi-Center Osteoarthritis Study and internal hospital data were used for external testing, respectively. The clinical variables, magnetic resonance (MR) images, radiographs, and quantitative and semi-quantitative assessments from images were analyzed. Deep learning (DL) models were used to extract features from radiographs and MR images. DL features were combined with clinical and image assessment features for survival analysis. A Lasso Cox feature selection method combined with a random survival forest model was used to estimate time-to-TKR. Utilizing only clinical variables for time-to-TKR predictions provided the estimation accuracy of 60.4% and C-index of 62.9%. Combining DL features extracted from radiographs, MR images with clinical, quantitative, and semi-quantitative image assessment features achieved the highest accuracy of 73.2%, (p=.001) and C-index of 77.3% for predicting time-to-TKR. The proposed predictive model demonstrated the potential of DL models and multimodal data fusion in accurately predicting time-to-TKR surgery that may help assist physicians to personalize treatment strategies and improve patient outcomes.

Mixed Modality Classification Musculoskeletal Retrospective Clinical In Silico Academic Lab

Evaluating Large Language Models for Enhancing Radiology Specialty Examination: A Comparative Study with Human Performance.

Liu HY, Chen SJ, Wang W, Lee CH, Hsu HH, Shen SH, Chiou HJ, Lee WJ

•papers•May 27 2025

The radiology specialty examination assesses clinical decision-making, image interpretation, and diagnostic reasoning. With the expansion of medical knowledge, traditional test design faces challenges in maintaining accuracy and relevance. Large language models (LLMs) demonstrate potential in medical education. This study evaluates LLM performance in radiology specialty exams, explores their role in assessing question difficulty, and investigates their reasoning processes, aiming to develop a more objective and efficient framework for exam design. This study compared the performance of LLMs and human examinees in a radiology specialty examination. Three LLMs (GPT-4o, o1-preview, and GPT-3.5-turbo-1106) were evaluated under zero-shot conditions. Exam accuracy, examinee accuracy, discrimination index, and point-biserial correlation were used to assess LLMs' ability to predict question difficulty and reasoning processes. The data provided by the Taiwan Radiological Society ensures comparability between AI and human performance. As for accuracy, GPT-4o (88.0%) and o1-preview (90.9%) outperformed human examinees (76.3%), whereas GPT-3.5-turbo-1106 showed significantly lower accuracy (50.2%). Question difficulty analysis revealed that newer LLMs excel in solving complex questions, while GPT-3.5-turbo-1106 exhibited greater performance variability. Discrimination index and point-biserial Correlation analyses demonstrated that GPT-4o and o1-preview accurately identified key differentiating questions, closely mirroring human reasoning patterns. These findings suggest that advanced LLMs can assess medical examination difficulty, offering potential applications in exam standardization and question evaluation. This study evaluated the problem-solving capabilities of GPT-3.5-turbo-1106, GPT-4o, and o1-preview in a radiology specialty examination. LLMs should be utilized as tools for assessing exam question difficulty and assisting in the standardized development of medical examinations.

LLM Radiology Report Retrospective Clinical In Silico Academic Lab GenAI

A Deep Neural Network Framework for the Detection of Bacterial Diseases from Chest X-Ray Scans.

Jain S, Jindal H, Bharti M

•papers•May 27 2025

This research aims to develop an advanced deep-learning framework for detecting respiratory diseases, including COVID-19, pneumonia, and tuberculosis (TB), using chest X-ray scans. A Deep Neural Network (DNN)-based system was developed to analyze medical images and extract key features from chest X-rays. The system leverages various DNN learning algorithms to study X-ray scan color, curve, and edge-based features. The Adam optimizer is employed to minimize error rates and enhance model training. A dataset of 1800 chest X-ray images, consisting of COVID-19, pneumonia, TB, and typical cases, was evaluated across multiple DNN models. The highest accuracy was achieved using the VGG19 model. The proposed system demonstrated an accuracy of 94.72%, with a sensitivity of 92.73%, a specificity of 96.68%, and an F1-score of 94.66%. The error rate was 5.28% when trained with 80% of the dataset and tested on 20%. The VGG19 model showed significant accuracy improvements of 32.69%, 36.65%, 42.16%, and 8.1% over AlexNet, GoogleNet, InceptionV3, and VGG16, respectively. The prediction time was also remarkably low, ranging between 3 and 5 seconds. The proposed deep learning model efficiently detects respiratory diseases, including COVID-19, pneumonia, and TB, within seconds. The method ensures high reliability and efficiency by optimizing feature extraction and maintaining system complexity, making it a valuable tool for clinicians in rapid disease diagnosis.

X-Ray Classification Chest Methodology In Silico Academic Lab

Deep Learning Auto-segmentation of Diffuse Midline Glioma on Multimodal Magnetic Resonance Images.

Fernández-Patón M, Montoya-Filardi A, Galiana-Bordera A, Martínez-Gironés PM, Veiga-Canuto D, Martínez de Las Heras B, Cerdá-Alberich L, Martí-Bonmatí L

•papers•May 27 2025

Diffuse midline glioma (DMG) H3 K27M-altered is a rare pediatric brainstem cancer with poor prognosis. To advance the development of predictive models to gain a deeper understanding of DMG, there is a crucial need for seamlessly integrating automatic and highly accurate tumor segmentation techniques. There is only one method that tries to solve this task in this cancer; for that reason, this study develops a modified CNN-based 3D-Unet tool to automatically segment DMG in an accurate way in magnetic resonance (MR) images. The dataset consisted of 52 DMG patients and 70 images, each with T1W and T2W or FLAIR images. Three different datasets were created: T1W images, T2W or FLAIR images, and a combined set of T1W and T2W/FLAIR images. Denoising, bias field correction, spatial resampling, and normalization were applied as preprocessing steps to the MR images. Patching techniques were also used to enlarge the dataset size. For tumor segmentation, a 3D U-Net architecture with residual blocks was used. The best results were obtained for the dataset composed of all T1W and T2W/FLAIR images, reaching an average Dice Similarity Coefficient (DSC) of 0.883 on the test dataset. These results are comparable to other brain tumor segmentation models and to state-of-the-art results in DMG segmentation using fewer sequences. Our results demonstrate the effectiveness of the proposed 3D U-Net architecture for DMG tumor segmentation. This advancement holds potential for enhancing the precision of diagnostic and predictive models in the context of this challenging pediatric cancer.

MRI Segmentation Neurological Methodology In Silico Academic Lab

Development of a No-Reference CT Image Quality Assessment Method Using RadImageNet Pre-trained Deep Learning Models.

Ohashi K, Nagatani Y, Yamazaki A, Yoshigoe M, Iwai K, Uemura R, Shimomura M, Tanimura K, Ishida T

•papers•May 27 2025

Accurate assessment of computed tomography (CT) image quality is crucial for ensuring diagnostic accuracy, optimizing imaging protocols, and preventing excessive radiation exposure. In clinical settings, where high-quality reference images are often unavailable, developing no-reference image quality assessment (NR-IQA) methods is essential. Recently, CT-NR-IQA methods using deep learning have been widely studied; however, significant challenges remain in handling multiple degradation factors and accurately reflecting real-world degradations. To address these issues, we propose a novel CT-NR-IQA method. Our approach utilizes a dataset that combines two degradation factors (noise and blur) to train convolutional neural network (CNN) models capable of handling multiple degradation factors. Additionally, we leveraged RadImageNet pre-trained models (ResNet50, DenseNet121, InceptionV3, and InceptionResNetV2), allowing the models to learn deep features from large-scale real clinical images, thus enhancing adaptability to real-world degradations without relying on artificially degraded images. The models' performances were evaluated by measuring the correlation between the subjective scores and predicted image quality scores for both artificially degraded and real clinical image datasets. The results demonstrated positive correlations between the subjective and predicted scores for both datasets. In particular, ResNet50 showed the best performance, with a correlation coefficient of 0.910 for the artificially degraded images and 0.831 for the real clinical images. These findings indicate that the proposed method could serve as a potential surrogate for subjective assessment in CT-NR-IQA.

CT Classification Methodology In Silico Academic Lab Benchmark SOTA

Automated Body Composition Analysis Using DAFS Express on 2D MRI Slices at L3 Vertebral Level.

Akella V, Bagherinasab R, Lee H, Li JM, Nguyen L, Salehin M, Chow VTY, Popuri K, Beg MF

•papers•May 27 2025

Body composition analysis is vital in assessing health conditions such as obesity, sarcopenia, and metabolic syndromes. MRI provides detailed images of skeletal muscle (SM), visceral adipose tissue (VAT), and subcutaneous adipose tissue (SAT), but their manual segmentation is labor-intensive and limits clinical applicability. This study validates an automated tool for MRI-based 2D body composition analysis (Data Analysis Facilitation Suite (DAFS) Express), comparing its automated measurements with expert manual segmentations using UK Biobank data. A cohort of 399 participants from the UK Biobank dataset was selected, yielding 423 single L3 slices for analysis. DAFS Express performed automated segmentations of SM, VAT, and SAT, which were then manually corrected by expert raters for validation. Evaluation metrics included Jaccard coefficients, Dice scores, intraclass correlation coefficients (ICCs), and Bland-Altman Plots to assess segmentation agreement and reliability. High agreements were observed between automated and manual segmentations with mean Jaccard scores: SM 99.03%, VAT 95.25%, and SAT 99.57%, and mean Dice scores: SM 99.51%, VAT 97.41%, and SAT 99.78%. Cross-sectional area comparisons showed consistent measurements, with automated methods closely matching manual measurements for SM and SAT, and slightly higher values for VAT (SM: auto 132.51 cm2, manual 132.36 cm2; VAT: auto 137.07 cm2, manual 134.46 cm2; SAT: auto 203.39 cm2, manual 202.85 cm2). ICCs confirmed strong reliability (SM 0.998, VAT 0.994, SAT 0.994). Bland-Altman plots revealed minimal biases, and boxplots illustrated distribution similarities across SM, VAT, and SAT areas. On average, DAFS Express took 18 s per DICOM for a total of 126.9 min for 423 images to output segmentations and measurement PDF's per DICOM. Automated segmentation of SM, VAT, and SAT from 2D MRI images using DAFS Express showed comparable accuracy to manual segmentation. This underscores its potential to streamline image analysis processes in research and clinical settings, enhancing diagnostic accuracy and efficiency. Future work should focus on further validation across diverse clinical applications and imaging conditions.

MRI Segmentation Abdominal Retrospective Clinical In Silico Academic Lab

Deep learning network enhances imaging quality of low-b-value diffusion-weighted imaging and improves lesion detection in prostate cancer.

Liu Z, Gu WJ, Wan FN, Chen ZZ, Kong YY, Liu XH, Ye DW, Dai B

•papers•May 27 2025

Diffusion-weighted imaging with higher b-value improves detection rate for prostate cancer lesions. However, obtaining high b-value DWI requires more advanced hardware and software configuration. Here we use a novel deep learning network, NAFNet, to generate a deep learning reconstructed (DLR1500) images from 800 b-value to mimic 1500 b-value images, and to evaluate its performance and lesion detection improvements based on whole-slide images (WSI). We enrolled 303 prostate cancer patients with both 800 and 1500 b-values from Fudan University Shanghai Cancer Centre between 2017 and 2020. We assigned these patients to the training and validation set in a 2:1 ratio. The testing set included 36 prostate cancer patients from an independent institute who had only preoperative DWI at 800 b-value. Two senior radiology doctors and two junior radiology doctors read and delineated cancer lesions on DLR1500, original 800 and 1500 b-values DWI images. WSI were used as the ground truth to assess the lesion detection improvement of DLR1500 images in the testing set. After training and generating, within junior radiology doctors, the diagnostic AUC based on DLR1500 images is not inferior to that based on 1500 b-value images (0.832 (0.788-0.876) vs. 0.821 (0.747-0.899), P = 0.824). The same phenomenon is also observed in senior radiology doctors. Furthermore, in the testing set, DLR1500 images could significantly enhance junior radiology doctors' diagnostic performance than 800 b-value images (0.848 (0.758-0.938) vs. 0.752 (0.661-0.843), P = 0.043). DLR1500 DWIs were comparable in quality to original 1500 b-value images within both junior and senior radiology doctors. NAFNet based DWI enhancement can significantly improve the image quality of 800 b-value DWI, and therefore promote the accuracy of prostate cancer lesion detection for junior radiology doctors.

MRI Reconstruction Abdominal Retrospective Clinical In Silico Academic Lab

Improving Breast Cancer Diagnosis in Ultrasound Images Using Deep Learning with Feature Fusion and Attention Mechanism.

Automatic identification of Parkinsonism using clinical multi-contrast brain MRI: a large self-supervised vision foundation model strategy.

A Left Atrial Positioning System to Enable Follow-Up and Cohort Studies.

Estimation of time-to-total knee replacement surgery with multimodal modeling and artificial intelligence.

Evaluating Large Language Models for Enhancing Radiology Specialty Examination: A Comparative Study with Human Performance.

A Deep Neural Network Framework for the Detection of Bacterial Diseases from Chest X-Ray Scans.

Deep Learning Auto-segmentation of Diffuse Midline Glioma on Multimodal Magnetic Resonance Images.

Development of a No-Reference CT Image Quality Assessment Method Using RadImageNet Pre-trained Deep Learning Models.

Automated Body Composition Analysis Using DAFS Express on 2D MRI Slices at L3 Vertebral Level.

Deep learning network enhances imaging quality of low-b-value diffusion-weighted imaging and improves lesion detection in prostate cancer.

Ready to Sharpen Your Edge?