Latest Papers on Radiology AI. Tags: In Silico

High-Performance Prompting for LLM Extraction of Compression Fracture Findings from Radiology Reports.

Kanani MM, Monawer A, Brown L, King WE, Miller ZD, Venugopal N, Heagerty PJ, Jarvik JG, Cohen T, Cross NM

•papers•May 16 2025

Extracting information from radiology reports can provide critical data to empower many radiology workflows. For spinal compression fractures, these data can facilitate evidence-based care for at-risk populations. Manual extraction from free-text reports is laborious, and error-prone. Large language models (LLMs) have shown promise; however, fine-tuning strategies to optimize performance in specific tasks can be resource intensive. A variety of prompting strategies have achieved similar results with fewer demands. Our study pioneers the use of Meta's Llama 3.1, together with prompt-based strategies, for automated extraction of compression fractures from free-text radiology reports, outputting structured data without model training. We tested performance on a time-based sample of CT exams covering the spine from 2/20/2024 to 2/22/2024 acquired across our healthcare enterprise (637 anonymized reports, age 18-102, 47% Female). Ground truth annotations were manually generated and compared against the performance of three models (Llama 3.1 70B, Llama 3.1 8B, and Vicuna 13B) with nine different prompting configurations for a total of 27 model/prompt experiments. The highest F1 score (0.91) was achieved by the 70B Llama 3.1 model when provided with a radiologist-written background, with similar results when the background was written by a separate LLM (0.86). The addition of few-shot examples to these prompts had variable impact on F1 measurements (0.89, 0.84 respectively). Comparable ROC-AUC and PR-AUC performance was observed. Our work demonstrated that an open-weights LLM excelled at extracting compression fractures findings from free-text radiology reports using prompt-based techniques without requiring extensive manually labeled examples for model training.

CT LLM Radiology Report Musculoskeletal Retrospective Clinical In Silico Academic Lab GenAI

UGoDIT: Unsupervised Group Deep Image Prior Via Transferable Weights

Shijun Liang, Ismail R. Alkhouri, Siddhant Gautam, Qing Qu, Saiprasad Ravishankar

•preprint•May 16 2025

Recent advances in data-centric deep generative models have led to significant progress in solving inverse imaging problems. However, these models (e.g., diffusion models (DMs)) typically require large amounts of fully sampled (clean) training data, which is often impractical in medical and scientific settings such as dynamic imaging. On the other hand, training-data-free approaches like the Deep Image Prior (DIP) do not require clean ground-truth images but suffer from noise overfitting and can be computationally expensive as the network parameters need to be optimized for each measurement set independently. Moreover, DIP-based methods often overlook the potential of learning a prior using a small number of sub-sampled measurements (or degraded images) available during training. In this paper, we propose UGoDIT, an Unsupervised Group DIP via Transferable weights, designed for the low-data regime where only a very small number, M, of sub-sampled measurement vectors are available during training. Our method learns a set of transferable weights by optimizing a shared encoder and M disentangled decoders. At test time, we reconstruct the unseen degraded image using a DIP network, where part of the parameters are fixed to the learned weights, while the remaining are optimized to enforce measurement consistency. We evaluate UGoDIT on both medical (multi-coil MRI) and natural (super resolution and non-linear deblurring) image recovery tasks under various settings. Compared to recent standalone DIP methods, UGoDIT provides accelerated convergence and notable improvement in reconstruction quality. Furthermore, our method achieves performance competitive with SOTA DM-based and supervised approaches, despite not requiring large amounts of clean training data.

MRI Reconstruction Methodology In Silico Academic Lab Benchmark SOTA

Impact of sarcopenia and obesity on mortality in older adults with SARS-CoV-2 infection: automated deep learning body composition analysis in the NAPKON-SUEP cohort.

Schluessel S, Mueller B, Tausendfreund O, Rippl M, Deissler L, Martini S, Schmidmaier R, Stoecklein S, Ingrisch M, Blaschke S, Brandhorst G, Spieth P, Lehnert K, Heuschmann P, de Miranda SMN, Drey M

•papers•May 16 2025

Severe respiratory infections pose a major challenge in clinical practice, especially in older adults. Body composition analysis could play a crucial role in risk assessment and therapeutic decision-making. This study investigates whether obesity or sarcopenia has a greater impact on mortality in patients with severe respiratory infections. The study focuses on the National Pandemic Cohort Network (NAPKON-SUEP) cohort, which includes patients over 60 years of age with confirmed severe COVID-19 pneumonia. An innovative approach was adopted, using pre-trained deep learning models for automated analysis of body composition based on routine thoracic CT scans. The study included 157 hospitalized patients (mean age 70 ± 8 years, 41% women, mortality rate 39%) from the NAPKON-SUEP cohort at 57 study sites. A pre-trained deep learning model was used to analyze body composition (muscle, bone, fat, and intramuscular fat volumes) from thoracic CT images of the NAPKON-SUEP cohort. Binary logistic regression was performed to investigate the association between obesity, sarcopenia, and mortality. Non-survivors exhibited lower muscle volume (p = 0.043), higher intramuscular fat volume (p = 0.041), and a higher BMI (p = 0.031) compared to survivors. Among all body composition parameters, muscle volume adjusted to weight was the strongest predictor of mortality in the logistic regression model, even after adjusting for factors such as sex, age, diabetes, chronic lung disease and chronic kidney disease, (odds ratio = 0.516). In contrast, BMI did not show significant differences after adjustment for comorbidities. This study identifies muscle volume derived from routine CT scans as a major predictor of survival in patients with severe respiratory infections. The results underscore the potential of AI supported CT-based body composition analysis for risk stratification and clinical decision making, not only for COVID-19 patients but also for all patients over 60 years of age with severe acute respiratory infections. The innovative application of pre-trained deep learning models opens up new possibilities for automated and standardized assessment in clinical practice.

CT Segmentation Chest Retrospective Clinical In Silico Academic Lab Benchmark SOTA

Artificial intelligence-assisted CT radiomics: redefining preoperative prediction of lateral cervical lymph node metastasis in papillary thyroid carcinoma.

Li J, Zhang J, Zou W, Li Z, Ye J, Chen X

•papers•May 16 2025

CT Classification Abdominal Retrospective Clinical In Silico Academic Lab

Machine learning prediction of pathological complete response to neoadjuvant chemotherapy with peritumoral breast tumor ultrasound radiomics: compare with intratumoral radiomics and clinicopathologic predictors.

Yao J, Zhou W, Jia X, Zhu Y, Chen X, Zhan W, Zhou J

•papers•May 16 2025

Noninvasive, accurate and novel approaches to predict patients who will achieve pathological complete response (pCR) after neoadjuvant chemotherapy (NAC) could assist treatment strategies. The aim of this study was to explore the application of machine learning (ML) based peritumoral ultrasound radiomics signature (PURS), compared with intratumoral radiomics (IURS) and clinicopathologic factors, for early prediction of pCR. We analyzed 358 locally advanced breast cancer patients (250 in the training set and 108 in the test set), who accepted NAC and post NAC surgery at our institution. The clinical and pathological data were analyzed using the independent t test and the Chi-square test to determine the factors associated with pCR. The PURS and IURS of baseline breast tumors were extracted by using 3D-slicer and PyRadiomics software. Five ML classifiers including linear discriminant analysis (LDA), support vector machine (SVM), random forest (RF), logistic regression (LR), and adaptive boosting (AdaBoost) were applied to construct radiomics predictive models. The performance of PURS, IURS models and clinicopathologic predictors were assessed with respect to sensitivity, specificity, accuracy and the areas under the curve (AUCs). Ninety-seven patients achieved pCR. The clinicopathologic predictors obtained an AUC of 0.759. Among PURS models, the RF classifier achieved better efficacy (AUC of 0.889) than LR (0.849), AdaBoost (0.823), SVM (0.746) and LDA (0.732). The RF classifier also obtained a maximum AUC of 0.931 than 0.920 (AdaBoost), 0.875 (LR), 0.825 (SVM), and 0.798 (LDA) in IURS models in the test set. The RF based PURS yielded higher predictive ability (AUC 0.889; 95% CI 0.814, 0.947) than clinicopathologic factors (AUC 0.759; 95% CI 0.657, 0.861; p < 0.05), but lower efficacy compared with IURS (AUC 0.931; 95% CI 0.865, 0.980; p < 0.05). The peritumoral US radiomics, as a novel potential biomarker, can assist clinical therapy decisions.

Ultrasound Classification Breast Retrospective Clinical In Silico Academic Lab

Assessing fetal lung maturity: Integration of ultrasound radiomics and deep learning.

Chen W, Zeng B, Ling X, Chen C, Lai J, Lin J, Liu X, Zhou H, Guo X

•papers•May 16 2025

This study built a model to forecast the maturity of lungs by blending radiomics and deep learning methods. We examined ultrasound images from 263 pregnancies in the pregnancy stages. Utilizing the GE VOLUSON E8 system we captured images to extract and analyze radiomic features. These features were integrated with clinical data by means of deep learning algorithms such as DenseNet121 to enhance the accuracy of assessing fetal lung maturity. This combined model was validated by receiver operating characteristic (ROC) curve, calibration diagram, as well as decision curve analysis (DCA). We discovered that the accuracy and reliability of the diagnosis indicated that this method significantly improves the level of prediction of fetal lung maturity. This novel non-invasive diagnostic technology highlights the potential advantages of integrating diverse data sources to enhance prenatal care and infant health. The study lays groundwork, for validation and refinement of the model across various healthcare settings.

Ultrasound Classification Abdominal Retrospective Clinical In Silico Academic Lab

Diff-Unfolding: A Model-Based Score Learning Framework for Inverse Problems

Yuanhao Wang, Shirin Shoushtari, Ulugbek S. Kamilov

•preprint•May 16 2025

Diffusion models are extensively used for modeling image priors for inverse problems. We introduce \emph{Diff-Unfolding}, a principled framework for learning posterior score functions of \emph{conditional diffusion models} by explicitly incorporating the physical measurement operator into a modular network architecture. Diff-Unfolding formulates posterior score learning as the training of an unrolled optimization scheme, where the measurement model is decoupled from the learned image prior. This design allows our method to generalize across inverse problems at inference time by simply replacing the forward operator without retraining. We theoretically justify our unrolling approach by showing that the posterior score can be derived from a composite model-based optimization formulation. Extensive experiments on image restoration and accelerated MRI show that Diff-Unfolding achieves state-of-the-art performance, improving PSNR by up to 2 dB and reducing LPIPS by $22.7\%$, while being both compact (47M parameters) and efficient (0.72 seconds per $256 \times 256$ image). An optimized C++/LibTorch implementation further reduces inference time to 0.63 seconds, underscoring the practicality of our approach.

MRI Reconstruction Methodology In Silico Academic Lab Benchmark SOTA Open Code

Development and validation of clinical-radiomics deep learning model based on MRI for endometrial cancer molecular subtypes classification.

Yue W, Han R, Wang H, Liang X, Zhang H, Li H, Yang Q

•papers•May 16 2025

This study aimed to develop and validate a clinical-radiomics deep learning (DL) model based on MRI for endometrial cancer (EC) molecular subtypes classification. This multicenter retrospective study included EC patients undergoing surgery, MRI, and molecular pathology diagnosis across three institutions from January 2020 to March 2024. Patients were divided into training, internal, and external validation cohorts. A total of 386 handcrafted radiomics features were extracted from each MR sequence, and MoCo-v2 was employed for contrastive self-supervised learning to extract 2048 DL features per patient. Feature selection integrated selected features into 12 machine learning methods. Model performance was evaluated with the AUC. A total of 526 patients were included (mean age, 55.01 ± 11.07). The radiomics model and clinical model demonstrated comparable performance across the internal and external validation cohorts, with macro-average AUCs of 0.70 vs 0.69 and 0.70 vs 0.67 (p = 0.51), respectively. The radiomics DL model, compared to the radiomics model, improved AUCs for POLEmut (0.68 vs 0.79), NSMP (0.71 vs 0.74), and p53abn (0.76 vs 0.78) in the internal validation (p = 0.08). The clinical-radiomics DL Model outperformed both the clinical model and radiomics DL model (macro-average AUC = 0.79 vs 0.69 and 0.73, in the internal validation [p = 0.02], 0.74 vs 0.67 and 0.69 in the external validation [p = 0.04]). The clinical-radiomics DL model based on MRI effectively distinguished EC molecular subtypes and demonstrated strong potential, with robust validation across multiple centers. Future research should explore larger datasets to further uncover DL's potential. Our clinical-radiomics DL model based on MRI has the potential to distinguish EC molecular subtypes. This insight aids in guiding clinicians in tailoring individualized treatments for EC patients. Accurate classification of EC molecular subtypes is crucial for prognostic risk assessment. The clinical-radiomics DL model outperformed both the clinical model and the radiomics DL model. The MRI features exhibited better diagnostic performance for POLEmut and p53abn.

MRI Classification Abdominal Retrospective Clinical In Silico Academic Lab

Lightweight hybrid transformers-based dyslexia detection using cross-modality data.

Sait ARW, Alkhurayyif Y

•papers•May 16 2025

Early and precise diagnosis of dyslexia is crucial for implementing timely intervention to reduce its effects. Timely identification can improve the individual's academic and cognitive performance. Traditional dyslexia detection (DD) relies on lengthy, subjective, restricted behavioral evaluations and interviews. Due to the limitations, deep learning (DL) models have been explored to improve DD by analyzing complex neurological, behavioral, and visual data. DL architectures, including convolutional neural networks (CNNs) and vision transformers (ViTs), encounter challenges in extracting meaningful patterns from cross-modality data. The lack of model interpretability and limited computational power restricts these models' generalizability across diverse datasets. To overcome these limitations, we propose an innovative model for DD using magnetic resonance imaging (MRI), electroencephalography (EEG), and handwriting images. We introduce a model, leveraging hybrid transformer-based feature extraction, including SWIN-Linformer for MRI, LeViT-Performer for handwriting images, and graph transformer networks (GTNs) with multi-attention mechanisms for EEG data. A multi-modal attention-based feature fusion network was used to fuse the extracted features in order to guarantee the integration of key multi-modal features. We enhance Dartbooster XGBoost (DXB)-based classification using Bayesian optimization with Hyperband (BOHB) algorithm. In order to reduce computational overhead, we employ a quantization-aware training technique. The local interpretable model-agnostic explanations (LIME) technique and gradient-weighted class activation mapping (Grad-CAM) were adopted to enable model interpretability. Five public repositories were used to train and test the proposed model. The experimental outcomes demonstrated that the proposed model achieves an accuracy of 99.8% with limited computational overhead, outperforming baseline models. It sets a novel standard for DD, offering potential for early identification and timely intervention. In the future, advanced feature fusion and quantization techniques can be utilized to achieve optimal results in resource-constrained environments.

MRI Classification Neurological Methodology In Silico Academic Lab

Impact of test set composition on AI performance in pediatric wrist fracture detection in X-rays.

Till T, Scherkl M, Stranger N, Singer G, Hankel S, Flucher C, Hržić F, Štajduhar I, Tschauner S

•papers•May 16 2025

To evaluate how different test set sampling strategies-random selection and balanced sampling-affect the performance of artificial intelligence (AI) models in pediatric wrist fracture detection using radiographs, aiming to highlight the need for standardization in test set design. This retrospective study utilized the open-sourced GRAZPEDWRI-DX dataset of 6091 pediatric wrist radiographs. Two test sets, each containing 4588 images, were constructed: one using a balanced approach based on case difficulty, projection type, and fracture presence and the other a random selection. EfficientNet and YOLOv11 models were trained and validated on 18,762 radiographs and tested on both sets. Binary classification and object detection tasks were evaluated using metrics such as precision, recall, F1 score, AP50, and AP50-95. Statistical comparisons between test sets were performed using nonparametric tests. Performance metrics significantly decreased in the balanced test set with more challenging cases. For example, the precision for YOLOv11 models decreased from 0.95 in the random set to 0.83 in the balanced set. Similar trends were observed for recall, accuracy, and F1 score, indicating that models trained on easy-to-recognize cases performed poorly on more complex ones. These results were consistent across all model variants tested. AI models for pediatric wrist fracture detection exhibit reduced performance when tested on balanced datasets containing more difficult cases, compared to randomly selected cases. This highlights the importance of constructing representative and standardized test sets that account for clinical complexity to ensure robust AI performance in real-world settings. Question Do different sampling strategies based on samples' complexity have an influence in deep learning models' performance in fracture detection? Findings AI performance in pediatric wrist fracture detection significantly drops when tested on balanced datasets with more challenging cases, compared to randomly selected cases. Clinical relevance Without standardized and validated test datasets for AI that reflect clinical complexities, performance metrics may be overestimated, limiting the utility of AI in real-world settings.

X-Ray Detection Musculoskeletal Retrospective Clinical In Silico Academic Lab Open Dataset

Filter Papers

Tags

High-Performance Prompting for LLM Extraction of Compression Fracture Findings from Radiology Reports.

UGoDIT: Unsupervised Group Deep Image Prior Via Transferable Weights

Impact of sarcopenia and obesity on mortality in older adults with SARS-CoV-2 infection: automated deep learning body composition analysis in the NAPKON-SUEP cohort.

Artificial intelligence-assisted CT radiomics: redefining preoperative prediction of lateral cervical lymph node metastasis in papillary thyroid carcinoma.

Machine learning prediction of pathological complete response to neoadjuvant chemotherapy with peritumoral breast tumor ultrasound radiomics: compare with intratumoral radiomics and clinicopathologic predictors.

Assessing fetal lung maturity: Integration of ultrasound radiomics and deep learning.

Diff-Unfolding: A Model-Based Score Learning Framework for Inverse Problems

Development and validation of clinical-radiomics deep learning model based on MRI for endometrial cancer molecular subtypes classification.

Lightweight hybrid transformers-based dyslexia detection using cross-modality data.

Impact of test set composition on AI performance in pediatric wrist fracture detection in X-rays.

Ready to Sharpen Your Edge?