Latest Papers on Radiology AI. Tags: None

A deep active learning framework for mitotic figure detection with minimal manual annotation and labelling.

Liu E, Lin A, Kakodkar P, Zhao Y, Wang B, Ling C, Zhang Q

•papers•Jul 3 2025

Accurately and efficiently identifying mitotic figures (MFs) is crucial for diagnosing and grading various cancers, including glioblastoma (GBM), a highly aggressive brain tumour requiring precise and timely intervention. Traditional manual counting of MFs in whole slide images (WSIs) is labour-intensive and prone to interobserver variability. Our study introduces a deep active learning framework that addresses these challenges with minimal human intervention. We utilized a dataset of GBM WSIs from The Cancer Genome Atlas (TCGA). Our framework integrates convolutional neural networks (CNNs) with an active learning strategy. Initially, a CNN is trained on a small, annotated dataset. The framework then identifies uncertain samples from the unlabelled data pool, which are subsequently reviewed by experts. These ambiguous cases are verified and used for model retraining. This iterative process continues until the model achieves satisfactory performance. Our approach achieved 81.75% precision and 82.48% recall for MF detection. For MF subclass classification, it attained an accuracy of 84.1%. Furthermore, this approach significantly reduced annotation time - approximately 900 min across 66 WSIs - cutting the effort nearly in half compared to traditional methods. Our deep active learning framework demonstrates a substantial improvement in both efficiency and accuracy for MF detection and classification in GBM WSIs. By reducing reliance on large annotated datasets, it minimizes manual effort while maintaining high performance. This methodology can be generalized to other medical imaging tasks, supporting broader applications in the healthcare domain.

Mixed Modality Detection Neurological Methodology In Silico Academic Lab Benchmark SOTA

Interpretable and generalizable deep learning model for preoperative assessment of microvascular invasion and outcome in hepatocellular carcinoma based on MRI: a multicenter study.

Dong X, Jia X, Zhang W, Zhang J, Xu H, Xu L, Ma C, Hu H, Luo J, Zhang J, Wang Z, Ji W, Yang D, Yang Z

•papers•Jul 3 2025

This study aimed to develop an interpretable, domain-generalizable deep learning model for microvascular invasion (MVI) assessment in hepatocellular carcinoma (HCC). Utilizing a retrospective dataset of 546 HCC patients from five centers, we developed and validated a clinical-radiological model and deep learning models aimed at MVI prediction. The models were developed on a dataset of 263 cases consisting of data from three centers, internally validated on a set of 66 patients, and externally tested on two independent sets. An adversarial network-based deep learning (AD-DL) model was developed to learn domain-invariant features from multiple centers within the training set. The area under the receiver operating characteristic curve (AUC) was calculated using pathological MVI status. With the best-performed model, early recurrence-free survival (ERFS) stratification was validated on the external test set by the log-rank test, and the differentially expressed genes (DEGs) associated with MVI status were tested on the RNA sequencing analysis of the Cancer Imaging Archive. The AD-DL model demonstrated the highest diagnostic performance and generalizability with an AUC of 0.793 in the internal test set, 0.801 in external test set 1, and 0.773 in external test set 2. The model's prediction of MVI status also demonstrated a significant correlation with ERFS (p = 0.048). DEGs associated with MVI status were primarily enriched in the metabolic processes and the Wnt signaling pathway, and the epithelial-mesenchymal transition process. The AD-DL model allows preoperative MVI prediction and ERFS stratification in HCC patients, which has a good generalizability and biological interpretability. The adversarial network-based deep learning model predicts MVI status well in HCC patients and demonstrates good generalizability. By integrating bioinformatics analysis of the model's predictions, it achieves biological interpretability, facilitating its clinical translation. Current MVI assessment models for HCC lack interpretability and generalizability. The adversarial network-based model's performance surpassed clinical radiology and squeeze-and-excitation network-based models. Biological function analysis was employed to enhance the interpretability and clinical translatability of the adversarial network-based model.

MRI Classification Abdominal Retrospective Clinical In Silico Academic Lab Benchmark SOTA GenAI

Can Whole-Thyroid-Based CT Radiomics Model Achieve the Performance of Lesion-Based Model in Predicting the Thyroid Nodules Malignancy? - A Comparative Study.

Yuan W, Wu J, Mai W, Li H, Li Z

•papers•Jul 3 2025

Machine learning is now extensively implemented in medical imaging for preoperative risk stratification and post-therapeutic outcome assessment, enhancing clinical decision-making. Numerous studies have focused on predicting whether thyroid nodules are benign or malignant using a nodule-based approach, which is time-consuming, inefficient, and overlooks the impact of the peritumoral region. To evaluate the effectiveness of using the whole-thyroid as the region of interest in differentiating between benign and malignant thyroid nodules, exploring the potential application value of the entire thyroid. This study enrolled 1121 patients with thyroid nodules between February 2017 and May 2023. All participants underwent contrast-enhanced CT scans prior to surgical intervention. Radiomics features were extracted from arterial phase images, and feature dimensionality reduction was performed using the Least Absolute Shrinkage and Selection Operator (LASSO) algorithm. Four machine learning models were trained on the selected features within the training cohort and subsequently evaluated on the independent validation cohort. The diagnostic performance of whole-thyroid versus nodule-based radiomics models was compared through receiver operating characteristic (ROC) curve analysis and area under the curve (AUC) metrics. The nodule-based logistic regression model achieved an AUC of 0.81 in the validation set, with sensitivity, specificity, and accuracy of 78.6%, 69.4%, and 75.6%, respectively. The whole-thyroid-based random forest model attained an AUC of 0.80, with sensitivity, specificity, and accuracy of 90.0%, 51.9.%, and 80.1%, respectively. The AUC advantage ratios on the LR, DT, RF, and SVM models are approximately - 2.47%, 0.00%, - 4.76%, and - 4.94%, respectively. The Delong test showed no significant differences among the four machine learning models regarding the region of interest defined by either the thyroid primary lesion or the whole thyroid. There was no significant difference in distinguishing between benign and malignant thyroid nodules using either a nodule-based or whole-thyroid-based strategy for ROI outlining. We hypothesize that the whole-thyroid approach provides enhanced diagnostic capability for detecting papillary thyroid carcinomas (PTCs) with ill-defined margins.

CT Classification Abdominal Retrospective Clinical In Silico Academic Lab

De-speckling of medical ultrasound image using metric-optimized knowledge distillation.

Khalifa M, Hamza HM, Hosny KM

•papers•Jul 3 2025

Ultrasound imaging provides real-time views of internal organs, which are essential for accurate diagnosis and treatment. However, speckle noise, caused by wave interactions with tissues, creates a grainy texture that hides crucial details. This noise varies with image intensity, which limits the effectiveness of traditional denoising methods. We introduce the Metric-Optimized Knowledge Distillation (MK) model, a deep-learning approach that utilizes Knowledge Distillation (KD) for denoising ultrasound images. Our method transfers knowledge from a high-performing teacher network to a smaller student network designed for this task. By leveraging KD, the model removes speckle noise while preserving key anatomical details needed for accurate diagnosis. A key innovation of our paper is the metric-guided training strategy. We achieve this by repeatedly computing evaluation metrics used to assess our model. Incorporating them into the loss function enables the model to reduce noise and enhance image quality optimally. We evaluate our proposed method against state-of-the-art despeckling techniques, including DNCNN and other recent models. The results demonstrate that our approach performs superior noise reduction and image quality preservation, making it a valuable tool for enhancing the diagnostic utility of ultrasound images.

Ultrasound Reconstruction Methodology In Silico Academic Lab

Cross-validation of an artificial intelligence tool for fracture classification and localization on conventional radiography in Dutch population.

Ruitenbeek HC, Sahil S, Kumar A, Kushawaha RK, Tanamala S, Sathyamurthy S, Agrawal R, Chattoraj S, Paramasamy J, Bos D, Fahimi R, Oei EHG, Visser JJ

•papers•Jul 3 2025

The aim of this study is to validate the effectiveness of an AI tool trained on Indian data in a Dutch medical center and to assess its ability to classify and localize fractures. Conventional radiographs acquired between January 2019 and November 2022 were analyzed using a multitask deep neural network. The tool, trained on Indian data, identified and localized fractures in 17 body parts. The reference standard was based on radiology reports resulting from routine clinical workflow and confirmed by an experienced musculoskeletal radiologist. The analysis included both patient-wise and fracture-wise evaluations, employing binary and Intersection over Union (IoU) metrics to assess fracture detection and localization accuracy. In total, 14,311 radiographs (median age, 48 years (range 18-98), 7265 male) were analyzed and categorized by body parts; clavicle, shoulder, humerus, elbow, forearm, wrist, hand and finger, pelvis, hip, femur, knee, lower leg, ankle, foot and toe. 4156/14,311 (29%) had fractures. The AI tool demonstrated overall patient-wise sensitivity, specificity, and AUC of 87.1% (95% CI: 86.1-88.1%), 87.1% (95% CI: 86.4-87.7%), and 0.92 (95% CI: 0.91-0.93), respectively. Fracture detection rate was 60% overall, ranging from 7% for rib fractures to 90% for clavicle fractures. This study validates a fracture detection AI tool on a Western-European dataset, originally trained on Indian data. While classification performance is robust on real clinical data, fracture-wise analysis reveals variability in localization accuracy, underscoring the need for refinement in fracture localization. AI may provide help by enabling optimal use of limited resources or personnel. This study evaluates an AI tool designed to aid in detecting fractures, possibly reducing reading time or optimization of radiology workflow by prioritizing fracture-positive cases. Cross-validation on a consecutive Dutch cohort confirms this AI tool's clinical robustness. The tool detected fractures with 87% sensitivity, 87% specificity, and 0.92 AUC. AI localizes 60% of fractures, the highest for clavicle (90%) and lowest for ribs (7%).

X-Ray Detection Musculoskeletal Retrospective Clinical Clinical Pilot Startup Benchmark SOTA

Integrating MobileNetV3 and SqueezeNet for Multi-class Brain Tumor Classification.

Kantu S, Kaja HS, Kukkala V, Aly SA, Sayed K

•papers•Jul 3 2025

Brain tumors pose a critical health threat requiring timely and accurate classification for effective treatment. Traditional MRI analysis is labor-intensive and prone to variability, necessitating reliable automated solutions. This study explores lightweight deep learning models for multi-class brain tumor classification across four categories: glioma, meningioma, pituitary tumors, and no tumor. We investigate the performance of MobileNetV3 and SqueezeNet individually, and a feature-fusion hybrid model that combines their embedding layers. We utilized a publicly available MRI dataset containing 7023 images with a consistent internal split (65% training, 17% validation, 18% test) to ensure reliable evaluation. MobileNetV3 offers deep semantic understanding through its expressive features, while SqueezeNet provides minimal computational overhead. Their feature-level integration creates a balanced approach between diagnostic accuracy and deployment efficiency. Experiments conducted with consistent hyperparameters and preprocessing showed MobileNetV3 achieved the highest test accuracy (99.31%) while maintaining a low parameter count (3.47M), making it suitable for real-world deployment. Grad-CAM visualizations were employed for model explainability, highlighting tumor-relevant regions and helping visualize the specific areas contributing to predictions. Our proposed models outperform several baseline architectures like VGG16 and InceptionV3, achieving high accuracy with significantly fewer parameters. These results demonstrate that well-optimized lightweight networks can deliver accurate and interpretable brain tumor classification.

MRI Classification Neurological Methodology In Silico Academic Lab Reproducibility

Content-based X-ray image retrieval using fusion of local neighboring patterns and deep features for lung disease detection.

Prakash A, Singh VP

•papers•Jul 3 2025

This paper introduces a Content-Based Medical Image Retrieval (CBMIR) system for detecting and retrieving lung disease cases to assist doctors and radiologists in clinical decision-making. The system combines texture-based features using Local Binary Patterns (LBP) with deep learning-based features extracted from pretrained CNN models, including VGG-16, DenseNet121, and InceptionV3. The objective is to identify the optimal fusion of texture and deep features to enhance the image retrieval performance. Various similarity measures, including Euclidean, Manhattan, and cosine similarities, were evaluated, with Cosine Similarity demonstrating the best performance, achieving an average precision of 65.5%. For COVID-19 cases, VGG-16 achieved a precision of 52.5%, while LBP performed best for the normal class with 85% precision. The fusion of LBP, VGG-16, and DenseNet121 excelled in pneumonia cases, with a precision of 93.5%. Overall, VGG-16 delivered the highest average precision of 74.0% across all classes, followed by LBP at 72.0%. The fusion of texture (LBP) and deep features from all CNN models achieved 86% accuracy for the retrieval of the top 10 images, supporting healthcare professionals in making more informed clinical decisions.

X-Ray Classification Chest Methodology In Silico Academic Lab

Predicting Ten-Year Clinical Outcomes in Multiple Sclerosis with Radiomics-Based Machine Learning Models.

Tranfa M, Petracca M, Cuocolo R, Ugga L, Morra VB, Carotenuto A, Elefante A, Falco F, Lanzillo R, Moccia M, Scaravilli A, Brunetti A, Cocozza S, Quarantelli M, Pontillo G

•papers•Jul 3 2025

Identifying patients with multiple sclerosis (pwMS) at higher risk of clinical progression is essential to inform clinical management. We aimed to build prognostic models using machine learning (ML) algorithms predicting long-term clinical outcomes based on a systematic mapping of volumetric, radiomic, and macrostructural disconnection features from routine brain MRI scans of pwMS. In this longitudinal monocentric study, 3T structural MRI scans of pwMS were retrospectively analyzed. Based on a ten-year clinical follow-up (average duration=9.4±1.1 years), patients were classified according to confirmed disability progression (CDP) and cognitive impairment (CI) as assessed through the Expanded Disability Status Scale (EDSS) and the Brief International Cognitive Assessment of Multiple Sclerosis (BICAMS) battery, respectively. 3D-T1w and FLAIR images were automatically segmented to obtain volumes, disconnection scores (estimated based on lesion masks and normative tractography data), and radiomic features from 116 gray matter regions defined according to the Automated Anatomical Labelling (AAL) atlas. Three ML algorithms (Extra Trees, Logistic Regression, and Support Vector Machine) were used to build models predicting long-term CDP and CI based on MRI-derived features. Feature selection was performed on the training set with a multi-step process, and models were validated with a holdout approach, randomly splitting the patients into training (75%) and test (25%) sets. We studied 177 pwMS (M/F = 51/126; mean±SD age: 35.2±8.7 years). Long-term CDP and CI were observed in 71 and 55 patients, respectively. Regarding the CDP class prediction analysis, the feature selection identified 13-, 12-, and 10-feature subsets obtaining an accuracy on the test set of 0.71, 0.69, and 0.67 for the Extra Trees, Logistic Regression, and Support Vector Machine classifiers, respectively. Similarly, for the CI prediction, subsets of 16, 17, and 19 features were selected, with 0.69, 0.64, and 0.62 accuracy values on the test set, respectively. There were no significant differences in accuracy between ML models for CDP (p=0.65) or CI (p=0.31). Building on quantitative features derived from conventional MRI scans, we obtained long-term prognostic models, potentially informing patients' stratification and clinical decision-making. MS, multiple sclerosis; pwMS, people with MS; HC, healthy controls; ML, machine learning; DD, disease duration; EDSS, Expanded Disability Status Scale; TLV, total lesion volume; CDP, confirmed disability progression; CI, cognitive impairment; BICAMS, Brief International Cognitive Assessment of Multiple Sclerosis.

MRI Classification Neurological Retrospective Clinical In Silico Academic Lab Benchmark SOTA

Recent Advances in Applying Machine Learning to Proton Radiotherapy.

Wildman VL, Wynne J, Momin S, Kesarwala AH, Yang X

•papers•Jul 3 2025

In radiation oncology, precision and timeliness of both planning and treatment are paramount values of patient care. Machine learning has increasingly been applied to various aspects of photon radiotherapy to reduce manual error and improve the efficiency of clinical decision making; however, applications to proton therapy remain an emerging field in comparison. This systematic review aims to comprehensively cover all current and potential applications of machine learning to the proton therapy clinical workflow, an area that has not been extensively explored in literature. PubMed and Embase were utilized to identify studies pertinent to machine learning in proton therapy between 2019 to 2024. An initial search on PubMed was made with the search strategy "'proton therapy', 'machine learning', 'deep learning'". A subsequent search on Embase was made with "("proton therapy") AND ("machine learning" OR "deep learning")". In total, 38 relevant studies have been summarized and incorporated. It is observed that U-Net architectures are prevalent in the patient pre-screening process, while convolutional neural networks play an important role in dose and range prediction. Both image quality improvement and transformation between modalities to decrease extraneous radiation are popular targets of various models. To adaptively improve treatments, advanced architectures such as general deep inception or deep cascaded convolution neural networks improve online dose verification and range monitoring. With the rising clinical usage of proton therapy, machine learning models have been increasingly proposed to facilitate both treatment and discovery. Significantly improving patient screening, planning, image quality, and dose and range calculation, machine learning is advancing the precision and personalization of proton therapy.

Mixed Modality Image Synthesis Review In Silico Academic Lab Benchmark SOTA

A Chain of Diagnosis Framework for Accurate and Explainable Radiology Report Generation.

Jin H, Che H, He S, Chen H

•papers•Jul 3 2025

Despite the progress of radiology report generation (RRG), existing works face two challenges: 1) The performances in clinical efficacy are unsatisfactory, especially for lesion attributes description; 2) the generated text lacks explainability, making it difficult for radiologists to trust the results. To address the challenges, we focus on a trustworthy RRG model, which not only generates accurate descriptions of abnormalities, but also provides basis of its predictions. To this end, we propose a framework named chain of diagnosis (CoD), which maintains a chain of diagnostic process for clinically accurate and explainable RRG. It first generates question-answer (QA) pairs via diagnostic conversation to extract key findings, then prompts a large language model with QA diagnoses for accurate generation. To enhance explainability, a diagnosis grounding module is designed to match QA diagnoses and generated sentences, where the diagnoses act as a reference. Moreover, a lesion grounding module is designed to locate abnormalities in the image, further improving the working efficiency of radiologists. To facilitate label-efficient training, we propose an omni-supervised learning strategy with clinical consistency to leverage various types of annotations from different datasets. Our efforts lead to 1) an omni-labeled RRG dataset with QA pairs and lesion boxes; 2) a evaluation tool for assessing the accuracy of reports in describing lesion location and severity; 3) extensive experiments to demonstrate the effectiveness of CoD, where it outperforms both specialist and generalist models consistently on two RRG benchmarks and shows promising explainability by accurately grounding generated sentences to QA diagnoses and images.

Mixed Modality Report Generation Whole Body Methodology In Silico Academic Lab Open Dataset Open Code

Filter Papers

Tags

A deep active learning framework for mitotic figure detection with minimal manual annotation and labelling.

Interpretable and generalizable deep learning model for preoperative assessment of microvascular invasion and outcome in hepatocellular carcinoma based on MRI: a multicenter study.

Can Whole-Thyroid-Based CT Radiomics Model Achieve the Performance of Lesion-Based Model in Predicting the Thyroid Nodules Malignancy? - A Comparative Study.

De-speckling of medical ultrasound image using metric-optimized knowledge distillation.

Cross-validation of an artificial intelligence tool for fracture classification and localization on conventional radiography in Dutch population.

Integrating MobileNetV3 and SqueezeNet for Multi-class Brain Tumor Classification.

Content-based X-ray image retrieval using fusion of local neighboring patterns and deep features for lung disease detection.

Predicting Ten-Year Clinical Outcomes in Multiple Sclerosis with Radiomics-Based Machine Learning Models.

Recent Advances in Applying Machine Learning to Proton Radiotherapy.

A Chain of Diagnosis Framework for Accurate and Explainable Radiology Report Generation.

Ready to Sharpen Your Edge?