Latest Papers on Radiology AI.

FOSSIL: Regret-Minimizing Curriculum Learning for Metadata-Free and Low-Data Mpox Diagnosis

Sahng-Min Han, Minjae Kim, Jinho Cha, Se-woon Choe, Eunchan Daniel Cha, Jungwon Choi, Kyudong Jung

•preprint•Oct 11 2025

Deep learning in small and imbalanced biomedical datasets remains fundamentally constrained by unstable optimization and poor generalization. We present the first biomedical implementation of FOSSIL (Flexible Optimization via Sample-Sensitive Importance Learning), a regret-minimizing weighting framework that adaptively balances training emphasis according to sample difficulty. Using softmax-based uncertainty as a continuous measure of difficulty, we construct a four-stage curriculum (Easy-Very Hard) and integrate FOSSIL into both convolutional and transformer-based architectures for Mpox skin lesion diagnosis. Across all settings, FOSSIL substantially improves discrimination (AUC = 0.9573), calibration (ECE = 0.053), and robustness under real-world perturbations, outperforming conventional baselines without metadata, manual curation, or synthetic augmentation. The results position FOSSIL as a generalizable, data-efficient, and interpretable framework for difficulty-aware learning in medical imaging under data scarcity.

X-Ray Classification Methodology In Silico Academic Lab Benchmark SOTA

From image to report: automating lung cancer screening interpretation and reporting with vision-language models.

Chang TY, Gou Q, Zhao L, Zhou T, Chen H, Yang D, Ju H, Smith KE, Sun C, Pan J, Huang Y, He X, Zhang X, Xu D, Xu J, Bian J, Chen A

•papers•Oct 11 2025

Lung cancer is the most prevalent cancer and the leading cause of cancer-related death in the United States. Lung cancer screening with low-dose computed tomography (LDCT) helps identify lung cancer at an early stage and thus improves overall survival. The growing adoption of LDCT screening has increased radiologists' workload and demands specialized training to accurately interpret LDCT images and report findings. Advances in artificial intelligence (AI), including large language models (LLMs) and vision models, could help reduce this burden and improve accuracy. We devised LUMEN (Lung cancer screening with Unified Multimodal Evaluation and Navigation), a multimodal AI framework that mimics the radiologist's workflow by identifying nodules in LDCT images, generating their characteristics, and drafting corresponding radiology reports in accordance with reporting guidelines. LUMEN integrates computer vision, vision-language models (VLMs), and LLMs. To assess our system, we developed a benchmarking framework to evaluate the lung cancer screening reports generated based on the findings and management criteria outlined in the Lung Imaging Reporting and Data System (Lung-RADS). It extracts them from radiology reports and measures clinical accuracy-focusing on information that is clinically important for lung cancer screening-independently of report format. This complement exists LLM/VLM in semantic accuracy metrics and provides a more comprehensive view of system performance. Our lung cancer screening report generation system achieved unparalleled performance compared to contemporary VLM systems, including M3D, CT2Report and MedM3DVLM. Furthermore, compared to standard LLM metrics, the clinical metrics we designed for lung cancer screening more accurately reflect the clinical utility of the generated reports. LUMEN demonstrates the feasibility of generating clinically accurate lung nodule reports from LDCT images through a nodule-centric VQA approach, highlighting the potential of integrating VLMs and LLMs to support radiologists in lung cancer screening workflows. Our findings also underscore the importance of applying clinically meaningful evaluation metrics in developing medical AI systems.

CT LLM Radiology Report Chest Methodology In Silico Academic Lab Benchmark SOTA GenAI

J-RAS: Enhancing Medical Image Segmentation via Retrieval-Augmented Joint Training

Salma J. Ahmed, Emad A. Mohammed, Azam Asilian Bidgoli

•preprint•Oct 11 2025

Image segmentation, the process of dividing images into meaningful regions, is critical in medical applications for accurate diagnosis, treatment planning, and disease monitoring. Although manual segmentation by healthcare professionals produces precise outcomes, it is time-consuming, costly, and prone to variability due to differences in human expertise. Artificial intelligence (AI)-based methods have been developed to address these limitations by automating segmentation tasks; however, they often require large, annotated datasets that are rarely available in practice and frequently struggle to generalize across diverse imaging conditions due to inter-patient variability and rare pathological cases. In this paper, we propose Joint Retrieval Augmented Segmentation (J-RAS), a joint training method for guided image segmentation that integrates a segmentation model with a retrieval model. Both models are jointly optimized, enabling the segmentation model to leverage retrieved image-mask pairs to enrich its anatomical understanding, while the retrieval model learns segmentation-relevant features beyond simple visual similarity. This joint optimization ensures that retrieval actively contributes meaningful contextual cues to guide boundary delineation, thereby enhancing the overall segmentation performance. We validate J-RAS across multiple segmentation backbones, including U-Net, TransUNet, SAM, and SegFormer, on two benchmark datasets: ACDC and M&Ms, demonstrating consistent improvements. For example, on the ACDC dataset, SegFormer without J-RAS achieves a mean Dice score of 0.8708$\pm$0.042 and a mean Hausdorff Distance (HD) of 1.8130$\pm$2.49, whereas with J-RAS, the performance improves substantially to a mean Dice score of 0.9115$\pm$0.031 and a mean HD of 1.1489$\pm$0.30. These results highlight the method's effectiveness and its generalizability across architectures and datasets.

MRI Segmentation Cardiac Methodology In Silico Academic Lab Benchmark SOTA

Automated detection of mandibular landmarks in CT data using a dual-input approach in a two-stage design.

Deitermann M, Pankert T, Jaganathan S, Röhrle O, Hölzle F, Modabber A, Raith S

•papers•Oct 11 2025

Identification of anatomical landmarks in 3D imaging data is an essential step in patient-specific cranio-maxillofacial surgery. Today, precise landmark localization remains largely manual, prone to inter-operator variability, and a bottleneck in streamlined workflows of digitalized preoperative planning, that have in recent years, become a key aspect of cranio-maxillofacial surgery. In clinical practice, bone segmentation and landmark detection in CT imaging is often avoided and automated solutions fall back to the analysis of 2D cephalograms. This work investigates different pipelines to automate the process of landmark localization in the mandible from volumetric CT imaging using convolutional neural networks. As a central element, a 3D U-Net architecture is employed to treat landmark localization and classification like a multi-label segmentation problem. We leverage a two-stage coarse-to-fine approach to tackle heterogeneous input data and preserve high resolution for the final prediction. Our primary innovation is a novel dual-input architecture for the second stage, which uses both the cropped CT data and a mandible segmentation to provide the model with explicit geometric priors for improved accuracy. The method was developed and tested on a clinical dataset comprising 287 CT datasets to localize nine different landmarks on the human mandible, including the Condyles, Coronoids, Gonions, Pogonion, Gnathion and Menton. On a test dataset of 29 CTs, landmarks were predicted with a mean absolute error of 1.40±1.04 while successfully predicting 99.6% of all landmarks. The proposed method demonstrates high accuracy, robustness, and speed suggesting strong potential for integration into clinical workflows for automated, patient-specific surgical planning in cranio-maxillofacial surgery.

CT Segmentation Retrospective Clinical In Silico Academic Lab

Machine learning outperforms deep learning in adhesive capsulitis diagnosis: a clinical-radiomics model bridging PD-T2 MRI and multimodal data fusion.

Yang Y, Pan T, Zhang C

•papers•Oct 11 2025

Adhesive Capsulitis of the Shoulder (ACS) is a chronic inflammatory condition characterized by capsular fibrosis, thickening, and restricted mobility. Early diagnosis remains challenging due to the limited sensitivity of traditional imaging and symptom-based methods. This study developed a clinical-multi-sequence radiomics model by integrating clinical data with magnetic resonance imaging (MRI) radiomics to enhance ACS detection and compared machine learning (ML) and deep learning (DL) approaches. A total of 444 patients from two medical centers were retrospectively included and divided into a primary cohort (n = 387) and an external test cohort (n = 57). Radiomic features were extracted from proton density-weighted coronal (PD-COR) and T2-weighted sagittal (T2-SAG) MRI sequences using PyRadiomics, while deep learning features were obtained from ResNet-200 and Vision Transformer (ViT) models. ML models were developed using Support Vector Machine (SVM), Extreme Gradient Boosting (XGBoost), and Light Gradient Boosting machine (LightGBM). The clinical-multi-sequence radiomics model was constructed by integrating radiomic and clinical features, with performance assessed via the area under the receiver operating characteristic curve (AUC), accuracy, sensitivity, specificity, and Brier Score. The PD_T2_LightGBM model achieved optimal performance (AUC: 0.975 training, 0.915 validation, 0.886 test), surpassing DL features models. The Clinical-Radiomics Combined model showed robust generalization (AUC: 0.981 training, 0.935 validation, 0.882 test). DL features models exhibited high sensitivity but reduced external validation accuracy. Integrating clinical and radiomic features significantly improved diagnostic precision. While DL features models provide valuable feature extraction capabilities, traditional ML models like LightGBM exhibit superior stability and interpretability, making them suitable for clinical applications. Future efforts should prioritize larger datasets and advanced fusion techniques to refine ACS diagnosis.

MRI Classification Musculoskeletal Retrospective Clinical In Silico Academic Lab

Physics-Informed Neural Networks (PINNs) for solving the forward and inverse problems of prostate biomechanics.

Ferrón-Vivó M, Nadal E, Navarro-Jiménez JM, Gregori S, Rupérez MJ

•papers•Oct 11 2025

This work introduces a novel integration of Physics-Informed Neural Networks (PINNs) with hyperelastic material modeling, employing the Neo-Hookean model to estimate the stiffness of soft tissue organs based on realistic anatomical geometries. Specifically, we propose the modeling of the prostate biomechanics as an initial application of this framework. By combining machine learning with principles of continuum mechanics, the methodology leverages finite element method (FEM) simulations and magnetic resonance imaging (MRI)-derived prostate models to address forward and inverse biomechanical problems. The PINN framework demonstrates the ability to provide accurate material property estimations, requiring limited data while overcoming challenges in data scarcity. This approach marks a significant advancement in patient-specific precision medicine, enabling improved diagnostics, personalized treatment planning, and broader applications in the biomechanical characterization of other soft tissues and organ systems.

MRI Image Synthesis Abdominal Methodology In Silico Academic Lab GenAI

Do patients with renal calculi exhibit viscerosomatic reflexes as evident on CT imaging?

Haughton DR, Gupta AK, Nasir BR, Kania AM

•papers•Oct 10 2025

Experimental evidence supporting the existence of the viscerosomatic reflex highlights an involvement of multiple vertebral levels when renal pathology is present. Further exploration of this reflex, particularly in the context of nephrolithiasis, could offer valuable insights for osteopathic treatments related to this pathology. Open-sourced machine learning datasets provide a valuable source of imaging data for investigating osteopathic phenomena including the viscerosomatic reflex. This study aimed to compare the rotation of vertebrae at levels associated with the viscerosomatic reflex in renal pathology in patients with nephrolithiasis vs. those without kidney stones. A total of 210 unenhanced computed tomography (CT) scans were examined from an open-sourced dataset designed for kidney and kidney stone segmentation. Among these, 166 scans were excluded due to pathologies that could affect analysis (osteophytes, renal masses, etc.). The 44 scans included in the analysis encompassed 292 relevant vertebrae. Of those, 15 scans were of patients with kidney stones in the right kidney, 13 in the left kidney, 7 bilaterally, and 11 without kidney stones. These scans included vertebral levels from T5-L5, with the majority falling within T10-L5. An open-sourced algorithm was employed to segment individual vertebrae, generating models that maintained their orientation in three-dimensional (3D) space. A self-coded 3D slicer module utilizing vertebral symmetry for rotation detection was then applied. Two-way analysis of variance (ANOVA) testing was conducted to assess differences in vertebral rotation between the four possible combinations of kidney stone location (left-sided, right-sided, bilateral, or none) and vertebral levels (T10-L4). Subsequently, the two-way ANOVA analysis was narrowed down to include various combinations of three vertebral levels (T10-L4) to identify the most significant levels. We observed a statistically significant difference in average vertebral rotation (p=0.0038) dependent on kidney stone location. Post-hoc analysis showed an average difference in rotation of -1.38° leftward between scans that contained left kidney stones compared to no kidney stones (p=0.027), as well as an average difference of -1.72° leftward in the scans containing right kidney stones compared to no kidney stone (p=0.0037). The average differences in rotation between the remaining stone location combinations were not statistically significant. Narrowed analysis of three vertebral level combinations showed a single statistically significant combination (T10, T12, and L4) out of a total of 35 combinations (p=0.028). A subsequent post-hoc procedure showed that angular rotation at these levels had the only statistically significant contribution to the difference between scans containing right kidney stones and no kidney stones (p=0.046). This study observed a statistically significant difference in the rotation of vertebrae at the levels associated with the viscerosomatic reflex between patients with unilateral kidney stones and those without kidney stones. The vertebral levels with the highest significance of association with this finding, particularly in right kidney stones, were T10, T12, and L4.

CT Segmentation Abdominal Retrospective Clinical In Silico Academic Lab Open Dataset Open Code

AI-based modality-agnostic classification system for vascular calcifications.

Ramezanpour M, Robertson AM, Tobe Y, Jia X, Cebral JR

•papers•Oct 10 2025

The importance of vascular calcification in major adverse cardiovascular events such as heart attacks or strokes has been established. However, calcifications have heterogeneous phenotypes, and their influence on diseased tissue stability remains poorly understood. Precise classification of calcification phenotypes is therefore essential for determining their impact on tissue stability through clinical and basic science studies. Here, we introduce a new classification system for phenotyping calcification along with a semi-automatic, non-destructive pipeline that can distinguish these phenotypes in imaging datasets. This pipeline covers diverse calcification phenotypes characterized by their size-related, morphological, spatial, and environmental phenotypes. We demonstrated its applicability using high-resolution micro-CT images of five arterial and aneurysmal specimens. The pipeline comprises an annotation-efficient, semi-automatic deep learning-based segmentation framework for the segmentation of sample and lipid pools in large noisy μ-CT stacks, an in-house 3D reconstruction tool, and advanced unsupervised clustering techniques for calcification classifications. The segmentation framework achieved high accuracy, with mean Dilated Dice Similarity Coefficients (dilation radius: 2 pixels) of 0.998 ± 0.003 (95% CI 0.997-0.999) for sample segmentation and 0.961 ± 0.031 (95% CI 0.955-0.967) for lipid pool segmentation across all samples using only 13 manually marked slices for each stack. Relying on 3D models rather than input images makes our classification system applicable to any imaging technique allowing 3D reconstructions, such as micro-CT and micro-OCT. This provides a common language across studies to communicate findings on the role of each calcification phenotype and potentially paves the way toward identifying novel biomarkers for accurate cardiovascular risk assessment.

CT Segmentation Vascular Methodology In Silico Academic Lab Benchmark SOTA

Machine Learning to Predict Extranodal Extension in Head and Neck Squamous Cell Carcinoma: A Systematic Review and Meta-Analysis.

Aulakh A, Sarafan M, Sekhon AS, Tran KL, Amanian A, Sabiq F, Kürten C, Prisman E

•papers•Oct 10 2025

To evaluate the clinical utility of machine learning algorithms (MLAs) in diagnosing extra-nodal extension (ENE) using CT imaging in HNSCC. A comprehensive literature search was conducted on MEDLINE (Ovid), EMBASE, Cochrane, Scopus, and Web of Science, from January 1, 2000, to February 12, 2025. Two independent reviewers selected studies reporting the diagnostic accuracy of MLAs in detecting ENE in patients with HNSCC. The review followed PRISMA guidelines. Meta-analysis was performed using MedCalc (23.0.2), with pooled estimates of the area under the curve (AUC) and corresponding 95% confidence intervals (CI) calculated. The Checklist for Artificial Intelligence in Medical Imaging (CLAIM) was used to analyze the methodological quality of the included studies. Of 57 articles retrieved, six met inclusion criteria, encompassing 2870 lymph nodes from 1407 patients. MLAs achieved a pooled AUC of 0.92 (95% CI [0.915, 0.923], p < 0.001; fixed-effects) and 0.91 (95% CI [0.882, 0.929], p < 0.001; random-effects), outperforming radiologists who had pooled AUCs of 0.65 (95% CI [0.645-0.654], p < 0.001; fixed-effects) and 0.65 (95% CI [0.591-0.708], p < 0.001; random-effects). Furthermore, MLA achieved a sensitivity ranging from 66.9% to 91.2%, compared to 24% to 96.0% by radiologists. The specificity and accuracy of MLA ranged from 72% to 96.2% and 66% to 92.2%, respectively, compared to that of radiologists, which ranged from 43.0% to 96.0% and 51.5% to 88.6%, respectively. MLAs demonstrate superior diagnostic performance in predicting ENE in HNSCC and may serve as a valuable adjunct to radiologists in clinical practice.

CT Classification Meta Analysis In Silico Academic Lab Benchmark SOTA

Focused ultrasound capsulotomy: predicting the probability of successful lesioning based on skull morphology.

De Schlichting E, Huang Y, Jones RM, Meng Y, Cao X, Baskaran A, Hynynen K, Hamani C, Lipsman N, Goubran M, Davidson B

•papers•Oct 10 2025

MR-guided focused ultrasound anterior capsulotomy (MRgFUS-AC) is an incisionless ablative procedure, which has shown reassuring safety and compelling efficacy in the treatment of refractory obsessive-compulsive disorder and major depressive disorder. However, in some patients lesions cannot be reliably generated due to patient-specific skull morphologies and properties. Despite screening patients for MRgFUS-AC using skull density ratio (SDR), up to 25% of cases experience treatment failure. This variability in technical success limits the real-world applicability of an otherwise highly impactful treatment, and a better predictor of success is needed. This study analyzed data from 60 attempted MRgFUS-AC treatments in 57 patients between 2017 and 2024. Treatments were categorized as success or failure based on lesion volume. Preoperative parameters, including SDR, skull thickness, angle of incidence, CSF volume, brain and head volumes, and lesion side, were recorded. Logistic and machine learning models were evaluated to construct a preoperative model to predict the probability of technical success. A total of 157 lesions were treated, of which 31 experienced treatment failure. Higher SDR, thinner skulls, and lower incident angles were significantly associated with successful outcomes (all p < 0.05). The logistic regression model performed the best among the models tested, with an accuracy of 0.81 ± 0.07 and an F1 score of 0.89 ± 0.04. The model was incorporated into a predictive tool to aid in identifying candidates for MRgFUS-AC. SDR, skull thickness, and angle of incidence significantly influenced the likelihood of successful MRgFUS-AC lesioning. Incorporating these three parameters into a predictive tool can dramatically reduce technical failure rates and may be especially informative in patients with an SDR between 0.35 and 0.55.

MRI Classification Neurological Retrospective Clinical In Silico Academic Lab

Filter Papers

Tags

FOSSIL: Regret-Minimizing Curriculum Learning for Metadata-Free and Low-Data Mpox Diagnosis

From image to report: automating lung cancer screening interpretation and reporting with vision-language models.

J-RAS: Enhancing Medical Image Segmentation via Retrieval-Augmented Joint Training

Automated detection of mandibular landmarks in CT data using a dual-input approach in a two-stage design.

Machine learning outperforms deep learning in adhesive capsulitis diagnosis: a clinical-radiomics model bridging PD-T2 MRI and multimodal data fusion.

Physics-Informed Neural Networks (PINNs) for solving the forward and inverse problems of prostate biomechanics.

Do patients with renal calculi exhibit viscerosomatic reflexes as evident on CT imaging?

AI-based modality-agnostic classification system for vascular calcifications.

Machine Learning to Predict Extranodal Extension in Head and Neck Squamous Cell Carcinoma: A Systematic Review and Meta-Analysis.

Focused ultrasound capsulotomy: predicting the probability of successful lesioning based on skull morphology.

Ready to Sharpen Your Edge?