Sort by:
Page 58 of 2252246 results

Beyond the First Read: AI-Assisted Perceptual Error Detection in Chest Radiography Accounting for Interobserver Variability

Adhrith Vutukuri, Akash Awasthi, David Yang, Carol C. Wu, Hien Van Nguyen

arxiv logopreprintJun 16 2025
Chest radiography is widely used in diagnostic imaging. However, perceptual errors -- especially overlooked but visible abnormalities -- remain common and clinically significant. Current workflows and AI systems provide limited support for detecting such errors after interpretation and often lack meaningful human--AI collaboration. We introduce RADAR (Radiologist--AI Diagnostic Assistance and Review), a post-interpretation companion system. RADAR ingests finalized radiologist annotations and CXR images, then performs regional-level analysis to detect and refer potentially missed abnormal regions. The system supports a "second-look" workflow and offers suggested regions of interest (ROIs) rather than fixed labels to accommodate inter-observer variation. We evaluated RADAR on a simulated perceptual-error dataset derived from de-identified CXR cases, using F1 score and Intersection over Union (IoU) as primary metrics. RADAR achieved a recall of 0.78, precision of 0.44, and an F1 score of 0.56 in detecting missed abnormalities in the simulated perceptual-error dataset. Although precision is moderate, this reduces over-reliance on AI by encouraging radiologist oversight in human--AI collaboration. The median IoU was 0.78, with more than 90% of referrals exceeding 0.5 IoU, indicating accurate regional localization. RADAR effectively complements radiologist judgment, providing valuable post-read support for perceptual-error detection in CXR interpretation. Its flexible ROI suggestions and non-intrusive integration position it as a promising tool in real-world radiology workflows. To facilitate reproducibility and further evaluation, we release a fully open-source web implementation alongside a simulated error dataset. All code, data, demonstration videos, and the application are publicly available at https://github.com/avutukuri01/RADAR.

Default Mode Network Connectivity Predicts Individual Differences in Long-Term Forgetting: Evidence for Storage Degradation, not Retrieval Failure

Xu, Y., Prat, C. S., Sense, F., van Rijn, H., Stocco, A.

biorxiv logopreprintJun 16 2025
Despite the importance of memories in everyday life and the progress made in understanding how they are encoded and retrieved, the neural processes by which declarative memories are maintained or forgotten remain elusive. Part of the problem is that it is empirically difficult to measure the rate at which memories fade, even between repeated presentations of the source of the memory. Without such a ground-truth measure, it is hard to identify the corresponding neural correlates. This study addresses this problem by comparing individual patterns of functional connectivity against behavioral differences in forgetting speed derived from computational phenotyping. Specifically, the individual-specific values of the speed of forgetting in long-term memory (LTM) were estimated for 33 participants using a formal model fit to accuracy and response time data from an adaptive paired-associate learning task. Individual speeds of forgetting were then used to examine participant-specific patterns of resting-state fMRI connectivity, using machine learning techniques to identify the most predictive and generalizable features. Our results show that individual speeds of forgetting are associated with resting-state connectivity within the default mode network (DMN) as well as between the DMN and cortical sensory areas. Cross-validation showed that individual speeds of forgetting were predicted with high accuracy (r = .78) from these connectivity patterns alone. These results support the view that DMN activity and the associated sensory regions are actively involved in maintaining memories and preventing their decline, a view that can be seen as evidence for the hypothesis that forgetting is a result of storage degradation, rather than of retrieval failure.

An 11,000-Study Open-Access Dataset of Longitudinal Magnetic Resonance Images of Brain Metastases

Saahil Chadha, David Weiss, Anastasia Janas, Divya Ramakrishnan, Thomas Hager, Klara Osenberg, Klara Willms, Joshua Zhu, Veronica Chiang, Spyridon Bakas, Nazanin Maleki, Durga V. Sritharan, Sven Schoenherr, Malte Westerhoff, Matthew Zawalich, Melissa Davis, Ajay Malhotra, Khaled Bousabarah, Cornelius Deuschl, MingDe Lin, Sanjay Aneja, Mariam S. Aboian

arxiv logopreprintJun 16 2025
Brain metastases are a common complication of systemic cancer, affecting over 20% of patients with primary malignancies. Longitudinal magnetic resonance imaging (MRI) is essential for diagnosing patients, tracking disease progression, assessing therapeutic response, and guiding treatment selection. However, the manual review of longitudinal imaging is time-intensive, especially for patients with multifocal disease. Artificial intelligence (AI) offers opportunities to streamline image evaluation, but developing robust AI models requires comprehensive training data representative of real-world imaging studies. Thus, there is an urgent necessity for a large dataset with heterogeneity in imaging protocols and disease presentation. To address this, we present an open-access dataset of 11,884 longitudinal brain MRI studies from 1,430 patients with clinically confirmed brain metastases, paired with clinical and image metadata. The provided dataset will facilitate the development of AI models to assist in the long-term management of patients with brain metastasis.

Evaluating Explainability: A Framework for Systematic Assessment and Reporting of Explainable AI Features

Miguel A. Lago, Ghada Zamzmi, Brandon Eich, Jana G. Delfino

arxiv logopreprintJun 16 2025
Explainability features are intended to provide insight into the internal mechanisms of an AI device, but there is a lack of evaluation techniques for assessing the quality of provided explanations. We propose a framework to assess and report explainable AI features. Our evaluation framework for AI explainability is based on four criteria: 1) Consistency quantifies the variability of explanations to similar inputs, 2) Plausibility estimates how close the explanation is to the ground truth, 3) Fidelity assesses the alignment between the explanation and the model internal mechanisms, and 4) Usefulness evaluates the impact on task performance of the explanation. Finally, we developed a scorecard for AI explainability methods that serves as a complete description and evaluation to accompany this type of algorithm. We describe these four criteria and give examples on how they can be evaluated. As a case study, we use Ablation CAM and Eigen CAM to illustrate the evaluation of explanation heatmaps on the detection of breast lesions on synthetic mammographies. The first three criteria are evaluated for clinically-relevant scenarios. Our proposed framework establishes criteria through which the quality of explanations provided by AI models can be evaluated. We intend for our framework to spark a dialogue regarding the value provided by explainability features and help improve the development and evaluation of AI-based medical devices.

First experiences with an adaptive pelvic radiotherapy system: Analysis of treatment times and learning curve.

Benzaquen D, Taussky D, Fave V, Bouveret J, Lamine F, Letenneur G, Halley A, Solmaz Y, Champion A

pubmed logopapersJun 16 2025
The Varian Ethos system allows not only on-treatment-table plan adaptation but also automated contouring with the aid of artificial intelligence. This study evaluates the initial clinical implementation of an adaptive pelvic radiotherapy system, focusing on the treatment times and the associated learning curve. We analyzed the data from 903 consecutive treatments for most urogenital cancers at our center. The treatment time was calculated from the time of the first cone-beam computed tomography scan used for replanning until the end of treatment. To calculate whether treatments were generally shorter over time, we divided the date of the first treatment into 3-months quartiles. Differences between the groups were calculated using t-tests. The mean time from the first cone-beam computed tomography scan to the end of treatment was 25.9min (standard deviation: 6.9min). Treatment time depended on the number of planning target volumes and treatment of the pelvic lymph nodes. The mean time from cone-beam computed tomography to the end of treatment was 37 % longer if the pelvic lymph nodes were treated and 26 % longer if there were more than two planning target volumes. There was a learning curve: in linear regression analysis, both quartiles of months of treatment (odds ratio [OR]: 1.3, 95 % confidence interval [CI]: 1.8-0.70, P<0.001) and the number of planning target volumes (OR: 3.0, 95 % CI: 2.6-3.4, P<0.001) were predictive of treatment time. Approximately two-thirds of the treatments were delivered within 33min. Treatment time was strongly dependent on the number of separate planning target volumes. There was a continuous learning curve.

Kernelized weighted local information based picture fuzzy clustering with multivariate coefficient of variation and modified total Bregman divergence measure for brain MRI image segmentation.

Lohit H, Kumar D

pubmed logopapersJun 16 2025
This paper proposes a novel clustering method for noisy image segmentation using a kernelized weighted local information approach under the Picture Fuzzy Set (PFS) framework. Existing kernel-based fuzzy clustering methods struggle with noisy environments and non-linear structures, while intuitionistic fuzzy clustering methods face limitations in handling uncertainty in real-world medical images. To address these challenges, we introduce a local picture fuzzy information measure, developed for the first time using Multivariate Coefficient of Variation (MCV) theory, enhancing robustness in segmentation. Additionally, we integrate non-Euclidean distance measures, including kernel distance for local information computation and modified total Bregman divergence (MTBD) measure for improving clustering accuracy. This combination enhances both local spatial consistency and global membership estimation, leading to precise segmentation. The proposed method is extensively evaluated on synthetic images with Gaussian, Salt and Pepper, and mixed noise, along with Brainweb, IBSR, and MRBrainS18 MRI datasets under varying Rician noise levels, and a CT image template. Furthermore, we benchmark our proposed method against two deep learning-based segmentation models, ResNet34-LinkNet and patch-based U-Net. Experimental results demonstrate significant improvements in segmentation accuracy, as validated by metrics such as Dice Score, Fuzzy Performance Index, Modified Partition Entropy, Average Volume Difference (AVD), and the XB index. Additionally, Friedman's statistical test confirms the superior performance of our approach compared to state-of-the-art clustering methods for noisy image segmentation. To facilitate reproducibility, the implementation of our proposed method is made publicly available at: Google Drive Repository.

Roadmap analysis for coronary artery stenosis detection and percutaneous coronary intervention prediction in cardiac CT for transcatheter aortic valve replacement.

Fujito H, Jilaihawi H, Han D, Gransar H, Hashimoto H, Cho SW, Lee S, Gheyath B, Park RH, Patel D, Guo Y, Kwan AC, Hayes SW, Thomson LEJ, Slomka PJ, Dey D, Makkar R, Friedman JD, Berman DS

pubmed logopapersJun 16 2025
The new artificial intelligence-based software, Roadmap (HeartFlow), may assist in evaluating coronary artery stenosis during cardiac computed tomography (CT) for transcatheter aortic valve replacement (TAVR). Consecutive TAVR candidates who underwent both cardiac CT angiography (CTA) and invasive coronary angiography were enrolled. We evaluated the ability of three methods to predict obstructive coronary artery disease (CAD), defined as ≥50 ​% stenosis on quantitative coronary angiography (QCA), and the need for percutaneous coronary intervention (PCI) within one year: Roadmap, clinician CT specialists with Roadmap, and CT specialists alone. The area under the curve (AUC) for predicting QCA ≥50 ​% stenosis was similar for CT specialists with or without Roadmap (0.93 [0.85-0.97] vs. 0.94 [0.88-0.98], p ​= ​0.82), both significantly higher than Roadmap alone (all p ​< ​0.05). For PCI prediction, no significant differences were found between QCA and CT specialists, with or without Roadmap, while Roadmap's AUC was lower (all p ​< ​0.05). The negative predictive value (NPV) of CT specialists with Roadmap for ≥50 ​% stenosis was 97 ​%, and for PCI prediction, the NPV was comparable to QCA (p ​= ​1.00). In contrast, the positive predictive value (PPV) of Roadmap alone for ≥50 ​% stenosis was 49 ​%, the lowest among all approaches, with a similar trend observed for PCI prediction. While Roadmap alone is insufficient for clinical decision-making due to low PPV, Roadmap may serve as a "second observer", providing a supportive tool for CT specialists by flagging lesions for careful review, thereby enhancing workflow efficiency and maintaining high diagnostic accuracy with excellent NPV.

Precision Medicine and Machine Learning to predict critical disease and death due to Coronavirus disease 2019 (COVID-19).

Júnior WLDT, Danelli T, Tano ZN, Cassela PLCS, Trigo GL, Cardoso KM, Loni LP, Ahrens TM, Espinosa BR, Fernandes AJ, Almeida ERD, Lozovoy MAB, Reiche EMV, Maes M, Simão ANC

pubmed logopapersJun 16 2025
The severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) causes Coronavirus Disease 2019 (COVID-19) and induces activation of inflammatory pathways, including the inflammasome. The aim was to construct Machine Learning (ML) models to predict critical disease and death in patients with COVID-19. A total of 528 individuals with SARS-CoV-2 infection were included, comprising 308 with critical and 220 with non-critical COVID-19. The ML models included imaging, demographic, inflammatory biomarkers, NLRP3 (rs10754558 and rs10157379) and IL18 (rs360717 and rs187238) inflammasome variants. Individuals with critical COVID-19 were older, higher male/female ratio, body mass index (BMI), rate of type 2 diabetes mellitus (T2DM), hypertension, inflammatory biomarkers, need of orotracheal intubation, intensive care unit admission, incidence of death, and sickness symptom complex (SSC) scores and lower peripheral oxygen saturation (SpO<sub>2</sub>) compared to those with non-critical disease. We found that 49.5 % of the variance in the severity of critical COVID-19 was explained by SpO<sub>2</sub> and SSC (negatively associated), chest computed tomography alterations (CCTA), inflammatory biomarkers, severe acute respiratory syndrome (SARS), BMI, T2DM, and age (positively associated). In this model, the NLRP3/IL18 variants showed indirect effects on critical COVID-19 that were mediated by inflammatory biomarkers, SARS, and SSC. Neural network models yielded a prediction of critical disease and death due to COVID-19 with an area under the receiving operating characteristic curve of 0.930 and 0.927, respectively. These ML methods increase the accuracy of predicting severity, critical illness, and mortality caused by COVID-19 and show that the genetic variants contribute to the predictive power of the ML models.

TCFNet: Bidirectional face-bone transformation via a Transformer-based coarse-to-fine point movement network.

Zhang R, Jie B, He Y, Wang J

pubmed logopapersJun 16 2025
Computer-aided surgical simulation is a critical component of orthognathic surgical planning, where accurately simulating face-bone shape transformations is significant. The traditional biomechanical simulation methods are limited by their computational time consumption levels, labor-intensive data processing strategies and low accuracy. Recently, deep learning-based simulation methods have been proposed to view this problem as a point-to-point transformation between skeletal and facial point clouds. However, these approaches cannot process large-scale points, have limited receptive fields that lead to noisy points, and employ complex preprocessing and postprocessing operations based on registration. These shortcomings limit the performance and widespread applicability of such methods. Therefore, we propose a Transformer-based coarse-to-fine point movement network (TCFNet) to learn unique, complicated correspondences at the patch and point levels for dense face-bone point cloud transformations. This end-to-end framework adopts a Transformer-based network and a local information aggregation network (LIA-Net) in the first and second stages, respectively, which reinforce each other to generate precise point movement paths. LIA-Net can effectively compensate for the neighborhood precision loss of the Transformer-based network by modeling local geometric structures (edges, orientations and relative position features). The previous global features are employed to guide the local displacement using a gated recurrent unit. Inspired by deformable medical image registration, we propose an auxiliary loss that can utilize expert knowledge for reconstructing critical organs. Our framework is an unsupervised algorithm, and this loss is optional. Compared with the existing state-of-the-art (SOTA) methods on gathered datasets, TCFNet achieves outstanding evaluation metrics and visualization results. The code is available at https://github.com/Runshi-Zhang/TCFNet.

Improving Prostate Gland Segmenting Using Transformer based Architectures

Shatha Abudalou

arxiv logopreprintJun 16 2025
Inter reader variability and cross site domain shift challenge the automatic segmentation of prostate anatomy using T2 weighted MRI images. This study investigates whether transformer models can retain precision amid such heterogeneity. We compare the performance of UNETR and SwinUNETR in prostate gland segmentation against our previous 3D UNet model [1], based on 546 MRI (T2weighted) volumes annotated by two independent experts. Three training strategies were analyzed: single cohort dataset, 5 fold cross validated mixed cohort, and gland size based dataset. Hyperparameters were tuned by Optuna. The test set, from an independent population of readers, served as the evaluation endpoint (Dice Similarity Coefficient). In single reader training, SwinUNETR achieved an average dice score of 0.816 for Reader#1 and 0.860 for Reader#2, while UNETR scored 0.8 and 0.833 for Readers #1 and #2, respectively, compared to the baseline UNets 0.825 for Reader #1 and 0.851 for Reader #2. SwinUNETR had an average dice score of 0.8583 for Reader#1 and 0.867 for Reader#2 in cross-validated mixed training. For the gland size-based dataset, SwinUNETR achieved an average dice score of 0.902 for Reader#1 subset and 0.894 for Reader#2, using the five-fold mixed training strategy (Reader#1, n=53; Reader#2, n=87) at larger gland size-based subsets, where UNETR performed poorly. Our findings demonstrate that global and shifted-window self-attention effectively reduces label noise and class imbalance sensitivity, resulting in improvements in the Dice score over CNNs by up to five points while maintaining computational efficiency. This contributes to the high robustness of SwinUNETR for clinical deployment.
Page 58 of 2252246 results
Show
per page

Ready to Sharpen Your Edge?

Join hundreds of your peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.