Sort by:
Page 122 of 2922917 results

Pixel-wise Modulated Dice Loss for Medical Image Segmentation

Seyed Mohsen Hosseini

arxiv logopreprintJun 17 2025
Class imbalance and the difficulty imbalance are the two types of data imbalance that affect the performance of neural networks in medical segmentation tasks. In class imbalance the loss is dominated by the majority classes and in difficulty imbalance the loss is dominated by easy to classify pixels. This leads to an ineffective training. Dice loss, which is based on a geometrical metric, is very effective in addressing the class imbalance compared to the cross entropy (CE) loss, which is adopted directly from classification tasks. To address the difficulty imbalance, the common approach is employing a re-weighted CE loss or a modified Dice loss to focus the training on difficult to classify areas. The existing modification methods are computationally costly and with limited success. In this study we propose a simple modification to the Dice loss with minimal computational cost. With a pixel level modulating term, we take advantage of the effectiveness of Dice loss in handling the class imbalance to also handle the difficulty imbalance. Results on three commonly used medical segmentation tasks show that the proposed Pixel-wise Modulated Dice loss (PM Dice loss) outperforms other methods, which are designed to tackle the difficulty imbalance problem.

SCISSOR: Mitigating Semantic Bias through Cluster-Aware Siamese Networks for Robust Classification

Shuo Yang, Bardh Prenkaj, Gjergji Kasneci

arxiv logopreprintJun 17 2025
Shortcut learning undermines model generalization to out-of-distribution data. While the literature attributes shortcuts to biases in superficial features, we show that imbalances in the semantic distribution of sample embeddings induce spurious semantic correlations, compromising model robustness. To address this issue, we propose SCISSOR (Semantic Cluster Intervention for Suppressing ShORtcut), a Siamese network-based debiasing approach that remaps the semantic space by discouraging latent clusters exploited as shortcuts. Unlike prior data-debiasing approaches, SCISSOR eliminates the need for data augmentation and rewriting. We evaluate SCISSOR on 6 models across 4 benchmarks: Chest-XRay and Not-MNIST in computer vision, and GYAFC and Yelp in NLP tasks. Compared to several baselines, SCISSOR reports +5.3 absolute points in F1 score on GYAFC, +7.3 on Yelp, +7.7 on Chest-XRay, and +1 on Not-MNIST. SCISSOR is also highly advantageous for lightweight models with ~9.5% improvement on F1 for ViT on computer vision datasets and ~11.9% for BERT on NLP. Our study redefines the landscape of model generalization by addressing overlooked semantic biases, establishing SCISSOR as a foundational framework for mitigating shortcut learning and fostering more robust, bias-resistant AI systems.

Step-by-Step Approach to Design Image Classifiers in AI: An Exemplary Application of the CNN Architecture for Breast Cancer Diagnosis

Lohani, A., Mishra, B. K., Wertheim, K. Y., Fagbola, T. M.

medrxiv logopreprintJun 17 2025
In recent years, different Convolutional Neural Networks (CNNs) approaches have been applied for image classification in general and specific problems such as breast cancer diagnosis, but there is no standardising approach to facilitate comparison and synergy. This paper attempts a step-by-step approach to standardise a common application of image classification with the specific problem of classifying breast ultrasound images for breast cancer diagnosis as an illustrative example. In this study, three distinct datasets: Breast Ultrasound Image (BUSI), Breast Ultrasound Image (BUI), and Ultrasound Breast Images for Breast Cancer (UBIBC) datasets have been used to build and fine-tune custom and pre-trained CNN models systematically. Custom CNN models have been built, and hence, transfer learning (TL) has been applied to deploy a broad range of pre-trained models, optimised by applying data augmentation techniques and hyperparameter tuning. Models were trained and tested in scenarios involving limited and large datasets to gain insights into their robustness and generality. The obtained results indicated that the custom CNN and VGG19 are the two most suitable architectures for this problem. The experimental results highlight the significance of employing an effective step-by-step approach in image classification tasks to enhance the robustness and generalisation capabilities of CNN-based classifiers.

Radiologist-AI workflow can be modified to reduce the risk of medical malpractice claims

Bernstein, M., Sheppard, B., Bruno, M. A., Lay, P. S., Baird, G. L.

medrxiv logopreprintJun 16 2025
BackgroundArtificial Intelligence (AI) is rapidly changing the legal landscape of radiology. Results from a previous experiment suggested that providing AI error rates can reduce perceived radiologist culpability, as judged by mock jury members (4). The current study advances this work by examining whether the radiologists behavior also impacts perceptions of liability. Methods. Participants (n=282) read about a hypothetical malpractice case where a 50-year-old who visited the Emergency Department with acute neurological symptoms received a brain CT scan to determine if bleeding was present. An AI system was used by the radiologist who interpreted imaging. The AI system correctly flagged the case as abnormal. Nonetheless, the radiologist concluded no evidence of bleeding, and the blood-thinner t-PA was administered. Participants were randomly assigned to either a 1.) single-read condition, where the radiologist interpreted the CT once after seeing AI feedback, or 2.) a double-read condition, where the radiologist interpreted the CT twice, first without AI and then with AI feedback. Participants were then told the patient suffered irreversible brain damage due to the missed brain bleed, resulting in the patient (plaintiff) suing the radiologist (defendant). Participants indicated whether the radiologist met their duty of care to the patient (yes/no). Results. Hypothetical jurors were more likely to side with the plaintiff in the single-read condition (106/142, 74.7%) than in the double-read condition (74/140, 52.9%), p=0.0002. Conclusion. This suggests that the penalty for disagreeing with correct AI can be mitigated when images are interpreted twice, or at least if a radiologist gives an interpretation before AI is used.

Next-generation machine learning model to measure the Norberg angle on canine hip radiographs increases accuracy and time to completion.

Hansen GC, Yao Y, Fischetti AJ, Gonzalez A, Porter I, Todhunter RJ, Zhang Y

pubmed logopapersJun 16 2025
To apply machine learning (ML) to measure the Norberg angle (NA) on canine ventrodorsal hip-extended pelvic radiographs. In this observational study, an NA-AI model was trained on real and synthetic radiographs. Additional radiographs were used for validation and testing. Each NA was predicted using a hybrid architecture derived from 2 ML vision models. The NAs were measured by 4 authors, and the model all were compared to each other. The time taken to correct the NAs predicted by the model was compared to unassisted human measurements. The NA-AI model was trained on 733 real and 1,474 synthetic radiographs; 105 real radiographs were used for validation and 128 for testing. The mean absolute error between each human measurement ranged from 3° to 10° ± SD = 3° to 10° with an intraclass correlation between humans of 0.38 to 0.92. The mean absolute error between the NA-AI model prediction and the human measurements was 5° to 6° ± SD = 5° (intraclass correlation, 0.39 to 0.94). Bland-Altman plots showed good agreement between human and AI measurements when the NAs were greater than 80°. The time taken to check the accuracy of the NA measurement compared to unassisted measurements was reduced by 45% to 80%. The NA-AI model proved more accurate than the original model except when the hip dysplasia was severe, and its assistance decreased the time needed to analyze radiographs. The assistance of the NA-AI model reduces the time taken for radiographic hip analysis for clinical applications. However, it is less reliable in cases involving severe osteoarthritic change, requiring manual review for such cases.

Integration of MRI radiomics and germline genetics to predict the IDH mutation status of gliomas.

Nakase T, Henderson GA, Barba T, Bareja R, Guerra G, Zhao Q, Francis SS, Gevaert O, Kachuri L

pubmed logopapersJun 16 2025
The molecular profiling of gliomas for isocitrate dehydrogenase (IDH) mutations currently relies on resected tumor samples, highlighting the need for non-invasive, preoperative biomarkers. We investigated the integration of glioma polygenic risk scores (PRS) and radiographic features for prediction of IDH mutation status. We used 256 radiomic features, a glioma PRS and demographic information in 158 glioma cases within elastic net and neural network models. The integration of glioma PRS with radiomics increased the area under the receiver operating characteristic curve (AUC) for distinguishing IDH-wildtype vs. IDH-mutant glioma from 0.83 to 0.88 (P<sub>ΔAUC</sub> = 6.9 × 10<sup>-5</sup>) in the elastic net model and from 0.91 to 0.92 (P<sub>ΔAUC</sub> = 0.32) in the neural network model. Incorporating age at diagnosis and sex further improved the classifiers (elastic net: AUC = 0.93, neural network: AUC = 0.93). Patients predicted to have IDH-mutant vs. IDH-wildtype tumors had significantly lower mortality risk (hazard ratio (HR) = 0.18, 95% CI: 0.08-0.40, P = 2.1 × 10<sup>-5</sup>), comparable to prognostic trajectories for biopsy-confirmed IDH status. The augmentation of imaging-based classifiers with genetic risk profiles may help delineate molecular subtypes and improve the timely, non-invasive clinical assessment of glioma patients.

Whole-lesion-aware network based on freehand ultrasound video for breast cancer assessment: a prospective multicenter study.

Han J, Gao Y, Huo L, Wang D, Xie X, Zhang R, Xiao M, Zhang N, Lei M, Wu Q, Ma L, Sun C, Wang X, Liu L, Cheng S, Tang B, Wang L, Zhu Q, Wang Y

pubmed logopapersJun 16 2025
The clinical application of artificial intelligence (AI) models based on breast ultrasound static images has been hindered in real-world workflows due to operator-dependence of standardized image acquisition and incomplete view of breast lesions on static images. To better exploit the real-time advantages of ultrasound and more conducive to clinical application, we proposed a whole-lesion-aware network based on freehand ultrasound video (WAUVE) scanning in an arbitrary direction for predicting overall breast cancer risk score. The WAUVE was developed using 2912 videos (2912 lesions) of 2771 patients retrospectively collected from May 2020 to August 2022 in two hospitals. We compared the diagnostic performance of WAUVE with static 2D-ResNet50 and dynamic TimeSformer models in the internal validation set. Subsequently, a dataset comprising 190 videos (190 lesions) from 175 patients prospectively collected from December 2022 to April 2023 in two other hospitals, was used as an independent external validation set. A reader study was conducted by four experienced radiologists on the external validation set. We compared the diagnostic performance of WAUVE with the four experienced radiologists and evaluated the auxiliary value of model for radiologists. The WAUVE demonstrated superior performance compared to the 2D-ResNet50 model, while similar to the TimeSformer model. In the external validation set, WAUVE achieved an area under the receiver operating characteristic curve (AUC) of 0.8998 (95% CI = 0.8529-0.9439), and showed a comparable diagnostic performance to that of four experienced radiologists in terms of sensitivity (97.39% vs. 98.48%, p = 0.36), specificity (49.33% vs. 50.00%, p = 0.92), and accuracy (78.42% vs.79.34%, p = 0.60). With the WAUVE model assistance, the average specificity of four experienced radiologists was improved by 6.67%, and higher consistency was achieved (from 0.807 to 0.838). The WAUVE based on non-standardized ultrasound scanning demonstrated excellent performance in breast cancer assessment which yielded outcomes similar to those of experienced radiologists, indicating the clinical application of the WAUVE model promising.

Imaging-Based AI for Predicting Lymphovascular Space Invasion in Cervical Cancer: Systematic Review and Meta-Analysis.

She L, Li Y, Wang H, Zhang J, Zhao Y, Cui J, Qiu L

pubmed logopapersJun 16 2025
The role of artificial intelligence (AI) in enhancing the accuracy of lymphovascular space invasion (LVSI) detection in cervical cancer remains debated. This meta-analysis aimed to evaluate the diagnostic accuracy of imaging-based AI for predicting LVSI in cervical cancer. We conducted a comprehensive literature search across multiple databases, including PubMed, Embase, and Web of Science, identifying studies published up to November 9, 2024. Studies were included if they evaluated the diagnostic performance of imaging-based AI models in detecting LVSI in cervical cancer. We used a bivariate random-effects model to calculate pooled sensitivity and specificity with corresponding 95% confidence intervals. Study heterogeneity was assessed using the I2 statistic. Of 403 studies identified, 16 studies (2514 patients) were included. For the interval validation set, the pooled sensitivity, specificity, and area under the curve (AUC) for detecting LVSI were 0.84 (95% CI 0.79-0.87), 0.78 (95% CI 0.75-0.81), and 0.87 (95% CI 0.84-0.90). For the external validation set, the pooled sensitivity, specificity, and AUC for detecting LVSI were 0.79 (95% CI 0.70-0.86), 0.76 (95% CI 0.67-0.83), and 0.84 (95% CI 0.81-0.87). Using the likelihood ratio test for subgroup analysis, deep learning demonstrated significantly higher sensitivity compared to machine learning (P=.01). Moreover, AI models based on positron emission tomography/computed tomography exhibited superior sensitivity relative to those based on magnetic resonance imaging (P=.01). Imaging-based AI, particularly deep learning algorithms, demonstrates promising diagnostic performance in predicting LVSI in cervical cancer. However, the limited external validation datasets and the retrospective nature of the research may introduce potential biases. These findings underscore AI's potential as an auxiliary diagnostic tool, necessitating further large-scale prospective validation.

Real-time cardiac cine MRI: A comparison of a diffusion probabilistic model with alternative state-of-the-art image reconstruction techniques for undersampled spiral acquisitions.

Schad O, Heidenreich JF, Petri N, Kleineisel J, Sauer S, Bley TA, Nordbeck P, Petritsch B, Wech T

pubmed logopapersJun 16 2025
Electrocardiogram (ECG)-gated cine imaging in breath-hold enables high-quality diagnostics in most patients but can be compromised by arrhythmia and inability to hold breath. Real-time cardiac MRI offers faster and robust exams without these limitations. To achieve sufficient acceleration, advanced reconstruction methods, which transfer data into high-quality images, are required. In this study, undersampled spiral balanced SSFP (bSSFP) real-time data in free-breathing were acquired at 1.5T in 16 healthy volunteers and five arrhythmic patients, with ECG-gated Cartesian cine in breath-hold serving as clinical reference. Image reconstructions were performed using a tailored and specifically trained score-based diffusion model, compared to a variational network and different compressed sensing approaches. The techniques were assessed using an expert reader study, scalar metric calculations, difference images against a segmented reference, and Bland-Altman analysis of cardiac functional parameters. In participants with irregular RR-cycles, spiral real-time acquisitions showed superior image quality compared to the clinical reference. Quantitative and qualitative metrics indicate enhanced image quality of the diffusion model in comparison to the alternative reconstruction methods, although improvements over the variational network were minor. Slightly higher ejection fractions for the real-time diffusion reconstructions were exhibited relative to the clinical references with a bias of 1.1 ± 5.7% for healthy subjects. The proposed real-time technique enables free-breathing acquisitions of spatio-temporal images with high quality, covering the entire heart in less than 1 min. Evaluation of ejection fraction using the ECG-gated reference can be vulnerable to arrhythmia and averaging effects, highlighting the need for real-time approaches. Prolonged inference times and stochastic variability of the diffusion reconstruction represent obstacles to overcome for clinical translation.
Page 122 of 2922917 results
Show
per page

Ready to Sharpen Your Edge?

Join hundreds of your peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.