Sort by:
Page 8 of 878 results

Shortcut learning leads to sex bias in deep learning models for photoacoustic tomography.

Knopp M, Bender CJ, Holzwarth N, Li Y, Kempf J, Caranovic M, Knieling F, Lang W, Rother U, Seitel A, Maier-Hein L, Dreher KK

pubmed logopapersMay 9 2025
Shortcut learning has been identified as a source of algorithmic unfairness in medical imaging artificial intelligence (AI), but its impact on photoacoustic tomography (PAT), particularly concerning sex bias, remains underexplored. This study investigates this issue using peripheral artery disease (PAD) diagnosis as a specific clinical application. To examine the potential for sex bias due to shortcut learning in convolutional neural network (CNNs) and assess how such biases might affect diagnostic predictions, we created training and test datasets with varying PAD prevalence between sexes. Using these datasets, we explored (1) whether CNNs can classify the sex from imaging data, (2) how sex-specific prevalence shifts impact PAD diagnosis performance and underdiagnosis disparity between sexes, and (3) how similarly CNNs encode sex and PAD features. Our study with 147 individuals demonstrates that CNNs can classify the sex from calf muscle PAT images, achieving an AUROC of 0.75. For PAD diagnosis, models trained on data with imbalanced sex-specific disease prevalence experienced significant performance drops (up to 0.21 AUROC) when applied to balanced test sets. Additionally, greater imbalances in sex-specific prevalence within the training data exacerbated underdiagnosis disparities between sexes. Finally, we identify evidence of shortcut learning by demonstrating the effective reuse of learned feature representations between PAD diagnosis and sex classification tasks. CNN-based models trained on PAT data may engage in shortcut learning by leveraging sex-related features, leading to biased and unreliable diagnostic predictions. Addressing demographic-specific prevalence imbalances and preventing shortcut learning is critical for developing models in the medical field that are both accurate and equitable across diverse patient populations.

Medical machine learning operations: a framework to facilitate clinical AI development and deployment in radiology.

de Almeida JG, Messiou C, Withey SJ, Matos C, Koh DM, Papanikolaou N

pubmed logopapersMay 8 2025
The integration of machine-learning technologies into radiology practice has the potential to significantly enhance diagnostic workflows and patient care. However, the successful deployment and maintenance of medical machine-learning (MedML) systems in radiology requires robust operational frameworks. Medical machine-learning operations (MedMLOps) offer a structured approach ensuring persistent MedML reliability, safety, and clinical relevance. MedML systems are increasingly employed to analyse sensitive clinical and radiological data, which continuously changes due to advancements in data acquisition and model development. These systems can alleviate the workload of radiologists by streamlining diagnostic tasks, such as image interpretation and triage. MedMLOps ensures that such systems stay accurate and dependable by facilitating continuous performance monitoring, systematic validation, and simplified model maintenance-all critical to maintaining trust in machine-learning-driven diagnostics. Furthermore, MedMLOps aligns with established principles of patient data protection and regulatory compliance, including recent developments in the European Union, emphasising transparency, documentation, and safe model retraining. This enables radiologists to implement modern machine-learning tools with control and oversight at the forefront, ensuring reliable model performance within the dynamic context of clinical practice. MedMLOps empowers radiologists to deliver consistent, high-quality care with confidence, ensuring that MedML systems stay aligned with evolving medical standards and patient needs. MedMLOps can assist multiple stakeholders in radiology by ensuring models are available, continuously monitored and easy to use and maintain while preserving patient privacy. MedMLOps can better serve patients by facilitating the clinical implementation of cutting-edge MedML and clinicians by ensuring that MedML models are only utilised when they are performing as expected. KEY POINTS: Question MedML applications are becoming increasingly adopted in clinics, but the necessary infrastructure to sustain these applications is currently not well-defined. Findings Adapting machine learning operations concepts enhances MedML ecosystems by improving interoperability, automating monitoring/validation, and reducing deployment burdens on clinicians and medical informaticians. Clinical relevance Implementing these solutions eases the faster and safer adoption of advanced MedML models, ensuring consistent performance while reducing workload for clinicians, benefiting patient care through streamlined diagnostic workflows.

False Promises in Medical Imaging AI? Assessing Validity of Outperformance Claims

Evangelia Christodoulou, Annika Reinke, Pascaline Andrè, Patrick Godau, Piotr Kalinowski, Rola Houhou, Selen Erkan, Carole H. Sudre, Ninon Burgos, Sofiène Boutaj, Sophie Loizillon, Maëlys Solal, Veronika Cheplygina, Charles Heitz, Michal Kozubek, Michela Antonelli, Nicola Rieke, Antoine Gilson, Leon D. Mayer, Minu D. Tizabi, M. Jorge Cardoso, Amber Simpson, Annette Kopp-Schneider, Gaël Varoquaux, Olivier Colliot, Lena Maier-Hein

arxiv logopreprintMay 7 2025
Performance comparisons are fundamental in medical imaging Artificial Intelligence (AI) research, often driving claims of superiority based on relative improvements in common performance metrics. However, such claims frequently rely solely on empirical mean performance. In this paper, we investigate whether newly proposed methods genuinely outperform the state of the art by analyzing a representative cohort of medical imaging papers. We quantify the probability of false claims based on a Bayesian approach that leverages reported results alongside empirically estimated model congruence to estimate whether the relative ranking of methods is likely to have occurred by chance. According to our results, the majority (>80%) of papers claims outperformance when introducing a new method. Our analysis further revealed a high probability (>5%) of false outperformance claims in 86% of classification papers and 53% of segmentation papers. These findings highlight a critical flaw in current benchmarking practices: claims of outperformance in medical imaging AI are frequently unsubstantiated, posing a risk of misdirecting future research efforts.

Opinions and preferences regarding artificial intelligence use in healthcare delivery: results from a national multi-site survey of breast imaging patients.

Dontchos BN, Dodelzon K, Bhole S, Edmonds CE, Mullen LA, Parikh JR, Daly CP, Epling JA, Christensen S, Grimm LJ

pubmed logopapersMay 6 2025
Artificial intelligence (AI) utilization is growing, but patient perceptions of AI are unclear. Our objective was to understand patient perceptions of AI through a multi-site survey of breast imaging patients. A 36-question survey was distributed to eight US practices (6 academic, 2 non-academic) from October 2023 through October 2024. This manuscript analyzes a subset of questions from the survey addressing digital health literacy and attitudes towards AI in medicine and breast imaging specifically. Multivariable analysis compared responses by respondent demographics. A total of 3,532 surveys were collected (response rate: 69.9%, 3,532/5053). Median respondent age was 55 years (IQR 20). Most respondents were White (73.0%, 2579/3532) and had completed college (77.3%, 2732/3532). Overall, respondents were undecided (range: 43.2%-50.8%) regarding questions about general perceptions of AI in healthcare. Respondents with higher electronic health literacy, more education, and younger age were significantly more likely to consider it useful to use utilize AI for aiding medical tasks (all p<0.001). In contrast, respondents with lower electronic health literacy and less education were significantly more likely to indicate it was a bad idea for AI to perform medical tasks (p<0.001). Non-White patients were more likely to express concerns that AI will not work as well for some groups compared to others (p<0.05). Overall, favorable opinions of AI use for medical tasks were associated with younger age, more education, and higher electronic health literacy. As AI is increasingly implemented into clinical workflows, it is important to educate patients and provide transparency to build patient understanding and trust.

Patients', clinicians' and developers' perspectives and experiences of artificial intelligence in cardiac healthcare: A qualitative study.

Baillie L, Stewart-Lord A, Thomas N, Frings D

pubmed logopapersJan 1 2025
This study investigated perspectives and experiences of artificial intelligence (AI) developers, clinicians and patients about the use of AI-based software in cardiac healthcare. A qualitative study took place at two hospitals in England that had trialled AI-based software use in stress echocardiography, a scan that uses ultrasound to assess heart function. Semi-structured interviews were conducted with: patients (<i>n = </i>9), clinicians (<i>n = </i>16) and AI software developers (<i>n = </i>5). Data were analysed using thematic analysis. Potential benefits identified were increasing consistency and reliability through reducing human error, and greater efficiency. Concerns included over-reliance on the AI technology, and data security. Participants discussed the need for human input and empathy within healthcare, transparency about AI use, and issues around trusting AI. Participants considered AI's role as assisting diagnosis but not replacing clinician involvement. Clinicians and patients emphasised holistic diagnosis that involves more than the scan. Clinicians considered their diagnostic ability as superior and discrepancies were managed in line with clinicians' diagnoses rather than AI reports. The practicalities of using the AI software concerned image acquisition to meet AI processing requirements and workflow integration. There was positivity towards AI use, but the AI software was considered an adjunct to clinicians rather than replacing their input. Clinicians' experiences were that their diagnostic ability remained superior to the AI, and acquiring images acceptable to AI was sometimes problematic. Despite hopes for increased efficiency through AI use, clinicians struggled to identify fit with clinical workflow to bring benefit.

YOLOv8 framework for COVID-19 and pneumonia detection using synthetic image augmentation.

A Hasib U, Md Abu R, Yang J, Bhatti UA, Ku CS, Por LY

pubmed logopapersJan 1 2025
Early and accurate detection of COVID-19 and pneumonia through medical imaging is critical for effective patient management. This study aims to develop a robust framework that integrates synthetic image augmentation with advanced deep learning (DL) models to address dataset imbalance, improve diagnostic accuracy, and enhance trust in artificial intelligence (AI)-driven diagnoses through Explainable AI (XAI) techniques. The proposed framework benchmarks state-of-the-art models (InceptionV3, DenseNet, ResNet) for initial performance evaluation. Synthetic images are generated using Feature Interpolation through Linear Mapping and principal component analysis to enrich dataset diversity and balance class distribution. YOLOv8 and InceptionV3 models, fine-tuned via transfer learning, are trained on the augmented dataset. Grad-CAM is used for model explainability, while large language models (LLMs) support visualization analysis to enhance interpretability. YOLOv8 achieved superior performance with 97% accuracy, precision, recall, and F1-score, outperforming benchmark models. Synthetic data generation effectively reduced class imbalance and improved recall for underrepresented classes. Comparative analysis demonstrated significant advancements over existing methodologies. XAI visualizations (Grad-CAM heatmaps) highlighted anatomically plausible focus areas aligned with clinical markers of COVID-19 and pneumonia, thereby validating the model's decision-making process. The integration of synthetic data generation, advanced DL, and XAI significantly enhances the detection of COVID-19 and pneumonia while fostering trust in AI systems. YOLOv8's high accuracy, coupled with interpretable Grad-CAM visualizations and LLM-driven analysis, promotes transparency crucial for clinical adoption. Future research will focus on developing a clinically viable, human-in-the-loop diagnostic workflow, further optimizing performance through the integration of transformer-based language models to improve interpretability and decision-making.

Ensuring Fairness in Detecting Mild Cognitive Impairment with MRI.

Tong B, Edwards T, Yang S, Hou B, Tarzanagh DA, Urbanowicz RJ, Moore JH, Ritchie MD, Davatzikos C, Shen L

pubmed logopapersJan 1 2024
Machine learning (ML) algorithms play a crucial role in the early and accurate diagnosis of Alzheimer's Disease (AD), which is essential for effective treatment planning. However, existing methods are not well-suited for identifying Mild Cognitive Impairment (MCI), a critical transitional stage between normal aging and AD. This inadequacy is primarily due to label imbalance and bias from different sensitve attributes in MCI classification. To overcome these challenges, we have designed an end-to-end fairness-aware approach for label-imbalanced classification, tailored specifically for neuroimaging data. This method, built on the recently developed FACIMS framework, integrates into STREAMLINE, an automated ML environment. We evaluated our approach against nine other ML algorithms and found that it achieves comparable balanced accuracy to other methods while prioritizing fairness in classifications with five different sensitive attributes. This analysis contributes to the development of equitable and reliable ML diagnostics for MCI detection.

Enhancement of Fairness in AI for Chest X-ray Classification.

Jackson NJ, Yan C, Malin BA

pubmed logopapersJan 1 2024
The use of artificial intelligence (AI) in medicine has shown promise to improve the quality of healthcare decisions. However, AI can be biased in a manner that produces unfair predictions for certain demographic subgroups. In MIMIC-CXR, a publicly available dataset of over 300,000 chest X-ray images, diagnostic AI has been shown to have a higher false negative rate for racial minorities. We evaluated the capacity of synthetic data augmentation, oversampling, and demographic-based corrections to enhance the fairness of AI predictions. We show that adjusting unfair predictions for demographic attributes, such as race, is ineffective at improving fairness or predictive performance. However, using oversampling and synthetic data augmentation to modify disease prevalence reduced such disparities by 74.7% and 10.6%, respectively. Moreover, such fairness gains were accomplished without reduction in performance (95% CI AUC: [0.816, 0.820] versus [0.810, 0.819] versus [0.817, 0.821] for baseline, oversampling, and augmentation, respectively).
Page 8 of 878 results
Show
per page

Ready to Sharpen Your Edge?

Join hundreds of your peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.