Sort by:
Page 3 of 22220 results

FetalMLOps: operationalizing machine learning models for standard fetal ultrasound plane classification.

Testi M, Fiorentino MC, Ballabio M, Visani G, Ciccozzi M, Frontoni E, Moccia S, Vessio G

pubmed logopapersSep 8 2025
Fetal standard plane detection is essential in prenatal care, enabling accurate assessment of fetal development and early identification of potential anomalies. Despite significant advancements in machine learning (ML) in this domain, its integration into clinical workflows remains limited-primarily due to the lack of standardized, end-to-end operational frameworks. To address this gap, we introduce FetalMLOps, the first comprehensive MLOps framework specifically designed for fetal ultrasound imaging. Our approach adopts a ten-step MLOps methodology that covers the entire ML lifecycle, with each phase meticulously adapted to clinical needs. From defining the clinical objective to curating and annotating fetal US datasets, every step ensures alignment with real-world medical practice. ETL (extract, transform, load) processes are developed to standardize, anonymize, and harmonize inputs, enhancing data quality. Model development prioritizes architectures that balance accuracy and efficiency, using clinically relevant evaluation metrics to guide selection. The best-performing model is deployed via a RESTful API, following MLOps best practices for continuous integration, delivery, and performance monitoring. Crucially, the framework embeds principles of explainability and environmental sustainability, promoting ethical, transparent, and responsible AI. By operationalizing ML models within a clinically meaningful pipeline, FetalMLOps bridges the gap between algorithmic innovation and real-world application, setting a precedent for trustworthy and scalable AI adoption in prenatal care.

Adherence to the Checklist for Artificial Intelligence in Medical Imaging (CLAIM): an umbrella review with a comprehensive two-level analysis.

Koçak B, Köse F, Keleş A, Şendur A, Meşe İ, Karagülle M

pubmed logopapersSep 8 2025
To comprehensively assess Checklist for Artificial Intelligence in Medical Imaging (CLAIM) adherence in medical imaging artificial intelligence (AI) literature by aggregating data from previous systematic and non-systematic reviews. A systematic search of PubMed, Scopus, and Google Scholar identified reviews using the CLAIM to evaluate medical imaging AI studies. Reviews were analyzed at two levels: review level (33 reviews; 1,458 studies) and study level (421 unique studies from 15 reviews). The CLAIM adherence metrics (scores and compliance rates), baseline characteristics, factors influencing adherence, and critiques of the CLAIM were analyzed. A review-level analysis of 26 reviews (874 studies) found a weighted mean CLAIM score of 25 [standard deviation (SD): 4] and a median of 26 [interquartile range (IQR): 8; 25<sup>th</sup>-75<sup>th</sup> percentiles: 20-28]. In a separate review-level analysis involving 18 reviews (993 studies), the weighted mean CLAIM compliance was 63% (SD: 11%), with a median of 66% (IQR: 4%; 25<sup>th</sup>-75<sup>th</sup> percentiles: 63%-67%). A study-level analysis of 421 unique studies published between 1997 and 2024 found a median CLAIM score of 26 (IQR: 6; 25<sup>th</sup>-75<sup>th</sup> percentiles: 23-29) and a median compliance of 68% (IQR: 16%; 25<sup>th</sup>-75<sup>th</sup> percentiles: 59%-75%). Adherence was independently associated with the journal impact factor quartile, publication year, and specific radiology subfields. After guideline publication, CLAIM compliance improved (<i>P</i> = 0.004). Multiple readers provided an evaluation in 85% (28/33) of reviews, but only 11% (3/28) included a reliability analysis. An item-wise evaluation identified 11 underreported items (missing in ≥50% of studies). Among the 10 identified critiques, the most common were item inapplicability to diverse study types and subjective interpretations of fulfillment. Our two-level analysis revealed considerable reporting gaps, underreported items, factors related to adherence, and common CLAIM critiques, providing actionable insights for researchers and journals to improve transparency, reproducibility, and reporting quality in AI studies. By combining data from systematic and non-systematic reviews on CLAIM adherence, our comprehensive findings may serve as targets to help researchers and journals improve transparency, reproducibility, and reporting quality in AI studies.

RetinaGuard: Obfuscating Retinal Age in Fundus Images for Biometric Privacy Preserving

Zhengquan Luo, Chi Liu, Dongfu Xiao, Zhen Yu, Yueye Wang, Tianqing Zhu

arxiv logopreprintSep 7 2025
The integration of AI with medical images enables the extraction of implicit image-derived biomarkers for a precise health assessment. Recently, retinal age, a biomarker predicted from fundus images, is a proven predictor of systemic disease risks, behavioral patterns, aging trajectory and even mortality. However, the capability to infer such sensitive biometric data raises significant privacy risks, where unauthorized use of fundus images could lead to bioinformation leakage, breaching individual privacy. In response, we formulate a new research problem of biometric privacy associated with medical images and propose RetinaGuard, a novel privacy-enhancing framework that employs a feature-level generative adversarial masking mechanism to obscure retinal age while preserving image visual quality and disease diagnostic utility. The framework further utilizes a novel multiple-to-one knowledge distillation strategy incorporating a retinal foundation model and diverse surrogate age encoders to enable a universal defense against black-box age prediction models. Comprehensive evaluations confirm that RetinaGuard successfully obfuscates retinal age prediction with minimal impact on image quality and pathological feature representation. RetinaGuard is also flexible for extension to other medical image derived biomarkers. RetinaGuard is also flexible for extension to other medical image biomarkers.

RetinaGuard: Obfuscating Retinal Age in Fundus Images for Biometric Privacy Preserving

Zhengquan Luo, Chi Liu, Dongfu Xiao, Zhen Yu, Yueye Wang, Tianqing Zhu

arxiv logopreprintSep 7 2025
The integration of AI with medical images enables the extraction of implicit image-derived biomarkers for a precise health assessment. Recently, retinal age, a biomarker predicted from fundus images, is a proven predictor of systemic disease risks, behavioral patterns, aging trajectory and even mortality. However, the capability to infer such sensitive biometric data raises significant privacy risks, where unauthorized use of fundus images could lead to bioinformation leakage, breaching individual privacy. In response, we formulate a new research problem of biometric privacy associated with medical images and propose RetinaGuard, a novel privacy-enhancing framework that employs a feature-level generative adversarial masking mechanism to obscure retinal age while preserving image visual quality and disease diagnostic utility. The framework further utilizes a novel multiple-to-one knowledge distillation strategy incorporating a retinal foundation model and diverse surrogate age encoders to enable a universal defense against black-box age prediction models. Comprehensive evaluations confirm that RetinaGuard successfully obfuscates retinal age prediction with minimal impact on image quality and pathological feature representation. RetinaGuard is also flexible for extension to other medical image derived biomarkers. RetinaGuard is also flexible for extension to other medical image biomarkers.

Interpretable Semi-federated Learning for Multimodal Cardiac Imaging and Risk Stratification: A Privacy-Preserving Framework.

Liu X, Li S, Zhu Q, Xu S, Jin Q

pubmed logopapersSep 5 2025
The growing heterogeneity of cardiac patient data from hospitals and wearables necessitates predictive models that are tailored, comprehensible, and safeguard privacy. This study introduces PerFed-Cardio, a lightweight and interpretable semi-federated learning (Semi-FL) system for real-time cardiovascular risk stratification utilizing multimodal data, including cardiac imaging, physiological signals, and electronic health records (EHR). In contrast to conventional federated learning, where all clients engage uniformly, our methodology employs a personalized Semi-FL approach that enables high-capacity nodes (e.g., hospitals) to conduct comprehensive training, while edge devices (e.g., wearables) refine shared models via modality-specific subnetworks. Cardiac MRI and echocardiography pictures are analyzed via lightweight convolutional neural networks enhanced with local attention modules to highlight diagnostically significant areas. Physiological characteristics (e.g., ECG, activity) and EHR data are amalgamated through attention-based fusion layers. Model transparency is attained using Local Interpretable Model-agnostic Explanations (LIME) and Grad-CAM, which offer spatial and feature-level elucidations for each prediction. Assessments on authentic multimodal datasets from 123 patients across five simulated institutions indicate that PerFed-Cardio attains an AUC-ROC of 0.972 with an inference latency of 130 ms. The customized model calibration and targeted training diminish communication load by 28%, while maintaining an F1-score over 92% in noisy situations. These findings underscore PerFed-Cardio as a privacy-conscious, adaptive, and interpretable system for scalable cardiac risk assessment.

Detecting, Characterizing, and Mitigating Implicit and Explicit Racial Biases in Health Care Datasets With Subgroup Learnability: Algorithm Development and Validation Study.

Gulamali F, Sawant AS, Liharska L, Horowitz C, Chan L, Hofer I, Singh K, Richardson L, Mensah E, Charney A, Reich D, Hu J, Nadkarni G

pubmed logopapersSep 4 2025
The growing adoption of diagnostic and prognostic algorithms in health care has led to concerns about the perpetuation of algorithmic bias against disadvantaged groups of individuals. Deep learning methods to detect and mitigate bias have revolved around modifying models, optimization strategies, and threshold calibration with varying levels of success and tradeoffs. However, there have been limited substantive efforts to address bias at the level of the data used to generate algorithms in health care datasets. The aim of this study is to create a simple metric (AEquity) that uses a learning curve approximation to distinguish and mitigate bias via guided dataset collection or relabeling. We demonstrate this metric in 2 well-known examples, chest X-rays and health care cost utilization, and detect novel biases in the National Health and Nutrition Examination Survey. We demonstrated that using AEquity to guide data-centric collection for each diagnostic finding in the chest radiograph dataset decreased bias by between 29% and 96.5% when measured by differences in area under the curve. Next, we wanted to examine (1) whether AEquity worked on intersectional populations and (2) if AEquity is invariant to different types of fairness metrics, not just area under the curve. Subsequently, we examined the effect of AEquity on mitigating bias when measured by false negative rate, precision, and false discovery rate for Black patients on Medicaid. When we examined Black patients on Medicaid, at the intersection of race and socioeconomic status, we found that AEquity-based interventions reduced bias across a number of different fairness metrics including overall false negative rate by 33.3% (bias reduction absolute=1.88×10-1, 95% CI 1.4×10-1 to 2.5×10-1; bias reduction of 33.3%, 95% CI 26.6%-40%; precision bias by 7.50×10-2, 95% CI 7.48×10-2 to 7.51×10-2; bias reduction of 94.6%, 95% CI 94.5%-94.7%; false discovery rate by 94.5%; absolute bias reduction=3.50×10-2, 95% CI 3.49×10-2 to 3.50×10-2). Similarly, AEquity-guided data collection demonstrated bias reduction of up to 80% on mortality prediction with the National Health and Nutrition Examination Survey (bias reduction absolute=0.08, 95% CI 0.07-0.09). Then, we wanted to compare AEquity to state-of-the-art data-guided debiasing measures such as balanced empirical risk minimization and calibration. Consequently, we benchmarked against balanced empirical risk minimization and calibration and showed that AEquity-guided data collection outperforms both standard approaches. Moreover, we demonstrated that AEquity works on fully connected networks; convolutional neural networks such as ResNet-50; transformer architectures such as VIT-B-16, a vision transformer with 86 million parameters; and nonparametric methods such as Light Gradient-Boosting Machine. In short, we demonstrated that AEquity is a robust tool by applying it to different datasets, algorithms, and intersectional analyses and measuring its effectiveness with respect to a range of traditional fairness metrics.

Resting-State Functional MRI: Current State, Controversies, Limitations, and Future Directions-<i>AJR</i> Expert Panel Narrative Review.

Vachha BA, Kumar VA, Pillai JJ, Shimony JS, Tanabe J, Sair HI

pubmed logopapersSep 3 2025
Resting-state functional MRI (rs-fMRI), a promising method for interrogating different brain functional networks from a single MRI acquisition, is increasingly used in clinical presurgical and other pretherapeutic brain mapping. However, challenges in standardization of acquisition, preprocessing, and analysis methods across centers and variability in results interpretation complicate its clinical use. Additionally, inherent problems regarding reliability of language lateralization, interpatient variability of cognitive network representation, dynamic aspects of intranetwork and internetwork connectivity, and effects of neurovascular uncoupling on network detection still must be overcome. Although deep learning solutions and further methodologic standardization will help address these issues, rs-fMRI remains generally considered an adjunct to task-based fMRI (tb-fMRI) for clinical presurgical mapping. Nonetheless, in many clinical instances, rs-fMRI may offer valuable additional information that supplements tb-fMRI, especially if tb-fMRI is inadequate due to patient performance or other limitations. Future growth in clinical applications of rs-fMRI is anticipated as challenges are increasingly addressed. This <i>AJR</i> Expert Panel Narrative Review summarizes the current state and emerging clinical utility of rs-fMRI, focusing on its role in presurgical mapping. Ongoing controversies and limitations in clinical applicability are presented and future directions are discussed, including the developing role of rs-fMRI in neuromodulation treatment of various neurologic disorders.

Utilisation of artificial intelligence to enhance the detection rates of renal cancer on cross-sectional imaging: protocol for a systematic review and meta-analysis.

Ofagbor O, Bhardwaj G, Zhao Y, Baana M, Arkwazi M, Lami M, Bolton E, Heer R

pubmed logopapersAug 31 2025
The incidence of renal cell carcinoma has steadily been on the increase due to the increased use of imaging to identify incidental masses. Although survival has also improved because of early detection, overdiagnosis and overtreatment of benign renal masses are associated with significant morbidity, as patients with a suspected renal malignancy on imaging undergo invasive and risky procedures for a definitive diagnosis. Therefore, accurately characterising a renal mass as benign or malignant on imaging is paramount to improving patient outcomes. Artificial intelligence (AI) poses an exciting solution to the problem, augmenting traditional radiological diagnosis to increase detection accuracy. This report aims to investigate and summarise the current evidence about the diagnostic accuracy of AI in characterising renal masses on imaging. This will involve systematically searching PubMed, MEDLINE, Embase, Web of Science, Scopus and Cochrane databases. Publications of research that have evaluated the use of automated AI, fully or to some extent, in cross-sectional imaging for diagnosing or characterising malignant renal tumours will be included if published between July 2016 and June 2025 and in English. The protocol adheres to the Preferred Reporting Items for Systematic Review and Meta-Analysis Protocols 2015 checklist. The Quality Assessment of Diagnostic Accuracy Studies 2 (QUADAS-2) score will be used to evaluate the quality and risk of bias across included studies. Furthermore, in line with Checklist for Artificial Intelligence in Medical Imaging recommendations, studies will be evaluated for including the minimum necessary information on AI research reporting. Ethical clearance will not be necessary for conducting this systematic review, and results will be disseminated through peer-reviewed publications and presentations at both national and international conferences. CRD42024529929.

A hybrid computer vision model to predict lung cancer in diverse populations

Zakkar, A., Perwaiz, N., Harikrishnan, V., Zhong, W., Narra, V., Krule, A., Yousef, F., Kim, D., Burrage-Burton, M., Lawal, A. A., Gadi, V., Korpics, M. C., Kim, S. J., Chen, Z., Khan, A. A., Molina, Y., Dai, Y., Marai, E., Meidani, H., Nguyen, R., Salahudeen, A. A.

medrxiv logopreprintAug 29 2025
PURPOSE Disparities of lung cancer incidence exist in Black populations and screening criteria underserve Black populations due to disparately elevated risk in the screening eligible population. Prediction models that integrate clinical and imaging-based features to individualize lung cancer risk is a potential means to mitigate these disparities. PATIENTS AND METHODS This Multicenter (NLST) and catchment population based (UIH, urban and suburban Cook County) study utilized participants at risk of lung cancer with available lung CT imaging and follow up between the years 2015 and 2024. 53,452 in NLST and 11,654 in UIH were included based on age and tobacco use based risk factors for lung cancer. Cohorts were used for training and testing of deep and machine learning models using clinical features alone or combined with CT image features (hybrid computer vision). RESULTS An optimized 7 clinical feature model achieved ROC-AUC values ranging 0.64-0.67 in NLST and 0.60-0.65 in UIH cohorts across multiple years. Incorporation of imaging features to form a hybrid computer vision model significantly improved ROC-AUC values to 0.78-0.91 in NLST but deteriorated in UIH with ROC-AUC values of 0.68- 0.80, attributable to Black participants where ROC-AUC values ranged from 0.63-0.72 across multiple years. Retraining the hybrid computer vision model by incorporating Black and other participants from the UIH cohort improved performance with ROC- AUC values of 0.70-0.87 in a held out UIH test set. CONCLUSION Hybrid computer vision predicted risk with improved accuracy compared to clinical risk models alone. However, potential biases in image training data reduced model generalizability in Black participants. Performance was improved upon retraining with a subset of the UIH cohort, suggesting that inclusive training and validation datasets can minimize racial disparities. Future studies incorporating vision models trained on representative data sets may demonstrate improved health equity upon clinical use.

Advancements in biomedical rendering: A survey on AI-based denoising techniques.

Denisova E, Francia P, Nardi C, Bocchi L

pubmed logopapersAug 28 2025
A recent investigation into deep learning-based denoising for early Monte Carlo (MC) Path Tracing in computed tomography (CT) volume visualization yielded promising quantitative outcomes but inconsistent qualitative assessments. This research probes the underlying causes of this incongruity by deploying a web-based SurveyMonkey questionnaire distributed among healthcare professionals. The survey targeted radiologists, residents, orthopedic surgeons, and veterinarians, leveraging the authors' professional networks for dissemination. To evaluate perceptions, the questionnaire featured randomized sections gauging attitudes towards AI-enhanced image and video quality, confidence in reference images, and clinical applicability. Seventy-four participants took part, encompassing a spectrum of experience levels: <1 year (n=11), 1-3 years (n=27), 3-5 years (n=12), and >5 years (n=24). A substantial majority (77%) expressed a preference for AI-enhanced images over traditional MC estimates, a preference influenced by participant experience (adjusted OR 0.81, 95% CI 0.67-0.98, p=0.033). Experience correlates with confidence in AI-generated images (adjusted OR 0.98, 95% CI 0.95-1, p=0.018-0.047) and satisfaction with video previews, both with and without AI (adjusted OR 0.96-0.98, 95% CI 0.92-1, p = 0.033-0.048). Significant monotonic relationships emerged between experience, confidence (σ= 0.25-0.26, p = 0.025-0.029), and satisfaction (σ= 0.23-0.24, p = 0.037-0.046). The findings underscore the potential of AI post-processing to improve the rendering of biomedical volumes, noting enhanced confidence and satisfaction among experienced participants. The study reveals that participants' preferences may not align perfectly with quality metrics such as peak signal-to-noise ratio and structural similarity index, highlighting nuances in evaluating AI's qualitative impact on CT image denoising.
Page 3 of 22220 results
Show
per page

Ready to Sharpen Your Edge?

Join hundreds of your peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.