Sort by:
Page 2 of 34333 results

Safeguarding Generative AI Applications in Preclinical Imaging through Hybrid Anomaly Detection

Jakub Binda, Valentina Paneta, Vasileios Eleftheriadis, Hongkyou Chung, Panagiotis Papadimitroulas, Neo Christopher Chung

arxiv logopreprintAug 11 2025
Generative AI holds great potentials to automate and enhance data synthesis in nuclear medicine. However, the high-stakes nature of biomedical imaging necessitates robust mechanisms to detect and manage unexpected or erroneous model behavior. We introduce development and implementation of a hybrid anomaly detection framework to safeguard GenAI models in BIOEMTECH's eyes(TM) systems. Two applications are demonstrated: Pose2Xray, which generates synthetic X-rays from photographic mouse images, and DosimetrEYE, which estimates 3D radiation dose maps from 2D SPECT/CT scans. In both cases, our outlier detection (OD) enhances reliability, reduces manual oversight, and supports real-time quality control. This approach strengthens the industrial viability of GenAI in preclinical settings by increasing robustness, scalability, and regulatory compliance.

Post-deployment Monitoring of AI Performance in Intracranial Hemorrhage Detection by ChatGPT.

Rohren E, Ahmadzade M, Colella S, Kottler N, Krishnan S, Poff J, Rastogi N, Wiggins W, Yee J, Zuluaga C, Ramis P, Ghasemi-Rad M

pubmed logopapersAug 11 2025
To evaluate the post-deployment performance of an artificial intelligence (AI) system (Aidoc) for intracranial hemorrhage (ICH) detection and assess the utility of ChatGPT-4 Turbo for automated AI monitoring. This retrospective study evaluated 332,809 head CT examinations from 37 radiology practices across the United States (December 2023-May 2024). Of these, 13,569 cases were flagged as positive for ICH by the Aidoc AI system. A HIPAA (Health Insurance Portability and Accountability Act) -compliant version of ChatGPT-4 Turbo was used to extract data from radiology reports. Ground truth was established through radiologists' review of 200 randomly selected cases. Performance metrics were calculated for ChatGPT, Aidoc and radiologists. ChatGPT-4 Turbo demonstrated high diagnostic accuracy in identifying intracranial hemorrhage (ICH) from radiology reports, with a positive predictive value of 1 and a negative predictive value of 0.988 (AUC:0.996). Aidoc's false positive classifications were influenced by scanner manufacturer, midline shift, mass effect, artifacts, and neurologic symptoms. Multivariate analysis identified Philips scanners (OR: 6.97, p=0.003) and artifacts (OR: 3.79, p=0.029) as significant contributors to false positives, while midline shift (OR: 0.08, p=0.021) and mass effect (OR: 0.18, p=0.021) were associated with a reduced false positive rate. Aidoc-assisted radiologists achieved a sensitivity of 0.936 and a specificity of 1. This study underscores the importance of continuous performance monitoring for AI systems in clinical practice. The integration of LLMs offers a scalable solution for evaluating AI performance, ensuring reliable deployment and enhancing diagnostic workflows.

Spinal-QDCNN: advanced feature extraction for brain tumor detection using MRI images.

T L, J JJ, Rani VV, Saini ML

pubmed logopapersAug 9 2025
Brain tumor occurs due to the abnormal development of cells in the brain. It has adversely affected human health, and early diagnosis is required to improve the survival rate of the patient. Hence, various brain tumor detection models have been developed to detect brain tumors. However, the existing methods often suffer from limited accuracy and inefficient learning architecture. The traditional approaches cannot effectively detect the small and subtle changes in the brain cells. To overcome these limitations, a SpinalNet-Quantum Dilated Convolutional Neural Network (Spinal-QDCNN) model is proposed for detecting brain tumors using MRI images. The Spinal-QDCNN method is developed by the combination of QDCNN and SpinalNet for brain tumor detection using MRI. At first, the input brain image is pre-processed using RoI extraction. Then, image enhancement is done by using the thresholding transformation, which is followed by segmentation using Projective Adversarial Networks (PAN). Then, different processes, like random erasing, flipping, and resizing, are applied in the image augmentation phase. This is followed by feature extraction, where statistical features such as average contrast, kurtosis and skewness, and mean, Gabor wavelet features, Discrete Wavelet Transform (DWT) with Gradient Binary Pattern (GBP) are extracted, and finally detection is done using Spinal-QDCNN. Moreover, the proposed method attained a maximum accuracy of 86.356%, sensitivity of 87.37%, and specificity of 88.357%.

Deep learning in rib fracture imaging: study quality assessment using the Must AI Criteria-10 (MAIC-10) checklist for artificial intelligence in medical imaging.

Getzmann JM, Nulle K, Mennini C, Viglino U, Serpi F, Albano D, Messina C, Fusco S, Gitto S, Sconfienza LM

pubmed logopapersAug 9 2025
To analyze the methodological quality of studies on deep learning (DL) in rib fracture imaging with the Must AI Criteria-10 (MAIC-10) checklist, and to report insights and experiences regarding the applicability of the MAIC-10 checklist. An electronic literature search was conducted on the PubMed database. After selection of articles, three radiologists independently rated the articles according to MAIC-10. Differences of the MAIC-10 score for each checklist item were assessed using the Fleiss' kappa coefficient. A total of 25 original articles discussing DL applications in rib fracture imaging were identified. Most studies focused on fracture detection (n = 21, 84%). In most of the research papers, internal cross-validation of the dataset was performed (n = 16, 64%), while only six studies (24%) conducted external validation. The mean MAIC-10 score of the 25 studies was 5.63 (SD, 1.84; range 1-8), with the item "clinical need" being reported most consistently (100%) and the item "study design" being most frequently reported incompletely (94.8%). The average inter-rater agreement for the MAIC-10 score was 0.771. The MAIC-10 checklist is a valid tool for assessing the quality of AI research in medical imaging with good inter-rater agreement. With regard to rib fracture imaging, items such as "study design", "explainability", and "transparency" were often not comprehensively addressed. AI in medical imaging has become increasingly common. Therefore, quality control systems of published literature such as the MAIC-10 checklist are needed to ensure high quality research output. Quality control systems are needed for research on AI in medical imaging. The MAIC-10 checklist is a valid tool to assess AI in medical imaging research quality. Checklist items such as "study design", "explainability", and "transparency" are frequently addressed incomprehensively.

Medical application driven content based medical image retrieval system for enhanced analysis of X-ray images.

Saranya E, Chinnadurai M

pubmed logopapersAug 8 2025
By carefully analyzing latent image properties, content-based image retrieval (CBIR) systems are able to recover pertinent images without relying on text descriptions, natural language tags, or keywords related to the image. This search procedure makes it quite easy to automatically retrieve images in huge, well-balanced datasets. However, in the medical field, such datasets are usually not available. This study proposed an advanced DL technique to enhance the accuracy of image retrieval in complex medical datasets. The proposed model can be integrated into five stages, namely pre-processing, decomposing the images, feature extraction, dimensionality reduction, and classification with an image retrieval mechanism. The hybridized Wavelet-Hadamard Transform (HWHT) was utilized to obtain both low and high frequency detail for analysis. In order to extract the main characteristics, the Gray Level Co-occurrence Matrix (GLCM) was employed. Furthermore, to minimize feature complexity, Sine chaos based artificial rabbit optimization (SCARO) was utilized. By employing the Bhattacharyya Coefficient for improved similarity matching, the Bhattacharya Context performance aware global attention-based Transformer (BCGAT) improves classification accuracy. The experimental results proved that the COVID-19 Chest X-ray image dataset attained higher accuracy, precision, recall, and F1-Score of 99.5%, 97.1%, 97.1%, and 97.1%, 97.1%, respectively. However, the chest x-ray image (pneumonia) dataset has attained higher accuracy, precision, recall, and F1-score values of 98.60%, 98.49%, 97.40%, and 98.50%, respectively. For the NIH chest X-ray dataset, the accuracy value is 99.67%.

Artificial Intelligence for the Detection of Fetal Ultrasound Findings Concerning for Major Congenital Heart Defects.

Zelop CM, Lam-Rachlin J, Arunamata A, Punn R, Behera SK, Lachaud M, David N, DeVore GR, Rebarber A, Fox NS, Gayanilo M, Garmel S, Boukobza P, Uzan P, Joly H, Girardot R, Cohen L, Stos B, De Boisredon M, Askinazi E, Thorey V, Gardella C, Levy M, Geiger M

pubmed logopapersAug 7 2025
To evaluate the performance of an artificial intelligence (AI)-based software to identify second-trimester fetal ultrasound examinations suspicious for congenital heart defects. The software analyzes all grayscale two-dimensional ultrasound cine clips of an examination to evaluate eight morphologic findings associated with severe congenital heart defects. A data set of 877 examinations was retrospectively collected from 11 centers. The presence of suspicious findings was determined by a panel of expert pediatric cardiologists, who determined that 311 examinations had at least one of the eight suspicious findings. The AI software processed each examination, labeling each finding as present, absent, or inconclusive. Of the 280 examinations with known severe congenital heart defects, 278 (sensitivity 0.993, 95% CI, 0.974-0.998) had at least one of the eight suspicious findings present as determined by the fetal cardiologists, highlighting the relevance of these eight findings. We then evaluated the performance of the AI software, which identified at least one finding as present in 271 examinations, that all eight findings were absent in five examinations, and was inconclusive in four of the 280 examinations with severe congenital heart defects, yielding a sensitivity of 0.968 (95% CI, 0.940-0.983) for severe congenital heart defects. When comparing the AI to the determination of findings by fetal cardiologists, the detection of any finding by the AI had a sensitivity of 0.987 (95% CI, 0.967-0.995) and a specificity of 0.977 (95% CI, 0.961-0.986) after exclusion of inconclusive examinations. The AI rendered a decision for any finding (either present or absent) in 98.7% of examinations. The AI-based software demonstrated high accuracy in identification of suspicious findings associated with severe congenital heart defects, yielding a high sensitivity for detecting severe congenital heart defects. These results show that AI has potential to improve antenatal congenital heart defect detection.

Automated detection of wrist ganglia in MRI using convolutional neural networks.

Hämäläinen M, Sormaala M, Kaseva T, Salli E, Savolainen S, Kangasniemi M

pubmed logopapersAug 7 2025
To investigate feasibility of a method which combines segmenting convolutional neural networks (CNN) for the automated detection of ganglion cysts in 2D MRI of the wrist. The study serves as proof-of-concept, demonstrating a method to decrease false positives and offering an efficient solution for ganglia detection. We retrospectively analyzed 58 MRI studies with wrist ganglia, each including 2D axial, sagittal, and coronal series. Manual segmentations were performed by a radiologist and used to train CNNs for automatic segmentation of each orthogonal series. Predictions were fused into a single 3D volume using a proposed prediction fusion method. Performance was evaluated over all studies using six-fold cross-validation, comparing method variations with metrics including true positive rate, number of false positives, and F-score metrics. The proposed method reached mean TPR of 0.57, mean FP of 0.4 and mean F-score of 0.53. Fusion of series predictions decreased the number of false positives significantly but also decreased TPR values. CNNs can detect ganglion cysts in wrist MRI. The number of false positives can be decreased by a method of prediction fusion from multiple CNNs.

Automated detection of zygomatic fractures on spiral computed tomography using a deep learning model.

Yari A, Fasih P, Kamali Hakim L, Asadi A

pubmed logopapersAug 6 2025
The aim of this study was to evaluate the performance of the YOLOv8 deep learning model for detecting zygomatic fractures. Computed tomography scans with zygomatic fractures were collected, with all slices annotated to identify fracture lines across seven categories: zygomaticomaxillary suture, zygomatic arch, zygomaticofrontal suture, sphenozygomatic suture, orbital floor, zygomatic body, and maxillary sinus wall. The images were divided into training, validation, and test datasets in a 6:2:2 ratio. Performance metrics were calculated for each category. A total of 13,988 axial and 14,107 coronal slices were retrieved. The trained algorithm achieved accuracy of 94.2-97.9%. Recall exceeded 90% across all categories, with sphenozygomatic suture fractures having the highest value (96.6%). Average precision was highest for zygomatic arch fractures (0.827) and lowest for zygomatic body fractures (0.692). The highest F1 score was 96.7% for zygomaticomaxillary suture fractures, and the lowest was 82.1% for zygomatic body fractures. Area under the curve (AUC) values were also highest for zygomaticomaxillary suture (0.943) and lowest for zygomatic body fractures (0.876). The YOLOv8 model demonstrated promising results in the automated detection of zygomatic fractures, achieving the highest performance in identifying fractures of the zygomaticomaxillary suture and zygomatic arch.

Equivariant Spatiotemporal Transformers with MDL-Guided Feature Selection for Malignancy Detection in Dynamic PET

Dadashkarimi, M.

medrxiv logopreprintAug 6 2025
Dynamic Positron Emission Tomography (PET) scans offer rich spatiotemporal data for detecting malignancies, but their high-dimensionality and noise pose significant challenges. We introduce a novel framework, the Equivariant Spatiotemporal Transformer with MDL-Guided Feature Selection (EST-MDL), which integrates group-theoretic symmetries, Kolmogorov complexity, and Minimum Description Length (MDL) principles. By enforcing spatial and temporal symmetries (e.g., translations and rotations) and leveraging MDL for robust feature selection, our model achieves improved generalization and interpretability. Evaluated on three realworld PET datasets--LUNG-PET, BRAIN-PET, and BREAST-PET--our approach achieves AUCs of 0.94, 0.92, and 0.95, respectively, outperforming CNNs, Vision Transformers (ViTs), and Graph Neural Networks (GNNs) in AUC, sensitivity, specificity, and computational efficiency. This framework offers a robust, interpretable solution for malignancy detection in clinical settings.

Automated vertebral bone quality score measurement on lumbar MRI using deep learning: Development and validation of an AI algorithm.

Jayasuriya NM, Feng E, Nathani KR, Delawan M, Katsos K, Bhagra O, Freedman BA, Bydon M

pubmed logopapersAug 5 2025
Bone health is a critical determinant of spine surgery outcomes, yet many patients undergo procedures without adequate preoperative assessment due to limitations in current bone quality assessment methods. This study aimed to develop and validate an artificial intelligence-based algorithm that predicts Vertebral Bone Quality (VBQ) scores from routine MRI scans, enabling improved preoperative identification of patients at risk for poor surgical outcomes. This study utilized 257 lumbar spine T1-weighted MRI scans from the SPIDER challenge dataset. VBQ scores were calculated through a three-step process: selecting the mid-sagittal slice, measuring vertebral body signal intensity from L1-L4, and normalizing by cerebrospinal fluid signal intensity. A YOLOv8 model was developed to automate region of interest placement and VBQ score calculation. The system was validated against manual annotations from 47 lumbar spine surgery patients, with performance evaluated using precision, recall, mean average precision, intraclass correlation coefficient, Pearson correlation, RMSE, and mean error. The YOLOv8 model demonstrated high accuracy in vertebral body detection (precision: 0.9429, recall: 0.9076, [email protected]: 0.9403, mAP@[0.5:0.95]: 0.8288). Strong interrater reliability was observed with ICC values of 0.95 (human-human), 0.88 and 0.93 (human-AI). Pearson correlations for VBQ scores between human and AI measurements were 0.86 and 0.9, with RMSE values of 0.58 and 0.42 respectively. The AI-based algorithm accurately predicts VBQ scores from routine lumbar MRIs. This approach has potential to enhance early identification and intervention for patients with poor bone health, leading to improved surgical outcomes. Further external validation is recommended to ensure generalizability and clinical applicability.
Page 2 of 34333 results
Show
per page

Ready to Sharpen Your Edge?

Join hundreds of your peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.