Latest Papers on Radiology AI. Sources: medrxiv, Tags: Detection, Order: Best Match, Limit: 10.

Equivariant Spatiotemporal Transformers with MDL-Guided Feature Selection for Malignancy Detection in Dynamic PET

Dadashkarimi, M.

•preprint•Aug 6 2025

Dynamic Positron Emission Tomography (PET) scans offer rich spatiotemporal data for detecting malignancies, but their high-dimensionality and noise pose significant challenges. We introduce a novel framework, the Equivariant Spatiotemporal Transformer with MDL-Guided Feature Selection (EST-MDL), which integrates group-theoretic symmetries, Kolmogorov complexity, and Minimum Description Length (MDL) principles. By enforcing spatial and temporal symmetries (e.g., translations and rotations) and leveraging MDL for robust feature selection, our model achieves improved generalization and interpretability. Evaluated on three realworld PET datasets--LUNG-PET, BRAIN-PET, and BREAST-PET--our approach achieves AUCs of 0.94, 0.92, and 0.95, respectively, outperforming CNNs, Vision Transformers (ViTs), and Graph Neural Networks (GNNs) in AUC, sensitivity, specificity, and computational efficiency. This framework offers a robust, interpretable solution for malignancy detection in clinical settings.

PET Detection Retrospective Clinical In Silico Benchmark SOTA

BrainSignsNET: A Deep Learning Model for 3D Anatomical Landmark Detection in the Human Brain Imaging

shirzadeh barough, s., Ventura, C., Bilgel, M., Albert, M., Miller, M. I., Moghekar, A.

•preprint•Aug 5 2025

Accurate detection of anatomical landmarks in brain Magnetic Resonance Imaging (MRI) scans is essential for reliable spatial normalization, image alignment, and quantitative neuroimaging analyses. In this study, we introduce BrainSignsNET, a deep learning framework designed for robust three-dimensional (3D) landmark detection. Our approach leverages a multi-task 3D convolutional neural network that integrates an attention decoder branch with a multi-class decoder branch to generate precise 3D heatmaps, from which landmark coordinates are extracted. The model was trained and internally validated on T1-weighted Magnetization-Prepared Rapid Gradient-Echo (MPRAGE) scans from the Alzheimers Disease Neuroimaging Initiative (ADNI), the Baltimore Longitudinal Study of Aging (BLSA), and the Biomarkers of Cognitive Decline in Adults at Risk for AD (BIOCARD) datasets and externally validated on a clinical dataset from the Johns Hopkins Hydrocephalus Clinic. The study encompassed 14,472 scans from 6,299 participants, representing a diverse demographic profile with a significant proportion of older adult participants, particularly those over 70 years of age. Extensive preprocessing and data augmentation strategies, including traditional MRI corrections and tailored 3D transformations, ensured data consistency and improved model generalizability. Performance metrics demonstrated that on internal validation BrainSignsNET achieved an overall mean Euclidean distance of 2.32 {+/-} 0.41 mm and 94.8% of landmarks localized within their anatomically defined 3D volumes in the external validation dataset. This improvement in accurate anatomical landmark detection on brain MRI scans should benefit many imaging tasks, including registration, alignment, and quantitative analyses.

MRI Detection Neurological Methodology In Silico

Detecting Fifth Metatarsal Fractures on Radiographs through the Lens of Smartphones: A FIXUS AI Algorithm

Taseh, A., Shah, A., Eftekhari, M., Flaherty, A., Ebrahimi, A., Jones, S., Nukala, V., Nazarian, A., Waryasz, G., Ashkani-Esfahani, S.

•preprint•Jul 18 2025

BackgroundFifth metatarsal (5MT) fractures are common but challenging to diagnose, particularly with limited expertise or subtle fractures. Deep learning shows promise but faces limitations due to image quality requirements. This study develops a deep learning model to detect 5MT fractures from smartphone-captured radiograph images, enhancing accessibility of diagnostic tools. MethodsA retrospective study included patients aged >18 with 5MT fractures (n=1240) and controls (n=1224). Radiographs (AP, oblique, lateral) from Electronic Health Records (EHR) were obtained and photographed using a smartphone, creating a new dataset (SP). Models using ResNet 152V2 were trained on EHR, SP, and combined datasets, then evaluated on a separate smartphone test dataset (SP-test). ResultsOn validation, the SP model achieved optimal performance (AUROC: 0.99). On the SP-test dataset, the EHR models performance decreased (AUROC: 0.83), whereas SP and combined models maintained high performance (AUROC: 0.99). ConclusionsSmartphone-specific deep learning models effectively detect 5MT fractures, suggesting their practical utility in resource-limited settings.

X-Ray Detection Musculoskeletal Retrospective Clinical In Silico

Automated Detection of Lacunes in Brain MR Images Using SAM with Robust Prompts via Self-Distillation and Anatomy-Informed Priors

Deepika, P., Shanker, G., Narayanan, R., Sundaresan, V.

•preprint•Jul 10 2025

Lacunes, which are small fluid-filled cavities in the brain, are signs of cerebral small vessel disease and have been clinically associated with various neurodegenerative and cerebrovascular diseases. Hence, accurate detection of lacunes is crucial and is one of the initial steps for the precise diagnosis of these diseases. However, developing a robust and consistently reliable method for detecting lacunes is challenging because of the heterogeneity in their appearance, contrast, shape, and size. To address the above challenges, in this study, we propose a lacune detection method using the Segment Anything Model (SAM), guided by point prompts from a candidate prompt generator. The prompt generator initially detects potential lacunes with a high sensitivity using a composite loss function. The SAM model selects true lacunes by delineating their characteristics from mimics such as the sulcus and enlarged perivascular spaces, imitating the clinicians strategy of examining the potential lacunes along all three axes. False positives were further reduced by adaptive thresholds based on the region-wise prevalence of lacunes. We evaluated our method on two diverse, multi-centric MRI datasets, VALDO and ISLES, comprising only FLAIR sequences. Despite diverse imaging conditions and significant variations in slice thickness (0.5-6 mm), our method achieved sensitivities of 84% and 92%, with average false positive rates of 0.05 and 0.06 per slice in ISLES and VALDO datasets respectively. The proposed method outperformed the state-of-the-art methods, demonstrating its effectiveness in lacune detection and quantification.

MRI Detection Neurological Retrospective Clinical In Silico Benchmark SOTA

Artificial Intelligence in Prenatal Ultrasound: A Systematic Review of Diagnostic Tools for Detecting Congenital Anomalies

Dunne, J., Kumarasamy, C., Belay, D. G., Betran, A. P., Gebremedhin, A. T., Mengistu, S., Nyadanu, S. D., Roy, A., Tessema, G., Tigest, T., Pereira, G.

•preprint•Jul 5 2025

BackgroundArtificial intelligence (AI) has potentially shown promise in interpreting ultrasound imaging through flexible pattern recognition and algorithmic learning, but implementation in clinical practice remains limited. This study aimed to investigate the current application of AI in prenatal ultrasounds to identify congenital anomalies, and to synthesise challenges and opportunities for the advancement of AI-assisted ultrasound diagnosis. This comprehensive analysis addresses the clinical translation gap between AI performance metrics and practical implementation in prenatal care. MethodsSystematic searches were conducted in eight electronic databases (CINAHL Plus, Ovid/EMBASE, Ovid/MEDLINE, ProQuest, PubMed, Scopus, Web of Science and Cochrane Library) and Google Scholar from inception to May 2025. Studies were included if they applied an AI-assisted ultrasound diagnostic tool to identify a congenital anomaly during pregnancy. This review adhered to PRISMA guidelines for systematic reviews. We evaluated study quality using the Checklist for Artificial Intelligence in Medical Imaging (CLAIM) guidelines. FindingsOf 9,918 records, 224 were identified for full-text review and 20 met the inclusion criteria. The majority of studies (11/20, 55%) were conducted in China, with most published after 2020 (16/20, 80%). All AI models were developed as an assistive tool for anomaly detection or classification. Most models (85%) focused on single-organ systems: heart (35%), brain/cranial (30%), or facial features (20%), while three studies (15%) attempted multi-organ anomaly detection. Fifty percent of the included studies reported exceptionally high model performance, with both sensitivity and specificity exceeding 0.95, with AUC-ROC values ranging from 0.91 to 0.97. Most studies (75%) lacked external validation, with internal validation often limited to small training and testing datasets. InterpretationWhile AI applications in prenatal ultrasound showed potential, current evidence indicates significant limitations in their practical implementation. Much work is required to optimise their application, including the external validation of diagnostic models with clinical utility to have real-world implications. Future research should prioritise larger-scale multi-centre studies, developing multi-organ anomaly detection capabilities rather than the current single-organ focus, and robust evaluation of AI tools in real-world clinical settings.

Ultrasound Detection Review In Silico Academic Lab Benchmark SOTA

Diagnostic Performance of Universal versus Stratified Computer-Aided Detection Thresholds for Chest X-Ray-Based Tuberculosis Screening

Sung, J., Kitonsa, P. J., Nalutaaya, A., Isooba, D., Birabwa, S., Ndyabayunga, K., Okura, R., Magezi, J., Nantale, D., Mugabi, I., Nakiiza, V., Dowdy, D. W., Katamba, A., Kendall, E. A.

•preprint•Jun 24 2025

BackgroundComputer-aided detection (CAD) software analyzes chest X-rays for features suggestive of tuberculosis (TB) and provides a numeric abnormality score. However, estimates of CAD accuracy for TB screening are hindered by the lack of confirmatory data among people with lower CAD scores, including those without symptoms. Additionally, the appropriate CAD score thresholds for obtaining further testing may vary according to population and client characteristics. MethodsWe screened for TB in Ugandan individuals aged [≥]15 years using portable chest X-rays with CAD (qXR v3). Participants were offered screening regardless of their symptoms. Those with X-ray scores above a threshold of 0.1 (range, 0 - 1) were asked to provide sputum for Xpert Ultra testing. We estimated the diagnostic accuracy of CAD for detecting Xpert-positive TB when using the same threshold for all individuals (under different assumptions about TB prevalence among people with X-ray scores <0.1), and compared this estimate to age- and/or sex-stratified approaches. FindingsOf 52,835 participants screened for TB using CAD, 8,949 (16.9%) had X-ray scores [≥]0.1. Of 7,219 participants with valid Xpert Ultra results, 382 (5.3%) were Xpert-positive, including 81 with trace results. Assuming 0.1% of participants with X-ray scores <0.1 would have been Xpert-positive if tested, qXR had an estimated AUC of 0.920 (95% confidence interval 0.898-0.941) for Xpert-positive TB. Stratifying CAD thresholds according to age and sex improved accuracy; for example, at 96.1% specificity, estimated sensitivity was 75.0% for a universal threshold (of [≥]0.65) versus 76.9% for thresholds stratified by age and sex (p=0.046). InterpretationThe accuracy of CAD for TB screening among all screening participants, including those without symptoms or abnormal chest X-rays, is higher than previously estimated. Stratifying CAD thresholds based on client characteristics such as age and sex could further improve accuracy, enabling a more effective and personalized approach to TB screening. FundingNational Institutes of Health Research in contextO_ST_ABSEvidence before this studyC_ST_ABSThe World Health Organization (WHO) has endorsed computer-aided detection (CAD) as a screening tool for tuberculosis (TB), but the appropriate CAD score that triggers further diagnostic evaluation for tuberculosis varies by population. The WHO recommends determining the appropriate CAD threshold for specific settings and population and considering unique thresholds for specific populations, including older age groups, among whom CAD may perform poorly. We performed a PubMed literature search for articles published until September 9, 2024, using the search terms "tuberculosis" AND ("computer-aided detection" OR "computer aided detection" OR "CAD" OR "computer-aided reading" OR "computer aided reading" OR "artificial intelligence"), which resulted in 704 articles. Among them, we identified studies that evaluated the performance of CAD for tuberculosis screening and additionally reviewed relevant references. Most prior studies reported area under the curves (AUC) ranging from 0.76 to 0.88 but limited their evaluations to individuals with symptoms or abnormal chest X-rays. Some prior studies identified subgroups (including older individuals and people with prior TB) among whom CAD had lower-than-average AUCs, and authors discussed how the prevalence of such characteristics could affect the optimal value of a population-wide CAD threshold; however, none estimated the accuracy that could be gained with adjusting CAD thresholds between individuals based on personal characteristics. Added value of this studyIn this study, all consenting individuals in a high-prevalence setting were offered chest X-ray screening, regardless of symptoms, if they were [≥]15 years old, not pregnant, and not on TB treatment. A very low CAD score cutoff (qXR v3 score of 0.1 on a 0-1 scale) was used to select individuals for confirmatory sputum molecular testing, enabling the detection of radiographically mild forms of TB and facilitating comparisons of diagnostic accuracy at different CAD thresholds. With this more expansive, symptom-neutral evaluation of CAD, we estimated an AUC of 0.920, and we found that the qXR v3 threshold needed to decrease to under 0.1 to meet the WHO target product profile goal of [≥]90% sensitivity and [≥]70% specificity. Compared to using the same thresholds for all participants, adjusting CAD thresholds by age and sex strata resulted in a 1 to 2% increase in sensitivity without affecting specificity. Implications of all the available evidenceTo obtain high sensitivity with CAD screening in high-prevalence settings, low score thresholds may be needed. However, countries with a high burden of TB often do not have sufficient resources to test all individuals above a low threshold. In such settings, adjusting CAD thresholds based on individual characteristics associated with TB prevalence (e.g., male sex) and those associated with false-positive X-ray results (e.g., old age) can potentially improve the efficiency of TB screening programs.

X-Ray Detection Chest Retrospective Clinical Clinical Pilot Academic Lab Benchmark SOTA

Radiologist-AI workflow can be modified to reduce the risk of medical malpractice claims

Bernstein, M., Sheppard, B., Bruno, M. A., Lay, P. S., Baird, G. L.

•preprint•Jun 16 2025

BackgroundArtificial Intelligence (AI) is rapidly changing the legal landscape of radiology. Results from a previous experiment suggested that providing AI error rates can reduce perceived radiologist culpability, as judged by mock jury members (4). The current study advances this work by examining whether the radiologists behavior also impacts perceptions of liability. Methods. Participants (n=282) read about a hypothetical malpractice case where a 50-year-old who visited the Emergency Department with acute neurological symptoms received a brain CT scan to determine if bleeding was present. An AI system was used by the radiologist who interpreted imaging. The AI system correctly flagged the case as abnormal. Nonetheless, the radiologist concluded no evidence of bleeding, and the blood-thinner t-PA was administered. Participants were randomly assigned to either a 1.) single-read condition, where the radiologist interpreted the CT once after seeing AI feedback, or 2.) a double-read condition, where the radiologist interpreted the CT twice, first without AI and then with AI feedback. Participants were then told the patient suffered irreversible brain damage due to the missed brain bleed, resulting in the patient (plaintiff) suing the radiologist (defendant). Participants indicated whether the radiologist met their duty of care to the patient (yes/no). Results. Hypothetical jurors were more likely to side with the plaintiff in the single-read condition (106/142, 74.7%) than in the double-read condition (74/140, 52.9%), p=0.0002. Conclusion. This suggests that the penalty for disagreeing with correct AI can be mitigated when images are interpreted twice, or at least if a radiologist gives an interpretation before AI is used.

CT Detection Neurological Retrospective Clinical Post Market Academic Lab Ethics Policy

Detecting neurodegenerative changes in glaucoma using deep mean kurtosis-curve-corrected tractometry

Kasa, L. W., Schierding, W., Kwon, E., Holdsworth, S., Danesh-Meyer, H. V.

•preprint•Jun 6 2025

Glaucoma is increasingly recognized as a neurodegenerative condition involving both retinal and central nervous system structures. Here, we present an integrated framework that combines MK-Curve-corrected diffusion kurtosis imaging (DKI), tractometry, and deep autoencoder-based normative modeling to detect localized white matter abnormalities associated with glaucoma. Using UK Biobank diffusion MRI data, we show that MK-Curve approach corrects anatomically implausible values and improves the reliability of DKI metrics - particularly mean (MK), radial (RK), and axial kurtosis (AK) - in regions of complex fiber architecture. Tractometry revealed reduced MK in glaucoma patients along the optic radiation, inferior longitudinal fasciculus, and inferior fronto-occipital fasciculus, but not in a non-visual control tract, supporting disease specificity. These abnormalities were spatially localized, with significant changes observed at multiple points along the tracts. MK demonstrated greater sensitivity than MD and exhibited altered distributional features, reflecting microstructural heterogeneity not captured by standard metrics. Node-wise MK values in the right optic radiation showed weak but significant correlations with retinal OCT measures (ganglion cell layer and retinal nerve fiber layer thickness), reinforcing the biological relevance of these findings. Deep autoencoder-based modeling further enabled subject-level anomaly detection that aligned spatially with group-level changes and outperformed traditional approaches. Together, our results highlight the potential of advanced diffusion modeling and deep learning for sensitive, individualized detection of glaucomatous neurodegeneration and support their integration into future multimodal imaging pipelines in neuro-ophthalmology.

MRI Detection Neurological Retrospective Clinical In Silico Academic Lab Benchmark SOTA

Rad-Path Correlation of Deep Learning Models for Prostate Cancer Detection on MRI

Verde, A. S. C., de Almeida, J. G., Mendes, F., Pereira, M., Lopes, R., Brito, M. J., Urbano, M., Correia, P. S., Gaivao, A. M., Firpo-Betancourt, A., Fonseca, J., Matos, C., Regge, D., Marias, K., Tsiknakis, M., ProCAncer-I Consortium,, Conceicao, R. C., Papanikolaou, N.

•preprint•Jun 4 2025

While Deep Learning (DL) models trained on Magnetic Resonance Imaging (MRI) have shown promise for prostate cancer detection, their lack of direct biological validation often undermines radiologists trust and hinders clinical adoption. Radiologic-histopathologic (rad-path) correlation has the potential to validate MRI-based lesion detection using digital histopathology. This study uses automated and manually annotated digital histopathology slides as a standard of reference to evaluate the spatial extent of lesion annotations derived from both radiologist interpretations and DL models previously trained on prostate bi-parametric MRI (bp-MRI). 117 histopathology slides were used as reference. Prospective patients with clinically significant prostate cancer performed a bp-MRI examination before undergoing a robotic radical prostatectomy, and each prostate specimen was sliced using a 3D-printed patient-specific mold to ensure a direct comparison between pre-operative imaging and histopathology slides. The histopathology slides and their corresponding T2-weighted MRI images were co-registered. We trained DL models for cancer detection on large retrospective datasets of T2-w MRI only, bp-MRI and histopathology images and did inference in a prospective patient cohort. We evaluated the spatial extent between detected lesions and between detected lesions and the histopathological and radiological ground-truth, using the Dice similarity coefficient (DSC). The DL models trained on digital histopathology tiles and MRI images demonstrated promising capabilities in lesion detection. A low overlap was observed between the lesion detection masks generated by the histopathology and bp-MRI models, with a DSC = 0.10. However, the overlap was equivalent (DSC = 0.08) between radiologist annotations and histopathology ground truth. A rad-path correlation pipeline was established in a prospective patient cohort with prostate cancer undergoing surgery. The correlation between rad-path DL models was low but comparable to the overlap between annotations. While DL models show promise in prostate cancer detection, challenges remain in integrating MRI-based predictions with histopathological findings.

MRI Detection Abdominal Prospective In Silico Consortium

Artificial Intelligence-Driven Innovations in Diabetes Care and Monitoring

Abdul Rahman, S., Mahadi, M., Yuliana, D., Budi Susilo, Y. K., Ariffin, A. E., Amgain, K.

•preprint•Jun 2 2025

This study explores Artificial Intelligence (AI)s transformative role in diabetes care and monitoring, focusing on innovations that optimize patient outcomes. AI, particularly machine learning and deep learning, significantly enhances early detection of complications like diabetic retinopathy and improves screening efficacy. The methodology employs a bibliometric analysis using Scopus, VOSviewer, and Publish or Perish, analyzing 235 articles from 2023-2025. Results indicate a strong interdisciplinary focus, with Computer Science and Medicine being dominant subject areas (36.9% and 12.9% respectively). Bibliographic coupling reveals robust international collaborations led by the U.S. (1558.52 link strength), UK, and China, with key influential documents by Zhu (2023c) and Annuzzi (2023). This research highlights AIs impact on enhancing monitoring, personalized treatment, and proactive care, while acknowledging challenges in data privacy and ethical deployment. Future work should bridge technological advancements with real-world implementation to create equitable and efficient diabetes care systems.

OCT Detection Review Concept Academic Lab Ethics