Sort by:
Page 60 of 95943 results

High-performance Open-source AI for Breast Cancer Detection and Localization in MRI.

Hirsch L, Sutton EJ, Huang Y, Kayis B, Hughes M, Martinez D, Makse HA, Parra LC

pubmed logopapersJun 25 2025
<i>"Just Accepted" papers have undergone full peer review and have been accepted for publication in <i>Radiology: Artificial Intelligence</i>. This article will undergo copyediting, layout, and proof review before it is published in its final version. Please note that during production of the final copyedited article, errors may be discovered which could affect the content.</i> Purpose To develop and evaluate an open-source deep learning model for detection and localization of breast cancer on MRI. Materials and Methods In this retrospective study, a deep learning model for breast cancer detection and localization was trained on the largest breast MRI dataset to date. Data included all breast MRIs conducted at a tertiary cancer center in the United States between 2002 and 2019. The model was validated on sagittal MRIs from the primary site (<i>n</i> = 6,615 breasts). Generalizability was assessed by evaluating model performance on axial data from the primary site (<i>n</i> = 7,058 breasts) and a second clinical site (<i>n</i> = 1,840 breasts). Results The primary site dataset included 30,672 sagittal MRI examinations (52,598 breasts) from 9,986 female patients (mean [SD] age, 53 [11] years). The model achieved an area under the receiver operating characteristic curve (AUC) of 0.95 for detecting cancer in the primary site. At 90% specificity (5717/6353), model sensitivity was 83% (217/262), which was comparable to historical performance data for radiologists. The model generalized well to axial examinations, achieving an AUC of 0.92 on data from the same clinical site and 0.92 on data from a secondary site. The model accurately located the tumor in 88.5% (232/262) of sagittal images, 92.8% (272/293) of axial images from the primary site, and 87.7% (807/920) of secondary site axial images. Conclusion The model demonstrated state-of-the-art performance on breast cancer detection. Code and weights are openly available to stimulate further development and validation. ©RSNA, 2025.

Generalizable medical image enhancement using structure-preserved diffusion models.

Chen L, Yu X, Li H, Lin H, Niu K, Li H

pubmed logopapersJun 25 2025
Clinical medical images often suffer from compromised quality, which negatively impacts the diagnostic process by both clinicians and AI algorithms. While GAN-based enhancement methods have been commonly developed in recent years, delicate model training is necessary due to issues with artifacts, mode collapse, and instability. Diffusion models have shown promise in generating high-quality images superior to GANs, but challenges in training data collection and domain gaps hinder applying them for medical image enhancement. Additionally, preserving fine structures in enhancing medical images with diffusion models is still an area that requires further exploration. To overcome these challenges, we propose structure-preserved diffusion models for generalizable medical image enhancement (GEDM). GEDM leverages joint supervision from enhancement and segmentation to boost structure preservation and generalizability. Specifically, synthetic data is used to collect high-low quality paired training data with structure masks, and the Laplace transform is employed to reduce domain gaps and introduce multi-scale conditions. GEDM conducts medical image enhancement and segmentation jointly, supervised by high-quality references and structure masks from the training data. Four datasets of two medical imaging modalities were collected to implement the experiments, where GEDM outperformed state-of-the-art methods in image enhancement, as well as follow-up medical analysis tasks.

Computed tomography-derived quantitative imaging biomarkers enable the prediction of disease manifestations and survival in patients with systemic sclerosis.

Sieren MM, Grasshoff H, Riemekasten G, Berkel L, Nensa F, Hosch R, Barkhausen J, Kloeckner R, Wegner F

pubmed logopapersJun 25 2025
Systemic sclerosis (SSc) is a complex inflammatory vasculopathy with diverse symptoms and variable disease progression. Despite its known impact on body composition (BC), clinical decision-making has yet to incorporate these biomarkers. This study aims to extract quantitative BC imaging biomarkers from CT scans to assess disease severity, define BC phenotypes, track changes over time and predict survival. CT exams were extracted from a prospectively maintained cohort of 452 SSc patients. 128 patients with at least one CT exam were included. An artificial intelligence-based 3D body composition analysis (BCA) algorithm assessed muscle volume, different adipose tissue compartments, and bone mineral density. These parameters were analysed with regard to various clinical, laboratory, functional parameters and survival. Phenotypes were identified performing K-means cluster analysis. Longitudinal evaluation of BCA changes employed regression analyses. A regression model using BCA parameters outperformed models based on Body Mass Index and clinical parameters in predicting survival (area under the curve (AUC)=0.75). Longitudinal development of the cardiac marker enabled prediction of survival with an AUC=0.82. Patients with altered BCA parameters had increased ORs for various complications, including interstitial lung disease (p<0.05). Two distinct BCA phenotypes were identified, showing significant differences in gastrointestinal disease manifestations (p<0.01). This study highlights several parameters with the potential to reshape clinical pathways for SSc patients. Quantitative BCA biomarkers offer a means to predict survival and individual disease manifestations, in part outperforming established parameters. These insights open new avenues for research into the mechanisms driving body composition changes in SSc and for developing enhanced disease management tools, ultimately leading to more personalised and effective patient care.

Machine Learning-Based Risk Assessment of Myasthenia Gravis Onset in Thymoma Patients and Analysis of Their Correlations and Causal Relationships.

Liu W, Wang W, Zhang H, Guo M

pubmed logopapersJun 25 2025
The study aims to utilize interpretable machine learning models to predict the risk of myasthenia gravis onset in thymoma patients and investigate the intrinsic correlations and causal relationships between them. A comprehensive retrospective analysis was conducted on 172 thymoma patients diagnosed at two medical centers between 2018 and 2024. The cohort was bifurcated into a training set (n = 134) and test set (n = 38) to develop and validate risk predictive models. Radiomic and deep features were extracted from tumor regions across three CT phases: non-enhanced, arterial, and venous. Through rigorous feature selection employing Spearman's rank correlation coefficient and LASSO (Least Absolute Shrinkage and Selection Operator) regularization, 12 optimal imaging features were identified. These were integrated with 11 clinical parameters and one pathological subtype variable to form a multi-dimensional feature matrix. Six machine learning algorithms were subsequently implemented for model construction and comparative analysis. We utilized SHAP (SHapley Additive exPlanation) to interpret the model and employed doubly robust learner to perform a potential causal analysis between thymoma and myasthenia gravis (MG). All six models demonstrated satisfactory predictive capabilities, with the support vector machine (SVM) model exhibiting superior performance on the test cohort. It achieved an area under the curve (AUC) of 0.904 (95% confidence interval [CI] 0.798-1.000), outperforming other models such as logistic regression, multilayer perceptron (MLP), and others. The model's predictive result substantiates the strong correlation between thymoma and MG. Additionally, our analysis revealed the existence of a significant causal relationship between them, and high-risk tumors significantly elevated the risk of MG by an average treatment effect (ATE) of 9.2%. This implies that thymoma patients with types B2 and B3 face a considerably high risk of developing MG compared to those with types A, AB, and B1. The model provides a novel and effective tool for evaluating the risk of MG development in patients with thymoma. Furthermore, correlation and causal analysis have unveiled pathways that connect tumor to the risk of MG, with a notably higher incidence of MG observed in high risk pathological subtypes. These insights contribute to a deeper understanding of MG and drive a paradigm shift in medical practice from passive treatment to proactive intervention.

Few-Shot Learning for Prostate Cancer Detection on MRI: Comparative Analysis with Radiologists' Performance.

Yamagishi Y, Baba Y, Suzuki J, Okada Y, Kanao K, Oyama M

pubmed logopapersJun 25 2025
Deep-learning models for prostate cancer detection typically require large datasets, limiting clinical applicability across institutions due to domain shift issues. This study aimed to develop a few-shot learning deep-learning model for prostate cancer detection on multiparametric MRI that requires minimal training data and to compare its diagnostic performance with experienced radiologists. In this retrospective study, we used 99 cases (80 positive, 19 negative) of biopsy-confirmed prostate cancer (2017-2022), with 20 cases for training, 5 for validation, and 74 for testing. A 2D transformer model was trained on T2-weighted, diffusion-weighted, and apparent diffusion coefficient map images. Model predictions were compared with two radiologists using Matthews correlation coefficient (MCC) and F1 score, with 95% confidence intervals (CIs) calculated via bootstrap method. The model achieved an MCC of 0.297 (95% CI: 0.095-0.474) and F1 score of 0.707 (95% CI: 0.598-0.847). Radiologist 1 had an MCC of 0.276 (95% CI: 0.054-0.484) and F1 score of 0.741; Radiologist 2 had an MCC of 0.504 (95% CI: 0.289-0.703) and F1 score of 0.871, showing that the model performance was comparable to Radiologist 1. External validation on the Prostate158 dataset revealed that ImageNet pretraining substantially improved model performance, increasing study-level ROC-AUC from 0.464 to 0.636 and study-level PR-AUC from 0.637 to 0.773 across all architectures. Our findings demonstrate that few-shot deep-learning models can achieve clinically relevant performance when using pretrained transformer architectures, offering a promising approach to address domain shift challenges across institutions.

The evaluation of artificial intelligence in mammography-based breast cancer screening: Is breast-level analysis enough?

Taib AG, Partridge GJW, Yao L, Darker I, Chen Y

pubmed logopapersJun 25 2025
To assess whether the diagnostic performance of a commercial artificial intelligence (AI) algorithm for mammography differs between breast-level and lesion-level interpretations and to compare performance to a large population of specialised human readers. We retrospectively analysed 1200 mammograms from the NHS breast cancer screening programme using a commercial AI algorithm and assessments from 1258 trained human readers from the Personal Performance in Mammographic Screening (PERFORMS) external quality assurance programme. For breasts containing pathologically confirmed malignancies, a breast and lesion-level analysis was performed. The latter considered the locations of marked regions of interest for AI and humans. The highest score per lesion was recorded. For non-malignant breasts, a breast-level analysis recorded the highest score per breast. Area under the curve (AUC), sensitivity and specificity were calculated at the developer's recommended threshold for recall. The study was designed to detect a medium-sized effect (odds ratio 3.5 or 0.29) for sensitivity. The test set contained 882 non-malignant (73%) and 318 malignant breasts (27%), with 328 cancer lesions. The AI AUC was 0.942 at breast level and 0.929 at lesion level (difference -0.013, p < 0.01). The mean human AUC was 0.878 at breast level and 0.851 at lesion level (difference -0.027, p < 0.01). AI outperformed human readers at the breast and lesion level (ps < 0.01, respectively) according to the AUC. AI's diagnostic performance significantly decreased at the lesion level, indicating reduced accuracy in localising malignancies. However, its overall performance exceeded that of human readers. Question AI often recalls mammography cases not recalled by humans; to understand why, we as humans must consider the regions of interest it has marked as cancerous. Findings Evaluations of AI typically occur at the breast level, but performance decreases when AI is evaluated on a lesion level. This also occurs for humans. Clinical relevance To improve human-AI collaboration, AI should be assessed at the lesion level; poor accuracy here may lead to automation bias and unnecessary patient procedures.

Diagnostic Performance of Universal versus Stratified Computer-Aided Detection Thresholds for Chest X-Ray-Based Tuberculosis Screening

Sung, J., Kitonsa, P. J., Nalutaaya, A., Isooba, D., Birabwa, S., Ndyabayunga, K., Okura, R., Magezi, J., Nantale, D., Mugabi, I., Nakiiza, V., Dowdy, D. W., Katamba, A., Kendall, E. A.

medrxiv logopreprintJun 24 2025
BackgroundComputer-aided detection (CAD) software analyzes chest X-rays for features suggestive of tuberculosis (TB) and provides a numeric abnormality score. However, estimates of CAD accuracy for TB screening are hindered by the lack of confirmatory data among people with lower CAD scores, including those without symptoms. Additionally, the appropriate CAD score thresholds for obtaining further testing may vary according to population and client characteristics. MethodsWe screened for TB in Ugandan individuals aged [&ge;]15 years using portable chest X-rays with CAD (qXR v3). Participants were offered screening regardless of their symptoms. Those with X-ray scores above a threshold of 0.1 (range, 0 - 1) were asked to provide sputum for Xpert Ultra testing. We estimated the diagnostic accuracy of CAD for detecting Xpert-positive TB when using the same threshold for all individuals (under different assumptions about TB prevalence among people with X-ray scores <0.1), and compared this estimate to age- and/or sex-stratified approaches. FindingsOf 52,835 participants screened for TB using CAD, 8,949 (16.9%) had X-ray scores [&ge;]0.1. Of 7,219 participants with valid Xpert Ultra results, 382 (5.3%) were Xpert-positive, including 81 with trace results. Assuming 0.1% of participants with X-ray scores <0.1 would have been Xpert-positive if tested, qXR had an estimated AUC of 0.920 (95% confidence interval 0.898-0.941) for Xpert-positive TB. Stratifying CAD thresholds according to age and sex improved accuracy; for example, at 96.1% specificity, estimated sensitivity was 75.0% for a universal threshold (of [&ge;]0.65) versus 76.9% for thresholds stratified by age and sex (p=0.046). InterpretationThe accuracy of CAD for TB screening among all screening participants, including those without symptoms or abnormal chest X-rays, is higher than previously estimated. Stratifying CAD thresholds based on client characteristics such as age and sex could further improve accuracy, enabling a more effective and personalized approach to TB screening. FundingNational Institutes of Health Research in contextO_ST_ABSEvidence before this studyC_ST_ABSThe World Health Organization (WHO) has endorsed computer-aided detection (CAD) as a screening tool for tuberculosis (TB), but the appropriate CAD score that triggers further diagnostic evaluation for tuberculosis varies by population. The WHO recommends determining the appropriate CAD threshold for specific settings and population and considering unique thresholds for specific populations, including older age groups, among whom CAD may perform poorly. We performed a PubMed literature search for articles published until September 9, 2024, using the search terms "tuberculosis" AND ("computer-aided detection" OR "computer aided detection" OR "CAD" OR "computer-aided reading" OR "computer aided reading" OR "artificial intelligence"), which resulted in 704 articles. Among them, we identified studies that evaluated the performance of CAD for tuberculosis screening and additionally reviewed relevant references. Most prior studies reported area under the curves (AUC) ranging from 0.76 to 0.88 but limited their evaluations to individuals with symptoms or abnormal chest X-rays. Some prior studies identified subgroups (including older individuals and people with prior TB) among whom CAD had lower-than-average AUCs, and authors discussed how the prevalence of such characteristics could affect the optimal value of a population-wide CAD threshold; however, none estimated the accuracy that could be gained with adjusting CAD thresholds between individuals based on personal characteristics. Added value of this studyIn this study, all consenting individuals in a high-prevalence setting were offered chest X-ray screening, regardless of symptoms, if they were [&ge;]15 years old, not pregnant, and not on TB treatment. A very low CAD score cutoff (qXR v3 score of 0.1 on a 0-1 scale) was used to select individuals for confirmatory sputum molecular testing, enabling the detection of radiographically mild forms of TB and facilitating comparisons of diagnostic accuracy at different CAD thresholds. With this more expansive, symptom-neutral evaluation of CAD, we estimated an AUC of 0.920, and we found that the qXR v3 threshold needed to decrease to under 0.1 to meet the WHO target product profile goal of [&ge;]90% sensitivity and [&ge;]70% specificity. Compared to using the same thresholds for all participants, adjusting CAD thresholds by age and sex strata resulted in a 1 to 2% increase in sensitivity without affecting specificity. Implications of all the available evidenceTo obtain high sensitivity with CAD screening in high-prevalence settings, low score thresholds may be needed. However, countries with a high burden of TB often do not have sufficient resources to test all individuals above a low threshold. In such settings, adjusting CAD thresholds based on individual characteristics associated with TB prevalence (e.g., male sex) and those associated with false-positive X-ray results (e.g., old age) can potentially improve the efficiency of TB screening programs.

General Methods Make Great Domain-specific Foundation Models: A Case-study on Fetal Ultrasound

Jakob Ambsdorf, Asbjørn Munk, Sebastian Llambias, Anders Nymark Christensen, Kamil Mikolaj, Randall Balestriero, Martin Tolsgaard, Aasa Feragen, Mads Nielsen

arxiv logopreprintJun 24 2025
With access to large-scale, unlabeled medical datasets, researchers are confronted with two questions: Should they attempt to pretrain a custom foundation model on this medical data, or use transfer-learning from an existing generalist model? And, if a custom model is pretrained, are novel methods required? In this paper we explore these questions by conducting a case-study, in which we train a foundation model on a large regional fetal ultrasound dataset of 2M images. By selecting the well-established DINOv2 method for pretraining, we achieve state-of-the-art results on three fetal ultrasound datasets, covering data from different countries, classification, segmentation, and few-shot tasks. We compare against a series of models pretrained on natural images, ultrasound images, and supervised baselines. Our results demonstrate two key insights: (i) Pretraining on custom data is worth it, even if smaller models are trained on less data, as scaling in natural image pretraining does not translate to ultrasound performance. (ii) Well-tuned methods from computer vision are making it feasible to train custom foundation models for a given medical domain, requiring no hyperparameter tuning and little methodological adaptation. Given these findings, we argue that a bias towards methodological innovation should be avoided when developing domain specific foundation models under common computational resource constraints.

SAM2-SGP: Enhancing SAM2 for Medical Image Segmentation via Support-Set Guided Prompting

Yang Xing, Jiong Wu, Yuheng Bu, Kuang Gong

arxiv logopreprintJun 24 2025
Although new vision foundation models such as Segment Anything Model 2 (SAM2) have significantly enhanced zero-shot image segmentation capabilities, reliance on human-provided prompts poses significant challenges in adapting SAM2 to medical image segmentation tasks. Moreover, SAM2's performance in medical image segmentation was limited by the domain shift issue, since it was originally trained on natural images and videos. To address these challenges, we proposed SAM2 with support-set guided prompting (SAM2-SGP), a framework that eliminated the need for manual prompts. The proposed model leveraged the memory mechanism of SAM2 to generate pseudo-masks using image-mask pairs from a support set via a Pseudo-mask Generation (PMG) module. We further introduced a novel Pseudo-mask Attention (PMA) module, which used these pseudo-masks to automatically generate bounding boxes and enhance localized feature extraction by guiding attention to relevant areas. Furthermore, a low-rank adaptation (LoRA) strategy was adopted to mitigate the domain shift issue. The proposed framework was evaluated on both 2D and 3D datasets across multiple medical imaging modalities, including fundus photography, X-ray, computed tomography (CT), magnetic resonance imaging (MRI), positron emission tomography (PET), and ultrasound. The results demonstrated a significant performance improvement over state-of-the-art models, such as nnUNet and SwinUNet, as well as foundation models, such as SAM2 and MedSAM2, underscoring the effectiveness of the proposed approach. Our code is publicly available at https://github.com/astlian9/SAM_Support.

Non-invasive prediction of NSCLC immunotherapy efficacy and tumor microenvironment through unsupervised machine learning-driven CT Radiomic subtypes: a multi-cohort study.

Guo Y, Gong B, Li Y, Mo P, Chen Y, Fan Q, Sun Q, Miao L, Li Y, Liu Y, Tan W, Yang L, Zheng C

pubmed logopapersJun 24 2025
Radiomics analyzes quantitative features from medical images to reveal tumor heterogeneity, offering new insights for diagnosis, prognosis, and treatment prediction. This study explored radiomics based biomarkers to predict immunotherapy response and its association with the tumor microenvironment in non-small cell lung cancer (NSCLC) using unsupervised machine learning models derived from CT imaging. This study included 1539 NSCLC patients from seven independent cohorts. For 1834 radiomic features extracted from 869 NSCLC patients, K-means unsupervised clustering was applied to identify radiomic subtypes. A random forest model extended subtype classification to external cohorts, model accuracy, sensitivity, and specificity were evaluated. By conducting bulk RNA sequencing (RNA-seq) and single-cell transcriptome sequencing (scRNA-seq) of tumors, the immune microenvironment characteristics of tumors can be obtained to evaluate the association between radiomic subtypes and immunotherapy efficacy, immune scores, and immune cells infiltration. Unsupervised clustering stratified NSCLC patients into two subtypes (Cluster 1 and Cluster 2). Principal component analysis confirmed significant distinctions between subtypes across all cohorts. Cluster 2 exhibited significantly longer median overall survival (35 vs. 30 months, P = 0.006) and progression-free survival (19 vs. 16 months, P = 0.020) compared to Cluster 1. Multivariate Cox regression identified radiomic subtype as an independent predictor of overall survival (HR: 0.738, 95% CI 0.583-0.935, P = 0.012), validated in two external cohorts. Bulk RNA seq showed elevated interaction signaling and immune scores in Cluster 2 and scRNA-seq demonstrated higher proportions of T cells, B cells, and NK cells in Cluster 2. This study establishes a radiomic subtype associated with NSCLC immunotherapy efficacy and tumor immune microenvironment. The findings provide a non-invasive tool for personalized treatment, enabling early identification of immunotherapy-responsive patients and optimized therapeutic strategies.
Page 60 of 95943 results
Show
per page

Ready to Sharpen Your Edge?

Join hundreds of your peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.