Sort by:
Page 37 of 1411402 results

Cross-channel feature transfer 3D U-Net for automatic segmentation of the perilymph and endolymph fluid spaces in hydrops MRI.

Yoo TW, Yeo CD, Lee EJ, Oh IS

pubmed logopapersSep 1 2025
The identification of endolymphatic hydrops (EH) using magnetic resonance imaging (MRI) is crucial for understanding inner ear disorders such as Meniere's disease and sudden low-frequency hearing loss. The EH ratio is calculated as the ratio of the endolymphatic fluid space to the perilymphatic fluid space. We propose a novel cross-channel feature transfer (CCFT) 3D U-Net for fully automated segmentation of the perilymphatic and endolymphatic fluid spaces in hydrops MRI. The model exhibits state-of-the-art performance in segmenting the endolymphatic fluid space by transferring magnetic resonance cisternography (MRC) features to HYDROPS-Mi2 (HYbriD of Reversed image Of Positive endolymph signal and native image of positive perilymph Signal multiplied with the heavily T2-weighted MR cisternography). Experimental results using the CCFT module showed that the segmentation performance of the perilymphatic space was 0.9459 for the Dice similarity coefficient (DSC) and 0.8975 for the intersection over union (IOU), and that of the endolymphatic space was 0.8053 for the DSC and 0.6778 for the IOU.

Prior-Guided Residual Diffusion: Calibrated and Efficient Medical Image Segmentation

Fuyou Mao, Beining Wu, Yanfeng Jiang, Han Xue, Yan Tang, Hao Zhang

arxiv logopreprintSep 1 2025
Ambiguity in medical image segmentation calls for models that capture full conditional distributions rather than a single point estimate. We present Prior-Guided Residual Diffusion (PGRD), a diffusion-based framework that learns voxel-wise distributions while maintaining strong calibration and practical sampling efficiency. PGRD embeds discrete labels as one-hot targets in a continuous space to align segmentation with diffusion modeling. A coarse prior predictor provides step-wise guidance; the diffusion network then learns the residual to the prior, accelerating convergence and improving calibration. A deep diffusion supervision scheme further stabilizes training by supervising intermediate time steps. Evaluated on representative MRI and CT datasets, PGRD achieves higher Dice scores and lower NLL/ECE values than Bayesian, ensemble, Probabilistic U-Net, and vanilla diffusion baselines, while requiring fewer sampling steps to reach strong performance.

Prediction of lymphovascular invasion in invasive breast cancer via intratumoral and peritumoral multiparametric magnetic resonance imaging machine learning-based radiomics with Shapley additive explanations interpretability analysis.

Chen S, Zhong Z, Chen Y, Tang W, Fan Y, Sui Y, Hu W, Pan L, Liu S, Kong Q, Guo Y, Liu W

pubmed logopapersSep 1 2025
The use of multiparametric magnetic resonance imaging (MRI) in predicting lymphovascular invasion (LVI) in breast cancer has been well-documented in the literature. However, the majority of the related studies have primarily focused on intratumoral characteristics, overlooking the potential contribution of peritumoral features. The aim of this study was to evaluate the effectiveness of multiparametric MRI in predicting LVI by analyzing both intratumoral and peritumoral radiomics features and to assess the added value of incorporating both regions in LVI prediction. A total of 366 patients underwent preoperative breast MRI from two centers and were divided into training (n=208), validation (n=70), and test (n=88) sets. Imaging features were extracted from intratumoral and peritumoral T2-weighted imaging, diffusion-weighted imaging, and dynamic contrast-enhanced MRI. Five models were developed for predicting LVI status based on logistic regression: the tumor area (TA) model, peritumoral area (PA) model, tumor-plus-peritumoral area (TPA) model, clinical model, and combined model. The combined model was created incorporating the highest radiomics score and clinical factors. Predictive efficacy was evaluated via the receiver operating characteristic (ROC) curve and area under the curve (AUC). The Shapley additive explanation (SHAP) method was used to rank the features and explain the final model. The performance of the TPA model was superior to that of the TA and PA models. A combined model was further developed via multivariable logistic regression, with the TPA radiomics score (radscore), MRI-assessed axillary lymph node (ALN) status, and peritumoral edema (PE) being incorporated. The combined model demonstrated good calibration and discrimination performance across the training, validation, and test datasets, with AUCs of 0.888 [95% confidence interval (CI): 0.841-0.934], 0.856 (95% CI: 0.769-0.943), and 0.853 (95% CI: 0.760-0.946), respectively. Furthermore, we conducted SHAP analysis to evaluate the contributions of TPA radscore, MRI-ALN status, and PE in LVI status prediction. The combined model, incorporating clinical factors and intratumoral and peritumoral radscore, effectively predicts LVI and may potentially aid in tailored treatment planning.

Ultrasound-based detection and malignancy prediction of breast lesions eligible for biopsy: A multi-center clinical-scenario study using nomograms, large language models, and radiologist evaluation

Ali Abbasian Ardakani, Afshin Mohammadi, Taha Yusuf Kuzan, Beyza Nur Kuzan, Hamid Khorshidi, Ashkan Ghorbani, Alisa Mohebbi, Fariborz Faeghi, Sepideh Hatamikia, U Rajendra Acharya

arxiv logopreprintAug 31 2025
To develop and externally validate integrated ultrasound nomograms combining BIRADS features and quantitative morphometric characteristics, and to compare their performance with expert radiologists and state of the art large language models in biopsy recommendation and malignancy prediction for breast lesions. In this retrospective multicenter, multinational study, 1747 women with pathologically confirmed breast lesions underwent ultrasound across three centers in Iran and Turkey. A total of 10 BIRADS and 26 morphological features were extracted from each lesion. A BIRADS, morphometric, and fused nomogram integrating both feature sets was constructed via logistic regression. Three radiologists (one senior, two general) and two ChatGPT variants independently interpreted deidentified breast lesion images. Diagnostic performance for biopsy recommendation (BIRADS 4,5) and malignancy prediction was assessed in internal and two external validation cohorts. In pooled analysis, the fused nomogram achieved the highest accuracy for biopsy recommendation (83.0%) and malignancy prediction (83.8%), outperforming the morphometric nomogram, three radiologists and both ChatGPT models. Its AUCs were 0.901 and 0.853 for the two tasks, respectively. In addition, the performance of the BIRADS nomogram was significantly higher than the morphometric nomogram, three radiologists and both ChatGPT models for biopsy recommendation and malignancy prediction. External validation confirmed the robust generalizability across different ultrasound platforms and populations. An integrated BIRADS morphometric nomogram consistently outperforms standalone models, LLMs, and radiologists in guiding biopsy decisions and predicting malignancy. These interpretable, externally validated tools have the potential to reduce unnecessary biopsies and enhance personalized decision making in breast imaging.

Can General-Purpose Omnimodels Compete with Specialists? A Case Study in Medical Image Segmentation

Yizhe Zhang, Qiang Chen, Tao Zhou

arxiv logopreprintAug 31 2025
The emergence of powerful, general-purpose omnimodels capable of processing diverse data modalities has raised a critical question: can these ``jack-of-all-trades'' systems perform on par with highly specialized models in knowledge-intensive domains? This work investigates this question within the high-stakes field of medical image segmentation. We conduct a comparative study analyzing the zero-shot performance of a state-of-the-art omnimodel (Gemini 2.5 Pro, the ``Nano Banana'' model) against domain-specific deep learning models on three distinct tasks: polyp (endoscopy), retinal vessel (fundus), and breast tumor segmentation (ultrasound). Our study focuses on performance at the extremes by curating subsets of the ``easiest'' and ``hardest'' cases based on the specialist models' accuracy. Our findings reveal a nuanced and task-dependent landscape. For polyp and breast tumor segmentation, specialist models excel on easy samples, but the omnimodel demonstrates greater robustness on hard samples where specialists fail catastrophically. Conversely, for the fine-grained task of retinal vessel segmentation, the specialist model maintains superior performance across both easy and hard cases. Intriguingly, qualitative analysis suggests omnimodels may possess higher sensitivity, identifying subtle anatomical features missed by human annotators. Our results indicate that while current omnimodels are not yet a universal replacement for specialists, their unique strengths suggest a potential complementary role with specialist models, particularly in enhancing robustness on challenging edge cases.

Impact of pre-test probability on AI-LVO detection: a systematic review of LVO prevalence across clinical contexts.

Olivé-Gadea M, Mayol J, Requena M, Rodrigo-Gisbert M, Rizzo F, Garcia-Tornel A, Simonetti R, Diana F, Muchada M, Pagola J, Rodriguez-Luna D, Rodriguez-Villatoro N, Rubiera M, Molina CA, Tomasello A, Hernandez D, de Dios Lascuevas M, Ribo M

pubmed logopapersAug 31 2025
Rapid identification of large vessel occlusion (LVO) in acute ischemic stroke (AIS) is essential for reperfusion therapy. Screening tools, including Artificial Intelligence (AI) based algorithms, have been developed to accelerate detection but rely heavily on pre-test LVO prevalence. This study aimed to review LVO prevalence across clinical contexts and analyze its impact on AI-algorithm performance. We systematically reviewed studies reporting consecutive suspected AIS cohorts. Cohorts were grouped into four clinical scenarios based on patient selection criteria: (a) high suspicion of LVO by stroke specialists (direct-to-angiosuite candidates), (b) high suspicion of LVO according to pre-hospital scales, (c) and (d) any suspected AIS without considering severity cut-off in a hospital or pre-hospital setting, respectively. We analyzed LVO prevalence in each scenario and assessed the false discovery rate (FDR) - number of positive studies needed to encounter a false positive, if applying eight commercially available LVO-detecting algorithms. We included 87 cohorts from 80 studies. Median LVO prevalence was: (a) 84% (77-87%), (b) 35% (26-42%), (c) 19% (14-25%), and (d) 14% (8-22%). At high prevalence levels: (a) FDR ranged between 0.007 (1 false positive in 142 positives) and 0.023 (1 in 43), whereas in low prevalence scenarios (Ccand d), FDR ranged between 0.168 (1 in 6) and 0.543 (over 1 in 2). To ensure meaningful clinical impact, AI algorithms must be evaluated within the specific populations and care pathways where they are applied.

CXR-MultiTaskNet a unified deep learning framework for joint disease localization and classification in chest radiographs.

Reddy KD, Patil A

pubmed logopapersAug 31 2025
Chest X-ray (CXR) is a challenging problem in automated medical diagnosis, where complex visual patterns of thoracic diseases must be precisely identified through multi-label classification and lesion localization. Current approaches typically consider classification and localization in isolation, resulting in a piecemeal system that does not exploit common representations and is often not clinically interpretable, as well as limited in handling multi-label diseases. Although multi-task learning frameworks, such as DeepChest and CLN, appear to meet this goal, they suffer from task interference and poor explainability, which limits their practical application in real-world clinical workflows. To address these limitations, we present a unified multi-task deep learning framework, CXR-MultiTaskNet, for simultaneously classifying thoracic diseases and localizing lesions in chest X-rays. Our framework comprises a standard ResNet50 feature extractor, two task-specific heads for multi-task learning, and a Grad-CAM-based explainability module that provides accurate predictions and enhances clinical explainability. We formulate a joint loss that weighs the relative importance of representation extraction, which is large due to class variations, and the final loss, which is larger in the detection loss that occurs in extreme class imbalances between days and the detectability of varying disease manifestation types. Recent advances made by deep learning methods in the identification of disease in chest X-ray images are promising; however, there are limitations in their performance for complete analysis due to the lack of interpretability, some inherent weaknesses of convolutional neural networks (CNN), and prior learning of classification at the image level before localization of the disease. In this paper, we propose a dual-attention-based hierarchical feature extraction approach, which addresses the challenges of deep learning in detecting diseases in chest X-ray images. Through the use of visual attention maps, the detection steps can be better tracked, and therefore, the entire process is made more interpretable than with a traditional CNN-embedding model. We also manage to obtain both disease-level and pixel-level predictions, which enable explainable and comprehensive analysis of each image and aid in localizing each detected abnormality area. The proposed approach was further optimized for X-ray images by computing the objective losses during training, which ultimately gives higher significance to smaller lesions. Experimental evaluations on a benchmark chest X-ray dataset demonstrate the potential of the proposed approach achieving a macro F1-score of 0.965 (0.968 micro F1-score) for disease classification and mean IoU of 0.851 ([email protected]) for localization of diseases Content: Model intepretability, Chest X-ray image disease detection, Detection region localization, Weakly supervised transfer learning Lesion localization → 5 of 0.927 Compared to state-of-the-art single-task and multi-task baselines, these results are consistently better. The presented framework provides an integrated, method-based approach to chest X-ray analysis that is clinically useful, interpretable, and scalable for automation, allowing for efficient diagnostic pathways and enhanced clinical decision-making. This single framework can serve as a router for next-gen explainable AI in radiology.

A Modality-agnostic Multi-task Foundation Model for Human Brain Imaging

Peirong Liu, Oula Puonti, Xiaoling Hu, Karthik Gopinath, Annabel Sorby-Adams, Daniel C. Alexander, W. Taylor Kimberly, Juan E. Iglesias

arxiv logopreprintAug 30 2025
Recent learning-based approaches have made astonishing advances in calibrated medical imaging like computerized tomography (CT), yet they struggle to generalize in uncalibrated modalities -- notably magnetic resonance (MR) imaging, where performance is highly sensitive to the differences in MR contrast, resolution, and orientation. This prevents broad applicability to diverse real-world clinical protocols. Here we introduce BrainFM, a modality-agnostic, multi-task vision foundation model for human brain imaging. With the proposed "mild-to-severe" intra-subject generation and "real-synth" mix-up training strategy, BrainFM is resilient to the appearance of acquired images (e.g., modality, contrast, deformation, resolution, artifacts), and can be directly applied to five fundamental brain imaging tasks, including image synthesis for CT and T1w/T2w/FLAIR MRI, anatomy segmentation, scalp-to-cortical distance, bias field estimation, and registration. We evaluate the efficacy of BrainFM on eleven public datasets, and demonstrate its robustness and effectiveness across all tasks and input modalities. Code is available at https://github.com/jhuldr/BrainFM.

A Multimodal and Multi-centric Head and Neck Cancer Dataset for Tumor Segmentation and Outcome Prediction

Numan Saeed, Salma Hassan, Shahad Hardan, Ahmed Aly, Darya Taratynova, Umair Nawaz, Ufaq Khan, Muhammad Ridzuan, Vincent Andrearczyk, Adrien Depeursinge, Mathieu Hatt, Thomas Eugene, Raphaël Metz, Mélanie Dore, Gregory Delpon, Vijay Ram Kumar Papineni, Kareem Wahid, Cem Dede, Alaa Mohamed Shawky Ali, Carlos Sjogreen, Mohamed Naser, Clifton D. Fuller, Valentin Oreiller, Mario Jreige, John O. Prior, Catherine Cheze Le Rest, Olena Tankyevych, Pierre Decazes, Su Ruan, Stephanie Tanadini-Lang, Martin Vallières, Hesham Elhalawani, Ronan Abgral, Romain Floch, Kevin Kerleguer, Ulrike Schick, Maelle Mauguen, Arman Rahmim, Mohammad Yaqub

arxiv logopreprintAug 30 2025
We describe a publicly available multimodal dataset of annotated Positron Emission Tomography/Computed Tomography (PET/CT) studies for head and neck cancer research. The dataset includes 1123 FDG-PET/CT studies from patients with histologically confirmed head and neck cancer, acquired from 10 international medical centers. All examinations consisted of co-registered PET/CT scans with varying acquisition protocols, reflecting real-world clinical diversity across institutions. Primary gross tumor volumes (GTVp) and involved lymph nodes (GTVn) were manually segmented by experienced radiation oncologists and radiologists following standardized guidelines and quality control measures. We provide anonymized NifTi files of all studies, along with expert-annotated segmentation masks, radiotherapy dose distribution for a subset of patients, and comprehensive clinical metadata. This metadata includes TNM staging, HPV status, demographics (age and gender), long-term follow-up outcomes, survival times, censoring indicators, and treatment information. We demonstrate how this dataset can be used for three key clinical tasks: automated tumor segmentation, recurrence-free survival prediction, and HPV status classification, providing benchmark results using state-of-the-art deep learning models, including UNet, SegResNet, and multimodal prognostic frameworks.

Diagnostic Performance of CT-Based Artificial Intelligence for Early Recurrence of Cholangiocarcinoma: A Systematic Review and Meta-Analysis.

Chen J, Xi J, Chen T, Yang L, Liu K, Ding X

pubmed logopapersAug 30 2025
Despite AI models demonstrating high predictive accuracy for early cholangiocarcinoma(CCA) recurrence, their clinical application faces challenges such as reproducibility, generalizability, hidden biases, and uncertain performance across diverse datasets and populations, raising concerns about their practical applicability. This meta-analysis aims to systematically assess the diagnostic performance of artificial intelligence (AI) models utilizing computed tomography (CT) imaging to predict early recurrence of CCA. A systematic search was conducted in PubMed, Embase, and Web of Science for studies published up to May 2025. Studies were selected based on the PIRTOS framework. Participants (P): Patients diagnosed with CCA (including intrahepatic and extrahepatic locations). Index test (I): AI techniques applied to CT imaging for early recurrence prediction (defined as within 1 year). Reference standard (R): Pathological diagnosis or imaging follow-up confirming recurrence. Target condition (T): Early recurrence of CCA (positive group: recurrence, negative group: no recurrence). Outcomes (O): Sensitivity, specificity, diagnostic odds ratio (DOR), and area under the receiver operating characteristic curve (AUC), assessed in both internal and external validation cohorts. Setting (S): Retrospective or prospective studies using hospital datasets. Methodological quality was assessed using an optimized version of the revised QUADAS-2 tool. Heterogeneity was assessed using the I² statistic. Pooled sensitivity, specificity, DOR and AUC were calculated using a bivariate random-effects model. Nine studies with 30 datasets involving 1,537 patients were included. In internal validation cohorts, CT-based AI models showed a pooled sensitivity of 0.87 (95% CI: 0.81-0.92), specificity of 0.85 (95% CI: 0.79-0.89), DOR of 37.71 (95% CI: 18.35-77.51), and AUC of 0.93 (95% CI: 0.90-0.94). In external validation cohorts, pooled sensitivity was 0.87 (95% CI: 0.81-0.91), specificity was 0.82 (95% CI: 0.77-0.86), DOR was 30.81 (95% CI: 18.79-50.52), and AUC was 0.85 (95% CI: 0.82-0.88). The AUC was significantly lower in external validation cohorts compared to internal validation cohorts (P < .001). Our results show that CT-based AI models predict early CCA recurrence with high performance in internal validation sets and moderate performance in external validation sets. However, the high heterogeneity observed may impact the robustness of these results. Future research should focus on prospective studies and establishing standardized gold standards to further validate the clinical applicability and generalizability of AI models.
Page 37 of 1411402 results
Show
per page

Ready to Sharpen Your Edge?

Join hundreds of your peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.