Sort by:
Page 14 of 1401395 results

HiPerformer: A High-Performance Global-Local Segmentation Model with Modular Hierarchical Fusion Strategy

Dayu Tan, Zhenpeng Xu, Yansen Su, Xin Peng, Chunhou Zheng, Weimin Zhong

arxiv logopreprintSep 24 2025
Both local details and global context are crucial in medical image segmentation, and effectively integrating them is essential for achieving high accuracy. However, existing mainstream methods based on CNN-Transformer hybrid architectures typically employ simple feature fusion techniques such as serial stacking, endpoint concatenation, or pointwise addition, which struggle to address the inconsistencies between features and are prone to information conflict and loss. To address the aforementioned challenges, we innovatively propose HiPerformer. The encoder of HiPerformer employs a novel modular hierarchical architecture that dynamically fuses multi-source features in parallel, enabling layer-wise deep integration of heterogeneous information. The modular hierarchical design not only retains the independent modeling capability of each branch in the encoder, but also ensures sufficient information transfer between layers, effectively avoiding the degradation of features and information loss that come with traditional stacking methods. Furthermore, we design a Local-Global Feature Fusion (LGFF) module to achieve precise and efficient integration of local details and global semantic information, effectively alleviating the feature inconsistency problem and resulting in a more comprehensive feature representation. To further enhance multi-scale feature representation capabilities and suppress noise interference, we also propose a Progressive Pyramid Aggregation (PPA) module to replace traditional skip connections. Experiments on eleven public datasets demonstrate that the proposed method outperforms existing segmentation techniques, demonstrating higher segmentation accuracy and robustness. The code is available at https://github.com/xzphappy/HiPerformer.

Photon-counting detector computed tomography in thoracic oncology: revolutionizing tumor imaging through precision and detail.

Yanagawa M, Ueno M, Ito R, Ueda D, Saida T, Kurokawa R, Takumi K, Nishioka K, Sugawara S, Ide S, Honda M, Iima M, Kawamura M, Sakata A, Sofue K, Oda S, Watabe T, Hirata K, Naganawa S

pubmed logopapersSep 24 2025
Photon-counting detector computed tomography (PCD-CT) is an emerging imaging technology that promises to overcome the limitations of conventional energy-integrating detector (EID)-CT, particularly in thoracic oncology. This narrative review summarizes technical advances and clinical applications of PCD-CT in the thorax with emphasis on spatial resolution, dose-image-quality balance, and intrinsic spectral imaging, and it outlines practical implications relevant to thoracic oncology. A literature review of PubMed through May 31, 2025, was conducted using combinations of "photon counting," "computed tomography," "thoracic oncology," and "artificial intelligence." We screened the retrieved records and included studies with direct relevance to lung and mediastinal tumors, image quality, radiation dose, spectral/iodine imaging, or artificial intelligence-based reconstruction; case reports, editorials, and animal-only or purely methodological reports were excluded. PCD-CT demonstrated superior spatial resolution compared with EID-CT, enabling clearer visualization of fine pulmonary structures, such as bronchioles and subsolid nodules; slice thicknesses of approximately 0.4 mm and <i>ex vivo</i> resolvable structures approaching 0.11 mm have been reported. Across intraindividual clinical comparisons, radiation-dose reductions of 16%-43% have been achieved while maintaining or improving diagnostic image quality. Intrinsic spectral imaging enables accurate iodine mapping and low-keV virtual monoenergetic images and has shown quantitative advantages versus dual-energy CT in phantoms and early clinical work. Artificial intelligence-based deep-learning reconstruction and super-resolution can complement detector capabilities to reduce noise and stabilize fine-structure depiction without increasing dose. Potential reductions in contrast volume are biologically plausible given improved low-keV contrast-to-noise ratio, although clinical dose-finding data remain limited, and routine K-edge imaging has not yet translated to clinical thoracic practice. In conclusion, PCD-CT provides higher spatial and spectral fidelity at lower or comparable doses, supporting earlier and more precise tumor detection and characterization; future work should prioritize outcome-oriented trials, protocol harmonization, and implementation studies aligned with "Green Radiology".

A Versatile Foundation Model for AI-enabled Mammogram Interpretation

Fuxiang Huang, Jiayi Zhu, Yunfang Yu, Yu Xie, Yuan Guo, Qingcong Kong, Mingxiang Wu, Xinrui Jiang, Shu Yang, Jiabo Ma, Ziyi Liu, Zhe Xu, Zhixuan Chen, Yujie Tan, Zifan He, Luhui Mao, Xi Wang, Junlin Hou, Lei Zhang, Qiong Luo, Zhenhui Li, Herui Yao, Hao Chen

arxiv logopreprintSep 24 2025
Breast cancer is the most commonly diagnosed cancer and the leading cause of cancer-related mortality in women globally. Mammography is essential for the early detection and diagnosis of breast lesions. Despite recent progress in foundation models (FMs) for mammogram analysis, their clinical translation remains constrained by several fundamental limitations, including insufficient diversity in training data, limited model generalizability, and a lack of comprehensive evaluation across clinically relevant tasks. Here, we introduce VersaMammo, a versatile foundation model for mammograms, designed to overcome these limitations. We curated the largest multi-institutional mammogram dataset to date, comprising 706,239 images from 21 sources. To improve generalization, we propose a two-stage pre-training strategy to develop VersaMammo, a mammogram foundation model. First, a teacher model is trained via self-supervised learning to extract transferable features from unlabeled mammograms. Then, supervised learning combined with knowledge distillation transfers both features and clinical knowledge into VersaMammo. To ensure a comprehensive evaluation, we established a benchmark comprising 92 specific tasks, including 68 internal tasks and 24 external validation tasks, spanning 5 major clinical task categories: lesion detection, segmentation, classification, image retrieval, and visual question answering. VersaMammo achieves state-of-the-art performance, ranking first in 50 out of 68 specific internal tasks and 20 out of 24 external validation tasks, with average ranks of 1.5 and 1.2, respectively. These results demonstrate its superior generalization and clinical utility, offering a substantial advancement toward reliable and scalable breast cancer screening and diagnosis.

SHMoAReg: Spark Deformable Image Registration via Spatial Heterogeneous Mixture of Experts and Attention Heads

Yuxi Zheng, Jianhui Feng, Tianran Li, Marius Staring, Yuchuan Qiao

arxiv logopreprintSep 24 2025
Encoder-Decoder architectures are widely used in deep learning-based Deformable Image Registration (DIR), where the encoder extracts multi-scale features and the decoder predicts deformation fields by recovering spatial locations. However, current methods lack specialized extraction of features (that are useful for registration) and predict deformation jointly and homogeneously in all three directions. In this paper, we propose a novel expert-guided DIR network with Mixture of Experts (MoE) mechanism applied in both encoder and decoder, named SHMoAReg. Specifically, we incorporate Mixture of Attention heads (MoA) into encoder layers, while Spatial Heterogeneous Mixture of Experts (SHMoE) into the decoder layers. The MoA enhances the specialization of feature extraction by dynamically selecting the optimal combination of attention heads for each image token. Meanwhile, the SHMoE predicts deformation fields heterogeneously in three directions for each voxel using experts with varying kernel sizes. Extensive experiments conducted on two publicly available datasets show consistent improvements over various methods, with a notable increase from 60.58% to 65.58% in Dice score for the abdominal CT dataset. Furthermore, SHMoAReg enhances model interpretability by differentiating experts' utilities across/within different resolution layers. To the best of our knowledge, we are the first to introduce MoE mechanism into DIR tasks. The code will be released soon.

Revisiting Performance Claims for Chest X-Ray Models Using Clinical Context

Andrew Wang, Jiashuo Zhang, Michael Oberst

arxiv logopreprintSep 24 2025
Public healthcare datasets of Chest X-Rays (CXRs) have long been a popular benchmark for developing computer vision models in healthcare. However, strong average-case performance of machine learning (ML) models on these datasets is insufficient to certify their clinical utility. In this paper, we use clinical context, as captured by prior discharge summaries, to provide a more holistic evaluation of current ``state-of-the-art'' models for the task of CXR diagnosis. Using discharge summaries recorded prior to each CXR, we derive a ``prior'' or ``pre-test'' probability of each CXR label, as a proxy for existing contextual knowledge available to clinicians when interpreting CXRs. Using this measure, we demonstrate two key findings: First, for several diagnostic labels, CXR models tend to perform best on cases where the pre-test probability is very low, and substantially worse on cases where the pre-test probability is higher. Second, we use pre-test probability to assess whether strong average-case performance reflects true diagnostic signal, rather than an ability to infer the pre-test probability as a shortcut. We find that performance drops sharply on a balanced test set where this shortcut does not exist, which may indicate that much of the apparent diagnostic power derives from inferring this clinical context. We argue that this style of analysis, using context derived from clinical notes, is a promising direction for more rigorous and fine-grained evaluation of clinical vision models.

Region-of-Interest Augmentation for Mammography Classification under Patient-Level Cross-Validation

Farbod Bigdeli, Mohsen Mohammadagha, Ali Bigdeli

arxiv logopreprintSep 24 2025
Breast cancer screening with mammography remains central to early detection and mortality reduction. Deep learning has shown strong potential for automating mammogram interpretation, yet limited-resolution datasets and small sample sizes continue to restrict performance. We revisit the Mini-DDSM dataset (9,684 images; 2,414 patients) and introduce a lightweight region-of-interest (ROI) augmentation strategy. During training, full images are probabilistically replaced with random ROI crops sampled from a precomputed, label-free bounding-box bank, with optional jitter to increase variability. We evaluate under strict patient-level cross-validation and report ROC-AUC, PR-AUC, and training-time efficiency metrics (throughput and GPU memory). Because ROI augmentation is training-only, inference-time cost remains unchanged. On Mini-DDSM, ROI augmentation (best: p_roi = 0.10, alpha = 0.10) yields modest average ROC-AUC gains, with performance varying across folds; PR-AUC is flat to slightly lower. These results demonstrate that simple, data-centric ROI strategies can enhance mammography classification in constrained settings without requiring additional labels or architectural modifications.

A Contrastive Learning Framework for Breast Cancer Detection

Samia Saeed, Khuram Naveed

arxiv logopreprintSep 24 2025
Breast cancer, the second leading cause of cancer-related deaths globally, accounts for a quarter of all cancer cases [1]. To lower this death rate, it is crucial to detect tumors early, as early-stage detection significantly improves treatment outcomes. Advances in non-invasive imaging techniques have made early detection possible through computer-aided detection (CAD) systems which rely on traditional image analysis to identify malignancies. However, there is a growing shift towards deep learning methods due to their superior effectiveness. Despite their potential, deep learning methods often struggle with accuracy due to the limited availability of large-labeled datasets for training. To address this issue, our study introduces a Contrastive Learning (CL) framework, which excels with smaller labeled datasets. In this regard, we train Resnet-50 in semi supervised CL approach using similarity index on a large amount of unlabeled mammogram data. In this regard, we use various augmentation and transformations which help improve the performance of our approach. Finally, we tune our model on a small set of labelled data that outperforms the existing state of the art. Specifically, we observed a 96.7% accuracy in detecting breast cancer on benchmark datasets INbreast and MIAS.

Deep Learning-based Automated Detection of Pulmonary Embolism: Is It Reliable?

Babacan Ö, Karkaş AY, Durak G, Uysal E, Durak Ü, Shrestha R, Bingöl Z, Okumuş G, Medetalibeyoğlu A, Ertürk ŞM

pubmed logopapersSep 24 2025
To assess the diagnostic accuracy and clinical applicability of the artificial intelligence (AI) program "Canon Automation Platform" for the automated detection and localization of pulmonary embolisms (PEs) in chest computed tomography pulmonary angiograms (CTPAs). A total of 1474 CTPAs suspected of PEs were retrospectively evaluated by 2 senior radiology residents with 5 years of experience. The final diagnosis was verified through radiology reports by 2 thoracic radiologists with 20 and 25 years of experience, along with the patients' clinical records and histories. The images were transferred to the Canon Automation Platform, which integrates with the picture archiving and communication system (PACS), and the diagnostic success of the platform was evaluated. This study examined all anatomic levels of the pulmonary arteries, including the left pulmonary artery, right pulmonary artery, and interlobar, segmental, and subsegmental branches. The confusion matrix data obtained at all anatomic levels considered in our study were as follows: AUC-ROC score of 0.945 to 0.996, accuracy of 95.4% to 99.7%, sensitivity of 81.4% to 99.1%, specificity of 98.7% to 100%, PPV of 89.1% to 100%, NPV of 95.6% to 99.9%, F1 score of 0.868 to 0.987, and Cohen Kappa of 0.842 to 0.986. Notably, sensitivity in the subsegmental branches was lower (81.4% to 84.7%) compared with more central locations, whereas specificity remained consistent (98.7% to 98.9%). The results showed that the chest pain package of the Canon Automation Platform accurately provides rapid automatic PE detection in chest CTPAs by leveraging deep learning algorithms to facilitate the clinical workflow. This study demonstrates that AI can provide physicians with robust diagnostic support for acute PE, particularly in hospitals without 24/7 access to radiology specialists.

Performance Comparison of Cutting-Edge Large Language Models on the ACR In-Training Examination: An Update for 2025.

Young A, Paloka R, Islam A, Prasanna P, Hill V, Payne D

pubmed logopapersSep 24 2025
This study represents a continuation of prior work by Payne et al. evaluating large language model (LLM) performance on radiology board-style assessments, specifically the ACR diagnostic radiology in-training examination (DXIT). Building upon earlier findings with GPT-4, we assess the performance of newer, cutting-edge models, such as GPT-4o, GPT-o1, GPT-o3, Claude, Gemini, and Grok on standardized DXIT questions. In addition to overall performance, we compare model accuracy on text-based versus image-based questions to assess multi-modal reasoning capabilities. As a secondary aim, we investigate the potential impact of data contamination by comparing model performance on original versus revised image-based questions. Seven LLMs - GPT-4, GPT-4o, GPT-o1, GPT-o3, Claude 3.5 Sonnet, Gemini 1.5 Pro, and Grok 2.0-were evaluated using 106 publicly available DXIT questions. Each model was prompted using a standardized instruction set to simulate a radiology resident answering board-style questions. For each question, the model's selected answer, rationale, and confidence score were recorded. Unadjusted accuracy (based on correct answer selection) and logic-adjusted accuracy (based on clinical reasoning pathways) were calculated. Subgroup analysis compared model performance on text-based versus image-based questions. Additionally, 63 image-based questions were revised to test novel reasoning while preserving the original diagnostic image to assess the impact of potential training data contamination. Across 106 DXIT questions, GPT-o1 demonstrated the highest unadjusted accuracy (71.7%), followed closely by GPT-4o (69.8%) and GPT-o3 (68.9%). GPT-4 (59.4%) and Grok 2.0 exhibited similar scores (59.4% and 52.8%). Claude 3.5 Sonnet had the lowest unadjusted accuracies (34.9%). Similar trends were observed for logic-adjusted accuracy, with GPT-o1 (60.4%), GPT-4o (59.4%), and GPT-o3 (59.4%) again outperforming other models, while Grok 2.0 and Claude 3.5 Sonnet lagged behind (34.0% and 30.2%, respectively). GPT-4o's performance was significantly higher on text-based questions compared to image-based ones. Unadjusted accuracy for the revised DXIT questions was 49.2%, compared to 56.1% on matched original DXIT questions. Logic-adjusted accuracy for the revised DXIT questions was 40.0% compared to 44.4% on matched original DXIT questions. No significant difference in performance was observed between original and revised questions. Modern LLMs, especially those from OpenAI, demonstrate strong and improved performance on board-style radiology assessments. Comparable performance on revised prompts suggests that data contamination may have played a limited role. As LLMs improve, they hold strong potential to support radiology resident learning through personalized feedback and board-style question review.

Automated Coronary Artery Calcium Scoring Using Deep Learning: Validation Across Diverse Chest CT Protocols.

Mineo E, Assuncao-Jr AN, Grego da Silva CF, Liberato G, Dantas-Jr RN, Graves CV, Gutierrez MA, Nomura CH

pubmed logopapersSep 24 2025
Coronary artery calcium (CAC) scoring refines atherosclerotic cardiovascular disease (ASCVD) risk but is not frequently reported on routine non‑gated chest CT (NCCT), whose use expanded in the COVID‑19 era. We sought to develop and validate a workflow-ready deep learning model for fully automated, protocol-agnostic CAC quantification. In this retrospective study, a deep learning (DL) model was trained and validated using 2132 chest CT scans (routine, CT-CAC, and CT-COVID) from patients without established atherosclerotic cardiovascular disease (ASCVD) collected (2013-2023) at a single university hospital. The index test was a DL-based CAC segmentation model; the reference standard was manual annotation by experienced observers. Agreement was evaluated using intra-class correlation coefficients (ICC) for Agatston scores and Cohen's kappa for CAC risk categories. Sensitivity, specificity, positive, and negative predictive values, and F1 scores were calculated to measure diagnostic performance. The DL model demonstrated high reliability for Agatston scores (ICC=0.987) and strong agreement in CAC categories (Cohen's κ=0.86-0.95). Diagnostic performance for CAC >100 (F1=0.956) and CAC >300 (F1=0.967) was very high. External validation in the Mashhad COVID Study showed good agreement (κ=0.8). In the SBU COVID study, the F1 score for detecting moderate-to-severe CAC was 0.928. The proposed DL model delivers accurate, workflow‑ready CAC quantification across routine, dedicated, and pandemic‑era chest CT scans, supporting opportunistic, cost‑effective cardiovascular risk stratification in contemporary clinical practice.
Page 14 of 1401395 results
Show
per page

Ready to Sharpen Your Edge?

Join hundreds of your peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.