Latest Papers on Radiology AI.

Incidental Cardiovascular Findings in Lung Cancer Screening and Noncontrast Chest Computed Tomography.

Cham MD, Shemesh J

•papers•Sep 24 2025

While the primary goal of lung cancer screening CT is to detect early-stage lung cancer in high-risk populations, it often reveals asymptomatic cardiovascular abnormalities that can be clinically significant. These findings include coronary artery calcifications (CACs), myocardial pathologies, cardiac chamber enlargement, valvular lesions, and vascular disease. CAC, a marker of subclinical atherosclerosis, is particularly emphasized due to its strong predictive value for cardiovascular events and mortality. Guidelines recommend qualitative or quantitative CAC scoring on all noncontrast chest CTs. Other actionable findings include aortic aneurysms, pericardial disease, and myocardial pathology, some of which may indicate past or impending cardiac events. This article explores the wide range of incidental cardiovascular findings detectable during low-dose CT (LDCT) scans for lung cancer screening, as well as noncontrast chest CT scans. Distinguishing which findings warrant further evaluation is essential to avoid overdiagnosis, unnecessary anxiety, and resource misuse. The article advocates for a structured approach to follow-up based on the clinical significance of each finding and the patient's overall risk profile. It also notes the rising role of artificial intelligence in automatically detecting and quantifying these abnormalities, potentiating early behavioral modification or medical and surgical interventions. Ultimately, this piece highlights the opportunity to reframe LDCT as a comprehensive cardiothoracic screening tool.

CT Detection Chest Review Concept GenAI

HiPerformer: A High-Performance Global-Local Segmentation Model with Modular Hierarchical Fusion Strategy

Dayu Tan, Zhenpeng Xu, Yansen Su, Xin Peng, Chunhou Zheng, Weimin Zhong

•preprint•Sep 24 2025

Both local details and global context are crucial in medical image segmentation, and effectively integrating them is essential for achieving high accuracy. However, existing mainstream methods based on CNN-Transformer hybrid architectures typically employ simple feature fusion techniques such as serial stacking, endpoint concatenation, or pointwise addition, which struggle to address the inconsistencies between features and are prone to information conflict and loss. To address the aforementioned challenges, we innovatively propose HiPerformer. The encoder of HiPerformer employs a novel modular hierarchical architecture that dynamically fuses multi-source features in parallel, enabling layer-wise deep integration of heterogeneous information. The modular hierarchical design not only retains the independent modeling capability of each branch in the encoder, but also ensures sufficient information transfer between layers, effectively avoiding the degradation of features and information loss that come with traditional stacking methods. Furthermore, we design a Local-Global Feature Fusion (LGFF) module to achieve precise and efficient integration of local details and global semantic information, effectively alleviating the feature inconsistency problem and resulting in a more comprehensive feature representation. To further enhance multi-scale feature representation capabilities and suppress noise interference, we also propose a Progressive Pyramid Aggregation (PPA) module to replace traditional skip connections. Experiments on eleven public datasets demonstrate that the proposed method outperforms existing segmentation techniques, demonstrating higher segmentation accuracy and robustness. The code is available at https://github.com/xzphappy/HiPerformer.

Segmentation Methodology In Silico Open Code Benchmark SOTA

Photon-counting detector computed tomography in thoracic oncology: revolutionizing tumor imaging through precision and detail.

Yanagawa M, Ueno M, Ito R, Ueda D, Saida T, Kurokawa R, Takumi K, Nishioka K, Sugawara S, Ide S, Honda M, Iima M, Kawamura M, Sakata A, Sofue K, Oda S, Watabe T, Hirata K, Naganawa S

•papers•Sep 24 2025

Photon-counting detector computed tomography (PCD-CT) is an emerging imaging technology that promises to overcome the limitations of conventional energy-integrating detector (EID)-CT, particularly in thoracic oncology. This narrative review summarizes technical advances and clinical applications of PCD-CT in the thorax with emphasis on spatial resolution, dose-image-quality balance, and intrinsic spectral imaging, and it outlines practical implications relevant to thoracic oncology. A literature review of PubMed through May 31, 2025, was conducted using combinations of "photon counting," "computed tomography," "thoracic oncology," and "artificial intelligence." We screened the retrieved records and included studies with direct relevance to lung and mediastinal tumors, image quality, radiation dose, spectral/iodine imaging, or artificial intelligence-based reconstruction; case reports, editorials, and animal-only or purely methodological reports were excluded. PCD-CT demonstrated superior spatial resolution compared with EID-CT, enabling clearer visualization of fine pulmonary structures, such as bronchioles and subsolid nodules; slice thicknesses of approximately 0.4 mm and <i>ex vivo</i> resolvable structures approaching 0.11 mm have been reported. Across intraindividual clinical comparisons, radiation-dose reductions of 16%-43% have been achieved while maintaining or improving diagnostic image quality. Intrinsic spectral imaging enables accurate iodine mapping and low-keV virtual monoenergetic images and has shown quantitative advantages versus dual-energy CT in phantoms and early clinical work. Artificial intelligence-based deep-learning reconstruction and super-resolution can complement detector capabilities to reduce noise and stabilize fine-structure depiction without increasing dose. Potential reductions in contrast volume are biologically plausible given improved low-keV contrast-to-noise ratio, although clinical dose-finding data remain limited, and routine K-edge imaging has not yet translated to clinical thoracic practice. In conclusion, PCD-CT provides higher spatial and spectral fidelity at lower or comparable doses, supporting earlier and more precise tumor detection and characterization; future work should prioritize outcome-oriented trials, protocol harmonization, and implementation studies aligned with "Green Radiology".

CT Reconstruction Chest Review In Silico Academic Lab Benchmark SOTA

A Versatile Foundation Model for AI-enabled Mammogram Interpretation

Fuxiang Huang, Jiayi Zhu, Yunfang Yu, Yu Xie, Yuan Guo, Qingcong Kong, Mingxiang Wu, Xinrui Jiang, Shu Yang, Jiabo Ma, Ziyi Liu, Zhe Xu, Zhixuan Chen, Yujie Tan, Zifan He, Luhui Mao, Xi Wang, Junlin Hou, Lei Zhang, Qiong Luo, Zhenhui Li, Herui Yao, Hao Chen

•preprint•Sep 24 2025

Breast cancer is the most commonly diagnosed cancer and the leading cause of cancer-related mortality in women globally. Mammography is essential for the early detection and diagnosis of breast lesions. Despite recent progress in foundation models (FMs) for mammogram analysis, their clinical translation remains constrained by several fundamental limitations, including insufficient diversity in training data, limited model generalizability, and a lack of comprehensive evaluation across clinically relevant tasks. Here, we introduce VersaMammo, a versatile foundation model for mammograms, designed to overcome these limitations. We curated the largest multi-institutional mammogram dataset to date, comprising 706,239 images from 21 sources. To improve generalization, we propose a two-stage pre-training strategy to develop VersaMammo, a mammogram foundation model. First, a teacher model is trained via self-supervised learning to extract transferable features from unlabeled mammograms. Then, supervised learning combined with knowledge distillation transfers both features and clinical knowledge into VersaMammo. To ensure a comprehensive evaluation, we established a benchmark comprising 92 specific tasks, including 68 internal tasks and 24 external validation tasks, spanning 5 major clinical task categories: lesion detection, segmentation, classification, image retrieval, and visual question answering. VersaMammo achieves state-of-the-art performance, ranking first in 50 out of 68 specific internal tasks and 20 out of 24 external validation tasks, with average ranks of 1.5 and 1.2, respectively. These results demonstrate its superior generalization and clinical utility, offering a substantial advancement toward reliable and scalable breast cancer screening and diagnosis.

Mammography Classification Breast Methodology In Silico Academic Lab Benchmark SOTA Open Dataset

An Anisotropic Cross-View Texture Transfer with Multi-Reference Non-Local Attention for CT Slice Interpolation

Kwang-Hyun Uhm, Hyunjun Cho, Sung-Hoo Hong, Seung-Won Jung

•preprint•Sep 24 2025

Computed tomography (CT) is one of the most widely used non-invasive imaging modalities for medical diagnosis. In clinical practice, CT images are usually acquired with large slice thicknesses due to the high cost of memory storage and operation time, resulting in an anisotropic CT volume with much lower inter-slice resolution than in-plane resolution. Since such inconsistent resolution may lead to difficulties in disease diagnosis, deep learning-based volumetric super-resolution methods have been developed to improve inter-slice resolution. Most existing methods conduct single-image super-resolution on the through-plane or synthesize intermediate slices from adjacent slices; however, the anisotropic characteristic of 3D CT volume has not been well explored. In this paper, we propose a novel cross-view texture transfer approach for CT slice interpolation by fully utilizing the anisotropic nature of 3D CT volume. Specifically, we design a unique framework that takes high-resolution in-plane texture details as a reference and transfers them to low-resolution through-plane images. To this end, we introduce a multi-reference non-local attention module that extracts meaningful features for reconstructing through-plane high-frequency details from multiple in-plane images. Through extensive experiments, we demonstrate that our method performs significantly better in CT slice interpolation than existing competing methods on public CT datasets including a real-paired benchmark, verifying the effectiveness of the proposed framework. The source code of this work is available at https://github.com/khuhm/ACVTT.

CT Reconstruction Methodology In Silico Academic Lab Open Code

ImageNet-trained CNNs are not biased towards texture: Revisiting feature reliance through controlled suppression

Tom Burgert, Oliver Stoll, Paolo Rota, Begüm Demir

•preprint•Sep 24 2025

The hypothesis that Convolutional Neural Networks (CNNs) are inherently texture-biased has shaped much of the discourse on feature use in deep learning. We revisit this hypothesis by examining limitations in the cue-conflict experiment by Geirhos et al. To address these limitations, we propose a domain-agnostic framework that quantifies feature reliance through systematic suppression of shape, texture, and color cues, avoiding the confounds of forced-choice conflicts. By evaluating humans and neural networks under controlled suppression conditions, we find that CNNs are not inherently texture-biased but predominantly rely on local shape features. Nonetheless, this reliance can be substantially mitigated through modern training strategies or architectures (ConvNeXt, ViTs). We further extend the analysis across computer vision, medical imaging, and remote sensing, revealing that reliance patterns differ systematically: computer vision models prioritize shape, medical imaging models emphasize color, and remote sensing models exhibit a stronger reliance towards texture. Code is available at https://github.com/tomburgert/feature-reliance.

Methodology In Silico Open Code

Scan-do Attitude: Towards Autonomous CT Protocol Management using a Large Language Model Agent

Xingjian Kang, Linda Vorberg, Andreas Maier, Alexander Katzmann, Oliver Taubmann

•preprint•Sep 24 2025

Managing scan protocols in Computed Tomography (CT), which includes adjusting acquisition parameters or configuring reconstructions, as well as selecting postprocessing tools in a patient-specific manner, is time-consuming and requires clinical as well as technical expertise. At the same time, we observe an increasing shortage of skilled workforce in radiology. To address this issue, a Large Language Model (LLM)-based agent framework is proposed to assist with the interpretation and execution of protocol configuration requests given in natural language or a structured, device-independent format, aiming to improve the workflow efficiency and reduce technologists' workload. The agent combines in-context-learning, instruction-following, and structured toolcalling abilities to identify relevant protocol elements and apply accurate modifications. In a systematic evaluation, experimental results indicate that the agent can effectively retrieve protocol components, generate device compatible protocol definition files, and faithfully implement user requests. Despite demonstrating feasibility in principle, the approach faces limitations regarding syntactic and semantic validity due to lack of a unified device API, and challenges with ambiguous or complex requests. In summary, the findings show a clear path towards LLM-based agents for supporting scan protocol management in CT imaging.

CT LLM Radiology Report Methodology Prototype Academic Lab GenAI

SHMoAReg: Spark Deformable Image Registration via Spatial Heterogeneous Mixture of Experts and Attention Heads

Yuxi Zheng, Jianhui Feng, Tianran Li, Marius Staring, Yuchuan Qiao

•preprint•Sep 24 2025

Encoder-Decoder architectures are widely used in deep learning-based Deformable Image Registration (DIR), where the encoder extracts multi-scale features and the decoder predicts deformation fields by recovering spatial locations. However, current methods lack specialized extraction of features (that are useful for registration) and predict deformation jointly and homogeneously in all three directions. In this paper, we propose a novel expert-guided DIR network with Mixture of Experts (MoE) mechanism applied in both encoder and decoder, named SHMoAReg. Specifically, we incorporate Mixture of Attention heads (MoA) into encoder layers, while Spatial Heterogeneous Mixture of Experts (SHMoE) into the decoder layers. The MoA enhances the specialization of feature extraction by dynamically selecting the optimal combination of attention heads for each image token. Meanwhile, the SHMoE predicts deformation fields heterogeneously in three directions for each voxel using experts with varying kernel sizes. Extensive experiments conducted on two publicly available datasets show consistent improvements over various methods, with a notable increase from 60.58% to 65.58% in Dice score for the abdominal CT dataset. Furthermore, SHMoAReg enhances model interpretability by differentiating experts' utilities across/within different resolution layers. To the best of our knowledge, we are the first to introduce MoE mechanism into DIR tasks. The code will be released soon.

CT Registration Abdominal Methodology In Silico Open Code Benchmark SOTA

Revisiting Performance Claims for Chest X-Ray Models Using Clinical Context

Andrew Wang, Jiashuo Zhang, Michael Oberst

•preprint•Sep 24 2025

Public healthcare datasets of Chest X-Rays (CXRs) have long been a popular benchmark for developing computer vision models in healthcare. However, strong average-case performance of machine learning (ML) models on these datasets is insufficient to certify their clinical utility. In this paper, we use clinical context, as captured by prior discharge summaries, to provide a more holistic evaluation of current ``state-of-the-art'' models for the task of CXR diagnosis. Using discharge summaries recorded prior to each CXR, we derive a ``prior'' or ``pre-test'' probability of each CXR label, as a proxy for existing contextual knowledge available to clinicians when interpreting CXRs. Using this measure, we demonstrate two key findings: First, for several diagnostic labels, CXR models tend to perform best on cases where the pre-test probability is very low, and substantially worse on cases where the pre-test probability is higher. Second, we use pre-test probability to assess whether strong average-case performance reflects true diagnostic signal, rather than an ability to infer the pre-test probability as a shortcut. We find that performance drops sharply on a balanced test set where this shortcut does not exist, which may indicate that much of the apparent diagnostic power derives from inferring this clinical context. We argue that this style of analysis, using context derived from clinical notes, is a promising direction for more rigorous and fine-grained evaluation of clinical vision models.

X-Ray Classification Chest Retrospective Clinical In Silico Benchmark SOTA Ethics

Anomaly Detection by Clustering DINO Embeddings using a Dirichlet Process Mixture

Nico Schulthess, Ender Konukoglu

•preprint•Sep 24 2025

In this work, we leverage informative embeddings from foundational models for unsupervised anomaly detection in medical imaging. For small datasets, a memory-bank of normative features can directly be used for anomaly detection which has been demonstrated recently. However, this is unsuitable for large medical datasets as the computational burden increases substantially. Therefore, we propose to model the distribution of normative DINOv2 embeddings with a Dirichlet Process Mixture model (DPMM), a non-parametric mixture model that automatically adjusts the number of mixture components to the data at hand. Rather than using a memory bank, we use the similarity between the component centers and the embeddings as anomaly score function to create a coarse anomaly segmentation mask. Our experiments show that through DPMM embeddings of DINOv2, despite being trained on natural images, achieve very competitive anomaly detection performance on medical imaging benchmarks and can do this while at least halving the computation time at inference. Our analysis further indicates that normalized DINOv2 embeddings are generally more aligned with anatomical structures than unnormalized features, even in the presence of anomalies, making them great representations for anomaly detection. The code is available at https://github.com/NicoSchulthess/anomalydino-dpmm.

Mixed Modality Detection Methodology In Silico Academic Lab Open Code

Filter Papers

Tags

Incidental Cardiovascular Findings in Lung Cancer Screening and Noncontrast Chest Computed Tomography.

HiPerformer: A High-Performance Global-Local Segmentation Model with Modular Hierarchical Fusion Strategy

Photon-counting detector computed tomography in thoracic oncology: revolutionizing tumor imaging through precision and detail.

A Versatile Foundation Model for AI-enabled Mammogram Interpretation

An Anisotropic Cross-View Texture Transfer with Multi-Reference Non-Local Attention for CT Slice Interpolation

ImageNet-trained CNNs are not biased towards texture: Revisiting feature reliance through controlled suppression

Scan-do Attitude: Towards Autonomous CT Protocol Management using a Large Language Model Agent

SHMoAReg: Spark Deformable Image Registration via Spatial Heterogeneous Mixture of Experts and Attention Heads

Revisiting Performance Claims for Chest X-Ray Models Using Clinical Context

Anomaly Detection by Clustering DINO Embeddings using a Dirichlet Process Mixture

Ready to Sharpen Your Edge?