Latest Papers on Radiology AI. Category: preprint, Order: Best Match, Limit: 10.

MedFormer: Hierarchical Medical Vision Transformer with Content-Aware Dual Sparse Selection Attention

Zunhui Xia, Hongxing Li, Libin Lan

•preprint•Jul 3 2025

Medical image recognition serves as a key way to aid in clinical diagnosis, enabling more accurate and timely identification of diseases and abnormalities. Vision transformer-based approaches have proven effective in handling various medical recognition tasks. However, these methods encounter two primary challenges. First, they are often task-specific and architecture-tailored, limiting their general applicability. Second, they usually either adopt full attention to model long-range dependencies, resulting in high computational costs, or rely on handcrafted sparse attention, potentially leading to suboptimal performance. To tackle these issues, we present MedFormer, an efficient medical vision transformer with two key ideas. First, it employs a pyramid scaling structure as a versatile backbone for various medical image recognition tasks, including image classification and dense prediction tasks such as semantic segmentation and lesion detection. This structure facilitates hierarchical feature representation while reducing the computation load of feature maps, highly beneficial for boosting performance. Second, it introduces a novel Dual Sparse Selection Attention (DSSA) with content awareness to improve computational efficiency and robustness against noise while maintaining high performance. As the core building technique of MedFormer, DSSA is explicitly designed to attend to the most relevant content. In addition, a detailed theoretical analysis has been conducted, demonstrating that MedFormer has superior generality and efficiency in comparison to existing medical vision transformers. Extensive experiments on a variety of imaging modality datasets consistently show that MedFormer is highly effective in enhancing performance across all three above-mentioned medical image recognition tasks. The code is available at https://github.com/XiaZunhui/MedFormer.

Mixed Modality Classification Methodology In Silico Academic Lab Open Code Benchmark SOTA

TABNet: A Triplet Augmentation Self-Recovery Framework with Boundary-Aware Pseudo-Labels for Medical Image Segmentation

Peilin Zhang, Shaouxan Wua, Jun Feng, Zhuo Jin, Zhizezhang Gao, Jingkun Chen, Yaqiong Xing, Xiao Zhang

•preprint•Jul 3 2025

Background and objective: Medical image segmentation is a core task in various clinical applications. However, acquiring large-scale, fully annotated medical image datasets is both time-consuming and costly. Scribble annotations, as a form of sparse labeling, provide an efficient and cost-effective alternative for medical image segmentation. However, the sparsity of scribble annotations limits the feature learning of the target region and lacks sufficient boundary supervision, which poses significant challenges for training segmentation networks. Methods: We propose TAB Net, a novel weakly-supervised medical image segmentation framework, consisting of two key components: the triplet augmentation self-recovery (TAS) module and the boundary-aware pseudo-label supervision (BAP) module. The TAS module enhances feature learning through three complementary augmentation strategies: intensity transformation improves the model's sensitivity to texture and contrast variations, cutout forces the network to capture local anatomical structures by masking key regions, and jigsaw augmentation strengthens the modeling of global anatomical layout by disrupting spatial continuity. By guiding the network to recover complete masks from diverse augmented inputs, TAS promotes a deeper semantic understanding of medical images under sparse supervision. The BAP module enhances pseudo-supervision accuracy and boundary modeling by fusing dual-branch predictions into a loss-weighted pseudo-label and introducing a boundary-aware loss for fine-grained contour refinement. Results: Experimental evaluations on two public datasets, ACDC and MSCMR seg, demonstrate that TAB Net significantly outperforms state-of-the-art methods for scribble-based weakly supervised segmentation. Moreover, it achieves performance comparable to that of fully supervised methods.

MRI Segmentation Cardiac Methodology In Silico Academic Lab

CineMyoPS: Segmenting Myocardial Pathologies from Cine Cardiac MR

Wangbin Ding, Lei Li, Junyi Qiu, Bogen Lin, Mingjing Yang, Liqin Huang, Lianming Wu, Sihan Wang, Xiahai Zhuang

•preprint•Jul 3 2025

Myocardial infarction (MI) is a leading cause of death worldwide. Late gadolinium enhancement (LGE) and T2-weighted cardiac magnetic resonance (CMR) imaging can respectively identify scarring and edema areas, both of which are essential for MI risk stratification and prognosis assessment. Although combining complementary information from multi-sequence CMR is useful, acquiring these sequences can be time-consuming and prohibitive, e.g., due to the administration of contrast agents. Cine CMR is a rapid and contrast-free imaging technique that can visualize both motion and structural abnormalities of the myocardium induced by acute MI. Therefore, we present a new end-to-end deep neural network, referred to as CineMyoPS, to segment myocardial pathologies, \ie scars and edema, solely from cine CMR images. Specifically, CineMyoPS extracts both motion and anatomy features associated with MI. Given the interdependence between these features, we design a consistency loss (resembling the co-training strategy) to facilitate their joint learning. Furthermore, we propose a time-series aggregation strategy to integrate MI-related features across the cardiac cycle, thereby enhancing segmentation accuracy for myocardial pathologies. Experimental results on a multi-center dataset demonstrate that CineMyoPS achieves promising performance in myocardial pathology segmentation, motion estimation, and anatomy segmentation.

MRI Segmentation Cardiac Methodology In Silico

Multi-task machine learning reveals the functional neuroanatomy fingerprint of mental processing

Wang, Z., Chen, Y., Pan, Y., Yan, J., Mao, W., Xiao, Z., Cao, G., Toussaint, P.-J., Guo, W., Zhao, B., Sun, H., Zhang, T., Evans, A. C., Jiang, X.

•preprint•Jul 3 2025

Mental processing delineates the functions of human mind encompassing a wide range of motor, sensory, emotional, and cognitive processes, each of which is underlain by the neuroanatomical substrates. Identifying accurate representation of functional neuroanatomy substrates of mental processing could inform understanding of its neural mechanism. The challenge is that it is unclear whether a specific mental process possesses a 'functional neuroanatomy fingerprint', i.e., a unique and reliable pattern of functional neuroanatomy that underlies the mental process. To address this question, we utilized a multi-task deep learning model to disentangle the functional neuroanatomy fingerprint of seven different and representative mental processes including Emotion, Gambling, Language, Motor, Relational, Social, and Working Memory. Results based on the functional magnetic resonance imaging data of two independent cohorts of 1235 subjects from the US and China consistently show that each of the seven mental processes possessed a functional neuroanatomy fingerprint, which is represented by a unique set of functional activity weights of whole-brain regions characterizing the degree of each region involved in the mental process. The functional neuroanatomy fingerprint of a specific mental process exhibits high discrimination ability (93% classification accuracy and AUC of 0.99) with those of the other mental processes, and is robust across different datasets and using different brain atlases. This study provides a solid functional neuroanatomy foundation for investigating the neural mechanism of mental processing.

MRI Classification Neurological Retrospective Clinical In Silico Academic Lab

PiCME: Pipeline for Contrastive Modality Evaluation and Encoding in the MIMIC Dataset

Michal Golovanevsky, Pranav Mahableshwarkar, Carsten Eickhoff, Ritambhara Singh

•preprint•Jul 3 2025

Multimodal deep learning holds promise for improving clinical prediction by integrating diverse patient data, including text, imaging, time-series, and structured demographics. Contrastive learning facilitates this integration by producing a unified representation that can be reused across tasks, reducing the need for separate models or encoders. Although contrastive learning has seen success in vision-language domains, its use in clinical settings remains largely limited to image and text pairs. We propose the Pipeline for Contrastive Modality Evaluation and Encoding (PiCME), which systematically assesses five clinical data types from MIMIC: discharge summaries, radiology reports, chest X-rays, demographics, and time-series. We pre-train contrastive models on all 26 combinations of two to five modalities and evaluate their utility on in-hospital mortality and phenotype prediction. To address performance plateaus with more modalities, we introduce a Modality-Gated LSTM that weights each modality according to its contrastively learned importance. Our results show that contrastive models remain competitive with supervised baselines, particularly in three-modality settings. Performance declines beyond three modalities, which supervised models fail to recover. The Modality-Gated LSTM mitigates this drop, improving AUROC from 73.19% to 76.93% and AUPRC from 51.27% to 62.26% in the five-modality setting. We also compare contrastively learned modality importance scores with attribution scores and evaluate generalization across demographic subgroups, highlighting strengths in interpretability and fairness. PiCME is the first to scale contrastive learning across all modality combinations in MIMIC, offering guidance for modality selection, training strategies, and equitable clinical prediction.

X-Ray Classification Chest Methodology In Silico Academic Lab Benchmark SOTA

Outcome prediction and individualized treatment effect estimation in patients with large vessel occlusion stroke

Lisa Herzog, Pascal Bühler, Ezequiel de la Rosa, Beate Sick, Susanne Wegener

•preprint•Jul 3 2025

Mechanical thrombectomy has become the standard of care in patients with stroke due to large vessel occlusion (LVO). However, only 50% of successfully treated patients show a favorable outcome. We developed and evaluated interpretable deep learning models to predict functional outcomes in terms of the modified Rankin Scale score alongside individualized treatment effects (ITEs) using data of 449 LVO stroke patients from a randomized clinical trial. Besides clinical variables, we considered non-contrast CT (NCCT) and angiography (CTA) scans which were integrated using novel foundation models to make use of advanced imaging information. Clinical variables had a good predictive power for binary functional outcome prediction (AUC of 0.719 [0.666, 0.774]) which could slightly be improved when adding CTA imaging (AUC of 0.737 [0.687, 0.795]). Adding NCCT scans or a combination of NCCT and CTA scans to clinical features yielded no improvement. The most important clinical predictor for functional outcome was pre-stroke disability. While estimated ITEs were well calibrated to the average treatment effect, discriminatory ability was limited indicated by a C-for-Benefit statistic of around 0.55 in all models. In summary, the models allowed us to jointly integrate CT imaging and clinical features while achieving state-of-the-art prediction performance and ITE estimates. Yet, further research is needed to particularly improve ITE estimation.

Mixed Modality Classification Neurological Retrospective Clinical In Silico Academic Lab Benchmark SOTA

Group-derived and individual disconnection in stroke: recovery prediction and deep graph learning

Bey, P., Dhindsa, K., Rackoll, T., Feldheim, J., Bönstrup, M., Thomalla, G., Schulz, R., Cheng, B., Gerloff, C., Endres, M., Nave, A. H., Ritter, P.

•preprint•Jul 3 2025

Recent advances in the treatment of acute ischemic stroke contribute to improved patient outcomes, yet the mechanisms driving long-term disease trajectory are not well-understood. Current trends in the literature emphasize the distributed disruptive impact of stroke lesions on brain network organization. While most studies use population-derived data to investigate lesion interference on healthy tissue, the potential for individualized treatment strategies remains underexplored due to a lack of availability and effective utilization of the necessary clinical imaging data. To validate the potential for individualized patient evaluation, we explored and compared the differential information in network models based on normative and individual data. We further present our novel deep learning approach providing usable and accurate estimates of individual stroke impact utilizing minimal imaging data, thus bridging the data gap hindering individualized treatment planning. We created normative and individual disconnectomes for each of 78 patients (mean age 65.1 years, 32 females) from two independent cohort studies. MRI data and Barthel Index, as a measure of activities of daily living, were collected in the acute and early sub-acute phase after stroke (baseline) and at three months post stroke incident. Disconnectomes were subsequently described using 12 network metrics, including clustering coefficient and transitivity. Metrics were first compared between disconnectomes and further utilized as features in a classifier to predict a patients disease trajectory, as defined by three months Barthel Index. We then developed a deep learning architecture based on graph convolution and trained it to predict properties of the individual disconnectomes from the normative disconnectomes. Both disconnectomes showed statistically significant differences in topology and predictive power. Normative disconnectomes included a statistically significant larger number of connections (N=604 for normative versus N=210 for individual) and agreement between network properties ranged from r2=0.01 for clustering coefficient to r2=0.8 for assortativity, highlighting the impact of disconnectome choice on subsequent analysis. To predict patient deficit severity, individual data achieved an AUC score of 0.94 compared to an AUC score of 0.85 for normative based features. Our deep learning estimates showed high correlation with individual features (mean r2=0.94) and a comparable performance with an AUC score of 0.93. We were able to show how normative data-based analysis of stroke disconnections provides limited information regarding patient recovery. In contrast, individual data provided higher prognostic precision. We presented a novel approach to curb the need for individual data while retaining most of the differential information encoding individual patient disease trajectory.

MRI Classification Neurological Retrospective Clinical In Silico Academic Lab

Quantification of Optical Coherence Tomography Features in >3500 Patients with Inherited Retinal Disease Reveals Novel Genotype-Phenotype Associations

Woof, W. A., de Guimaraes, T. A. C., Al-Khuzaei, S., Daich Varela, M., Shah, M., Naik, G., Sen, S., Bagga, P., Chan, Y. W., Mendes, B. S., Lin, S., Ghoshal, B., Liefers, B., Fu, D. J., Georgiou, M., da Silva, A. S., Nguyen, Q., Liu, Y., Fujinami-Yokokawa, Y., Sumodhee, D., Furman, J., Patel, P. J., Moghul, I., Moosajee, M., Sallum, J., De Silva, S. R., Lorenz, B., Herrmann, P., Holz, F. G., Fujinami, K., Webster, A. R., Mahroo, O. A., Downes, S. M., Madhusudhan, S., Balaskas, K., Michaelides, M., Pontikos, N.

•preprint•Jul 3 2025

PurposeTo quantify spectral-domain optical coherence tomography (SD-OCT) images cross-sectionally and longitudinally in a large cohort of molecularly characterized patients with inherited retinal disease (IRDs) from the UK. DesignRetrospective study of imaging data. ParticipantsPatients with a clinical and molecularly confirmed diagnosis of IRD who have undergone macular SD-OCT imaging at Moorfields Eye Hospital (MEH) between 2011 and 2019. We retrospectively identified 4,240 IRD patients from the MEH database (198 distinct IRD genes), including 69,664 SD-OCT macular volumes. MethodsEight features of interest were defined: retina, fovea, intraretinal cystic spaces (ICS), subretinal fluid (SRF), subretinal hyper-reflective material (SHRM), pigment epithelium detachment (PED), ellipsoid zone loss (EZ-loss) and retinal pigment epithelium loss (RPE-loss). Manual annotations of five b-scans per SD-OCT volume was performed for the retinal features by four graders based on a defined grading protocol. A total of 1,749 b-scans from 360 SD-OCT volumes across 275 patients were annotated for the eight retinal features for training and testing of a neural-network-based segmentation model, AIRDetect-OCT, which was then applied to the entire imaging dataset. Main Outcome MeasuresPerformance of AIRDetect-OCT, comparing to inter-grader agreement was evaluated using Dice score on a held-out dataset. Feature prevalence, volume and area were analysed cross-sectionally and longitudinally. ResultsThe inter-grader Dice score for manual segmentation was [≥]90% for retina, ICS, SRF, SHRM and PED, >77% for both EZ-loss and RPE-loss. Model-grader agreement was >80% for segmentation of retina, ICS, SRF, SHRM, and PED, and >68% for both EZ-loss and RPE-loss. Automatic segmentation was applied to 272,168 b-scans across 7,405 SD-OCT volumes from 3,534 patients encompassing 176 unique genes. Accounting for age, male patients exhibited significantly more EZ-loss (19.6mm2 vs 17.9mm2, p<2.8x10-4) and RPE-loss (7.79mm2 vs 6.15mm2, p<3.2x10-6) than females. RPE-loss was significantly higher in Asian patients than other ethnicities (9.37mm2 vs 7.29mm2, p<0.03). ICS average total volume was largest in RS1 (0.47mm3) and NR2E3 (0.25mm3), SRF in BEST1 (0.21mm3) and PED in EFEMP1 (0.34mm3). BEST1 and PROM1 showed significantly different patterns of EZ-loss (p<10-4) and RPE-loss (p<0.02) comparing the dominant to the recessive forms. Sectoral analysis revealed significantly increased EZ-loss in the inferior quadrant compared to superior quadrant for RHO ({Delta}=-0.414 mm2, p=0.036) and EYS ({Delta}=-0.908 mm2, p=1.5x10-4). In ABCA4 retinopathy, more severe genotypes (group A) were associated with faster progression of EZ-loss (2.80{+/-}0.62 mm2/yr), whilst the p.(Gly1961Glu) variant (group D) was associated with slower progression (0.56 {+/-}0.18 mm2/yr). There were also sex differences within groups with males in group A experiencing significantly faster rates of progression of RPE-loss (2.48 {+/-}1.40 mm2/yr vs 0.87 {+/-}0.62 mm2/yr, p=0.047), but lower rates in groups B, C, and D. ConclusionsAIRDetect-OCT, a novel deep learning algorithm, enables large-scale OCT feature quantification in IRD patients uncovering cross-sectional and longitudinal phenotype correlations with demographic and genotypic parameters.

OCT Segmentation Retrospective Clinical In Silico Academic Lab Benchmark SOTA

A Multi-Centric Anthropomorphic 3D CT Phantom-Based Benchmark Dataset for Harmonization

Mohammadreza Amirian, Michael Bach, Oscar Jimenez-del-Toro, Christoph Aberle, Roger Schaer, Vincent Andrearczyk, Jean-Félix Maestrati, Maria Martin Asiain, Kyriakos Flouris, Markus Obmann, Clarisse Dromain, Benoît Dufour, Pierre-Alexandre Alois Poletti, Hendrik von Tengg-Kobligk, Rolf Hügli, Martin Kretzschmar, Hatem Alkadhi, Ender Konukoglu, Henning Müller, Bram Stieltjes, Adrien Depeursinge

•preprint•Jul 2 2025

Artificial intelligence (AI) has introduced numerous opportunities for human assistance and task automation in medicine. However, it suffers from poor generalization in the presence of shifts in the data distribution. In the context of AI-based computed tomography (CT) analysis, significant data distribution shifts can be caused by changes in scanner manufacturer, reconstruction technique or dose. AI harmonization techniques can address this problem by reducing distribution shifts caused by various acquisition settings. This paper presents an open-source benchmark dataset containing CT scans of an anthropomorphic phantom acquired with various scanners and settings, which purpose is to foster the development of AI harmonization techniques. Using a phantom allows fixing variations attributed to inter- and intra-patient variations. The dataset includes 1378 image series acquired with 13 scanners from 4 manufacturers across 8 institutions using a harmonized protocol as well as several acquisition doses. Additionally, we present a methodology, baseline results and open-source code to assess image- and feature-level stability and liver tissue classification, promoting the development of AI harmonization strategies.

CT Image Synthesis Abdominal Dataset Release Phantom/Animal Consortium Open Dataset Open Code

PanTS: The Pancreatic Tumor Segmentation Dataset

Wenxuan Li, Xinze Zhou, Qi Chen, Tianyu Lin, Pedro R. A. S. Bassi, Szymon Plotka, Jaroslaw B. Cwikla, Xiaoxi Chen, Chen Ye, Zheren Zhu, Kai Ding, Heng Li, Kang Wang, Yang Yang, Yucheng Tang, Daguang Xu, Alan L. Yuille, Zongwei Zhou

•preprint•Jul 2 2025

PanTS is a large-scale, multi-institutional dataset curated to advance research in pancreatic CT analysis. It contains 36,390 CT scans from 145 medical centers, with expert-validated, voxel-wise annotations of over 993,000 anatomical structures, covering pancreatic tumors, pancreas head, body, and tail, and 24 surrounding anatomical structures such as vascular/skeletal structures and abdominal/thoracic organs. Each scan includes metadata such as patient age, sex, diagnosis, contrast phase, in-plane spacing, slice thickness, etc. AI models trained on PanTS achieve significantly better performance in pancreatic tumor detection, localization, and segmentation compared to those trained on existing public datasets. Our analysis indicates that these gains are directly attributable to the 16x larger-scale tumor annotations and indirectly supported by the 24 additional surrounding anatomical structures. As the largest and most comprehensive resource of its kind, PanTS offers a new benchmark for developing and evaluating AI models in pancreatic CT analysis.

CT Segmentation Abdominal Dataset Release In Silico Consortium Open Dataset Benchmark SOTA

Filter Papers

Tags

MedFormer: Hierarchical Medical Vision Transformer with Content-Aware Dual Sparse Selection Attention

TABNet: A Triplet Augmentation Self-Recovery Framework with Boundary-Aware Pseudo-Labels for Medical Image Segmentation

CineMyoPS: Segmenting Myocardial Pathologies from Cine Cardiac MR

Multi-task machine learning reveals the functional neuroanatomy fingerprint of mental processing

PiCME: Pipeline for Contrastive Modality Evaluation and Encoding in the MIMIC Dataset

Outcome prediction and individualized treatment effect estimation in patients with large vessel occlusion stroke

Group-derived and individual disconnection in stroke: recovery prediction and deep graph learning

Quantification of Optical Coherence Tomography Features in >3500 Patients with Inherited Retinal Disease Reveals Novel Genotype-Phenotype Associations

A Multi-Centric Anthropomorphic 3D CT Phantom-Based Benchmark Dataset for Harmonization

PanTS: The Pancreatic Tumor Segmentation Dataset

Ready to Sharpen Your Edge?