Sort by:
Page 36 of 1411402 results

Temporal Representation Learning for Real-Time Ultrasound Analysis

Yves Stebler, Thomas M. Sutter, Ece Ozkan, Julia E. Vogt

arxiv logopreprintSep 1 2025
Ultrasound (US) imaging is a critical tool in medical diagnostics, offering real-time visualization of physiological processes. One of its major advantages is its ability to capture temporal dynamics, which is essential for assessing motion patterns in applications such as cardiac monitoring, fetal development, and vascular imaging. Despite its importance, current deep learning models often overlook the temporal continuity of ultrasound sequences, analyzing frames independently and missing key temporal dependencies. To address this gap, we propose a method for learning effective temporal representations from ultrasound videos, with a focus on echocardiography-based ejection fraction (EF) estimation. EF prediction serves as an ideal case study to demonstrate the necessity of temporal learning, as it requires capturing the rhythmic contraction and relaxation of the heart. Our approach leverages temporally consistent masking and contrastive learning to enforce temporal coherence across video frames, enhancing the model's ability to represent motion patterns. Evaluated on the EchoNet-Dynamic dataset, our method achieves a substantial improvement in EF prediction accuracy, highlighting the importance of temporally-aware representation learning for real-time ultrasound analysis.

Prior-Guided Residual Diffusion: Calibrated and Efficient Medical Image Segmentation

Fuyou Mao, Beining Wu, Yanfeng Jiang, Han Xue, Yan Tang, Hao Zhang

arxiv logopreprintSep 1 2025
Ambiguity in medical image segmentation calls for models that capture full conditional distributions rather than a single point estimate. We present Prior-Guided Residual Diffusion (PGRD), a diffusion-based framework that learns voxel-wise distributions while maintaining strong calibration and practical sampling efficiency. PGRD embeds discrete labels as one-hot targets in a continuous space to align segmentation with diffusion modeling. A coarse prior predictor provides step-wise guidance; the diffusion network then learns the residual to the prior, accelerating convergence and improving calibration. A deep diffusion supervision scheme further stabilizes training by supervising intermediate time steps. Evaluated on representative MRI and CT datasets, PGRD achieves higher Dice scores and lower NLL/ECE values than Bayesian, ensemble, Probabilistic U-Net, and vanilla diffusion baselines, while requiring fewer sampling steps to reach strong performance.

M3Ret: Unleashing Zero-shot Multimodal Medical Image Retrieval via Self-Supervision

Che Liu, Zheng Jiang, Chengyu Fang, Heng Guo, Yan-Jie Zhou, Jiaqi Qu, Le Lu, Minfeng Xu

arxiv logopreprintSep 1 2025
Medical image retrieval is essential for clinical decision-making and translational research, relying on discriminative visual representations. Yet, current methods remain fragmented, relying on separate architectures and training strategies for 2D, 3D, and video-based medical data. This modality-specific design hampers scalability and inhibits the development of unified representations. To enable unified learning, we curate a large-scale hybrid-modality dataset comprising 867,653 medical imaging samples, including 2D X-rays and ultrasounds, RGB endoscopy videos, and 3D CT scans. Leveraging this dataset, we train M3Ret, a unified visual encoder without any modality-specific customization. It successfully learns transferable representations using both generative (MAE) and contrastive (SimDINO) self-supervised learning (SSL) paradigms. Our approach sets a new state-of-the-art in zero-shot image-to-image retrieval across all individual modalities, surpassing strong baselines such as DINOv3 and the text-supervised BMC-CLIP. More remarkably, strong cross-modal alignment emerges without paired data, and the model generalizes to unseen MRI tasks, despite never observing MRI during pretraining, demonstrating the generalizability of purely visual self-supervision to unseen modalities. Comprehensive analyses further validate the scalability of our framework across model and data sizes. These findings deliver a promising signal to the medical imaging community, positioning M3Ret as a step toward foundation models for visual SSL in multimodal medical image understanding.

FocalTransNet: A Hybrid Focal-Enhanced Transformer Network for Medical Image Segmentation.

Liao M, Yang R, Zhao Y, Liang W, Yuan J

pubmed logopapersSep 1 2025
CNNs have demonstrated superior performance in medical image segmentation. To overcome the limitation of only using local receptive field, previous work has attempted to integrate Transformers into convolutional network components such as encoders, decoders, or skip connections. However, these methods can only establish long-distance dependencies for some specific patterns and usually neglect the loss of fine-grained details during downsampling in multi-scale feature extraction. To address the issues, we present a novel hybrid Transformer network called FocalTransNet. specifically, we construct a focal-enhanced (FE) Transformer module by introducing dense cross-connections into a CNN-Transformer dual-path structure and deploy the FE Transformer throughout the entire encoder. Different from existing hybrid networks that employ embedding or stacking strategies, the proposed model allows for a comprehensive extraction and deep fusion of both local and global features at different scales. Besides, we propose a symmetric patch merging (SPM) module for downsampling, which can retain the fine-grained details by stablishing a specific information compensation mechanism. We evaluated the proposed method on four different medical image segmentation benchmarks. The proposed method outperforms previous state-of-the-art convolutional networks, Transformers, and hybrid networks. The code for FocalTransNet is publicly available at https://github.com/nemanjajoe/FocalTransNet.

Automated coronary analysis in ultrahigh-spatial resolution photon-counting detector CT angiography: Clinical validation and intra-individual comparison with energy-integrating detector CT.

Kravchenko D, Hagar MT, Varga-Szemes A, Schoepf UJ, Schoebinger M, O'Doherty J, Gülsün MA, Laghi A, Laux GS, Vecsey-Nagy M, Emrich T, Tremamunno G

pubmed logopapersSep 1 2025
To evaluate a deep-learning algorithm for automated coronary artery analysis on ultrahigh-resolution photon-counting detector coronary computed tomography (CT) angiography and compared its performance to expert readers using invasive coronary angiography as reference. Thirty-two patients (mean age 68.6 years; 81 ​% male) underwent both energy-integrating detector and ultrahigh-resolution photon-counting detector CT within 30 days. Expert readers scored each image using the Coronary Artery Disease-Reporting and Data System classification, and compared to invasive angiography. After a three-month wash-out, one reader reanalyzed the photon-counting detector CT images assisted by the algorithm. Sensitivity, specificity, accuracy, inter-reader agreement, and reading times were recorded for each method. On 401 arterial segments, inter-reader agreement improved from substantial (κ ​= ​0.75) on energy-integrating detector CT to near-perfect (κ ​= ​0.86) on photon-counting detector CT. The algorithm alone achieved 85 ​% sensitivity, 91 ​% specificity, and 90 ​% accuracy on energy-integrating detector CT, and 85 ​%, 96 ​%, and 95 ​% on photon-counting detector CT. Compared to invasive angiography on photon-counting detector CT, manual and automated reads had similar sensitivity (67 ​%), but manual assessment slightly outperformed regarding specificity (85 ​% vs. 79 ​%) and accuracy (84 ​% vs. 78 ​%). When the reader was assisted by the algorithm, specificity rose to 97 ​% (p ​< ​0.001), accuracy to 95 ​%, and reading time decreased by 54 ​% (p ​< ​0.001). This deep-learning algorithm demonstrates high agreement with experts and improved diagnostic performance on photon-counting detector CT. Expert review augmented by the algorithm further increases specificity and dramatically reduces interpretation time.

SpectMamba: Integrating Frequency and State Space Models for Enhanced Medical Image Detection

Yao Wang, Dong Yang, Zhi Qiao, Wenjian Huang, Liuzhi Yang, Zhen Qian

arxiv logopreprintSep 1 2025
Abnormality detection in medical imaging is a critical task requiring both high efficiency and accuracy to support effective diagnosis. While convolutional neural networks (CNNs) and Transformer-based models are widely used, both face intrinsic challenges: CNNs have limited receptive fields, restricting their ability to capture broad contextual information, and Transformers encounter prohibitive computational costs when processing high-resolution medical images. Mamba, a recent innovation in natural language processing, has gained attention for its ability to process long sequences with linear complexity, offering a promising alternative. Building on this foundation, we present SpectMamba, the first Mamba-based architecture designed for medical image detection. A key component of SpectMamba is the Hybrid Spatial-Frequency Attention (HSFA) block, which separately learns high- and low-frequency features. This approach effectively mitigates the loss of high-frequency information caused by frequency bias and correlates frequency-domain features with spatial features, thereby enhancing the model's ability to capture global context. To further improve long-range dependencies, we propose the Visual State-Space Module (VSSM) and introduce a novel Hilbert Curve Scanning technique to strengthen spatial correlations and local dependencies, further optimizing the Mamba framework. Comprehensive experiments show that SpectMamba achieves state-of-the-art performance while being both effective and efficient across various medical image detection tasks.

Added prognostic value of histogram features from preoperative multi-modal diffusion MRI in predicting Ki-67 proliferation for adult-type diffuse gliomas.

Huang Y, He S, Hu H, Ma H, Huang Z, Zeng S, Mazu L, Zhou W, Zhao C, Zhu N, Wu J, Liu Q, Yang Z, Wang W, Shen G, Zhang N, Chu J

pubmed logopapersSep 1 2025
Ki-67 labelling index (LI), a critical marker of tumor proliferation, is vital for grading adult-type diffuse gliomas and predicting patient survival. However, its accurate assessment currently relies on invasive biopsy or surgical resection. This makes it challenging to non-invasively predict Ki-67 LI and subsequent prognosis. Therefore, this study aimed to investigate whether histogram analysis of multi-parametric diffusion model metrics-specifically diffusion tensor imaging (DTI), diffusion kurtosis imaging (DKI), and neurite orientation dispersion and density imaging (NODDI)-could help predict Ki-67 LI in adult-type diffuse gliomas and further predict patient survival. A total of 123 patients with diffuse gliomas who underwent preoperative bipolar spin-echo diffusion magnetic resonance imaging (MRI) were included. Diffusion metrics (DTI, DKI and NODDI) and their histogram features were extracted and used to develop a nomogram model in the training set (n=86), and the performance was verified in the test set (n=37). Area under the receiver operating characteristics curve of the nomogram model was calculated. The outcome cohort, including 123 patients, was used to evaluate the predictive value of the diffusion nomogram model for overall survival (OS). Cox proportion regression was performed to predict OS. Among 123 patients, 87 exhibited high Ki-67 LI (Ki-67 LI >5%). The patients had a mean age of 46.08±13.24 years, with 39 being female. Tumor grading showed 46 cases of grade 2, 21 cases of grade 3, and 56 cases of grade 4. The nomogram model included eight histogram features from diffusion MRI and showed good performance for prediction Ki-67 LI, with area under the receiver operating characteristic curves (AUCs) of 0.92 [95% confidence interval (CI): 0.85-0.98, sensitivity =0.85, specificity =0.84] and 0.84 (95% CI: 0.64-0.98, sensitivity =0.77, specificity =0.73) in the training set and test set, respectively. Further nomogram incorporating these variables showed good discrimination in Ki-67 LI predicting and glioma grading. A low nomogram model score relative to the median value in the outcomes cohort was independently associated with OS (P<0.01). Accurate prediction of the Ki-67 LI in adult-type diffuse glioma patients was achieved by using multi-modal diffusion MRI histogram radiomics model, which also reliably and accurately determined survival. ClinicalTrials.gov Identifier: NCT06572592.

Prediction of lymphovascular invasion in invasive breast cancer via intratumoral and peritumoral multiparametric magnetic resonance imaging machine learning-based radiomics with Shapley additive explanations interpretability analysis.

Chen S, Zhong Z, Chen Y, Tang W, Fan Y, Sui Y, Hu W, Pan L, Liu S, Kong Q, Guo Y, Liu W

pubmed logopapersSep 1 2025
The use of multiparametric magnetic resonance imaging (MRI) in predicting lymphovascular invasion (LVI) in breast cancer has been well-documented in the literature. However, the majority of the related studies have primarily focused on intratumoral characteristics, overlooking the potential contribution of peritumoral features. The aim of this study was to evaluate the effectiveness of multiparametric MRI in predicting LVI by analyzing both intratumoral and peritumoral radiomics features and to assess the added value of incorporating both regions in LVI prediction. A total of 366 patients underwent preoperative breast MRI from two centers and were divided into training (n=208), validation (n=70), and test (n=88) sets. Imaging features were extracted from intratumoral and peritumoral T2-weighted imaging, diffusion-weighted imaging, and dynamic contrast-enhanced MRI. Five models were developed for predicting LVI status based on logistic regression: the tumor area (TA) model, peritumoral area (PA) model, tumor-plus-peritumoral area (TPA) model, clinical model, and combined model. The combined model was created incorporating the highest radiomics score and clinical factors. Predictive efficacy was evaluated via the receiver operating characteristic (ROC) curve and area under the curve (AUC). The Shapley additive explanation (SHAP) method was used to rank the features and explain the final model. The performance of the TPA model was superior to that of the TA and PA models. A combined model was further developed via multivariable logistic regression, with the TPA radiomics score (radscore), MRI-assessed axillary lymph node (ALN) status, and peritumoral edema (PE) being incorporated. The combined model demonstrated good calibration and discrimination performance across the training, validation, and test datasets, with AUCs of 0.888 [95% confidence interval (CI): 0.841-0.934], 0.856 (95% CI: 0.769-0.943), and 0.853 (95% CI: 0.760-0.946), respectively. Furthermore, we conducted SHAP analysis to evaluate the contributions of TPA radscore, MRI-ALN status, and PE in LVI status prediction. The combined model, incorporating clinical factors and intratumoral and peritumoral radscore, effectively predicts LVI and may potentially aid in tailored treatment planning.

Improved image quality and diagnostic performance of coronary computed tomography angiography-derived fractional flow reserve with super-resolution deep learning reconstruction.

Zou LM, Xu C, Xu M, Xu KT, Wang M, Wang Y, Wang YN

pubmed logopapersSep 1 2025
Super-resolution deep learning reconstruction (SR-DLR) algorithm has emerged as a promising image reconstruction technique for improving the image quality of coronary computed tomography angiography (CCTA) and ensuring accurate CCTA-derived fractional flow reserve (CT-FFR) assessments even in problematic scenarios (e.g., the presence of heavily calcified plaque and stent implantation). Therefore, the purposes of this study were to evaluate the image quality of CCTA obtained with SR-DLR in comparison with conventional reconstruction methods and to investigate the diagnostic performances of different reconstruction approaches based on CT-FFR. Fifty patients who underwent CCTA and subsequent invasive coronary angiography (ICA) were retrospectively included. All images were reconstructed with hybrid iterative reconstruction (HIR), model-based iterative reconstruction (MBIR), conventional deep learning reconstruction (C-DLR), and SR-DLR algorithms. Objective parameters and subjective scores were compared. Among the patients, 22-comprising 45 lesions-had invasive FFR results as a reference, and the diagnostic performance of different reconstruction approaches based on CT-FFR were compared. SR-DLR achieved the lowest image noise, highest signal-to-noise ratio (SNR), and best edge sharpness (all P values <0.05), as well as the best subjective scores from both reviewers (all P values <0.001). With FFR serving as a reference, the specificity and positive predictive value (PPV) were improved as compared with HIR and C-DLR (72% <i>vs.</i> 36-44% and 73% <i>vs.</i> 53-58%, respectively); moreover, SR-DLR improved the sensitivity and negative predictive value (NPV) as compared to MBIR (95% <i>vs.</i> 70% and 95% <i>vs.</i> 68%, respectively; all P values <0.05). The overall diagnostic accuracy and area under the curve (AUC) for SR-DLR were significantly higher than those of the HIR, MBIR, and C-DLR algorithms (82% <i>vs.</i> 60-67% and 0.84 <i>vs.</i> 0.61-0.70, respectively; all P values <0.05). SR-DLR had the best image quality for both objective and subjective evaluation. The diagnostic performances of CT-FFR were improved by SR-DLR, enabling more accurate assessment of flow-limiting lesions.

Cross-channel feature transfer 3D U-Net for automatic segmentation of the perilymph and endolymph fluid spaces in hydrops MRI.

Yoo TW, Yeo CD, Lee EJ, Oh IS

pubmed logopapersSep 1 2025
The identification of endolymphatic hydrops (EH) using magnetic resonance imaging (MRI) is crucial for understanding inner ear disorders such as Meniere's disease and sudden low-frequency hearing loss. The EH ratio is calculated as the ratio of the endolymphatic fluid space to the perilymphatic fluid space. We propose a novel cross-channel feature transfer (CCFT) 3D U-Net for fully automated segmentation of the perilymphatic and endolymphatic fluid spaces in hydrops MRI. The model exhibits state-of-the-art performance in segmenting the endolymphatic fluid space by transferring magnetic resonance cisternography (MRC) features to HYDROPS-Mi2 (HYbriD of Reversed image Of Positive endolymph signal and native image of positive perilymph Signal multiplied with the heavily T2-weighted MR cisternography). Experimental results using the CCFT module showed that the segmentation performance of the perilymphatic space was 0.9459 for the Dice similarity coefficient (DSC) and 0.8975 for the intersection over union (IOU), and that of the endolymphatic space was 0.8053 for the DSC and 0.6778 for the IOU.
Page 36 of 1411402 results
Show
per page

Ready to Sharpen Your Edge?

Join hundreds of your peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.