Sort by:
Page 211 of 6546537 results

Yao Wang, Dong Yang, Zhi Qiao, Wenjian Huang, Liuzhi Yang, Zhen Qian

arxiv logopreprintSep 1 2025
Abnormality detection in medical imaging is a critical task requiring both high efficiency and accuracy to support effective diagnosis. While convolutional neural networks (CNNs) and Transformer-based models are widely used, both face intrinsic challenges: CNNs have limited receptive fields, restricting their ability to capture broad contextual information, and Transformers encounter prohibitive computational costs when processing high-resolution medical images. Mamba, a recent innovation in natural language processing, has gained attention for its ability to process long sequences with linear complexity, offering a promising alternative. Building on this foundation, we present SpectMamba, the first Mamba-based architecture designed for medical image detection. A key component of SpectMamba is the Hybrid Spatial-Frequency Attention (HSFA) block, which separately learns high- and low-frequency features. This approach effectively mitigates the loss of high-frequency information caused by frequency bias and correlates frequency-domain features with spatial features, thereby enhancing the model's ability to capture global context. To further improve long-range dependencies, we propose the Visual State-Space Module (VSSM) and introduce a novel Hilbert Curve Scanning technique to strengthen spatial correlations and local dependencies, further optimizing the Mamba framework. Comprehensive experiments show that SpectMamba achieves state-of-the-art performance while being both effective and efficient across various medical image detection tasks.

Yves Stebler, Thomas M. Sutter, Ece Ozkan, Julia E. Vogt

arxiv logopreprintSep 1 2025
Ultrasound (US) imaging is a critical tool in medical diagnostics, offering real-time visualization of physiological processes. One of its major advantages is its ability to capture temporal dynamics, which is essential for assessing motion patterns in applications such as cardiac monitoring, fetal development, and vascular imaging. Despite its importance, current deep learning models often overlook the temporal continuity of ultrasound sequences, analyzing frames independently and missing key temporal dependencies. To address this gap, we propose a method for learning effective temporal representations from ultrasound videos, with a focus on echocardiography-based ejection fraction (EF) estimation. EF prediction serves as an ideal case study to demonstrate the necessity of temporal learning, as it requires capturing the rhythmic contraction and relaxation of the heart. Our approach leverages temporally consistent masking and contrastive learning to enforce temporal coherence across video frames, enhancing the model's ability to represent motion patterns. Evaluated on the EchoNet-Dynamic dataset, our method achieves a substantial improvement in EF prediction accuracy, highlighting the importance of temporally-aware representation learning for real-time ultrasound analysis.

Emmanouil Nikolakakis, Amine Ouasfi, Julie Digne, Razvan Marinescu

arxiv logopreprintSep 1 2025
We present RibPull, a methodology that utilizes implicit occupancy fields to bridge computational geometry and medical imaging. Implicit 3D representations use continuous functions that handle sparse and noisy data more effectively than discrete methods. While voxel grids are standard for medical imaging, they suffer from resolution limitations, topological information loss, and inefficient handling of sparsity. Coordinate functions preserve complex geometrical information and represent a better solution for sparse data representation, while allowing for further morphological operations. Implicit scene representations enable neural networks to encode entire 3D scenes within their weights. The result is a continuous function that can implicitly compesate for sparse signals and infer further information about the 3D scene by passing any combination of 3D coordinates as input to the model. In this work, we use neural occupancy fields that predict whether a 3D point lies inside or outside an object to represent CT-scanned ribcages. We also apply a Laplacian-based contraction to extract the medial axis of the ribcage, thus demonstrating a geometrical operation that benefits greatly from continuous coordinate-based 3D scene representations versus voxel-based representations. We evaluate our methodology on 20 medical scans from the RibSeg dataset, which is itself an extension of the RibFrac dataset. We will release our code upon publication.

Che Liu, Zheng Jiang, Chengyu Fang, Heng Guo, Yan-Jie Zhou, Jiaqi Qu, Le Lu, Minfeng Xu

arxiv logopreprintSep 1 2025
Medical image retrieval is essential for clinical decision-making and translational research, relying on discriminative visual representations. Yet, current methods remain fragmented, relying on separate architectures and training strategies for 2D, 3D, and video-based medical data. This modality-specific design hampers scalability and inhibits the development of unified representations. To enable unified learning, we curate a large-scale hybrid-modality dataset comprising 867,653 medical imaging samples, including 2D X-rays and ultrasounds, RGB endoscopy videos, and 3D CT scans. Leveraging this dataset, we train M3Ret, a unified visual encoder without any modality-specific customization. It successfully learns transferable representations using both generative (MAE) and contrastive (SimDINO) self-supervised learning (SSL) paradigms. Our approach sets a new state-of-the-art in zero-shot image-to-image retrieval across all individual modalities, surpassing strong baselines such as DINOv3 and the text-supervised BMC-CLIP. More remarkably, strong cross-modal alignment emerges without paired data, and the model generalizes to unseen MRI tasks, despite never observing MRI during pretraining, demonstrating the generalizability of purely visual self-supervision to unseen modalities. Comprehensive analyses further validate the scalability of our framework across model and data sizes. These findings deliver a promising signal to the medical imaging community, positioning M3Ret as a step toward foundation models for visual SSL in multimodal medical image understanding.

Cheng Cheng, Zeping Chen, Rui Xie, Peiyao Zheng, Xavier Wang

arxiv logopreprintSep 1 2025
Accurately predicting early recurrence in brain tumor patients following surgical resection remains a clinical challenge. This study proposes a multi-modal machine learning framework that integrates structural MRI features with clinical biomarkers to improve postoperative recurrence prediction. We employ four machine learning algorithms -- Gradient Boosting Machine (GBM), Random Survival Forest (RSF), CoxBoost, and XGBoost -- and validate model performance using concordance index (C-index), time-dependent AUC, calibration curves, and decision curve analysis. Our model demonstrates promising performance, offering a potential tool for risk stratification and personalized follow-up planning.

Cheong S, Gupta R, Kadaba Sridhar S, Hall AJ, Frankel M, Wright DW, Sham YY, Samadani U

pubmed logopapersSep 1 2025
This post hoc study of the Progesterone for Traumatic Brain Injury, Experimental Clinical Treatment (ProTECT) III trial investigates whether improving traumatic brain injury (TBI) classification, using serum biomarkers (glial fibrillary acidic protein [GFAP] and ubiquitin carboxyl-terminal esterase L1 [UCH-L1]) and algorithmically assessed total lesion volume, could identify a subset of responders to progesterone treatment, beyond broad measures like the Glasgow Coma Scale (GCS) and Glasgow Outcome Scale-Extended (GOS-E), which may fail to capture subtle changes in TBI recovery. Brain lesion volumes on CT scans were quantified using Brain Lesion Analysis and Segmentation Tool for CT. Patients were classified into true-positive and true-negative groups based on an optimization scheme to determine a threshold that maximizes agreement between radiological assessment and objectively measured lesion volume. True-positives were further categorized into low (> 0.2-10 mL), medium (> 10-50 mL), and high (> 50 mL) lesion volumes for analysis with protein biomarkers and injury severity. Correlation analyses linked Rotterdam scores (RSs) with biomarker levels and lesion volumes, whereas Welch's t-test evaluated biomarker differences between groups and progesterone's effects. Forty-nine level 1 trauma centers in the United States. Patients with moderate-to-severe TBI. Progesterone. GFAP and UCH-L1 levels were significantly higher in true-positive cases with low to medium lesion volume. Only UCH-L1 differed between progesterone and placebo groups at 48 hours. Both biomarkers and lesion volume in the true-positive group correlated with the RS. No sex-specific or treatment differences were found. This study reaffirms elevated levels of GFAP and UCH-L1 as biomarkers for detecting TBI in patients with brain lesions and for predicting clinical outcomes. Despite improved classification using CT-imaging segmentation and serum biomarkers, we did not identify a subset of progesterone responders within 24 or 48 hours of progesterone treatment. More rigorous and quantifiable measures for classifying the nature of injury may be needed to enable development of therapeutics as neither serum markers nor algorithmic CT analysis performed better than the older metrics of Rotterdam or GCS metrics.

Cao Z, Zhang J, Lin C, Li T, Wu H, Zhang Y

pubmed logopapersSep 1 2025
This study explored a generative image synthesis method based on diffusion models, potentially providing a low-cost and high-efficiency training data augmentation strategy for medical artificial intelligence (AI) applications. The MedMNIST v2 dataset was utilized as a small-volume training dataset under low-performance computing conditions. Based on the characteristics of existing samples, new medical images were synthesized using the proposed annotated diffusion model. In addition to observational assessment, quantitative evaluation was performed based on the gradient descent of the loss function during the generation process and the Fréchet Inception Distance (FID), using various loss functions and feature vector dimensions. Compared to the original data, the proposed diffusion model successfully generated medical images of similar styles but with dramatically varied anatomic details. The model trained with the Huber loss function achieved a higher FID of 15.2 at a feature vector dimension of 2048, compared with the model trained with the L2 loss function, which achieved the best FID of 0.85 at a feature vector dimension of 64. The use of the Huber loss enhanced model robustness, while FID values indicated acceptable similarity between generated and real images. Future work should explore the application of these models to more complex datasets and clinical scenarios. This study demonstrated that diffusion model-based medical image synthesis is potentially applicable as an augmentation strategy for AI, particularly in situations where access to real clinical data is limited. Optimal training parameters were also proposed by evaluating the dimensionality of feature vectors in FID calculations and the complexity of loss functions.

Li Y, Ding J, Du F, Wang Z, Liu Z, Liu Y, Zhou Y, Zhang Q

pubmed logopapersSep 1 2025
Locally advanced rectal cancer (LARC) exhibits significant heterogeneity in response to neoadjuvant chemotherapy (NAC), with poor responders facing delayed treatment and unnecessary toxicity. Although MRI provides spatial pathophysiological information and proteomics reveals molecular mechanisms, current single-modal approaches cannot integrate these complementary perspectives, resulting in limited predictive accuracy and biological insight. This retrospective study developed a multimodal deep learning framework using a cohort of 274 LARC patients treated with NAC (2012-2021). Graph neural networks analyzed proteomic profiles from FFPE tissues, incorporating KEGG/GO pathways and PPI networks, while a spatially enhanced 3D ResNet152 processed T2WI. A LightGBM classifier integrated both modalities with clinical features using zero-imputation for missing data. Model performance was assessed through AUC-ROC, decision curve analysis, and interpretability techniques (SHAP and Grad-CAM). The integrated model achieved superior NAC response prediction (test AUC 0.828, sensitivity 0.875, specificity 0.750), significantly outperforming single-modal approaches (MRI ΔAUC +0.109; proteomics ΔAUC +0.125). SHAP analysis revealed MRI-derived features contributed 57.7% of predictive power, primarily through peritumoral stromal heterogeneity quantification. Proteomics identified 10 key chemoresistance proteins, including CYBA, GUSB, ATP6AP2, DYNC1I2, DAD1, ACOX1, COPG1, FBP1, DHRS7, and SSR3. Decision curve analysis confirmed clinical utility across threshold probabilities (0-0.75). Our study established a novel MRI-proteomics integration framework for NAC response prediction, with MRI defining spatial resistance patterns and proteomics deciphering molecular drivers, enabling early organ preservation strategies. The zero-imputation design ensured deplorability in diverse clinical settings.

He G, Cheng W, Zhu H, Yu G

pubmed logopapersSep 1 2025
The hippocampus is an important brain structure involved in various psychiatric disorders, and its automatic and accurate segmentation is vital for studying these diseases. Recently, deep learning-based methods have made significant progress in hippocampus segmentation. However, training deep neural network models requires substantial computational resources, time, and a large amount of labeled training data, which is frequently scarce in medical image segmentation. To address these issues, we propose LoRA-PT, a novel parameter-efficient fine-tuning (PEFT) method that transfers the pre-trained UNETR model from the BraTS2021 dataset to the hippocampus segmentation task. Specifically, LoRA-PT divides the parameter matrix of the transformer structure into three distinct sizes, yielding three third-order tensors. These tensors are decomposed using tensor singular value decomposition to generate low-rank tensors consisting of the principal singular values and vectors, with the remaining singular values and vectors forming the residual tensor. During fine-tuning, only the low-rank tensors (i.e., the principal tensor singular values and vectors) are updated, while the residual tensors remain unchanged. We validated the proposed method on three public hippocampus datasets, and the experimental results show that LoRA-PT outperformed state-of-the-art PEFT methods in segmentation accuracy while significantly reducing the number of parameter updates. Our source code is available at https://github.com/WangangCheng/LoRA-PT/tree/LoRA-PT.

Reddy KD, Patil A

pubmed logopapersAug 31 2025
Chest X-ray (CXR) is a challenging problem in automated medical diagnosis, where complex visual patterns of thoracic diseases must be precisely identified through multi-label classification and lesion localization. Current approaches typically consider classification and localization in isolation, resulting in a piecemeal system that does not exploit common representations and is often not clinically interpretable, as well as limited in handling multi-label diseases. Although multi-task learning frameworks, such as DeepChest and CLN, appear to meet this goal, they suffer from task interference and poor explainability, which limits their practical application in real-world clinical workflows. To address these limitations, we present a unified multi-task deep learning framework, CXR-MultiTaskNet, for simultaneously classifying thoracic diseases and localizing lesions in chest X-rays. Our framework comprises a standard ResNet50 feature extractor, two task-specific heads for multi-task learning, and a Grad-CAM-based explainability module that provides accurate predictions and enhances clinical explainability. We formulate a joint loss that weighs the relative importance of representation extraction, which is large due to class variations, and the final loss, which is larger in the detection loss that occurs in extreme class imbalances between days and the detectability of varying disease manifestation types. Recent advances made by deep learning methods in the identification of disease in chest X-ray images are promising; however, there are limitations in their performance for complete analysis due to the lack of interpretability, some inherent weaknesses of convolutional neural networks (CNN), and prior learning of classification at the image level before localization of the disease. In this paper, we propose a dual-attention-based hierarchical feature extraction approach, which addresses the challenges of deep learning in detecting diseases in chest X-ray images. Through the use of visual attention maps, the detection steps can be better tracked, and therefore, the entire process is made more interpretable than with a traditional CNN-embedding model. We also manage to obtain both disease-level and pixel-level predictions, which enable explainable and comprehensive analysis of each image and aid in localizing each detected abnormality area. The proposed approach was further optimized for X-ray images by computing the objective losses during training, which ultimately gives higher significance to smaller lesions. Experimental evaluations on a benchmark chest X-ray dataset demonstrate the potential of the proposed approach achieving a macro F1-score of 0.965 (0.968 micro F1-score) for disease classification and mean IoU of 0.851 ([email protected]) for localization of diseases Content: Model intepretability, Chest X-ray image disease detection, Detection region localization, Weakly supervised transfer learning Lesion localization → 5 of 0.927 Compared to state-of-the-art single-task and multi-task baselines, these results are consistently better. The presented framework provides an integrated, method-based approach to chest X-ray analysis that is clinically useful, interpretable, and scalable for automation, allowing for efficient diagnostic pathways and enhanced clinical decision-making. This single framework can serve as a router for next-gen explainable AI in radiology.
Page 211 of 6546537 results
Show
per page

Ready to Sharpen Your Edge?

Join hundreds of your peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.