Sort by:
Page 348 of 6646636 results

Nand Kumar Yadav, Rodrigue Rizk, Willium WC Chen, KC

arxiv logopreprintJul 22 2025
Accurate and efficient medical image segmentation is crucial but challenging due to anatomical variability and high computational demands on volumetric data. Recent hybrid CNN-Transformer architectures achieve state-of-the-art results but add significant complexity. In this paper, we propose MLRU++, a Multiscale Lightweight Residual UNETR++ architecture designed to balance segmentation accuracy and computational efficiency. It introduces two key innovations: a Lightweight Channel and Bottleneck Attention Module (LCBAM) that enhances contextual feature encoding with minimal overhead, and a Multiscale Bottleneck Block (M2B) in the decoder that captures fine-grained details via multi-resolution feature aggregation. Experiments on four publicly available benchmark datasets (Synapse, BTCV, ACDC, and Decathlon Lung) demonstrate that MLRU++ achieves state-of-the-art performance, with average Dice scores of 87.57% (Synapse), 93.00% (ACDC), and 81.12% (Lung). Compared to existing leading models, MLRU++ improves Dice scores by 5.38% and 2.12% on Synapse and ACDC, respectively, while significantly reducing parameter count and computational cost. Ablation studies evaluating LCBAM and M2B further confirm the effectiveness of the proposed architectural components. Results suggest that MLRU++ offers a practical and high-performing solution for 3D medical image segmentation tasks. Source code is available at: https://github.com/1027865/MLRUPP

Xiaojiao Xiao, Qinmin Vivian Hu, Guanghui Wang

arxiv logopreprintJul 22 2025
Medical image synthesis plays a crucial role in clinical workflows, addressing the common issue of missing imaging modalities due to factors such as extended scan times, scan corruption, artifacts, patient motion, and intolerance to contrast agents. The paper presents a novel image synthesis network, the Pyramid Hierarchical Masked Diffusion Model (PHMDiff), which employs a multi-scale hierarchical approach for more detailed control over synthesizing high-quality images across different resolutions and layers. Specifically, this model utilizes randomly multi-scale high-proportion masks to speed up diffusion model training, and balances detail fidelity and overall structure. The integration of a Transformer-based Diffusion model process incorporates cross-granularity regularization, modeling the mutual information consistency across each granularity's latent spaces, thereby enhancing pixel-level perceptual accuracy. Comprehensive experiments on two challenging datasets demonstrate that PHMDiff achieves superior performance in both the Peak Signal-to-Noise Ratio (PSNR) and Structural Similarity Index Measure (SSIM), highlighting its capability to produce high-quality synthesized images with excellent structural integrity. Ablation studies further confirm the contributions of each component. Furthermore, the PHMDiff model, a multi-scale image synthesis framework across and within medical imaging modalities, shows significant advantages over other methods. The source code is available at https://github.com/xiaojiao929/PHMDiff

Cedric Zöllner, Simon Reiß, Alexander Jaus, Amroalalaa Sholi, Ralf Sodian, Rainer Stiefelhagen

arxiv logopreprintJul 22 2025
When preoperative planning for surgeries is conducted on the basis of medical images, artificial intelligence methods can support medical doctors during assessment. In this work, we consider medical guidelines for preoperative planning of the transcatheter aortic valve replacement (TAVR) and identify tasks, that may be supported via semantic segmentation models by making relevant anatomical structures measurable in computed tomography scans. We first derive fine-grained TAVR-relevant pseudo-labels from coarse-grained anatomical information, in order to train segmentation models and quantify how well they are able to find these structures in the scans. Furthermore, we propose an adaptation to the loss function in training these segmentation models and through this achieve a +1.27% Dice increase in performance. Our fine-grained TAVR-relevant pseudo-labels and the computed tomography scans we build upon are available at https://doi.org/10.5281/zenodo.16274176.

Lin Xi, Yingliang Ma, Cheng Wang, Sandra Howell, Aldo Rinaldi, Kawal S. Rhode

arxiv logopreprintJul 22 2025
Obtaining pixel-level annotations in the medical domain is both expensive and time-consuming, often requiring close collaboration between clinical experts and developers. Semi-supervised medical image segmentation aims to leverage limited annotated data alongside abundant unlabeled data to achieve accurate segmentation. However, existing semi-supervised methods often struggle to structure semantic distributions in the latent space due to noise introduced by pseudo-labels. In this paper, we propose a novel diffusion-based framework for semi-supervised medical image segmentation. Our method introduces a constraint into the latent structure of semantic labels during the denoising diffusion process by enforcing prototype-based contrastive consistency. Rather than explicitly delineating semantic boundaries, the model leverages class prototypes centralized semantic representations in the latent space as anchors. This strategy improves the robustness of dense predictions, particularly in the presence of noisy pseudo-labels. We also introduce a new publicly available benchmark: Multi-Object Segmentation in X-ray Angiography Videos (MOSXAV), which provides detailed, manually annotated segmentation ground truth for multiple anatomical structures in X-ray angiography videos. Extensive experiments on the EndoScapes2023 and MOSXAV datasets demonstrate that our method outperforms state-of-the-art medical image segmentation approaches under the semi-supervised learning setting. This work presents a robust and data-efficient diffusion model that offers enhanced flexibility and strong potential for a wide range of clinical applications.

Xinyue Yang, Meiliang Liu, Yunfang Xu, Xiaoxiao Yang, Zhengye Si, Zijin Li, Zhiwen Zhao

arxiv logopreprintJul 22 2025
Alzheimer's disease (AD) is a progressive neurodegenerative disorder that predominantly affects the elderly population and currently has no cure. Magnetic Resonance Imaging (MRI), as a non-invasive imaging technique, is essential for the early diagnosis of AD. MRI inherently contains both spatial and frequency information, as raw signals are acquired in the frequency domain and reconstructed into spatial images via the Fourier transform. However, most existing AD diagnostic models extract features from a single domain, limiting their capacity to fully capture the complex neuroimaging characteristics of the disease. While some studies have combined spatial and frequency information, they are mostly confined to 2D MRI, leaving the potential of dual-domain analysis in 3D MRI unexplored. To overcome this limitation, we propose Spatio-Frequency Network (SFNet), the first end-to-end deep learning framework that simultaneously leverages spatial and frequency domain information to enhance 3D MRI-based AD diagnosis. SFNet integrates an enhanced dense convolutional network to extract local spatial features and a global frequency module to capture global frequency-domain representations. Additionally, a novel multi-scale attention module is proposed to further refine spatial feature extraction. Experiments on the Alzheimer's Disease Neuroimaging Initiative (ANDI) dataset demonstrate that SFNet outperforms existing baselines and reduces computational overhead in classifying cognitively normal (CN) and AD, achieving an accuracy of 95.1%.

Sajua GA, Akhib M, Chang Y

pubmed logopapersJul 22 2025
Artificial intelligence (AI)-driven autonomous agents are transforming multiple domains by integrating reasoning, decision-making, and task execution into a unified framework. In medical imaging, such agents have the potential to change workflows by reducing human intervention and optimizing image quality. In this paper, we introduce the AgentMRI. It is an AI-driven system that leverages vision language models (VLMs) for fully autonomous magnetic resonance imaging (MRI) reconstruction in the presence of multiple degradations. Unlike traditional MRI correction or reconstruction methods, AgentMRI does not rely on manual intervention for post-processing or does not rely on fixed correction models. Instead, it dynamically detects MRI corruption and then automatically selects the best correction model for image reconstruction. The framework uses a multi-query VLM strategy to ensure robust corruption detection through consensus-based decision-making and confidence-weighted inference. AgentMRI automatically chooses deep learning models that include MRI reconstruction, motion correction, and denoising models. We evaluated AgentMRI in both zero-shot and fine-tuned settings. Experimental results on a comprehensive brain MRI dataset demonstrate that AgentMRI achieves an average of 73.6% accuracy in zero-shot and 95.1% accuracy for fine-tuned settings. Experiments show that it accurately executes the reconstruction process without human intervention. AgentMRI eliminates manual intervention and introduces a scalable and multimodal AI framework for autonomous MRI processing. This work may build a significant step toward fully autonomous and intelligent MR image reconstruction systems.

Kim JK, Wang MX, Park D, Chang MC

pubmed logopapersJul 22 2025
Accurate assessments of axial vertebral rotation (AVR) is essential for managing idiopathic scoliosis. The Nash-Moe classification method has been extensively used for AVR assessment; however, its subjective nature can lead to measurement variability. Therefore, herein, we propose an automated deep learning (DL) model for AVR assessment based on posteroanterior spinal radiographs. We develop a two-stage DL framework using the MMRotate toolbox and analyze 1080 posteroanterior spinal radiographs of patients aged 4-18 years. The framework comprises a vertebra detection model (864 training and 216 validation images) and a pedicle detection model (14,608 training and 3652 validation images). We improved the Nash-Moe classification method by implementing a 12-segment division system and width ratio metric for precise pedicle assessment. The vertebra and pedicle detection models achieved mean average precision values of 0.909 and 0.905, respectively. The overall classification accuracy was 0.74, with grade-specific performance between 0.70 and 1.00 for precision and 0.33 and 0.93 for recall across Grades 0-3. The proposed DL framework processed complete posteroanterior radiographs in < 5 s per case compared with conventional manual measurements (114 s per radiograph). The best performance was observed in mild to moderate rotation cases, with performance in severe rotation cases limited by insufficient data. The implementation of DL framework for the automated Nash-Moe classification method exhibited satisfactory accuracy and exceptional efficiency. However, this study is limited by low recall (0.33) for Grade 3 and the inability to classify Grade 4 towing to dataset constraints. Further validation using augmented datasets that include severe rotation cases is necessary.

Nebbia G, Kumar S, McNamara SM, Bridge C, Campbell JP, Chiang MF, Mandava N, Singh P, Kalpathy-Cramer J

pubmed logopapersJul 22 2025
Foundation models for medical imaging are a prominent research topic, but risks associated with the imaging features they can capture have not been explored. We aimed to assess whether imaging features from foundation models enable patient re-identification and to relate re-identification to demographic features prediction. Our data included Colour Fundus Photos (CFP), Optical Coherence Tomography (OCT) b-scans, and chest x-rays and we reported re-identification rates of 40.3%, 46.3%, and 25.9%, respectively. We reported varying performance on demographic features prediction depending on re-identification status (e.g., AUC-ROC for gender from CFP is 82.1% for re-identified images vs. 76.8% for non-re-identified ones). When training a deep learning model on the re-identification task, we reported performance of 82.3%, 93.9%, and 63.7% at image level on our internal CFP, OCT, and chest x-ray data. We showed that imaging features extracted from foundation models in ophthalmology and radiology include information that can lead to patient re-identification.

Zhao W, Chen W, Fan L, Shang Y, Wang Y, Situ W, Li W, Liu T, Yuan Y, Liu J

pubmed logopapersJul 22 2025
Liver multiphase enhanced computed tomography (MPECT) is vital in clinical practice, but its utility is limited by various factors. We aimed to develop a deep learning network capable of automatically generating MPECT images from standard non-contrast CT scans. Dataset 1 included 374 patients and was divided into three parts: a training set, a validation set and a test set. Dataset 2 included 144 patients with one specific liver disease and was used as an internal test dataset. We further collected another dataset comprising 83 patients for external validation. Then, we propose a Mask-Adaptive Normalization-based Generative Adversarial Network with Cycle-Consistency Loss (MAN-GAN) to achieve non-contrast CT to MPECT translation. To assess the efficiency of MAN-GAN, we conducted a comparative analysis with state-of-the-art methods commonly employed in diverse medical image synthesis tasks. Moreover, two subjective radiologist evaluation studies were performed to verify the clinical usefulness of the generated images. MAN-GAN outperformed the baseline network and other state-of-the-art methods in all generations of the three phases. These results were verified in internal and external datasets. According to radiological evaluation, the image quality of generated three phase images are all above average. Moreover, the similarities between real images and generated images in all three phases are satisfactory. MAN-GAN demonstrates the feasibility of liver MPECT image translation based on non-contrast images and achieves state-of-the-art performance via the subtraction strategy. It has great potential for solving the dilemma of liver CT contrast canning and aiding further liver interaction clinical scenarios.

Masayoshi K, Hashimoto M, Toda N, Mori H, Kobayashi G, Haque H, So M, Jinzaki M

pubmed logopapersJul 22 2025
Ultrasound examinations, while valuable, are time-consuming and often limited in availability. Consequently, many hospitals implement reservation systems; however, these systems typically lack prioritization for examination purposes. Hence, our hospital uses a waitlist system that prioritizes examination requests based on their clinical value when slots become available due to cancellations. This system, however, requires a manual review of examination purposes, which are recorded in free-form text. We hypothesized that artificial intelligence language models could preliminarily estimate the priority of requests before manual reviews. This study aimed to investigate potential challenges associated with using language models for estimating the priority of medical examination requests and to evaluate the performance of language models in processing Japanese medical texts. We retrospectively collected ultrasound examination requests from the waitlist system at Keio University Hospital, spanning January 2020 to March 2023. Each request comprised an examination purpose documented by the requesting physician and a 6-tier priority level assigned by a radiologist during the clinical workflow. We fine-tuned JMedRoBERTa, Luke, OpenCalm, and LLaMA2 under two conditions: (1) tuning only the final layer and (2) tuning all layers using either standard backpropagation or low-rank adaptation. We had 2335 and 204 requests in the training and test datasets post cleaning. When only the final layers were tuned, JMedRoBERTa outperformed the other models (Kendall coefficient=0.225). With full fine-tuning, JMedRoBERTa continued to perform best (Kendall coefficient=0.254), though with reduced margins compared with the other models. The radiologist's retrospective re-evaluation yielded a Kendall coefficient of 0.221. Language models can estimate the priority of examination requests with accuracy comparable with that of human radiologists. The fine-tuning results indicate that general-purpose language models can be adapted to domain-specific texts (ie, Japanese medical texts) with sufficient fine-tuning. Further research is required to address priority rank ambiguity, expand the dataset across multiple institutions, and explore more recent language models with potentially higher performance or better suitability for this task.
Page 348 of 6646636 results
Show
per page

Ready to Sharpen Your Edge?

Join hundreds of your peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.