Sort by:
Page 29 of 81808 results

HNOSeg-XS: Extremely Small Hartley Neural Operator for Efficient and Resolution-Robust 3D Image Segmentation.

Wong KCL, Wang H, Syeda-Mahmood T

pubmed logopapersJul 11 2025
In medical image segmentation, convolutional neural networks (CNNs) and transformers are dominant. For CNNs, given the local receptive fields of convolutional layers, long-range spatial correlations are captured through consecutive convolutions and pooling. However, as the computational cost and memory footprint can be prohibitively large, 3D models can only afford fewer layers than 2D models with reduced receptive fields and abstract levels. For transformers, although long-range correlations can be captured by multi-head attention, its quadratic complexity with respect to input size is computationally demanding. Therefore, either model may require input size reduction to allow more filters and layers for better segmentation. Nevertheless, given their discrete nature, models trained with patch-wise training or image downsampling may produce suboptimal results when applied on higher resolutions. To address this issue, here we propose the resolution-robust HNOSeg-XS architecture. We model image segmentation by learnable partial differential equations through the Fourier neural operator which has the zero-shot super-resolution property. By replacing the Fourier transform by the Hartley transform and reformulating the problem in the frequency domain, we created the HNOSeg-XS model, which is resolution robust, fast, memory efficient, and extremely parameter efficient. When tested on the BraTS'23, KiTS'23, and MVSeg'23 datasets with a Tesla V100 GPU, HNOSeg-XS showed its superior resolution robustness with fewer than 34.7k model parameters. It also achieved the overall best inference time (< 0.24 s) and memory efficiency (< 1.8 GiB) compared to the tested CNN and transformer models<sup>1</sup>.

[MP-MRI in the evaluation of non-operative treatment response, for residual and recurrent tumor detection in head and neck cancer].

Gődény M

pubmed logopapersJul 11 2025
As non-surgical therapies gain acceptance in head and neck tumors, the importance of imaging has increased. New therapeutic methods (in radiation therapy, targeted biological therapy, immunotherapy) need better tumor characterization and prognostic information along with the accurate anatomy. Magnetic resonance imaging (MRI) has become the gold standard in head and neck cancer evaluation not only for staging but also for assessing tumor response, posttreatment status and complications, as well as for finding residual or recurrent tumor. Multiparametric anatomical and functional MRI (MP-MRI) is a true cancer imaging biomarker providing, in addition to high resolution tumor anatomy, more molecular and functional, qualitative and quantitative data using diffusion- weighted MRI (DW-MRI) and perfusion-dynamic contrast enhanced MRI (P-DCE-MRI), can improve the assessment of biological target volume and determine treatment response. DW-MRI provides information at the cellular level about the cell density and the integrity of the plasma membrane, based on water movement. P-DCE-MRI provides useful hemodynamic information about tissue vascularity and vascular permeability. Recent studies have shown promising results using radiomics features, MP-MRI has opened new perspectives in oncologic imaging with better realization of the latest technological advances with the help of artificial intelligence.

Oriented tooth detection: a CBCT image processing method integrated with RoI transformer.

Zhao Z, Wu B, Su S, Liu D, Wu Z, Gao R, Zhang N

pubmed logopapersJul 11 2025
Cone beam computed tomography (CBCT) has revolutionized dental imaging due to its high spatial resolution and ability to provide detailed three-dimensional reconstructions of dental structures. This study introduces an innovative CBCT image processing method using an oriented object detection approach integrated with a Region of Interest (RoI) Transformer. This study addresses the challenge of accurate tooth detection and classification in PAN derived from CBCT, introducing an innovative oriented object detection approach, which has not been previously applied in dental imaging. This method better aligns with the natural growth patterns of teeth, allowing for more accurate detection and classification of molars, premolars, canines, and incisors. By integrating RoI transformer, the model demonstrates relatively acceptable performance metrics compared to conventional horizontal detection methods, while also offering enhanced visualization capabilities. Furthermore, post-processing techniques, including distance and grayscale value constraints, are employed to correct classification errors and reduce false positives, especially in areas with missing teeth. The experimental results indicate that the proposed method achieves an accuracy of 98.48%, a recall of 97.21%, an F1 score of 97.21%, and an mAP of 98.12% in tooth detection. The proposed method enhances the accuracy of tooth detection in CBCT-derived PAN by reducing background interference and improving the visualization of tooth orientation.

Implementing Large Language Models in Health Care: Clinician-Focused Review With Interactive Guideline.

Li H, Fu JF, Python A

pubmed logopapersJul 11 2025
Large language models (LLMs) can generate outputs understandable by humans, such as answers to medical questions and radiology reports. With the rapid development of LLMs, clinicians face a growing challenge in determining the most suitable algorithms to support their work. We aimed to provide clinicians and other health care practitioners with systematic guidance in selecting an LLM that is relevant and appropriate to their needs and facilitate the integration process of LLMs in health care. We conducted a literature search of full-text publications in English on clinical applications of LLMs published between January 1, 2022, and March 31, 2025, on PubMed, ScienceDirect, Scopus, and IEEE Xplore. We excluded papers from journals below a set citation threshold, as well as papers that did not focus on LLMs, were not research based, or did not involve clinical applications. We also conducted a literature search on arXiv within the same investigated period and included papers on the clinical applications of innovative multimodal LLMs. This led to a total of 270 studies. We collected 330 LLMs and recorded their application frequency in clinical tasks and frequency of best performance in their context. On the basis of a 5-stage clinical workflow, we found that stages 2, 3, and 4 are key stages in the clinical workflow, involving numerous clinical subtasks and LLMs. However, the diversity of LLMs that may perform optimally in each context remains limited. GPT-3.5 and GPT-4 were the most versatile models in the 5-stage clinical workflow, applied to 52% (29/56) and 71% (40/56) of the clinical subtasks, respectively, and they performed best in 29% (16/56) and 54% (30/56) of the clinical subtasks, respectively. General-purpose LLMs may not perform well in specialized areas as they often require lightweight prompt engineering methods or fine-tuning techniques based on specific datasets to improve model performance. Most LLMs with multimodal abilities are closed-source models and, therefore, lack of transparency, model customization, and fine-tuning for specific clinical tasks and may also pose challenges regarding data protection and privacy, which are common requirements in clinical settings. In this review, we found that LLMs may help clinicians in a variety of clinical tasks. However, we did not find evidence of generalist clinical LLMs successfully applicable to a wide range of clinical tasks. Therefore, their clinical deployment remains challenging. On the basis of this review, we propose an interactive online guideline for clinicians to select suitable LLMs by clinical task. With a clinical perspective and free of unnecessary technical jargon, this guideline may be used as a reference to successfully apply LLMs in clinical settings.

Raptor: Scalable Train-Free Embeddings for 3D Medical Volumes Leveraging Pretrained 2D Foundation Models

Ulzee An, Moonseong Jeong, Simon A. Lee, Aditya Gorla, Yuzhe Yang, Sriram Sankararaman

arxiv logopreprintJul 11 2025
Current challenges in developing foundational models for volumetric imaging data, such as magnetic resonance imaging (MRI), stem from the computational complexity of training state-of-the-art architectures in high dimensions and curating sufficiently large datasets of volumes. To address these challenges, we introduce Raptor (Random Planar Tensor Reduction), a train-free method for generating semantically rich embeddings for volumetric data. Raptor leverages a frozen 2D foundation model, pretrained on natural images, to extract visual tokens from individual cross-sections of medical volumes. These tokens are then spatially compressed using random projections, significantly reducing computational complexity while retaining semantic information. Extensive experiments on ten diverse medical volume tasks verify the superior performance of Raptor over state-of-the-art methods, including those pretrained exclusively on medical volumes (+3% SuPreM, +6% MISFM, +10% Merlin, +13% VoCo, and +14% SLIViT), while entirely bypassing the need for costly training. Our results highlight the effectiveness and versatility of Raptor as a foundation for advancing deep learning-based methods for medical volumes.

Cycle Context Verification for In-Context Medical Image Segmentation

Shishuai Hu, Zehui Liao, Liangli Zhen, Huazhu Fu, Yong Xia

arxiv logopreprintJul 11 2025
In-context learning (ICL) is emerging as a promising technique for achieving universal medical image segmentation, where a variety of objects of interest across imaging modalities can be segmented using a single model. Nevertheless, its performance is highly sensitive to the alignment between the query image and in-context image-mask pairs. In a clinical scenario, the scarcity of annotated medical images makes it challenging to select optimal in-context pairs, and fine-tuning foundation ICL models on contextual data is infeasible due to computational costs and the risk of catastrophic forgetting. To address this challenge, we propose Cycle Context Verification (CCV), a novel framework that enhances ICL-based medical image segmentation by enabling self-verification of predictions and accordingly enhancing contextual alignment. Specifically, CCV employs a cyclic pipeline in which the model initially generates a segmentation mask for the query image. Subsequently, the roles of the query and an in-context pair are swapped, allowing the model to validate its prediction by predicting the mask of the original in-context image. The accuracy of this secondary prediction serves as an implicit measure of the initial query segmentation. A query-specific prompt is introduced to alter the query image and updated to improve the measure, thereby enhancing the alignment between the query and in-context pairs. We evaluated CCV on seven medical image segmentation datasets using two ICL foundation models, demonstrating its superiority over existing methods. Our results highlight CCV's ability to enhance ICL-based segmentation, making it a robust solution for universal medical image segmentation. The code will be available at https://github.com/ShishuaiHu/CCV.

RadiomicsRetrieval: A Customizable Framework for Medical Image Retrieval Using Radiomics Features

Inye Na, Nejung Rue, Jiwon Chung, Hyunjin Park

arxiv logopreprintJul 11 2025
Medical image retrieval is a valuable field for supporting clinical decision-making, yet current methods primarily support 2D images and require fully annotated queries, limiting clinical flexibility. To address this, we propose RadiomicsRetrieval, a 3D content-based retrieval framework bridging handcrafted radiomics descriptors with deep learning-based embeddings at the tumor level. Unlike existing 2D approaches, RadiomicsRetrieval fully exploits volumetric data to leverage richer spatial context in medical images. We employ a promptable segmentation model (e.g., SAM) to derive tumor-specific image embeddings, which are aligned with radiomics features extracted from the same tumor via contrastive learning. These representations are further enriched by anatomical positional embedding (APE). As a result, RadiomicsRetrieval enables flexible querying based on shape, location, or partial feature sets. Extensive experiments on both lung CT and brain MRI public datasets demonstrate that radiomics features significantly enhance retrieval specificity, while APE provides global anatomical context essential for location-based searches. Notably, our framework requires only minimal user prompts (e.g., a single point), minimizing segmentation overhead and supporting diverse clinical scenarios. The capability to query using either image embeddings or selected radiomics attributes highlights its adaptability, potentially benefiting diagnosis, treatment planning, and research on large-scale medical imaging repositories. Our code is available at https://github.com/nainye/RadiomicsRetrieval.

Objective assessment of diagnostic image quality in CT scans: what radiologists and researchers need to know.

Hoeijmakers EJI, Martens B, Wildberger JE, Flohr TG, Jeukens CRLPN

pubmed logopapersJul 10 2025
Quantifying diagnostic image quality (IQ) is not straightforward but essential for optimizing the balance between IQ and radiation dose, and for ensuring consistent high-quality images in CT imaging. This review provides a comprehensive overview of advanced objective reference-free IQ assessment methods for CT scans, beyond standard approaches. A literature search was performed in PubMed and Web of Science up to June 2024 to identify studies using advanced objective image quality methods on clinical CT scans. Only reference-free methods, which do not require a predefined reference image, were included. Traditional methods relying on the standard deviation of the Hounsfield units, the signal-to-noise ratio or contrast-to-noise ratio, all within a manually selected region-of-interest, were excluded. Eligible results were categorized by IQ metric (i.e., noise, contrast, spatial resolution and other) and assessment method (manual, automated, and artificial intelligence (AI)-based). Thirty-five studies were included that proposed or employed reference-free IQ methods, identifying 12 noise assessment methods, 4 contrast assessment methods, 14 spatial resolution assessment methods and 7 others, based on manual, automated or AI-based approaches. This review emphasizes the transition from manual to fully automated approaches for IQ assessment, including the potential of AI-based methods, and it provides a reference tool for researchers and radiologists who need to make a well-considered choice in how to evaluate IQ in CT imaging. This review examines the challenge of quantifying diagnostic CT image quality, essential for optimization studies and ensuring consistent high-quality images, by providing an overview of objective reference-free diagnostic image quality assessment methods beyond standard methods. Quantifying diagnostic CT image quality remains a key challenge. This review summarizes objective diagnostic image quality assessment techniques beyond standard metrics. A decision tree is provided to help select optimal image quality assessment techniques.

Hierarchical deep learning system for orbital fracture detection and trap-door classification on CT images.

Oku H, Nakamura Y, Kanematsu Y, Akagi A, Kinoshita S, Sotozono C, Koizumi N, Watanabe A, Okumura N

pubmed logopapersJul 10 2025
To develop and evaluate a hierarchical deep learning system that detects orbital fractures on computed tomography (CT) images and classifies them as depressed or trap-door types. A retrospective diagnostic accuracy study analyzing CT images from patients with confirmed orbital fractures. We collected CT images from 686 patients with orbital fractures treated at a single institution (2010-2025), resulting in 46,013 orbital CT slices. After preprocessing, 7809 slices were selected as regions of interest and partitioned into training (6508 slices) and test (1301 slices) datasets. Our hierarchical approach consisted of a first-stage classifier (YOLOv8) for fracture detection and a second-stage classifier (Vision Transformer) for distinguishing depressed from trap-door fractures. Performance was evaluated at both slice and patient levels, focusing on accuracy, sensitivity, specificity, and area under the receiver operating characteristic curve (AUC-ROC) at both slice and patient levels. For fracture detection, YOLOv8 achieved a slice-level sensitivity of 80.4 % and specificity of 79.2 %, with patient-level performance improving to 94.7 % sensitivity and 90.0 % specificity. For fracture classification, Vision Transformer demonstrated a slice-level sensitivity of 91.5 % and specificity of 83.5 % for trap-door and depressed fractures, with patient-level metrics of 100 % sensitivity and 88.9 % specificity. The complete system correctly identified 18/20 no-fracture cases, 35/40 depressed fracture cases, and 15/17 trap-door fracture cases. Our hierarchical deep learning system effectively detects orbital fractures and distinguishes between depressed and trap-door types with high accuracy. This approach could aid in the timely identification of trap-door fractures requiring urgent surgical intervention, particularly in settings lacking specialized expertise.

Compressive Imaging Reconstruction via Tensor Decomposed Multi-Resolution Grid Encoding

Zhenyu Jin, Yisi Luo, Xile Zhao, Deyu Meng

arxiv logopreprintJul 10 2025
Compressive imaging (CI) reconstruction, such as snapshot compressive imaging (SCI) and compressive sensing magnetic resonance imaging (MRI), aims to recover high-dimensional images from low-dimensional compressed measurements. This process critically relies on learning an accurate representation of the underlying high-dimensional image. However, existing unsupervised representations may struggle to achieve a desired balance between representation ability and efficiency. To overcome this limitation, we propose Tensor Decomposed multi-resolution Grid encoding (GridTD), an unsupervised continuous representation framework for CI reconstruction. GridTD optimizes a lightweight neural network and the input tensor decomposition model whose parameters are learned via multi-resolution hash grid encoding. It inherently enjoys the hierarchical modeling ability of multi-resolution grid encoding and the compactness of tensor decomposition, enabling effective and efficient reconstruction of high-dimensional images. Theoretical analyses for the algorithm's Lipschitz property, generalization error bound, and fixed-point convergence reveal the intrinsic superiority of GridTD as compared with existing continuous representation models. Extensive experiments across diverse CI tasks, including video SCI, spectral SCI, and compressive dynamic MRI reconstruction, consistently demonstrate the superiority of GridTD over existing methods, positioning GridTD as a versatile and state-of-the-art CI reconstruction method.
Page 29 of 81808 results
Show
per page

Ready to Sharpen Your Edge?

Join hundreds of your peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.