Sort by:
Page 23 of 2252246 results

Exposing and Mitigating Calibration Biases and Demographic Unfairness in MLLM Few-Shot In-Context Learning for Medical Image Classification

Xing Shen, Justin Szeto, Mingyang Li, Hengguan Huang, Tal Arbel

arxiv logopreprintJun 29 2025
Multimodal large language models (MLLMs) have enormous potential to perform few-shot in-context learning in the context of medical image analysis. However, safe deployment of these models into real-world clinical practice requires an in-depth analysis of the accuracies of their predictions, and their associated calibration errors, particularly across different demographic subgroups. In this work, we present the first investigation into the calibration biases and demographic unfairness of MLLMs' predictions and confidence scores in few-shot in-context learning for medical image classification. We introduce CALIN, an inference-time calibration method designed to mitigate the associated biases. Specifically, CALIN estimates the amount of calibration needed, represented by calibration matrices, using a bi-level procedure: progressing from the population level to the subgroup level prior to inference. It then applies this estimation to calibrate the predicted confidence scores during inference. Experimental results on three medical imaging datasets: PAPILA for fundus image classification, HAM10000 for skin cancer classification, and MIMIC-CXR for chest X-ray classification demonstrate CALIN's effectiveness at ensuring fair confidence calibration in its prediction, while improving its overall prediction accuracies and exhibiting minimum fairness-utility trade-off.

Hierarchical Corpus-View-Category Refinement for Carotid Plaque Risk Grading in Ultrasound

Zhiyuan Zhu, Jian Wang, Yong Jiang, Tong Han, Yuhao Huang, Ang Zhang, Kaiwen Yang, Mingyuan Luo, Zhe Liu, Yaofei Duan, Dong Ni, Tianhong Tang, Xin Yang

arxiv logopreprintJun 29 2025
Accurate carotid plaque grading (CPG) is vital to assess the risk of cardiovascular and cerebrovascular diseases. Due to the small size and high intra-class variability of plaque, CPG is commonly evaluated using a combination of transverse and longitudinal ultrasound views in clinical practice. However, most existing deep learning-based multi-view classification methods focus on feature fusion across different views, neglecting the importance of representation learning and the difference in class features. To address these issues, we propose a novel Corpus-View-Category Refinement Framework (CVC-RF) that processes information from Corpus-, View-, and Category-levels, enhancing model performance. Our contribution is four-fold. First, to the best of our knowledge, we are the foremost deep learning-based method for CPG according to the latest Carotid Plaque-RADS guidelines. Second, we propose a novel center-memory contrastive loss, which enhances the network's global modeling capability by comparing with representative cluster centers and diverse negative samples at the Corpus level. Third, we design a cascaded down-sampling attention module to fuse multi-scale information and achieve implicit feature interaction at the View level. Finally, a parameter-free mixture-of-experts weighting strategy is introduced to leverage class clustering knowledge to weight different experts, enabling feature decoupling at the Category level. Experimental results indicate that CVC-RF effectively models global features via multi-level refinement, achieving state-of-the-art performance in the challenging CPG task.

MedRegion-CT: Region-Focused Multimodal LLM for Comprehensive 3D CT Report Generation

Sunggu Kyung, Jinyoung Seo, Hyunseok Lim, Dongyeong Kim, Hyungbin Park, Jimin Sung, Jihyun Kim, Wooyoung Jo, Yoojin Nam, Namkug Kim

arxiv logopreprintJun 29 2025
The recent release of RadGenome-Chest CT has significantly advanced CT-based report generation. However, existing methods primarily focus on global features, making it challenging to capture region-specific details, which may cause certain abnormalities to go unnoticed. To address this, we propose MedRegion-CT, a region-focused Multi-Modal Large Language Model (MLLM) framework, featuring three key innovations. First, we introduce Region Representative ($R^2$) Token Pooling, which utilizes a 2D-wise pretrained vision model to efficiently extract 3D CT features. This approach generates global tokens representing overall slice features and region tokens highlighting target areas, enabling the MLLM to process comprehensive information effectively. Second, a universal segmentation model generates pseudo-masks, which are then processed by a mask encoder to extract region-centric features. This allows the MLLM to focus on clinically relevant regions, using six predefined region masks. Third, we leverage segmentation results to extract patient-specific attributions, including organ size, diameter, and locations. These are converted into text prompts, enriching the MLLM's understanding of patient-specific contexts. To ensure rigorous evaluation, we conducted benchmark experiments on report generation using the RadGenome-Chest CT. MedRegion-CT achieved state-of-the-art performance, outperforming existing methods in natural language generation quality and clinical relevance while maintaining interpretability. The code for our framework is publicly available.

Frequency-enhanced Multi-granularity Context Network for Efficient Vertebrae Segmentation

Jian Shi, Tianqi You, Pingping Zhang, Hongli Zhang, Rui Xu, Haojie Li

arxiv logopreprintJun 29 2025
Automated and accurate segmentation of individual vertebra in 3D CT and MRI images is essential for various clinical applications. Due to the limitations of current imaging techniques and the complexity of spinal structures, existing methods still struggle with reducing the impact of image blurring and distinguishing similar vertebrae. To alleviate these issues, we introduce a Frequency-enhanced Multi-granularity Context Network (FMC-Net) to improve the accuracy of vertebrae segmentation. Specifically, we first apply wavelet transform for lossless downsampling to reduce the feature distortion in blurred images. The decomposed high and low-frequency components are then processed separately. For the high-frequency components, we apply a High-frequency Feature Refinement (HFR) to amplify the prominence of key features and filter out noises, restoring fine-grained details in blurred images. For the low-frequency components, we use a Multi-granularity State Space Model (MG-SSM) to aggregate feature representations with different receptive fields, extracting spatially-varying contexts while capturing long-range dependencies with linear complexity. The utilization of multi-granularity contexts is essential for distinguishing similar vertebrae and improving segmentation accuracy. Extensive experiments demonstrate that our method outperforms state-of-the-art approaches on both CT and MRI vertebrae segmentation datasets. The source code is publicly available at https://github.com/anaanaa/FMCNet.

CA-Diff: Collaborative Anatomy Diffusion for Brain Tissue Segmentation

Qilong Xing, Zikai Song, Yuteng Ye, Yuke Chen, Youjia Zhang, Na Feng, Junqing Yu, Wei Yang

arxiv logopreprintJun 28 2025
Segmentation of brain structures from MRI is crucial for evaluating brain morphology, yet existing CNN and transformer-based methods struggle to delineate complex structures accurately. While current diffusion models have shown promise in image segmentation, they are inadequate when applied directly to brain MRI due to neglecting anatomical information. To address this, we propose Collaborative Anatomy Diffusion (CA-Diff), a framework integrating spatial anatomical features to enhance segmentation accuracy of the diffusion model. Specifically, we introduce distance field as an auxiliary anatomical condition to provide global spatial context, alongside a collaborative diffusion process to model its joint distribution with anatomical structures, enabling effective utilization of anatomical features for segmentation. Furthermore, we introduce a consistency loss to refine relationships between the distance field and anatomical structures and design a time adapted channel attention module to enhance the U-Net feature fusion procedure. Extensive experiments show that CA-Diff outperforms state-of-the-art (SOTA) methods.

Hierarchical Characterization of Brain Dynamics via State Space-based Vector Quantization

Yanwu Yang, Thomas Wolfers

arxiv logopreprintJun 28 2025
Understanding brain dynamics through functional Magnetic Resonance Imaging (fMRI) remains a fundamental challenge in neuroscience, particularly in capturing how the brain transitions between various functional states. Recently, metastability, which refers to temporarily stable brain states, has offered a promising paradigm to quantify complex brain signals into interpretable, discretized representations. In particular, compared to cluster-based machine learning approaches, tokenization approaches leveraging vector quantization have shown promise in representation learning with powerful reconstruction and predictive capabilities. However, most existing methods ignore brain transition dependencies and lack a quantification of brain dynamics into representative and stable embeddings. In this study, we propose a Hierarchical State space-based Tokenization network, termed HST, which quantizes brain states and transitions in a hierarchical structure based on a state space-based model. We introduce a refined clustered Vector-Quantization Variational AutoEncoder (VQ-VAE) that incorporates quantization error feedback and clustering to improve quantization performance while facilitating metastability with representative and stable token representations. We validate our HST on two public fMRI datasets, demonstrating its effectiveness in quantifying the hierarchical dynamics of the brain and its potential in disease diagnosis and reconstruction performance. Our method offers a promising framework for the characterization of brain dynamics, facilitating the analysis of metastability.

Inpainting is All You Need: A Diffusion-based Augmentation Method for Semi-supervised Medical Image Segmentation

Xinrong Hu, Yiyu Shi

arxiv logopreprintJun 28 2025
Collecting pixel-level labels for medical datasets can be a laborious and expensive process, and enhancing segmentation performance with a scarcity of labeled data is a crucial challenge. This work introduces AugPaint, a data augmentation framework that utilizes inpainting to generate image-label pairs from limited labeled data. AugPaint leverages latent diffusion models, known for their ability to generate high-quality in-domain images with low overhead, and adapts the sampling process for the inpainting task without need for retraining. Specifically, given a pair of image and label mask, we crop the area labeled with the foreground and condition on it during reversed denoising process for every noise level. Masked background area would gradually be filled in, and all generated images are paired with the label mask. This approach ensures the accuracy of match between synthetic images and label masks, setting it apart from existing dataset generation methods. The generated images serve as valuable supervision for training downstream segmentation models, effectively addressing the challenge of limited annotations. We conducted extensive evaluations of our data augmentation method on four public medical image segmentation datasets, including CT, MRI, and skin imaging. Results across all datasets demonstrate that AugPaint outperforms state-of-the-art label-efficient methodologies, significantly improving segmentation performance.

Causality-Adjusted Data Augmentation for Domain Continual Medical Image Segmentation.

Zhu Z, Dong Q, Luo G, Wang W, Dong S, Wang K, Tian Y, Wang G, Li S

pubmed logopapersJun 27 2025
In domain continual medical image segmentation, distillation-based methods mitigate catastrophic forgetting by continuously reviewing old knowledge. However, these approaches often exhibit biases towards both new and old knowledge simultaneously due to confounding factors, which can undermine segmentation performance. To address these biases, we propose the Causality-Adjusted Data Augmentation (CauAug) framework, introducing a novel causal intervention strategy called the Texture-Domain Adjustment Hybrid-Scheme (TDAHS) alongside two causality-targeted data augmentation approaches: the Cross Kernel Network (CKNet) and the Fourier Transformer Generator (FTGen). (1) TDAHS establishes a domain-continual causal model that accounts for two types of knowledge biases by identifying irrelevant local textures (L) and domain-specific features (D) as confounders. It introduces a hybrid causal intervention that combines traditional confounder elimination with a proposed replacement approach to better adapt to domain shifts, thereby promoting causal segmentation. (2) CKNet eliminates confounder L to reduce biases in new knowledge absorption. It decreases reliance on local textures in input images, forcing the model to focus on relevant anatomical structures and thus improving generalization. (3) FTGen causally intervenes on confounder D by selectively replacing it to alleviate biases that impact old knowledge retention. It restores domain-specific features in images, aiding in the comprehensive distillation of old knowledge. Our experiments show that CauAug significantly mitigates catastrophic forgetting and surpasses existing methods in various medical image segmentation tasks. The implementation code is publicly available at: https://github.com/PerceptionComputingLab/CauAug_DCMIS.

Quantifying Sagittal Craniosynostosis Severity: A Machine Learning Approach With CranioRate.

Tao W, Somorin TJ, Kueper J, Dixon A, Kass N, Khan N, Iyer K, Wagoner J, Rogers A, Whitaker R, Elhabian S, Goldstein JA

pubmed logopapersJun 27 2025
ObjectiveTo develop and validate machine learning (ML) models for objective and comprehensive quantification of sagittal craniosynostosis (SCS) severity, enhancing clinical assessment, management, and research.DesignA cross-sectional study that combined the analysis of computed tomography (CT) scans and expert ratings.SettingThe study was conducted at a children's hospital and a major computer imaging institution. Our survey collected expert ratings from participating surgeons.ParticipantsThe study included 195 patients with nonsyndromic SCS, 221 patients with nonsyndromic metopic craniosynostosis (CS), and 178 age-matched controls. Fifty-four craniofacial surgeons participated in rating 20 patients head CT scans.InterventionsComputed tomography scans for cranial morphology assessment and a radiographic diagnosis of nonsyndromic SCS.Main OutcomesAccuracy of the proposed Sagittal Severity Score (SSS) in predicting expert ratings compared to cephalic index (CI). Secondary outcomes compared Likert ratings with SCS status, the predictive power of skull-based versus skin-based landmarks, and assessments of an unsupervised ML model, the Cranial Morphology Deviation (CMD), as an alternative without ratings.ResultsThe SSS achieved significantly higher accuracy in predicting expert responses than CI (<i>P</i> < .05). Likert ratings outperformed SCS status in supervising ML models to quantify within-group variations. Skin-based landmarks demonstrated equivalent predictive power as skull landmarks (<i>P</i> < .05, threshold 0.02). The CMD demonstrated a strong correlation with the SSS (Pearson coefficient: 0.92, Spearman coefficient: 0.90, <i>P</i> < .01).ConclusionsThe SSS and CMD can provide accurate, consistent, and comprehensive quantification of SCS severity. Implementing these data-driven ML models can significantly advance CS care through standardized assessments, enhanced precision, and informed surgical planning.

Automation in tibial implant loosening detection using deep-learning segmentation.

Magg C, Ter Wee MA, Buijs GS, Kievit AJ, Schafroth MU, Dobbe JGG, Streekstra GJ, Sánchez CI, Blankevoort L

pubmed logopapersJun 27 2025
Patients with recurrent complaints after total knee arthroplasty may suffer from aseptic implant loosening. Current imaging modalities do not quantify looseness of knee arthroplasty components. A recently developed and validated workflow quantifies the tibial component displacement relative to the bone from CT scans acquired under valgus and varus load. The 3D analysis approach includes segmentation and registration of the tibial component and bone. In the current approach, the semi-automatic segmentation requires user interaction, adding complexity to the analysis. The research question is whether the segmentation step can be fully automated while keeping outcomes indifferent. In this study, different deep-learning (DL) models for fully automatic segmentation are proposed and evaluated. For this, we employ three different datasets for model development (20 cadaveric CT pairs and 10 cadaveric CT scans) and evaluation (72 patient CT pairs). Based on the performance on the development dataset, the final model was selected, and its predictions replaced the semi-automatic segmentation in the current approach. Implant displacement was quantified by the rotation about the screw-axis, maximum total point motion, and mean target registration error. The displacement parameters of the proposed approach showed a statistically significant difference between fixed and loose samples in a cadaver dataset, as well as between asymptomatic and loose samples in a patient dataset, similar to the outcomes of the current approach. The methodological error calculated on a reproducibility dataset showed values that were not statistically significant different between the two approaches. The results of the proposed and current approaches showed excellent reliability for one and three operators on two datasets. The conclusion is that a full automation in knee implant displacement assessment is feasible by utilizing a DL-based segmentation model while maintaining the capability of distinguishing between fixed and loose implants.
Page 23 of 2252246 results
Show
per page

Ready to Sharpen Your Edge?

Join hundreds of your peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.