Back to all papers

ViConEx-Med: Visual Concept Explainability via Multi-Concept Token Transformer for Medical Image Analysis

October 11, 2025arxiv logopreprint

Authors

Cristiano Patrício,Luís F. Teixeira,João C. Neves

Abstract

Concept-based models aim to explain model decisions with human-understandable concepts. However, most existing approaches treat concepts as numerical attributes, without providing complementary visual explanations that could localize the predicted concepts. This limits their utility in real-world applications and particularly in high-stakes scenarios, such as medical use-cases. This paper proposes ViConEx-Med, a novel transformer-based framework for visual concept explainability, which introduces multi-concept learnable tokens to jointly predict and localize visual concepts. By leveraging specialized attention layers for processing visual and text-based concept tokens, our method produces concept-level localization maps while maintaining high predictive accuracy. Experiments on both synthetic and real-world medical datasets demonstrate that ViConEx-Med outperforms prior concept-based models and achieves competitive performance with black-box models in terms of both concept detection and localization precision. Our results suggest a promising direction for building inherently interpretable models grounded in visual concepts. Code is publicly available at https://github.com/CristianoPatricio/viconex-med.

Topics

cs.CV

Ready to Sharpen Your Edge?

Subscribe to join 7,100+ peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.