Back to all papers

RadGazeGen: radiomics and gaze-guided chest X-ray generation using diffusion models.

June 17, 2026pubmed logopapers

Authors

Bhattacharya M,Singh G,Jain S,Prasanna P

Affiliations (2)

  • Stony Brook University, Stony Brook, New York, United States.
  • Columbia University, New York, United States.

Abstract

We present RadGazeGen, a framework for integrating experts' eye gaze patterns and radiomic feature maps as controls within text-to-image diffusion models to enable high-fidelity medical image generation. Although recent text-to-image diffusion models have achieved impressive success, textual descriptions alone often fail to capture disease-specific details necessary for generating clinically accurate and anatomically faithful images. To address these limitations, RadGazeGen leverages radiologists' eye gaze trajectories and radiomics feature descriptors as spatial and semantic controls in the diffusion process. Eye gaze patterns encode visuo-cognitive attention and spatial localization of subtle disease cues, whereas radiomics features capture subvisual phenotype characteristics such as texture, intensity, and shape. By combining these multimodal cues, the proposed framework guides the generative model toward anatomically consistent and disease-aware image synthesis. The quality of the generated images were evaluated using a board-certified radiologist. RadGazeGen was evaluated on the REFLACX dataset for image generation quality and diversity. Furthermore, to assess its downstream clinical utility, the generated images were used for disease classification tasks on the CheXpert test set ( <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow><mi>n</mi> <mo>=</mo> <mn>500</mn></mrow> </math> ) and for long-tailed learning evaluation on the MIMIC-CXR-LT test set ( <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow><mi>n</mi> <mo>=</mo> <mn>23,550</mn></mrow> </math> ), demonstrating high fidelity and diagnostic relevance of the synthesized images. By jointly conditioning on gaze and radiomic representations, RadGazeGen bridges the gap between human visual cognition and machine perception, improving both realism and clinical validity in medical image generation. This framework underscores the importance of incorporating anatomically grounded and disease-aware controls in diffusion-based medical image synthesis.

Topics

Journal Article

Ready to Sharpen Your Edge?

Subscribe to join 11k+ peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.