Back to all papers

Quality over quantity: biopsy-anchored CT radiogenomics models outperform all-lesion training in a multi-tumour cohort despite a smaller sample size.

May 16, 2026pubmed logopapers

Authors

Rodríguez Sánchez DI,Middelkoop J,Vanneste T,Maxouri O,Ursprung S,Rostami S,Bogveradze N,Chupetlovska K,Castagnoli F,Landolfi F,Hong EK,Delli Pizzi A,Gennaro N,Jutidamrongphan W,Petrychenko L,Snaebjornsson P,Bodalal Z,Beets-Tan R

Affiliations (19)

  • Department of Radiology, The Netherlands Cancer Institute, Amsterdam, The Netherlands.
  • GROW Research Institute for Oncology and Reproduction, Maastricht University, Maastricht, The Netherlands.
  • Department of Radiology, University Hospital Tuebingen, Tuebingen, Germany.
  • Department of Radiology, American Hospital Tbilisi, Tbilisi, Georgia.
  • Department of Radiology, Royal Marsden Hospital, London, UK.
  • Division of Radiotherapy and Imaging, The Institute of Cancer Research, London, UK.
  • Radiology Unit, Sant'Andrea Hospital, Sapienza University of Rome, Rome, Italy.
  • Department of Radiology, Stanford University, Palo Alto, CA, USA.
  • Department of Innovative Technologies in Medicine & Dentistry, G. d'Annunzio University of Chieti-Pescara, Chieti, Italy.
  • Institute for Advanced Biomedical Technologies, G. d'Annunzio University of Chieti-Pescara, Chieti, Italy.
  • Feinberg School of Medicine, Northwestern University, NMH/Arkes Family Pavilion Suite 800, 676 N Saint Clair, Chicago, IL, 60611, USA.
  • Clinic of Radiology, Imaging Institute of Southern Switzerland (IIMSI), Ente Ospedaliero Cantonale (EOC), 6900, Lugano, Switzerland.
  • Department of Nuclear Medicine, Inselspital, Bern University Hospital, University of Bern, Bern, Switzerland.
  • Department of Pathology, The Netherlands Cancer Institute, Amsterdam, The Netherlands.
  • Faculty of Medicine, University of Iceland, Reykjavik, Iceland.
  • GROW Research Institute for Oncology and Reproduction, Maastricht University, Maastricht, The Netherlands. [email protected].
  • The Netherlands Cancer Institute, Amsterdam, The Netherlands. [email protected].
  • Faculty of Health Sciences, University of Southern Denmark, Odense, Denmark. [email protected].
  • Maastricht Radiation Oncology, Maastricht, The Netherlands. [email protected].

Abstract

Radiogenomics aims to non-invasively predict tumour genotypes from imaging, but most studies assume molecular homogeneity by assigning a single biopsy-derived label to all lesions within a patient. This approach risks substantial label noise given well-documented interlesional heterogeneity. We investigated whether anchoring training to biopsy-confirmed lesions improves radiogenomic model performance and generalisability. We retrospectively analysed 1646 patients (11473 segmented lesions) with contrast-enhanced CT and EGFR mutation status from next-generation sequencing at the Netherlands Cancer Institute, alongside an external NSCLC radiogenomics cohort (n = 158). All visible lesions were segmented, and the exact biopsy site was matched to its segmentation. Radiomic features were extracted, and machine learning models were trained with three lesion selection strategies: all lesions, non-biopsied lesions only, and biopsy-confirmed lesions only. To disentangle label quality from sample size, we created size-matched variants (one lesion per patient) for all-lesion and non-biopsied strategies. All models achieved significant discrimination of EGFR status on internal validation (AUC = 0.62-0.68). However, performance of the all-lesion and non-biopsied models declined on external validation (AUC = 0.55-0.63), while the biopsy-anchored model maintained stable performance (AUC = 0.62), despite having only 1/10th of the training sample size. When training sets were size-matched, the biopsy-anchored approach significantly outperformed a model trained on all available lesions on external validation (p = 0.037). Radiogenomic models trained on biopsy-confirmed lesions outperform conventional all-lesion strategies in external validation, despite using an order of magnitude fewer samples. Prioritising lesion-level label fidelity can mitigate heterogeneity-driven noise, enhancing robustness and clinical translation of imaging-based genomic prediction. Question Does assigning biopsy-derived molecular labels to all lesions introduce heterogeneity-driven label noise that reduces the generalisability of radiogenomic models? Findings Models trained exclusively on biopsy-confirmed lesions demonstrated superior external generalisability compared with all-lesion approaches, despite being trained on substantially fewer samples. Clinical relevance Biopsy-anchored radiogenomics improves the reliability of non-invasive mutation prediction by accounting for tumour heterogeneity, potentially supporting clinical decision-making when tissue sampling is limited or molecular results are discordant across lesions.

Topics

Journal Article

Ready to Sharpen Your Edge?

Subscribe to join 11k+ peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.