How many samples to label for an application given a foundation model? Chest X-ray classification study

October 13, 2025

Authors

Nikolay Nechaev,Evgenia Przhezdzetskaya,Viktor Gombolevskiy,Dmitry Umerenkov,Dmitry Dylov

Abstract

Chest X-ray classification is vital yet resource-intensive, typically demanding extensive annotated data for accurate diagnosis. Foundation models mitigate this reliance, but how many labeled samples are required remains unclear. We systematically evaluate the use of power-law fits to predict the training size necessary for specific ROC-AUC thresholds. Testing multiple pathologies and foundation models, we find XrayCLIP and XraySigLIP achieve strong performance with significantly fewer labeled examples than a ResNet-50 baseline. Importantly, learning curve slopes from just 50 labeled cases accurately forecast final performance plateaus. Our results enable practitioners to minimize annotation costs by labeling only the essential samples for targeted performance.

View Source Full Text PDF

Topics

cs.CV

How many samples to label for an application given a foundation model? Chest X-ray classification study

Authors

Abstract

Tags

Topics

Ready to Sharpen Your Edge?