Back to all papers

Independent evaluation of deep learning models for detecting focal cortical dysplasia.

March 18, 2026pubmed logopapers

Authors

Kaas H,Prener M,Ganz M,Knudsen GM,Pinborg LH,Beliveau V

Affiliations (6)

  • Neurobiology Research Unit, Copenhagen University Hospital Rigshospitalet, Copenhagen, Denmark. Electronic address: [email protected].
  • Neurobiology Research Unit, Copenhagen University Hospital Rigshospitalet, Copenhagen, Denmark. Electronic address: [email protected].
  • Neurobiology Research Unit, Copenhagen University Hospital Rigshospitalet, Copenhagen, Denmark; Department of Computer Science, University of Copenhagen, Copenhagen, Denmark. Electronic address: [email protected].
  • Neurobiology Research Unit, Copenhagen University Hospital Rigshospitalet, Copenhagen, Denmark; Department of Clinical Medicine, Neurology, University of Copenhagen, Copenhagen, Denmark. Electronic address: [email protected].
  • Neurobiology Research Unit, Copenhagen University Hospital Rigshospitalet, Copenhagen, Denmark; Department of Clinical Medicine, Neurology, University of Copenhagen, Copenhagen, Denmark; Epilepsy Clinic, Department of Neurology, Copenhagen University Hospital Rigshospitalet, Copenhagen, Denmark. Electronic address: [email protected].
  • Neurobiology Research Unit, Copenhagen University Hospital Rigshospitalet, Copenhagen, Denmark; Institute for Human Genetics, Medical University of Innsbruck, Innsbruck, Austria. Electronic address: [email protected].

Abstract

The purpose of this study is to perform an independent assessment of three state-of-the-art tools for the detection of focal cortical dysplasia (FCD) from Magnetic Resonance images (MRI). These tools include DeepFCD, the Multi-center Epilepsy Lesion Detection (MELD) Classifier, and MELDGraph. T1-weighted and fluid-attenuated inversion recovery MRIs from 101 epilepsy patients with FCD and 101 epilepsy patients without FCD were retrospectively included. Classifiers were evaluated at patient-level by their ability to correctly identify the presence of any FCD lesions, and at lesion-level by their capacity to identify lesions within regions delineated by neuroradiologists in MRI reports. A calibrated threshold for DeepFCD prediction probabilities was empirically determined to improve classifier specificity. Classifier test-retest consistency was measured using the Dice coefficient on repeated MRI scans of 21 individuals. At patient-level, MELDClassifier achieved 52% accuracy (sensitivity=91%, specificity=14%), MELDGraph reached 61% accuracy (sensitivity=76%, specificity=47%) and DeepFCD performed with 56% accuracy (sensitivity=62%, specificity=50%) at an empirically determined threshold of 0.90. At lesion-level, MELDClassifier performed with a sensitivity of 70% and a positive predictive value (PPV) of 13%. MELDGraph reached 53% sensitivity and PPV of 36%, whereas the DeepFCD performed with 30% sensitivity and PPV of 19%. Test-retest reliability was low, with an average [min, max] Dice coefficient of 0.28 [0.0, 1.0] for MELDClassifier, 0.38 [0.0, 1.0] for MELDGraph, and 0.35 [0.05, 0.54] for DeepFCD. This study highlights the current limitations of using deep learning models in FCD diagnosis and emphasizes the need to enhance the tools' accuracy, reliability, and interpretability to improve clinical utility.

Topics

Journal Article

Ready to Sharpen Your Edge?

Subscribe to join 11k+ peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.