Lack of Methodological Rigor and Limited Coverage of Generative AI in Existing AI Reporting Guidelines: A Scoping Review.
Authors
Affiliations (7)
Affiliations (7)
- Evidence-Based Medicine Center, School of Basic Medical Sciences, Lanzhou University, Lanzhou, China; Research Unit of Evidence-Based Evaluation and Guidelines, Chinese Academy of Medical Sciences (2021RU017), School of Basic Medical Sciences, Lanzhou University, Lanzhou, China; World Health Organization Collaboration Center for Guideline Implementation and Knowledge Translation, Lanzhou, China; Institute of Health Data Science, Lanzhou University, Lanzhou, China; Key Laboratory of Evidence Based Medicine of Gansu Province, Lanzhou University, Lanzhou, China.
- The First School of Clinical Medicine, Lanzhou University, Lanzhou, Gansu, 730000, China.
- Department of Health Policy and Health Management, School of Public Health, Lanzhou University, Lanzhou, China; Evidence-Based Social Science Research Center, School of Public Health, Lanzhou University, Lanzhou, China.
- School of Information Science & Engineering, Lanzhou University, Lanzhou, China.
- Department of Computer Science, Hong Kong Baptist University, Hongkong, China.
- Vincent V.C. Woo Chinese Medicine Clinical Research Institute, School of Chinese Medicine, Hong Kong Baptist University, Hong Kong, China; Chinese EQUATOR Centre, Hong Kong, China.
- Evidence-Based Medicine Center, School of Basic Medical Sciences, Lanzhou University, Lanzhou, China; Research Unit of Evidence-Based Evaluation and Guidelines, Chinese Academy of Medical Sciences (2021RU017), School of Basic Medical Sciences, Lanzhou University, Lanzhou, China; World Health Organization Collaboration Center for Guideline Implementation and Knowledge Translation, Lanzhou, China; Institute of Health Data Science, Lanzhou University, Lanzhou, China; Key Laboratory of Evidence Based Medicine of Gansu Province, Lanzhou University, Lanzhou, China. Electronic address: [email protected].
Abstract
This study aimed to systematically map the development methods, scope, and limitations of existing artificial intelligence (AI) reporting guidelines in medicine and to explore their applicability to generative AI (GAI) tools, such as large language models (LLMs). We reported a scoping review adhering to the Preferred Reporting Items for Systematic Reviews and Meta-Analyses Extension for Scoping Reviews (PRISMA-ScR). Five information sources were searched, including MEDLINE (via PubMed), EQUATOR Network, CNKI, FAIRsharing, and Google Scholar, from inception to December 31, 2024. Two reviewers independently screened records and extracted data using a predefined Excel template. Data included guideline characteristics (e.g., development methods, target audience, AI domain), adherence to EQUATOR Network recommendations, and consensus methodologies. Discrepancies were resolved by a third reviewer. 68 AI reporting guidelines were included. 48.5% focused on general AI, while only 7.4% addressed GAI/LLMs. Methodological rigor was limited: 39.7% described development processes, 42.6% involved multidisciplinary experts, and 33.8% followed EQUATOR recommendations. Significant overlap existed, particularly in medical imaging (20.6% of guidelines). GAI-specific guidelines (14.7%) lacked comprehensive coverage and methodological transparency. Existing AI reporting guidelines in medicine have suboptimal methodological rigor, redundancy, and insufficient coverage of GAI applications. Future and updated guidelines should prioritize standardized development processes, multidisciplinary collaboration, and expanded focus on emerging AI technologies like LLMs.