End-to-End PET/CT Interpretation and Quantification with an LLM-Orchestrated AI Agent: A Real-World Pilot Study.
Authors
Affiliations (4)
Affiliations (4)
- Department of Nuclear Medicine, Seoul National University Hospital, Seoul, Republic of Korea; [email protected].
- Department of Nuclear Medicine, Seoul National University College of Medicine, Seoul, Republic of Korea.
- Portrai, Seoul, Republic of Korea; and.
- Department of Thoracic and Cardiovascular Surgery, Seoul National University Hospital, Seoul National University College of Medicine, Seoul, Republic of Korea.
Abstract
Although deep learning models have improved individual PET analysis, image processing, and quantification tasks, end-to-end automation from raw DICOM data to quantitative clinical reporting remains limited, particularly in heterogeneous real-world settings. <b>Methods:</b> As a proof-of-concept, an autonomous large language model (LLM)-orchestrated multitool agent for end-to-end PET/CT interpretation was developed. A reasoning-based, text-based LLM selected appropriate series from raw DICOM, coordinated registration and SUV conversion, invoked segmentation and detection tools, generated maximum-intensity projections, called a vision-enabled LLM for interpretation, and synthesized structured draft reports. The system was retrospectively evaluated in 170 patients undergoing baseline [<sup>18</sup>F]FDG PET/CT for lung cancer staging, with clinical reports serving as the reference standard. <b>Results:</b> The agent successfully completed the full end-to-end workflow from raw DICOM selection to structured draft report generation without human intervention in all 170 examinations. Primary tumor detection achieved 100% sensitivity. For nodal involvement, sensitivity was 84.8% and specificity was 39.4%, whereas distant metastasis detection showed 70.2% sensitivity and 65.0% specificity. Discrepancy analysis of 58 nodal and 57 metastatic mismatch cases revealed systematic false-positive findings related to reactive or physiologic uptake and false-negative findings involving small-volume or anatomically atypical metastases. <b>Conclusion:</b> LLM-orchestrated PET/CT agents enabled workflow-level automation from raw DICOM to quantification and structured draft reporting under real-world conditions. Although primary tumor detection was highly reliable, nodal and metastatic assessment revealed systematic limitations, supporting a collaborative role with continued expert oversight in complex clinical scenarios.