What Does the GPT-5 Mean for Medical Imaging?

August 8, 2025

Image from OpenAI

The long-awaiting GPT-5 was released by OpenAI on August 7, 2025, marking a significant development in the field of artificial intelligence. To capture the scale of this new release, OpenAI CEO Sam Altman offered a clear analogy during a recent press briefing.

GPT-3 sort of felt like talking to a high school student. GPT-4 felt like you’re talking to a college student. GPT-5 is the first time that it really feels like talking to a PhD-level expert.

This launch marks a significant strategic shift. While previous models have demonstrated medical knowledge, this is the first time OpenAI has officially made healthcare a central pillar of a model's release, signaling a deliberate focus on the domain. This article will break down what this new "expert-level" AI means for the field of radiology.

Performance on a Key Medical Benchmark

A centerpiece of the GPT-5 launch is its performance on HealthBench, a benchmark created with input from hundreds of physicians to test an AI's clinical reasoning. The most critical detail for our field is that this benchmark was exclusively text-based. The AI was not evaluated on the direct analysis of medical images like X-rays or CT scans.

The performance data shows two major advancements:

Capability on Difficult Cases: On the "HealthBench Hard" dataset, a set of questions where prior model GPT-4o scored 0%, GPT-5 achieved a score of 46.2%. This demonstrates a fundamentally improved ability to reason through complex problems.
Performance on HealthBench Hard
Drastically Reduced Error Rate: On these same challenging cases, the rate of "hallucinations" (inaccurate or fabricated responses) dropped significantly. GPT-5's error rate was just 1.6%, compared to 15.8% for the older GPT-4o.
HealthBench Hard Hallucinations Inaccuracy

Potential Applications in the Radiology Workflow

While direct diagnostic use is not yet a reality, GPT-5's validated expertise in text processing points to several potential applications that could support a radiology practice:

Reporting Assistance: It could assist by drafting structured reports from dictated findings, ready for a radiologist's final review and sign-off.
Clinical Context Summarization: The model could quickly process a patient's EMR to provide a concise clinical history, offering relevant context prior to interpretation.
Patient-Friendly Communication: It could translate the technical language of a final report into a simplified summary for patients or referring physicians.

The Question of Image Analysis

To gauge GPT-5's future potential for image analysis, we can look at its performance on general multimodal benchmarks. A key example is MMMU (Massive Multi-discipline Multimodal Understanding), a benchmark that tests an AI’s ability to solve college and graduate-level problems using a combination of text, charts, diagrams, and images.

The results show a strong aptitude for general visual reasoning:

On the standard college-level MMMU benchmark, GPT-5 scores up to 84.2%.
Performance on MMMU Benchmark
On the more challenging graduate-level MMMU-Pro, it scores up to 78.4%.
Performance on MMMU-Pro Benchmark

These scores, which outperform prior models, demonstrate a high-level capability for understanding complex visual information. However, it is crucial to place this in the correct context. This impressive general visual skill is not the same as the highly specialized, nuanced skill of medical image interpretation.

How to Access and Test GPT-5

You can begin experimenting with GPT-5's capabilities directly through the standard ChatGPT interface. OpenAI has made the new model accessible to all users, with different tiers of availability.

For those using the free version of ChatGPT, there may be a daily usage limit for GPT-5. This limit is generally sufficient for performing simple tests or getting a feel for its improved reasoning. For physicians and researchers who wish to conduct more extensive testing, upgrading to a ChatGPT Plus subscription offers significantly higher usage limits and priority access.

To get started, you can log in to the ChatGPT website and begin testing with text prompts or by uploading an anonymized image. For a more detailed, step-by-step guide on this experimental process, you can refer to this resource: A Guide on Using ChatGPT to Interpret X-Rays.

Conclusion

GPT-5 represents a significant advance in AI, particularly in its ability to handle complex medical information in text, backed by a newly deliberate focus on the healthcare domain from OpenAI.