CoSyn Boosts Open-Source AI's Medical and Scientific Image Understanding

July 21, 2025

The CoSyn tool leverages synthetic data to help open-source AI models excel at understanding complex, text-rich images such as medical diagrams.

Key Details

  • Penn Engineering and the Allen Institute for AI developed CoSyn to generate scientific charts and diagrams as training data for open-source vision-language models.
  • CoSyn-400K includes over 400,000 synthetic images and 2.7 million sets of instructions, spanning scientific charts, chemical structures, and more.
  • CoSyn-trained models outperformed proprietary systems, including GPT-4V and Gemini 1.5 Flash, on seven benchmarks.
  • A small synthetic dataset (7,000 images) allowed their model to beat others trained on millions of real images for the NutritionQA benchmark.
  • The approach eliminates copyright risks and supports wide, open-source access.

Why It Matters

Synthetic data generation like CoSyn can democratize advanced image understanding for medical and radiology AI, improving model accuracy while addressing data scarcity and copyright barriers. This supports innovation in clinical decision support and scientific research for radiology professionals.

Read more

Ready to Sharpen Your Edge?

Join hundreds of your peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.