Sort by:
Page 27 of 55548 results

APTOS-2024 challenge report: Generation of synthetic 3D OCT images from fundus photographs

Bowen Liu, Weiyi Zhang, Peranut Chotcomwongse, Xiaolan Chen, Ruoyu Chen, Pawin Pakaymaskul, Niracha Arjkongharn, Nattaporn Vongsa, Xuelian Cheng, Zongyuan Ge, Kun Huang, Xiaohui Li, Yiru Duan, Zhenbang Wang, BaoYe Xie, Qiang Chen, Huazhu Fu, Michael A. Mahr, Jiaqi Qu, Wangyiyang Chen, Shiye Wang, Yubo Tan, Yongjie Li, Mingguang He, Danli Shi, Paisan Ruamviboonsuk

arxiv logopreprintJun 9 2025
Optical Coherence Tomography (OCT) provides high-resolution, 3D, and non-invasive visualization of retinal layers in vivo, serving as a critical tool for lesion localization and disease diagnosis. However, its widespread adoption is limited by equipment costs and the need for specialized operators. In comparison, 2D color fundus photography offers faster acquisition and greater accessibility with less dependence on expensive devices. Although generative artificial intelligence has demonstrated promising results in medical image synthesis, translating 2D fundus images into 3D OCT images presents unique challenges due to inherent differences in data dimensionality and biological information between modalities. To advance generative models in the fundus-to-3D-OCT setting, the Asia Pacific Tele-Ophthalmology Society (APTOS-2024) organized a challenge titled Artificial Intelligence-based OCT Generation from Fundus Images. This paper details the challenge framework (referred to as APTOS-2024 Challenge), including: the benchmark dataset, evaluation methodology featuring two fidelity metrics-image-based distance (pixel-level OCT B-scan similarity) and video-based distance (semantic-level volumetric consistency), and analysis of top-performing solutions. The challenge attracted 342 participating teams, with 42 preliminary submissions and 9 finalists. Leading methodologies incorporated innovations in hybrid data preprocessing or augmentation (cross-modality collaborative paradigms), pre-training on external ophthalmic imaging datasets, integration of vision foundation models, and model architecture improvement. The APTOS-2024 Challenge is the first benchmark demonstrating the feasibility of fundus-to-3D-OCT synthesis as a potential solution for improving ophthalmic care accessibility in under-resourced healthcare settings, while helping to expedite medical research and clinical applications.

Snap-and-tune: combining deep learning and test-time optimization for high-fidelity cardiovascular volumetric meshing

Daniel H. Pak, Shubh Thaker, Kyle Baylous, Xiaoran Zhang, Danny Bluestein, James S. Duncan

arxiv logopreprintJun 9 2025
High-quality volumetric meshing from medical images is a key bottleneck for physics-based simulations in personalized medicine. For volumetric meshing of complex medical structures, recent studies have often utilized deep learning (DL)-based template deformation approaches to enable fast test-time generation with high spatial accuracy. However, these approaches still exhibit limitations, such as limited flexibility at high-curvature areas and unrealistic inter-part distances. In this study, we introduce a simple yet effective snap-and-tune strategy that sequentially applies DL and test-time optimization, which combines fast initial shape fitting with more detailed sample-specific mesh corrections. Our method provides significant improvements in both spatial accuracy and mesh quality, while being fully automated and requiring no additional training labels. Finally, we demonstrate the versatility and usefulness of our newly generated meshes via solid mechanics simulations in two different software platforms. Our code is available at https://github.com/danpak94/Deep-Cardiac-Volumetric-Mesh.

A Narrative Review on Large AI Models in Lung Cancer Screening, Diagnosis, and Treatment Planning

Jiachen Zhong, Yiting Wang, Di Zhu, Ziwei Wang

arxiv logopreprintJun 8 2025
Lung cancer remains one of the most prevalent and fatal diseases worldwide, demanding accurate and timely diagnosis and treatment. Recent advancements in large AI models have significantly enhanced medical image understanding and clinical decision-making. This review systematically surveys the state-of-the-art in applying large AI models to lung cancer screening, diagnosis, prognosis, and treatment. We categorize existing models into modality-specific encoders, encoder-decoder frameworks, and joint encoder architectures, highlighting key examples such as CLIP, BLIP, Flamingo, BioViL-T, and GLoRIA. We further examine their performance in multimodal learning tasks using benchmark datasets like LIDC-IDRI, NLST, and MIMIC-CXR. Applications span pulmonary nodule detection, gene mutation prediction, multi-omics integration, and personalized treatment planning, with emerging evidence of clinical deployment and validation. Finally, we discuss current limitations in generalizability, interpretability, and regulatory compliance, proposing future directions for building scalable, explainable, and clinically integrated AI systems. Our review underscores the transformative potential of large AI models to personalize and optimize lung cancer care.

Simultaneous Segmentation of Ventricles and Normal/Abnormal White Matter Hyperintensities in Clinical MRI using Deep Learning

Mahdi Bashiri Bawil, Mousa Shamsi, Abolhassan Shakeri Bavil

arxiv logopreprintJun 8 2025
Multiple sclerosis (MS) diagnosis and monitoring rely heavily on accurate assessment of brain MRI biomarkers, particularly white matter hyperintensities (WMHs) and ventricular changes. Current segmentation approaches suffer from several limitations: they typically segment these structures independently despite their pathophysiological relationship, struggle to differentiate between normal and pathological hyperintensities, and are poorly optimized for anisotropic clinical MRI data. We propose a novel 2D pix2pix-based deep learning framework for simultaneous segmentation of ventricles and WMHs with the unique capability to distinguish between normal periventricular hyperintensities and pathological MS lesions. Our method was developed and validated on FLAIR MRI scans from 300 MS patients. Compared to established methods (SynthSeg, Atlas Matching, BIANCA, LST-LPA, LST-LGA, and WMH-SynthSeg), our approach achieved superior performance for both ventricle segmentation (Dice: 0.801+/-0.025, HD95: 18.46+/-7.1mm) and WMH segmentation (Dice: 0.624+/-0.061, precision: 0.755+/-0.161). Furthermore, our method successfully differentiated between normal and abnormal hyperintensities with a Dice coefficient of 0.647. Notably, our approach demonstrated exceptional computational efficiency, completing end-to-end processing in approximately 4 seconds per case, up to 36 times faster than baseline methods, while maintaining minimal resource requirements. This combination of improved accuracy, clinically relevant differentiation capability, and computational efficiency addresses critical limitations in current neuroimaging analysis, potentially enabling integration into routine clinical workflows and enhancing MS diagnosis and monitoring.

Transfer Learning and Explainable AI for Brain Tumor Classification: A Study Using MRI Data from Bangladesh

Shuvashis Sarker

arxiv logopreprintJun 8 2025
Brain tumors, regardless of being benign or malignant, pose considerable health risks, with malignant tumors being more perilous due to their swift and uncontrolled proliferation, resulting in malignancy. Timely identification is crucial for enhancing patient outcomes, particularly in nations such as Bangladesh, where healthcare infrastructure is constrained. Manual MRI analysis is arduous and susceptible to inaccuracies, rendering it inefficient for prompt diagnosis. This research sought to tackle these problems by creating an automated brain tumor classification system utilizing MRI data obtained from many hospitals in Bangladesh. Advanced deep learning models, including VGG16, VGG19, and ResNet50, were utilized to classify glioma, meningioma, and various brain cancers. Explainable AI (XAI) methodologies, such as Grad-CAM and Grad-CAM++, were employed to improve model interpretability by emphasizing the critical areas in MRI scans that influenced the categorization. VGG16 achieved the most accuracy, attaining 99.17%. The integration of XAI enhanced the system's transparency and stability, rendering it more appropriate for clinical application in resource-limited environments such as Bangladesh. This study highlights the capability of deep learning models, in conjunction with explainable artificial intelligence (XAI), to enhance brain tumor detection and identification in areas with restricted access to advanced medical technologies.

RARL: Improving Medical VLM Reasoning and Generalization with Reinforcement Learning and LoRA under Data and Hardware Constraints

Tan-Hanh Pham, Chris Ngo

arxiv logopreprintJun 7 2025
The growing integration of vision-language models (VLMs) in medical applications offers promising support for diagnostic reasoning. However, current medical VLMs often face limitations in generalization, transparency, and computational efficiency-barriers that hinder deployment in real-world, resource-constrained settings. To address these challenges, we propose a Reasoning-Aware Reinforcement Learning framework, \textbf{RARL}, that enhances the reasoning capabilities of medical VLMs while remaining efficient and adaptable to low-resource environments. Our approach fine-tunes a lightweight base model, Qwen2-VL-2B-Instruct, using Low-Rank Adaptation and custom reward functions that jointly consider diagnostic accuracy and reasoning quality. Training is performed on a single NVIDIA A100-PCIE-40GB GPU, demonstrating the feasibility of deploying such models in constrained environments. We evaluate the model using an LLM-as-judge framework that scores both correctness and explanation quality. Experimental results show that RARL significantly improves VLM performance in medical image analysis and clinical reasoning, outperforming supervised fine-tuning on reasoning-focused tasks by approximately 7.78%, while requiring fewer computational resources. Additionally, we demonstrate the generalization capabilities of our approach on unseen datasets, achieving around 27% improved performance compared to supervised fine-tuning and about 4% over traditional RL fine-tuning. Our experiments also illustrate that diversity prompting during training and reasoning prompting during inference are crucial for enhancing VLM performance. Our findings highlight the potential of reasoning-guided learning and reasoning prompting to steer medical VLMs toward more transparent, accurate, and resource-efficient clinical decision-making. Code and data are publicly available.

Lack of children in public medical imaging data points to growing age bias in biomedical AI

Hua, S. B. Z., Heller, N., He, P., Towbin, A. J., Chen, I., Lu, A., Erdman, L.

medrxiv logopreprintJun 7 2025
Artificial intelligence (AI) is rapidly transforming healthcare, but its benefits are not reaching all patients equally. Children remain overlooked with only 17% of FDA-approved medical AI devices labeled for pediatric use. In this work, we demonstrate that this exclusion may stem from a fundamental data gap. Our systematic review of 181 public medical imaging datasets reveals that children represent just under 1% of available data, while the majority of machine learning imaging conference papers we surveyed utilized publicly available data for methods development. Much like systematic biases of other kinds in model development, past studies have demonstrated the manner in which pediatric representation in data used for models intended for the pediatric population is essential for model performance in that population. We add to these findings, showing that adult-trained chest radiograph models exhibit significant age bias when applied to pediatric populations, with higher false positive rates in younger children. This work underscores the urgent need for increased pediatric representation in publicly accessible medical datasets. We provide actionable recommendations for researchers, policymakers, and data curators to address this age equity gap and ensure AI benefits patients of all ages. 1-2 sentence summaryOur analysis reveals a critical healthcare age disparity: children represent less than 1% of public medical imaging datasets. This gap in representation leads to biased predictions across medical image foundation models, with the youngest patients facing the highest risk of misdiagnosis.

Dual-stage AI system for Pathologist-Free Tumor Detectionand subtyping in Oral Squamous Cell Carcinoma

Chaudhary, N., Muddemanavar, P., Singh, D. K., Rai, A., Mishra, D., SV, S., Augustine, J., Chandra, A., Chaurasia, A., Ahmad, T.

medrxiv logopreprintJun 6 2025
BackgroundAccurate histological grading of oral squamous cell carcinoma (OSCC) is critical for prognosis and treatment planning. Current methods lack automation for OSCC detection, subtyping, and differentiation from high-risk pre-malignant conditions like oral submucous fibrosis (OSMF). Further, analysis of whole-slide image (WSI) analysis is time-consuming and variable, limiting consistency. We present a clinically relevant deep learning framework that leverages weakly supervised learning and attention-based multiple instance learning (MIL) to enable automated OSCC grading and early prediction of malignant transformation from OSMF. MethodsWe conducted a multi-institutional retrospective cohort study using a curated dataset of 1,925 whole-slide images (WSIs), including 1,586 OSCC cases stratified into well-, moderately-, and poorly-differentiated subtypes (WD, MD, and PD), 128 normal controls, and 211 OSMF and OSMF with OSCC cases. We developed a two-stage deep learning pipeline named OralPatho. In stage one, an attention-based multiple instance learning (MIL) model was trained to perform binary classification (normal vs OSCC). In stage two, a gated attention mechanism with top-K patch selection was employed to classify the OSCC subtypes. Model performance was assessed using stratified 3-fold cross-validation and external validation on an independent dataset. FindingsThe binary classifier demonstrated robust performance with a mean F1-score exceeding 0.93 across all validation folds. The multiclass model achieved consistent macro-F1 scores of 0.72, 0.70, and 0.68, along with AUCs of 0.79 for WD, 0.71 for MD, and 0.61 for PD OSCC subtypes. Model generalizability was validated using an independent external dataset. Attention maps reliably highlighted clinically relevant histological features, supporting the systems interpretability and diagnostic alignment with expert pathological assessment. InterpretationThis study demonstrates the feasibility of attention-based, weakly supervised learning for accurate OSCC grading from whole-slide images. OralPatho combines high diagnostic performance with real-time interpretability, making it a scalable solution for both advanced pathology labs and resource-limited settings.

CAN TRANSFER LEARNING IMPROVE SUPERVISED SEGMENTATIONOF WHITE MATTER BUNDLES IN GLIOMA PATIENTS?

Riccardi, C., Ghezzi, S., Amorosino, G., Zigiotto, L., Sarubbo, S., Jovicich, J., Avesani, P.

biorxiv logopreprintJun 6 2025
In clinical neuroscience, the segmentation of the main white matter bundles is propaedeutic for many tasks such as pre-operative neurosurgical planning and monitoring of neuro-related diseases. Automating bundle segmentation with data-driven approaches and deep learning models has shown promising accuracy in the context of healthy individuals. The lack of large clinical datasets is preventing the translation of these results to patients. Inference on patients data with models trained on healthy population is not effective because of domain shift. This study aims to carry out an empirical analysis to investigate how transfer learning might be beneficial to overcome these limitations. For our analysis, we consider a public dataset with hundreds of individuals and a clinical dataset of glioma patients. We focus our preliminary investigation on the corticospinal tract. The results show that transfer learning might be effective in partially overcoming the domain shift.

Full Conformal Adaptation of Medical Vision-Language Models

Julio Silva-Rodríguez, Leo Fillioux, Paul-Henry Cournède, Maria Vakalopoulou, Stergios Christodoulidis, Ismail Ben Ayed, Jose Dolz

arxiv logopreprintJun 6 2025
Vision-language models (VLMs) pre-trained at large scale have shown unprecedented transferability capabilities and are being progressively integrated into medical image analysis. Although its discriminative potential has been widely explored, its reliability aspect remains overlooked. This work investigates their behavior under the increasingly popular split conformal prediction (SCP) framework, which theoretically guarantees a given error level on output sets by leveraging a labeled calibration set. However, the zero-shot performance of VLMs is inherently limited, and common practice involves few-shot transfer learning pipelines, which cannot absorb the rigid exchangeability assumptions of SCP. To alleviate this issue, we propose full conformal adaptation, a novel setting for jointly adapting and conformalizing pre-trained foundation models, which operates transductively over each test data point using a few-shot adaptation set. Moreover, we complement this framework with SS-Text, a novel training-free linear probe solver for VLMs that alleviates the computational cost of such a transductive approach. We provide comprehensive experiments using 3 different modality-specialized medical VLMs and 9 adaptation tasks. Our framework requires exactly the same data as SCP, and provides consistent relative improvements of up to 27% on set efficiency while maintaining the same coverage guarantees.
Page 27 of 55548 results
Show
per page

Ready to Sharpen Your Edge?

Join hundreds of your peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.