Sort by:
Page 45 of 58575 results

Artificial Intelligence for Teaching Case Curation: Evaluating Model Performance on Imaging Report Discrepancies.

Bartley M, Huemann Z, Hu J, Tie X, Ross AB, Kennedy T, Warner JD, Bradshaw T, Lawrence EM

pubmed logopapersJun 1 2025
Assess the feasibility of using a large language model (LLM) to identify valuable radiology teaching cases through report discrepancy detection. Retrospective study included after-hours head CT and musculoskeletal radiograph exams from January 2017 to December 2021. Discrepancy level between trainee's preliminary interpretation and final attending report was annotated on a 5-point scale. RadBERT, an LLM pretrained on a vast corpus of radiology text, was fine-tuned for discrepancy detection. For comparison and to ensure the robustness of the approach, Mixstral 8×7B, Mistral 7B, and Llama2 were also evaluated. The model's performance in detecting discrepancies was evaluated using a randomly selected hold-out test set. A subset of discrepant cases identified by the LLM was compared to a random case set by recording clinical parameters, discrepant pathology, and evaluating possible educational value. F1 statistic was used for model comparison. Pearson's chi-squared test was employed to assess discrepancy prevalence and score between groups (significance set at p<0.05). The fine-tuned LLM model achieved an overall accuracy of 90.5% with a specificity of 95.5% and a sensitivity of 66.3% for discrepancy detection. The model sensitivity significantly improved with higher discrepancy scores, 49% (34/70) for score 2 versus 67% (47/62) for score 3, and 81% (35/43) for score 4/5 (p<0.05 compared to score 2). LLM-curated set showed a significant increase in the prevalence of all discrepancies and major discrepancies (scores 4 or 5) compared to a random case set (P<0.05 for both). Evaluation of the clinical characteristics from both the random and discrepant case sets demonstrated a broad mix of pathologies and discrepancy types. An LLM can detect trainee report discrepancies, including both higher and lower-scoring discrepancies, and may improve case set curation for resident education as well as serve as a trainee oversight tool.

Expanded AI learning: AI as a Tool for Human Learning.

Faghani S, Tiegs-Heiden CA, Moassefi M, Powell GM, Ringler MD, Erickson BJ, Rhodes NG

pubmed logopapersJun 1 2025
To demonstrate that a deep learning (DL) model can be employed as a teaching tool to improve radiologists' ability to perform a subsequent imaging task without additional artificial intelligence (AI) assistance at time of image interpretation. Three human readers were tasked to categorize 50 frontal knee radiographs by male and female sex before and after reviewing data derived from our DL model. The model's high accuracy in performing this task was revealed to the human subjects, who were also supplied the DL model's resultant occlusion interpretation maps ("heat maps") to serve as a teaching tool for study before final testing. Two weeks later, the three human readers performed the same task with a new set of 50 radiographs. The average accuracy of the three human readers was initially 0.59 (95%CI: 0.59-0.65), not statistically different than guessing given our sample skew. The DL model categorized sex with 0.96 accuracy. After study of AI-derived "heat maps" and associated radiographs, the average accuracy of the human readers, without the direct help of AI, on the new set of radiographs increased to 0.80 (95%CI: 0.73-0.86), a significant improvement (p=0.0270). AI-derived data can be used as a teaching tool to improve radiologists' own ability to perform an imaging task. This is an idea that we have not before seen advanced in the radiology literature. AI can be used as a teaching tool to improve the intrinsic accuracy of radiologists, even without the concurrent use of AI.

Development and interpretation of a pathomics-based model for the prediction of immune therapy response in colorectal cancer.

Luo Y, Tian Q, Xu L, Zeng D, Zhang H, Zeng T, Tang H, Wang C, Chen Y

pubmed logopapersMay 31 2025
Colorectal cancer (CRC) is the third most common malignancy and the second leading cause of cancer-related deaths worldwide, with a 5-year survival rate below 20 %. Immunotherapy, particularly immune checkpoint blockade (ICB)-based therapies, has become an important approach for CRC treatment. However, only specific patient subsets demonstrate significant clinical benefits. Although the TIDE algorithm can predict immunotherapy responses, the reliance on transcriptome sequencing data limits its clinical applicability. Recent advances in artificial intelligence and computational pathology provide new avenues for medical image analysis.In this study, we classified TCGA-CRC samples into immunotherapy responder and non-responder groups using the TIDE algorithm. Further, a pathomics model based on convolutional neural networks was constructed to directly predict immunotherapy responses from histopathological images. Single-cell analysis revealed that fibroblasts may induce immunotherapy resistance in CRC through collagen-CD44 and ITGA1 + ITGB1 signaling axes. The developed pathomics model demonstrated excellent classification performance in the test set, with an AUC of 0.88 at the patch level and 0.85 at the patient level. Moreover, key pathomics features were identified through SHAP analysis. This innovative predictive tool provides a novel method for clinical decision-making in CRC immunotherapy, with potential to optimize treatment strategies and advance precision medicine.

ABCDEFGH: An Adaptation-Based Convolutional Neural Network-CycleGAN Disease-Courses Evolution Framework Using Generative Models in Health Education

Ruiming Min, Minghao Liu

arxiv logopreprintMay 31 2025
With the advancement of modern medicine and the development of technologies such as MRI, CT, and cellular analysis, it has become increasingly critical for clinicians to accurately interpret various diagnostic images. However, modern medical education often faces challenges due to limited access to high-quality teaching materials, stemming from privacy concerns and a shortage of educational resources (Balogh et al., 2015). In this context, image data generated by machine learning models, particularly generative models, presents a promising solution. These models can create diverse and comparable imaging datasets without compromising patient privacy, thereby supporting modern medical education. In this study, we explore the use of convolutional neural networks (CNNs) and CycleGAN (Zhu et al., 2017) for generating synthetic medical images. The source code is available at https://github.com/mliuby/COMP4211-Project.

Text-to-CT Generation via 3D Latent Diffusion Model with Contrastive Vision-Language Pretraining

Daniele Molino, Camillo Maria Caruso, Filippo Ruffini, Paolo Soda, Valerio Guarrasi

arxiv logopreprintMay 31 2025
Objective: While recent advances in text-conditioned generative models have enabled the synthesis of realistic medical images, progress has been largely confined to 2D modalities such as chest X-rays. Extending text-to-image generation to volumetric Computed Tomography (CT) remains a significant challenge, due to its high dimensionality, anatomical complexity, and the absence of robust frameworks that align vision-language data in 3D medical imaging. Methods: We introduce a novel architecture for Text-to-CT generation that combines a latent diffusion model with a 3D contrastive vision-language pretraining scheme. Our approach leverages a dual-encoder CLIP-style model trained on paired CT volumes and radiology reports to establish a shared embedding space, which serves as the conditioning input for generation. CT volumes are compressed into a low-dimensional latent space via a pretrained volumetric VAE, enabling efficient 3D denoising diffusion without requiring external super-resolution stages. Results: We evaluate our method on the CT-RATE dataset and conduct a comprehensive assessment of image fidelity, clinical relevance, and semantic alignment. Our model achieves competitive performance across all tasks, significantly outperforming prior baselines for text-to-CT generation. Moreover, we demonstrate that CT scans synthesized by our framework can effectively augment real data, improving downstream diagnostic performance. Conclusion: Our results show that modality-specific vision-language alignment is a key component for high-quality 3D medical image generation. By integrating contrastive pretraining and volumetric diffusion, our method offers a scalable and controllable solution for synthesizing clinically meaningful CT volumes from text, paving the way for new applications in data augmentation, medical education, and automated clinical simulation.

CineMA: A Foundation Model for Cine Cardiac MRI

Yunguan Fu, Weixi Yi, Charlotte Manisty, Anish N Bhuva, Thomas A Treibel, James C Moon, Matthew J Clarkson, Rhodri Huw Davies, Yipeng Hu

arxiv logopreprintMay 31 2025
Cardiac magnetic resonance (CMR) is a key investigation in clinical cardiovascular medicine and has been used extensively in population research. However, extracting clinically important measurements such as ejection fraction for diagnosing cardiovascular diseases remains time-consuming and subjective. We developed CineMA, a foundation AI model automating these tasks with limited labels. CineMA is a self-supervised autoencoder model trained on 74,916 cine CMR studies to reconstruct images from masked inputs. After fine-tuning, it was evaluated across eight datasets on 23 tasks from four categories: ventricle and myocardium segmentation, left and right ventricle ejection fraction calculation, disease detection and classification, and landmark localisation. CineMA is the first foundation model for cine CMR to match or outperform convolutional neural networks (CNNs). CineMA demonstrated greater label efficiency than CNNs, achieving comparable or better performance with fewer annotations. This reduces the burden of clinician labelling and supports replacing task-specific training with fine-tuning foundation models in future cardiac imaging applications. Models and code for pre-training and fine-tuning are available at https://github.com/mathpluscode/CineMA, democratising access to high-performance models that otherwise require substantial computational resources, promoting reproducibility and accelerating clinical translation.

Mammogram mastery: Breast cancer image classification using an ensemble of deep learning with explainable artificial intelligence.

Kumar Mondal P, Jahan MK, Byeon H

pubmed logopapersMay 30 2025
Breast cancer is a serious public health problem and is one of the leading causes of cancer-related deaths in women worldwide. Early detection of the disease can significantly increase the chances of survival. However, manual analysis of mammogram mastery images is complex and time-consuming, which can lead to disagreements among experts. For this reason, automated diagnostic systems can play a significant role in increasing the accuracy and efficiency of diagnosis. In this study, we present an effective deep learning (DL) method, which classifies mammogram mastery images into cancer and noncancer categories using a collected dataset. Our model is pretrained based on the Inception V3 architecture. First, we run 5-fold cross-validation tests on the fully trained and fine-tuned Inception V3 model. Next, we apply a combined method based on likelihood and mean, where the fine-tuned Inception V3 model demonstrated superior performance in classification. Our DL model achieved 99% accuracy and 99% F1 score. In addition, interpretable AI techniques were used to enhance the transparency of the classification process. The finely tuned Inception V3 model demonstrated the highest performance in classification, confirming its effectiveness in automatic breast cancer detection. The experimental results clearly indicate that our proposed DL-based method for breast cancer image classification is highly effective, especially its application in image-based diagnostic methods. This study brings to the fore the huge potential of AI-based solutions, which can play a significant role in increasing the accuracy and reliability of breast cancer diagnosis.

Federated Foundation Model for GI Endoscopy Images

Alina Devkota, Annahita Amireskandari, Joel Palko, Shyam Thakkar, Donald Adjeroh, Xiajun Jiang, Binod Bhattarai, Prashnna K. Gyawali

arxiv logopreprintMay 30 2025
Gastrointestinal (GI) endoscopy is essential in identifying GI tract abnormalities in order to detect diseases in their early stages and improve patient outcomes. Although deep learning has shown success in supporting GI diagnostics and decision-making, these models require curated datasets with labels that are expensive to acquire. Foundation models offer a promising solution by learning general-purpose representations, which can be finetuned for specific tasks, overcoming data scarcity. Developing foundation models for medical imaging holds significant potential, but the sensitive and protected nature of medical data presents unique challenges. Foundation model training typically requires extensive datasets, and while hospitals generate large volumes of data, privacy restrictions prevent direct data sharing, making foundation model training infeasible in most scenarios. In this work, we propose a FL framework for training foundation models for gastroendoscopy imaging, enabling data to remain within local hospital environments while contributing to a shared model. We explore several established FL algorithms, assessing their suitability for training foundation models without relying on task-specific labels, conducting experiments in both homogeneous and heterogeneous settings. We evaluate the trained foundation model on three critical downstream tasks--classification, detection, and segmentation--and demonstrate that it achieves improved performance across all tasks, highlighting the effectiveness of our approach in a federated, privacy-preserving setting.

Standardizing Heterogeneous MRI Series Description Metadata Using Large Language Models.

Kamel PI, Doo FX, Savani D, Kanhere A, Yi PH, Parekh VS

pubmed logopapersMay 29 2025
MRI metadata, particularly free-text series descriptions (SDs) used to identify sequences, are highly heterogeneous due to variable inputs by manufacturers and technologists. This variability poses challenges in correctly identifying series for hanging protocols and dataset curation. The purpose of this study was to evaluate the ability of large language models (LLMs) to automatically classify MRI SDs. We analyzed non-contrast brain MRIs performed between 2016 and 2022 at our institution, identifying all unique SDs in the metadata. A practicing neuroradiologist manually classified the SD text into: "T1," "T2," "T2/FLAIR," "SWI," "DWI," ADC," or "Other." Then, various LLMs, including GPT 3.5 Turbo, GPT-4, GPT-4o, Llama 3 8b, and Llama 3 70b, were asked to classify each SD into one of the sequence categories. Model performances were compared to ground truth classification using area under the curve (AUC) as the primary metric. Additionally, GPT-4o was tasked with generating regular expression templates to match each category. In 2510 MRI brain examinations, there were 1395 unique SDs, with 727/1395 (52.1%) appearing only once, indicating high variability. GPT-4o demonstrated the highest performance, achieving an average AUC of 0.983 ± 0.020 for all series with detailed prompting. GPT models significantly outperformed Llama models, with smaller differences within the GPT family. Regular expression generation was inconsistent, demonstrating an average AUC of 0.774 ± 0.161 for all sequences. Our findings suggest that LLMs are effective for interpreting and standardizing heterogeneous MRI SDs.

Interpreting Chest X-rays Like a Radiologist: A Benchmark with Clinical Reasoning

Jinquan Guan, Qi Chen, Lizhou Liang, Yuhang Liu, Vu Minh Hieu Phan, Minh-Son To, Jian Chen, Yutong Xie

arxiv logopreprintMay 29 2025
Artificial intelligence (AI)-based chest X-ray (CXR) interpretation assistants have demonstrated significant progress and are increasingly being applied in clinical settings. However, contemporary medical AI models often adhere to a simplistic input-to-output paradigm, directly processing an image and an instruction to generate a result, where the instructions may be integral to the model's architecture. This approach overlooks the modeling of the inherent diagnostic reasoning in chest X-ray interpretation. Such reasoning is typically sequential, where each interpretive stage considers the images, the current task, and the contextual information from previous stages. This oversight leads to several shortcomings, including misalignment with clinical scenarios, contextless reasoning, and untraceable errors. To fill this gap, we construct CXRTrek, a new multi-stage visual question answering (VQA) dataset for CXR interpretation. The dataset is designed to explicitly simulate the diagnostic reasoning process employed by radiologists in real-world clinical settings for the first time. CXRTrek covers 8 sequential diagnostic stages, comprising 428,966 samples and over 11 million question-answer (Q&A) pairs, with an average of 26.29 Q&A pairs per sample. Building on the CXRTrek dataset, we propose a new vision-language large model (VLLM), CXRTrekNet, specifically designed to incorporate the clinical reasoning flow into the VLLM framework. CXRTrekNet effectively models the dependencies between diagnostic stages and captures reasoning patterns within the radiological context. Trained on our dataset, the model consistently outperforms existing medical VLLMs on the CXRTrek benchmarks and demonstrates superior generalization across multiple tasks on five diverse external datasets. The dataset and model can be found in our repository (https://github.com/guanjinquan/CXRTrek).
Page 45 of 58575 results
Show
per page

Ready to Sharpen Your Edge?

Join hundreds of your peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.