Sort by:
Page 3 of 34338 results

AIMR-MediTell: Attention-Infused Mask RNN for Medical Image Interpretation and Report Generation.

Chen L, Yang L, Bedir O

pubmed logopapersAug 7 2025
Medical diagnostics often rely on the interpretation of complex medical images. However, manual analysis and report generation by medical practitioners are time-consuming, and the inherent ambiguity in chest X-rays presents significant challenges for automated systems in producing interpretable results. To address this, we propose Attention-Infused Mask Recurrent Neural Network (AIMR-MediTell), a deep learning framework integrating instance segmentation using Mask RCNN with attention-based feature extraction to identify and highlight abnormal regions in chest X-rays. This framework also incorporates an encoder-decoder structure with pretrained BioWordVec embeddings to generate explanatory reports based on augmented images. We evaluated AIMR-MediTell on the Open-I dataset, achieving a BLEU-4 score of 0.415, outperforming existing models. Our results demonstrate the effectiveness of the proposed model, showing that incorporating masked regions enhances report accuracy and interpretability. By identifying malfunction areas and automating report generation for X-ray images, our approach has the potential to significantly improve the efficiency and accuracy of medical image analysis.

Enhancing image retrieval through optimal barcode representation.

Khosrowshahli R, Kheiri F, Asilian Bidgoli A, Tizhoosh HR, Makrehchi M, Rahnamayan S

pubmed logopapersAug 7 2025
Data binary encoding has proven to be a versatile tool for optimizing data processing and memory efficiency in various machine learning applications. This includes deep barcoding, generating barcodes from deep learning feature extraction for image retrieval of similar cases among millions of indexed images. Despite the recent advancement in barcode generation methods, converting high-dimensional feature vectors (e.g., deep features) to compact and discriminative binary barcodes is still an urgent necessity and remains an unresolved problem. Difference-based binarization of features is one of the most efficient binarization methods, transforming continuous feature vectors into binary sequences and capturing trend information. However, the performance of this method is highly dependent on the ordering of the input features, leading to a significant combinatorial challenge. This research addresses this problem by optimizing feature sequences based on retrieval performance metrics. Our approach identifies optimal feature orderings, leading to substantial improvements in retrieval effectiveness compared to arbitrary or default orderings. We assess the performance of the proposed approach in various medical and non-medical image retrieval tasks. This evaluation includes medical images from The Cancer Genome Atlas (TCGA), a comprehensive publicly available dataset, as well as COVID-19 Chest X-rays dataset. In addition, we evaluate the proposed approach on non-medical benchmark image datasets, such as CIFAR-10, CIFAR-100, and Fashion-MNIST. Our findings demonstrate the importance of optimizing binary barcode representation to significantly enhance accuracy for fast image retrieval across a wide range of applications, highlighting the applicability and potential of barcodes in various domains.

Multi-Modal and Multi-View Fusion Classifier for Craniosynostosis Diagnosis.

Kim DY, Kim JW, Kim SK, Kim YG

pubmed logopapersAug 7 2025
The diagnosis of craniosynostosis, a condition involving the premature fusion of cranial sutures in infants, is essential for ensuring timely treatment and optimal surgical outcomes. Current diagnostic approaches often require CT scans, which expose children to significant radiation risks. To address this, we present a novel deep learning-based model utilizing multi-view X-ray images for craniosynostosis detection. The proposed model integrates advanced multi-view fusion (MVF) and cross-attention mechanisms, effectively combining features from three X-ray views (AP, lateral right, lateral left) and patient metadata (age, sex). By leveraging these techniques, the model captures comprehensive semantic and structural information for high diagnostic accuracy while minimizing radiation exposure. Tested on a dataset of 882 X-ray images from 294 pediatric patients, the model achieved an AUROC of 0.975, an F1-score of 0.882, a sensitivity of 0.878, and a specificity of 0.937. Grad-CAM visualizations further validated its ability to localize disease-relevant regions using only classification annotations. The model demonstrates the potential to revolutionize pediatric care by providing a safer, cost-effective alternative to CT scans.

Benchmarking Uncertainty and its Disentanglement in multi-label Chest X-Ray Classification

Simon Baur, Wojciech Samek, Jackie Ma

arxiv logopreprintAug 6 2025
Reliable uncertainty quantification is crucial for trustworthy decision-making and the deployment of AI models in medical imaging. While prior work has explored the ability of neural networks to quantify predictive, epistemic, and aleatoric uncertainties using an information-theoretical approach in synthetic or well defined data settings like natural image classification, its applicability to real life medical diagnosis tasks remains underexplored. In this study, we provide an extensive uncertainty quantification benchmark for multi-label chest X-ray classification using the MIMIC-CXR-JPG dataset. We evaluate 13 uncertainty quantification methods for convolutional (ResNet) and transformer-based (Vision Transformer) architectures across a wide range of tasks. Additionally, we extend Evidential Deep Learning, HetClass NNs, and Deep Deterministic Uncertainty to the multi-label setting. Our analysis provides insights into uncertainty estimation effectiveness and the ability to disentangle epistemic and aleatoric uncertainties, revealing method- and architecture-specific strengths and limitations.

R2GenKG: Hierarchical Multi-modal Knowledge Graph for LLM-based Radiology Report Generation

Futian Wang, Yuhan Qiao, Xiao Wang, Fuling Wang, Yuxiang Zhang, Dengdi Sun

arxiv logopreprintAug 5 2025
X-ray medical report generation is one of the important applications of artificial intelligence in healthcare. With the support of large foundation models, the quality of medical report generation has significantly improved. However, challenges such as hallucination and weak disease diagnostic capability still persist. In this paper, we first construct a large-scale multi-modal medical knowledge graph (termed M3KG) based on the ground truth medical report using the GPT-4o. It contains 2477 entities, 3 kinds of relations, 37424 triples, and 6943 disease-aware vision tokens for the CheXpert Plus dataset. Then, we sample it to obtain multi-granularity semantic graphs and use an R-GCN encoder for feature extraction. For the input X-ray image, we adopt the Swin-Transformer to extract the vision features and interact with the knowledge using cross-attention. The vision tokens are fed into a Q-former and retrieved the disease-aware vision tokens using another cross-attention. Finally, we adopt the large language model to map the semantic knowledge graph, input X-ray image, and disease-aware vision tokens into language descriptions. Extensive experiments on multiple datasets fully validated the effectiveness of our proposed knowledge graph and X-ray report generation framework. The source code of this paper will be released on https://github.com/Event-AHU/Medical_Image_Analysis.

GL-LCM: Global-Local Latent Consistency Models for Fast High-Resolution Bone Suppression in Chest X-Ray Images

Yifei Sun, Zhanghao Chen, Hao Zheng, Yuqing Lu, Lixin Duan, Fenglei Fan, Ahmed Elazab, Xiang Wan, Changmiao Wang, Ruiquan Ge

arxiv logopreprintAug 5 2025
Chest X-Ray (CXR) imaging for pulmonary diagnosis raises significant challenges, primarily because bone structures can obscure critical details necessary for accurate diagnosis. Recent advances in deep learning, particularly with diffusion models, offer significant promise for effectively minimizing the visibility of bone structures in CXR images, thereby improving clarity and diagnostic accuracy. Nevertheless, existing diffusion-based methods for bone suppression in CXR imaging struggle to balance the complete suppression of bones with preserving local texture details. Additionally, their high computational demand and extended processing time hinder their practical use in clinical settings. To address these limitations, we introduce a Global-Local Latent Consistency Model (GL-LCM) architecture. This model combines lung segmentation, dual-path sampling, and global-local fusion, enabling fast high-resolution bone suppression in CXR images. To tackle potential boundary artifacts and detail blurring in local-path sampling, we further propose Local-Enhanced Guidance, which addresses these issues without additional training. Comprehensive experiments on a self-collected dataset SZCH-X-Rays, and the public dataset JSRT, reveal that our GL-LCM delivers superior bone suppression and remarkable computational efficiency, significantly outperforming several competitive methods. Our code is available at https://github.com/diaoquesang/GL-LCM.

A Novel Dual-Output Deep Learning Model Based on InceptionV3 for Radiographic Bone Age and Gender Assessment.

Rayed B, Amasya H, Sezdi M

pubmed logopapersAug 4 2025
Hand-wrist radiographs are used in bone age prediction. Computer-assisted clinical decision support systems offer solutions to the limitations of the radiographic bone age assessment methods. In this study, a multi-output prediction model was designed to predict bone age and gender using digital hand-wrist radiographs. The InceptionV3 architecture was used as the backbone, and the model was trained and tested using the open-access dataset of 2017 RSNA Pediatric Bone Age Challenge. A total of 14,048 samples were divided to training, validation, and testing subsets with the ratio of 7:2:1, and additional specialized convolutional neural network layers were implemented for robust feature management, such as Squeeze-and-Excitation block. The proposed model achieved a mean squared error of approximately 25 and a mean absolute error of 3.1 for predicting bone age. In gender classification, an accuracy of 95% and an area under the curve of 97% were achieved. The intra-class correlation coefficient for the continuous bone age predictions was found to be 0.997, while the Cohen's <math xmlns="http://www.w3.org/1998/Math/MathML"><mi>κ</mi></math> coefficient for the gender predictions was found to be 0.898 ( <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow><mi>p</mi> <mo><</mo></mrow> </math> 0.001). The proposed model aims to increase model efficiency by identifying common and discrete features. Based on the results, the proposed algorithm is promising; however, the mid-high-end hardware requirement may be a limitation for its use on local machines in the clinic. The future studies may consider increasing the dataset and simplification of the algorithms.

S-RRG-Bench: Structured Radiology Report Generation with Fine-Grained Evaluation Framework

Yingshu Li, Yunyi Liu, Zhanyu Wang, Xinyu Liang, Lingqiao Liu, Lei Wang, Luping Zhou

arxiv logopreprintAug 4 2025
Radiology report generation (RRG) for diagnostic images, such as chest X-rays, plays a pivotal role in both clinical practice and AI. Traditional free-text reports suffer from redundancy and inconsistent language, complicating the extraction of critical clinical details. Structured radiology report generation (S-RRG) offers a promising solution by organizing information into standardized, concise formats. However, existing approaches often rely on classification or visual question answering (VQA) pipelines that require predefined label sets and produce only fragmented outputs. Template-based approaches, which generate reports by replacing keywords within fixed sentence patterns, further compromise expressiveness and often omit clinically important details. In this work, we present a novel approach to S-RRG that includes dataset construction, model training, and the introduction of a new evaluation framework. We first create a robust chest X-ray dataset (MIMIC-STRUC) that includes disease names, severity levels, probabilities, and anatomical locations, ensuring that the dataset is both clinically relevant and well-structured. We train an LLM-based model to generate standardized, high-quality reports. To assess the generated reports, we propose a specialized evaluation metric (S-Score) that not only measures disease prediction accuracy but also evaluates the precision of disease-specific details, thus offering a clinically meaningful metric for report quality that focuses on elements critical to clinical decision-making and demonstrates a stronger alignment with human assessments. Our approach highlights the effectiveness of structured reports and the importance of a tailored evaluation metric for S-RRG, providing a more clinically relevant measure of report quality.

A Multi-Agent System for Complex Reasoning in Radiology Visual Question Answering

Ziruo Yi, Jinyu Liu, Ting Xiao, Mark V. Albert

arxiv logopreprintAug 4 2025
Radiology visual question answering (RVQA) provides precise answers to questions about chest X-ray images, alleviating radiologists' workload. While recent methods based on multimodal large language models (MLLMs) and retrieval-augmented generation (RAG) have shown promising progress in RVQA, they still face challenges in factual accuracy, hallucinations, and cross-modal misalignment. We introduce a multi-agent system (MAS) designed to support complex reasoning in RVQA, with specialized agents for context understanding, multimodal reasoning, and answer validation. We evaluate our system on a challenging RVQA set curated via model disagreement filtering, comprising consistently hard cases across multiple MLLMs. Extensive experiments demonstrate the superiority and effectiveness of our system over strong MLLM baselines, with a case study illustrating its reliability and interpretability. This work highlights the potential of multi-agent approaches to support explainable and trustworthy clinical AI applications that require complex reasoning.

Temporal consistency-aware network for renal artery segmentation in X-ray angiography.

Yang B, Li C, Fezzi S, Fan Z, Wei R, Chen Y, Tavella D, Ribichini FL, Zhang S, Sharif F, Tu S

pubmed logopapersAug 2 2025
Accurate segmentation of renal arteries from X-ray angiography videos is crucial for evaluating renal sympathetic denervation (RDN) procedures but remains challenging due to dynamic changes in contrast concentration and vessel morphology across frames. The purpose of this study is to propose TCA-Net, a deep learning model that improves segmentation consistency by leveraging local and global contextual information in angiography videos. Our approach utilizes a novel deep learning framework that incorporates two key modules: a local temporal window vessel enhancement module and a global vessel refinement module (GVR). The local module fuses multi-scale temporal-spatial features to improve the semantic representation of vessels in the current frame, while the GVR module integrates decoupled attention strategies (video-level and object-level attention) and gating mechanisms to refine global vessel information and eliminate redundancy. To further improve segmentation consistency, a temporal perception consistency loss function is introduced during training. We evaluated our model using 195 renal artery angiography sequences for development and tested it on an external dataset from 44 patients. The results demonstrate that TCA-Net achieves an F1-score of 0.8678 for segmenting renal arteries, outperforming existing state-of-the-art segmentation methods. We present TCA-Net, a deep learning-based model that significantly improves segmentation consistency for renal artery angiography videos. By effectively leveraging both local and global temporal contextual information, TCA-Net outperforms current methods and provides a reliable tool for assessing RDN procedures.
Page 3 of 34338 results
Show
per page

Ready to Sharpen Your Edge?

Join hundreds of your peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.