Sort by:
Page 15 of 46453 results

GL-LCM: Global-Local Latent Consistency Models for Fast High-Resolution Bone Suppression in Chest X-Ray Images

Yifei Sun, Zhanghao Chen, Hao Zheng, Yuqing Lu, Lixin Duan, Fenglei Fan, Ahmed Elazab, Xiang Wan, Changmiao Wang, Ruiquan Ge

arxiv logopreprintAug 5 2025
Chest X-Ray (CXR) imaging for pulmonary diagnosis raises significant challenges, primarily because bone structures can obscure critical details necessary for accurate diagnosis. Recent advances in deep learning, particularly with diffusion models, offer significant promise for effectively minimizing the visibility of bone structures in CXR images, thereby improving clarity and diagnostic accuracy. Nevertheless, existing diffusion-based methods for bone suppression in CXR imaging struggle to balance the complete suppression of bones with preserving local texture details. Additionally, their high computational demand and extended processing time hinder their practical use in clinical settings. To address these limitations, we introduce a Global-Local Latent Consistency Model (GL-LCM) architecture. This model combines lung segmentation, dual-path sampling, and global-local fusion, enabling fast high-resolution bone suppression in CXR images. To tackle potential boundary artifacts and detail blurring in local-path sampling, we further propose Local-Enhanced Guidance, which addresses these issues without additional training. Comprehensive experiments on a self-collected dataset SZCH-X-Rays, and the public dataset JSRT, reveal that our GL-LCM delivers superior bone suppression and remarkable computational efficiency, significantly outperforming several competitive methods. Our code is available at https://github.com/diaoquesang/GL-LCM.

S-RRG-Bench: Structured Radiology Report Generation with Fine-Grained Evaluation Framework

Yingshu Li, Yunyi Liu, Zhanyu Wang, Xinyu Liang, Lingqiao Liu, Lei Wang, Luping Zhou

arxiv logopreprintAug 4 2025
Radiology report generation (RRG) for diagnostic images, such as chest X-rays, plays a pivotal role in both clinical practice and AI. Traditional free-text reports suffer from redundancy and inconsistent language, complicating the extraction of critical clinical details. Structured radiology report generation (S-RRG) offers a promising solution by organizing information into standardized, concise formats. However, existing approaches often rely on classification or visual question answering (VQA) pipelines that require predefined label sets and produce only fragmented outputs. Template-based approaches, which generate reports by replacing keywords within fixed sentence patterns, further compromise expressiveness and often omit clinically important details. In this work, we present a novel approach to S-RRG that includes dataset construction, model training, and the introduction of a new evaluation framework. We first create a robust chest X-ray dataset (MIMIC-STRUC) that includes disease names, severity levels, probabilities, and anatomical locations, ensuring that the dataset is both clinically relevant and well-structured. We train an LLM-based model to generate standardized, high-quality reports. To assess the generated reports, we propose a specialized evaluation metric (S-Score) that not only measures disease prediction accuracy but also evaluates the precision of disease-specific details, thus offering a clinically meaningful metric for report quality that focuses on elements critical to clinical decision-making and demonstrates a stronger alignment with human assessments. Our approach highlights the effectiveness of structured reports and the importance of a tailored evaluation metric for S-RRG, providing a more clinically relevant measure of report quality.

A Multi-Agent System for Complex Reasoning in Radiology Visual Question Answering

Ziruo Yi, Jinyu Liu, Ting Xiao, Mark V. Albert

arxiv logopreprintAug 4 2025
Radiology visual question answering (RVQA) provides precise answers to questions about chest X-ray images, alleviating radiologists' workload. While recent methods based on multimodal large language models (MLLMs) and retrieval-augmented generation (RAG) have shown promising progress in RVQA, they still face challenges in factual accuracy, hallucinations, and cross-modal misalignment. We introduce a multi-agent system (MAS) designed to support complex reasoning in RVQA, with specialized agents for context understanding, multimodal reasoning, and answer validation. We evaluate our system on a challenging RVQA set curated via model disagreement filtering, comprising consistently hard cases across multiple MLLMs. Extensive experiments demonstrate the superiority and effectiveness of our system over strong MLLM baselines, with a case study illustrating its reliability and interpretability. This work highlights the potential of multi-agent approaches to support explainable and trustworthy clinical AI applications that require complex reasoning.

A Novel Dual-Output Deep Learning Model Based on InceptionV3 for Radiographic Bone Age and Gender Assessment.

Rayed B, Amasya H, Sezdi M

pubmed logopapersAug 4 2025
Hand-wrist radiographs are used in bone age prediction. Computer-assisted clinical decision support systems offer solutions to the limitations of the radiographic bone age assessment methods. In this study, a multi-output prediction model was designed to predict bone age and gender using digital hand-wrist radiographs. The InceptionV3 architecture was used as the backbone, and the model was trained and tested using the open-access dataset of 2017 RSNA Pediatric Bone Age Challenge. A total of 14,048 samples were divided to training, validation, and testing subsets with the ratio of 7:2:1, and additional specialized convolutional neural network layers were implemented for robust feature management, such as Squeeze-and-Excitation block. The proposed model achieved a mean squared error of approximately 25 and a mean absolute error of 3.1 for predicting bone age. In gender classification, an accuracy of 95% and an area under the curve of 97% were achieved. The intra-class correlation coefficient for the continuous bone age predictions was found to be 0.997, while the Cohen's <math xmlns="http://www.w3.org/1998/Math/MathML"><mi>κ</mi></math> coefficient for the gender predictions was found to be 0.898 ( <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow><mi>p</mi> <mo><</mo></mrow> </math> 0.001). The proposed model aims to increase model efficiency by identifying common and discrete features. Based on the results, the proposed algorithm is promising; however, the mid-high-end hardware requirement may be a limitation for its use on local machines in the clinic. The future studies may consider increasing the dataset and simplification of the algorithms.

Evaluating the Efficacy of Various Deep Learning Architectures for Automated Preprocessing and Identification of Impacted Maxillary Canines in Panoramic Radiographs.

Alenezi O, Bhattacharjee T, Alseed HA, Tosun YI, Chaudhry J, Prasad S

pubmed logopapersAug 2 2025
Previously, automated cropping and a reasonable classification accuracy for distinguishing impacted and non-impacted canines were demonstrated. This study evaluates multiple convolutional neural network (CNN) architectures for improving accuracy as a step towards a fully automated software for identification of impacted maxillary canines (IMCs) in panoramic radiographs (PRs). Eight CNNs (SqueezeNet, GoogLeNet, NASNet-Mobile, ShuffleNet, VGG-16, ResNet 50, DenseNet 201, and Inception V3) were compared in terms of their ability to classify 2 groups of PRs (impacted: n = 91; and non-impacted: n = 91 maxillary canines) before pre-processing and after applying automated cropping. For the PRs with impacted and non-impacted maxillary canines, GoogLeNet achieved the highest classification performance among the tested CNN architectures. Area under the curve (AUC) values of the Receiver Operating Characteristic (ROC) analysis without preprocessing and with preprocessing were 0.9 and 0.99 respectively, compared to 0.84 and 0.96 respectively with SqueezeNet. Among the tested CNN architectures, GoogLeNet achieved the highest performance on this dataset for the automated identification of impacted maxillary canines on both cropped and uncropped PRs.

Temporal consistency-aware network for renal artery segmentation in X-ray angiography.

Yang B, Li C, Fezzi S, Fan Z, Wei R, Chen Y, Tavella D, Ribichini FL, Zhang S, Sharif F, Tu S

pubmed logopapersAug 2 2025
Accurate segmentation of renal arteries from X-ray angiography videos is crucial for evaluating renal sympathetic denervation (RDN) procedures but remains challenging due to dynamic changes in contrast concentration and vessel morphology across frames. The purpose of this study is to propose TCA-Net, a deep learning model that improves segmentation consistency by leveraging local and global contextual information in angiography videos. Our approach utilizes a novel deep learning framework that incorporates two key modules: a local temporal window vessel enhancement module and a global vessel refinement module (GVR). The local module fuses multi-scale temporal-spatial features to improve the semantic representation of vessels in the current frame, while the GVR module integrates decoupled attention strategies (video-level and object-level attention) and gating mechanisms to refine global vessel information and eliminate redundancy. To further improve segmentation consistency, a temporal perception consistency loss function is introduced during training. We evaluated our model using 195 renal artery angiography sequences for development and tested it on an external dataset from 44 patients. The results demonstrate that TCA-Net achieves an F1-score of 0.8678 for segmenting renal arteries, outperforming existing state-of-the-art segmentation methods. We present TCA-Net, a deep learning-based model that significantly improves segmentation consistency for renal artery angiography videos. By effectively leveraging both local and global temporal contextual information, TCA-Net outperforms current methods and provides a reliable tool for assessing RDN procedures.

M4CXR: Exploring Multitask Potentials of Multimodal Large Language Models for Chest X-Ray Interpretation.

Park J, Kim S, Yoon B, Hyun J, Choi K

pubmed logopapersAug 1 2025
The rapid evolution of artificial intelligence, especially in large language models (LLMs), has significantly impacted various domains, including healthcare. In chest X-ray (CXR) analysis, previous studies have employed LLMs, but with limitations: either underutilizing the LLMs' capability for multitask learning or lacking clinical accuracy. This article presents M4CXR, a multimodal LLM designed to enhance CXR interpretation. The model is trained on a visual instruction-following dataset that integrates various task-specific datasets in a conversational format. As a result, the model supports multiple tasks such as medical report generation (MRG), visual grounding, and visual question answering (VQA). M4CXR achieves state-of-the-art clinical accuracy in MRG by employing a chain-of-thought (CoT) prompting strategy, in which it identifies findings in CXR images and subsequently generates corresponding reports. The model is adaptable to various MRG scenarios depending on the available inputs, such as single-image, multiimage, and multistudy contexts. In addition to MRG, M4CXR performs visual grounding at a level comparable to specialized models and demonstrates outstanding performance in VQA. Both quantitative and qualitative assessments reveal M4CXR's versatility in MRG, visual grounding, and VQA, while consistently maintaining clinical accuracy.

Acute lymphoblastic leukemia diagnosis using machine learning techniques based on selected features.

El Houby EMF

pubmed logopapersAug 1 2025
Cancer is considered one of the deadliest diseases worldwide. Early detection of cancer can significantly improve patient survival rates. In recent years, computer-aided diagnosis (CAD) systems have been increasingly employed in cancer diagnosis through various medical image modalities. These systems play a critical role in enhancing diagnostic accuracy, reducing physician workload, providing consistent second opinions, and contributing to the efficiency of the medical industry. Acute lymphoblastic leukemia (ALL) is a fast-progressing blood cancer that primarily affects children but can also occur in adults. Early and accurate diagnosis of ALL is crucial for effective treatment and improved outcomes, making it a vital area for CAD system development. In this research, a CAD system for ALL diagnosis has been developed. It contains four phases which are preprocessing, segmentation, feature extraction and selection phase, and classification of suspicious regions as normal or abnormal. The proposed system was applied to microscopic blood images to classify each case as ALL or normal. Three classifiers which are Naïve Bayes (NB), Support Vector Machine (SVM) and K-nearest Neighbor (K-NN) were utilized to classify the images based on selected features. Ant Colony Optimization (ACO) was combined with the classifiers as a feature selection method to identify the optimal subset of features among the extracted features from segmented cell parts that yield the highest classification accuracy. The NB classifier achieved the best performance, with accuracy, sensitivity, and specificity of 96.15%, 97.56, and 94.59%, respectively.

Evaluation of calcaneal inclusion angle in the diagnosis of pes planus with pretrained deep learning networks: An observational study.

Aktas E, Ceylan N, Yaltirik Bilgin E, Bilgin E, Ince L

pubmed logopapersAug 1 2025
Pes planus is a common postural deformity related to the medial longitudinal arch of the foot. Radiographic examinations are important for reproducibility and objectivity; the most commonly used methods are the calcaneal inclusion angle and Mery angle. However, there may be variations in radiographic measurements due to human error and inexperience. In this study, a deep learning (DL)-based solution is proposed to solve this problem. Lateral radiographs of the right and left foot of 289 patients were taken and saved. The study population is a homogeneous group in terms of age and gender, and does not provide sufficient heterogeneity to represent the general population. These radiography (X-ray) images were measured by 2 different experts and the measurements were recorded. According to these measurements, each X-ray image is labeled as pes planus or non-pes planus. These images were then filtered and resized using Gaussian blurring and median filtering methods. As a result of these processes, 2 separate data sets were created. Generally accepted DL models (AlexNet, GoogleNet, SqueezeNet) were reconstructed to classify these images. The 2-category (pes planus/no pes planus) data in the 2 preprocessed and resized datasets were classified by fine-tuning these reconstructed transfer learning networks. The GoogleNet and SqueezeNet models achieved 100% accuracy, while AlexNet achieved 92.98% accuracy. These results show that the predictions of the models and the measurements of expert radiologists overlap to a large extent. DL-based diagnostic methods can be used as a decision support system in the diagnosis of pes planus. DL algorithms enhance the consistency of the diagnostic process by reducing measurement variations between different observers. DL systems accelerate diagnosis by automatically performing angle measurements from X-ray images, which is particularly beneficial in busy clinical settings by saving time. DL models integrated with smartphone cameras can facilitate the diagnosis of pes planus and serve as a screening tool, especially in regions with limited access to healthcare.

Machine learning and machine learned prediction in chest X-ray images

Shereiff Garrett, Abhinav Adhikari, Sarina Gautam, DaShawn Marquis Morris, Chandra Mani Adhikari

arxiv logopreprintJul 31 2025
Machine learning and artificial intelligence are fast-growing fields of research in which data is used to train algorithms, learn patterns, and make predictions. This approach helps to solve seemingly intricate problems with significant accuracy without explicit programming by recognizing complex relationships in data. Taking an example of 5824 chest X-ray images, we implement two machine learning algorithms, namely, a baseline convolutional neural network (CNN) and a DenseNet-121, and present our analysis in making machine-learned predictions in predicting patients with ailments. Both baseline CNN and DenseNet-121 perform very well in the binary classification problem presented in this work. Gradient-weighted class activation mapping shows that DenseNet-121 correctly focuses on essential parts of the input chest X-ray images in its decision-making more than the baseline CNN.
Page 15 of 46453 results
Show
per page

Ready to Sharpen Your Edge?

Join hundreds of your peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.