Sort by:
Page 4 of 34338 results

Evaluating the Efficacy of Various Deep Learning Architectures for Automated Preprocessing and Identification of Impacted Maxillary Canines in Panoramic Radiographs.

Alenezi O, Bhattacharjee T, Alseed HA, Tosun YI, Chaudhry J, Prasad S

pubmed logopapersAug 2 2025
Previously, automated cropping and a reasonable classification accuracy for distinguishing impacted and non-impacted canines were demonstrated. This study evaluates multiple convolutional neural network (CNN) architectures for improving accuracy as a step towards a fully automated software for identification of impacted maxillary canines (IMCs) in panoramic radiographs (PRs). Eight CNNs (SqueezeNet, GoogLeNet, NASNet-Mobile, ShuffleNet, VGG-16, ResNet 50, DenseNet 201, and Inception V3) were compared in terms of their ability to classify 2 groups of PRs (impacted: n = 91; and non-impacted: n = 91 maxillary canines) before pre-processing and after applying automated cropping. For the PRs with impacted and non-impacted maxillary canines, GoogLeNet achieved the highest classification performance among the tested CNN architectures. Area under the curve (AUC) values of the Receiver Operating Characteristic (ROC) analysis without preprocessing and with preprocessing were 0.9 and 0.99 respectively, compared to 0.84 and 0.96 respectively with SqueezeNet. Among the tested CNN architectures, GoogLeNet achieved the highest performance on this dataset for the automated identification of impacted maxillary canines on both cropped and uncropped PRs.

M4CXR: Exploring Multitask Potentials of Multimodal Large Language Models for Chest X-Ray Interpretation.

Park J, Kim S, Yoon B, Hyun J, Choi K

pubmed logopapersAug 1 2025
The rapid evolution of artificial intelligence, especially in large language models (LLMs), has significantly impacted various domains, including healthcare. In chest X-ray (CXR) analysis, previous studies have employed LLMs, but with limitations: either underutilizing the LLMs' capability for multitask learning or lacking clinical accuracy. This article presents M4CXR, a multimodal LLM designed to enhance CXR interpretation. The model is trained on a visual instruction-following dataset that integrates various task-specific datasets in a conversational format. As a result, the model supports multiple tasks such as medical report generation (MRG), visual grounding, and visual question answering (VQA). M4CXR achieves state-of-the-art clinical accuracy in MRG by employing a chain-of-thought (CoT) prompting strategy, in which it identifies findings in CXR images and subsequently generates corresponding reports. The model is adaptable to various MRG scenarios depending on the available inputs, such as single-image, multiimage, and multistudy contexts. In addition to MRG, M4CXR performs visual grounding at a level comparable to specialized models and demonstrates outstanding performance in VQA. Both quantitative and qualitative assessments reveal M4CXR's versatility in MRG, visual grounding, and VQA, while consistently maintaining clinical accuracy.

Evaluation of calcaneal inclusion angle in the diagnosis of pes planus with pretrained deep learning networks: An observational study.

Aktas E, Ceylan N, Yaltirik Bilgin E, Bilgin E, Ince L

pubmed logopapersAug 1 2025
Pes planus is a common postural deformity related to the medial longitudinal arch of the foot. Radiographic examinations are important for reproducibility and objectivity; the most commonly used methods are the calcaneal inclusion angle and Mery angle. However, there may be variations in radiographic measurements due to human error and inexperience. In this study, a deep learning (DL)-based solution is proposed to solve this problem. Lateral radiographs of the right and left foot of 289 patients were taken and saved. The study population is a homogeneous group in terms of age and gender, and does not provide sufficient heterogeneity to represent the general population. These radiography (X-ray) images were measured by 2 different experts and the measurements were recorded. According to these measurements, each X-ray image is labeled as pes planus or non-pes planus. These images were then filtered and resized using Gaussian blurring and median filtering methods. As a result of these processes, 2 separate data sets were created. Generally accepted DL models (AlexNet, GoogleNet, SqueezeNet) were reconstructed to classify these images. The 2-category (pes planus/no pes planus) data in the 2 preprocessed and resized datasets were classified by fine-tuning these reconstructed transfer learning networks. The GoogleNet and SqueezeNet models achieved 100% accuracy, while AlexNet achieved 92.98% accuracy. These results show that the predictions of the models and the measurements of expert radiologists overlap to a large extent. DL-based diagnostic methods can be used as a decision support system in the diagnosis of pes planus. DL algorithms enhance the consistency of the diagnostic process by reducing measurement variations between different observers. DL systems accelerate diagnosis by automatically performing angle measurements from X-ray images, which is particularly beneficial in busy clinical settings by saving time. DL models integrated with smartphone cameras can facilitate the diagnosis of pes planus and serve as a screening tool, especially in regions with limited access to healthcare.

Acute lymphoblastic leukemia diagnosis using machine learning techniques based on selected features.

El Houby EMF

pubmed logopapersAug 1 2025
Cancer is considered one of the deadliest diseases worldwide. Early detection of cancer can significantly improve patient survival rates. In recent years, computer-aided diagnosis (CAD) systems have been increasingly employed in cancer diagnosis through various medical image modalities. These systems play a critical role in enhancing diagnostic accuracy, reducing physician workload, providing consistent second opinions, and contributing to the efficiency of the medical industry. Acute lymphoblastic leukemia (ALL) is a fast-progressing blood cancer that primarily affects children but can also occur in adults. Early and accurate diagnosis of ALL is crucial for effective treatment and improved outcomes, making it a vital area for CAD system development. In this research, a CAD system for ALL diagnosis has been developed. It contains four phases which are preprocessing, segmentation, feature extraction and selection phase, and classification of suspicious regions as normal or abnormal. The proposed system was applied to microscopic blood images to classify each case as ALL or normal. Three classifiers which are Naïve Bayes (NB), Support Vector Machine (SVM) and K-nearest Neighbor (K-NN) were utilized to classify the images based on selected features. Ant Colony Optimization (ACO) was combined with the classifiers as a feature selection method to identify the optimal subset of features among the extracted features from segmented cell parts that yield the highest classification accuracy. The NB classifier achieved the best performance, with accuracy, sensitivity, and specificity of 96.15%, 97.56, and 94.59%, respectively.

A Modified VGG19-Based Framework for Accurate and Interpretable Real-Time Bone Fracture Detection

Md. Ehsanul Haque, Abrar Fahim, Shamik Dey, Syoda Anamika Jahan, S. M. Jahidul Islam, Sakib Rokoni, Md Sakib Morshed

arxiv logopreprintJul 31 2025
Early and accurate detection of the bone fracture is paramount to initiating treatment as early as possible and avoiding any delay in patient treatment and outcomes. Interpretation of X-ray image is a time consuming and error prone task, especially when resources for such interpretation are limited by lack of radiology expertise. Additionally, deep learning approaches used currently, typically suffer from misclassifications and lack interpretable explanations to clinical use. In order to overcome these challenges, we propose an automated framework of bone fracture detection using a VGG-19 model modified to our needs. It incorporates sophisticated preprocessing techniques that include Contrast Limited Adaptive Histogram Equalization (CLAHE), Otsu's thresholding, and Canny edge detection, among others, to enhance image clarity as well as to facilitate the feature extraction. Therefore, we use Grad-CAM, an Explainable AI method that can generate visual heatmaps of the model's decision making process, as a type of model interpretability, for clinicians to understand the model's decision making process. It encourages trust and helps in further clinical validation. It is deployed in a real time web application, where healthcare professionals can upload X-ray images and get the diagnostic feedback within 0.5 seconds. The performance of our modified VGG-19 model attains 99.78\% classification accuracy and AUC score of 1.00, making it exceptionally good. The framework provides a reliable, fast, and interpretable solution for bone fracture detection that reasons more efficiently for diagnoses and better patient care.

Machine learning and machine learned prediction in chest X-ray images

Shereiff Garrett, Abhinav Adhikari, Sarina Gautam, DaShawn Marquis Morris, Chandra Mani Adhikari

arxiv logopreprintJul 31 2025
Machine learning and artificial intelligence are fast-growing fields of research in which data is used to train algorithms, learn patterns, and make predictions. This approach helps to solve seemingly intricate problems with significant accuracy without explicit programming by recognizing complex relationships in data. Taking an example of 5824 chest X-ray images, we implement two machine learning algorithms, namely, a baseline convolutional neural network (CNN) and a DenseNet-121, and present our analysis in making machine-learned predictions in predicting patients with ailments. Both baseline CNN and DenseNet-121 perform very well in the binary classification problem presented in this work. Gradient-weighted class activation mapping shows that DenseNet-121 correctly focuses on essential parts of the input chest X-ray images in its decision-making more than the baseline CNN.

CX-Mind: A Pioneering Multimodal Large Language Model for Interleaved Reasoning in Chest X-ray via Curriculum-Guided Reinforcement Learning

Wenjie Li, Yujie Zhang, Haoran Sun, Yueqi Li, Fanrui Zhang, Mengzhe Xu, Victoria Borja Clausich, Sade Mellin, Renhao Yang, Chenrun Wang, Jethro Zih-Shuo Wang, Shiyi Yao, Gen Li, Yidong Xu, Hanyu Wang, Yilin Huang, Angela Lin Wang, Chen Shi, Yin Zhang, Jianan Guo, Luqi Yang, Renxuan Li, Yang Xu, Jiawei Liu, Yao Zhang, Lei Liu, Carlos Gutiérrez SanRomán, Lei Wang

arxiv logopreprintJul 31 2025
Chest X-ray (CXR) imaging is one of the most widely used diagnostic modalities in clinical practice, encompassing a broad spectrum of diagnostic tasks. Recent advancements have seen the extensive application of reasoning-based multimodal large language models (MLLMs) in medical imaging to enhance diagnostic efficiency and interpretability. However, existing multimodal models predominantly rely on "one-time" diagnostic approaches, lacking verifiable supervision of the reasoning process. This leads to challenges in multi-task CXR diagnosis, including lengthy reasoning, sparse rewards, and frequent hallucinations. To address these issues, we propose CX-Mind, the first generative model to achieve interleaved "think-answer" reasoning for CXR tasks, driven by curriculum-based reinforcement learning and verifiable process rewards (CuRL-VPR). Specifically, we constructed an instruction-tuning dataset, CX-Set, comprising 708,473 images and 2,619,148 samples, and generated 42,828 high-quality interleaved reasoning data points supervised by clinical reports. Optimization was conducted in two stages under the Group Relative Policy Optimization framework: initially stabilizing basic reasoning with closed-domain tasks, followed by transfer to open-domain diagnostics, incorporating rule-based conditional process rewards to bypass the need for pretrained reward models. Extensive experimental results demonstrate that CX-Mind significantly outperforms existing medical and general-domain MLLMs in visual understanding, text generation, and spatiotemporal alignment, achieving an average performance improvement of 25.1% over comparable CXR-specific models. On real-world clinical dataset (Rui-CXR), CX-Mind achieves a mean recall@1 across 14 diseases that substantially surpasses the second-best results, with multi-center expert evaluations further confirming its clinical utility across multiple dimensions.

Deep learning for tooth detection and segmentation in panoramic radiographs: a systematic review and meta-analysis.

Bonfanti-Gris M, Herrera A, Salido Rodríguez-Manzaneque MP, Martínez-Rus F, Pradíes G

pubmed logopapersJul 30 2025
This systematic review and meta-analysis aimed to summarize and evaluate the available information regarding the performance of deep learning methods for tooth detection and segmentation in orthopantomographies. Electronic databases (Medline, Embase and Cochrane) were searched up to September 2023 for relevant observational studies and both, randomized and controlled clinical trials. Two reviewers independently conducted the study selection, data extraction, and quality assessments. GRADE (Grading of Recommendations, Assessment, Development, and Evaluation) assessment was adopted for collective grading of the overall body of evidence. From the 2,207 records identified, 20 studies were included in the analysis. Meta-analysis was conducted for the comparison of mesiodens detection and segmentation (n = 6) using sensitivity and specificity as the two main diagnostic parameters. A graphical summary of the analysis was also plotted and a Hierarchical Summary Receiver Operating Characteristic curve, prediction region, summary point, and confidence region were illustrated. The included studies quantitative analysis showed pooled sensitivity, specificity, positive LR, negative LR, and diagnostic odds ratio of 0.92 (95% confidence interval [CI], 0.84-0.96), 0.94 (95% CI, 0.89-0.97), 15.7 (95% CI, 7.6-32.2), 0.08 (95% CI, 0.04-0.18), and 186 (95% CI, 44-793), respectively. A graphical summary of the meta-analysis was plotted based on sensitivity and specificity. Hierarchical Summary Receiver Operating Characteristic curves showed a positive correlation between logit-transformed sensitivity and specificity (r = 0.886). Based on the results of the meta-analysis and GRADE assessment, a moderate recommendation is advised to dental operators when relying on AI-based tools for tooth detection and segmentation in panoramic radiographs.

Label-free estimation of clinically relevant performance metrics under distribution shifts

Tim Flühmann, Alceu Bissoto, Trung-Dung Hoang, Lisa M. Koch

arxiv logopreprintJul 30 2025
Performance monitoring is essential for safe clinical deployment of image classification models. However, because ground-truth labels are typically unavailable in the target dataset, direct assessment of real-world model performance is infeasible. State-of-the-art performance estimation methods address this by leveraging confidence scores to estimate the target accuracy. Despite being a promising direction, the established methods mainly estimate the model's accuracy and are rarely evaluated in a clinical domain, where strong class imbalances and dataset shifts are common. Our contributions are twofold: First, we introduce generalisations of existing performance prediction methods that directly estimate the full confusion matrix. Then, we benchmark their performance on chest x-ray data in real-world distribution shifts as well as simulated covariate and prevalence shifts. The proposed confusion matrix estimation methods reliably predicted clinically relevant counting metrics on medical images under distribution shifts. However, our simulated shift scenarios exposed important failure modes of current performance estimation techniques, calling for a better understanding of real-world deployment contexts when implementing these performance monitoring techniques for postmarket surveillance of medical AI models.

Distribution-Based Masked Medical Vision-Language Model Using Structured Reports

Shreyank N Gowda, Ruichi Zhang, Xiao Gu, Ying Weng, Lu Yang

arxiv logopreprintJul 29 2025
Medical image-language pre-training aims to align medical images with clinically relevant text to improve model performance on various downstream tasks. However, existing models often struggle with the variability and ambiguity inherent in medical data, limiting their ability to capture nuanced clinical information and uncertainty. This work introduces an uncertainty-aware medical image-text pre-training model that enhances generalization capabilities in medical image analysis. Building on previous methods and focusing on Chest X-Rays, our approach utilizes structured text reports generated by a large language model (LLM) to augment image data with clinically relevant context. These reports begin with a definition of the disease, followed by the `appearance' section to highlight critical regions of interest, and finally `observations' and `verdicts' that ground model predictions in clinical semantics. By modeling both inter- and intra-modal uncertainty, our framework captures the inherent ambiguity in medical images and text, yielding improved representations and performance on downstream tasks. Our model demonstrates significant advances in medical image-text pre-training, obtaining state-of-the-art performance on multiple downstream tasks.
Page 4 of 34338 results
Show
per page

Ready to Sharpen Your Edge?

Join hundreds of your peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.