Sort by:
Page 12 of 35341 results

Understanding Dataset Bias in Medical Imaging: A Case Study on Chest X-rays

Ethan Dack, Chengliang Dai

arxiv logopreprintJul 10 2025
Recent works have revisited the infamous task ``Name That Dataset'', demonstrating that non-medical datasets contain underlying biases and that the dataset origin task can be solved with high accuracy. In this work, we revisit the same task applied to popular open-source chest X-ray datasets. Medical images are naturally more difficult to release for open-source due to their sensitive nature, which has led to certain open-source datasets being extremely popular for research purposes. By performing the same task, we wish to explore whether dataset bias also exists in these datasets. To extend our work, we apply simple transformations to the datasets, repeat the same task, and perform an analysis to identify and explain any detected biases. Given the importance of AI applications in medical imaging, it's vital to establish whether modern methods are taking shortcuts or are focused on the relevant pathology. We implement a range of different network architectures on the datasets: NIH, CheXpert, MIMIC-CXR and PadChest. We hope this work will encourage more explainable research being performed in medical imaging and the creation of more open-source datasets in the medical domain. Our code can be found here: https://github.com/eedack01/x_ray_ds_bias.

A novel segmentation-based deep learning model for enhanced scaphoid fracture detection.

Bützow A, Anttila TT, Haapamäki V, Ryhänen J

pubmed logopapersJul 9 2025
To develop a deep learning model to detect apparent and occult scaphoid fractures from plain wrist radiographs and to compare the model's diagnostic performance with that of a group of experts. A dataset comprising 408 patients, 410 wrists, and 1011 radiographs was collected. 718 of these radiographs contained a scaphoid fracture, verified by magnetic resonance imaging or computed tomography scans. 58 of these fractures were occult. The images were divided into training, test, and occult fracture test sets. The images were annotated by marking the scaphoid bone and the possible fracture area. The performance of the developed DL model was compared with the ground truth and the assessments of three clinical experts. The DL model achieved a sensitivity of 0.86 (95 % CI: 0.75-0.93) and a specificity of 0.83 (0.64-0.94). The model's accuracy was 0.85 (0.76-0.92), and the area under the receiver operating characteristics curve was 0.92 (0.86-0.97). The clinical experts' sensitivity ranged from 0.77 to 0.89, and specificity from 0.83 to 0.97. The DL model detected 24 of 58 (41 %) occult fractures, compared to 10.3 %, 13.7 %, and 6.8 % by the clinical experts. Detecting scaphoid fractures using a segmentation-based DL model is feasible and comparable to previously developed DL models. The model performed similarly to a group of experts in identifying apparent scaphoid fractures and demonstrated higher diagnostic accuracy in detecting occult fractures. The improvement in occult fracture detection could enhance patient care.

MCA-RG: Enhancing LLMs with Medical Concept Alignment for Radiology Report Generation

Qilong Xing, Zikai Song, Youjia Zhang, Na Feng, Junqing Yu, Wei Yang

arxiv logopreprintJul 9 2025
Despite significant advancements in adapting Large Language Models (LLMs) for radiology report generation (RRG), clinical adoption remains challenging due to difficulties in accurately mapping pathological and anatomical features to their corresponding text descriptions. Additionally, semantic agnostic feature extraction further hampers the generation of accurate diagnostic reports. To address these challenges, we introduce Medical Concept Aligned Radiology Report Generation (MCA-RG), a knowledge-driven framework that explicitly aligns visual features with distinct medical concepts to enhance the report generation process. MCA-RG utilizes two curated concept banks: a pathology bank containing lesion-related knowledge, and an anatomy bank with anatomical descriptions. The visual features are aligned with these medical concepts and undergo tailored enhancement. We further propose an anatomy-based contrastive learning procedure to improve the generalization of anatomical features, coupled with a matching loss for pathological features to prioritize clinically relevant regions. Additionally, a feature gating mechanism is employed to filter out low-quality concept features. Finally, the visual features are corresponding to individual medical concepts, and are leveraged to guide the report generation process. Experiments on two public benchmarks (MIMIC-CXR and CheXpert Plus) demonstrate that MCA-RG achieves superior performance, highlighting its effectiveness in radiology report generation.

Impact of polymer source variations on hydrogel structure and product performance in dexamethasone-loaded ophthalmic inserts.

VandenBerg MA, Zaman RU, Plavchak CL, Smith WC, Nejad HB, Beringhs AO, Wang Y, Xu X

pubmed logopapersJul 9 2025
Localized drug delivery can enhance therapeutic efficacy while minimizing systemic side effects, making sustained-release ophthalmic inserts an attractive alternative to traditional eye drops. Such inserts offer improved patient compliance through prolonged therapeutic effects and a reduced need for frequent administration. This study focuses on dexamethasone-containing ophthalmic inserts. These inserts utilize a key excipient, polyethylene glycol (PEG), which forms a hydrogel upon contact with tear fluid. Developing generic equivalents of PEG-based inserts is challenging due to difficulties in characterizing inactive ingredients and the absence of standardized physicochemical characterization methods to demonstrate similarity. To address this gap, a suite of analytical approaches was applied to both PEG precursor materials sourced from different vendors and manufactured inserts. <sup>1</sup>H NMR, FTIR, MALDI, and SEC revealed variations in end-group functionalization, impurity content, and molecular weight distribution of the excipient. These differences led to changes in the finished insert network properties such as porosity, pore size and structure, gel mechanical strength, and crystallinity, which were corroborated by X-ray microscopy, AI-based image analysis, thermal, mechanical, and density measurements. In vitro release testing revealed distinct drug release profiles across formulations, with swelling rate correlated to release rate (i.e., faster release with rapid swelling). The use of non-micronized and micronized dexamethasone also contributed to release profile differences. Through comprehensive characterization of these PEG-based dexamethasone inserts, correlations between polymer quality, hydrogel microstructure, and release kinetics were established. The study highlights how excipient differences can alter product performance, emphasizing the importance of thorough analysis in developing generic equivalents of complex drug products.

Applicability and performance of convolutional neural networks for the identification of periodontal bone loss in periapical radiographs: a scoping review.

Putra RH, Astuti ER, Nurrachman AS, Savitri Y, Vadya AV, Khairunisa ST, Iikubo M

pubmed logopapersJul 9 2025
The study aimed to review the applicability and performance of various Convolutional Neural Network (CNN) models for the identification of periodontal bone loss (PBL) in digital periapical radiographs achieved through classification, detection, and segmentation approaches. We searched the PubMed, IEEE Xplore, and SCOPUS databases for articles published up to June 2024. After the selection process, a total of 11 studies were included in this review. The reviewed studies demonstrated that CNNs have a significant potential application for automatic identification of PBL on periapical radiographs through classification and segmentation approaches. CNN architectures can be utilized to classify the presence or absence of PBL, the severity or degree of PBL, and PBL area segmentation. CNN showed a promising performance for PBL identification on periapical radiographs. Future research should focus on dataset preparation, proper selection of CNN architecture, and robust performance evaluation to improve the model. Utilizing an optimized CNN architecture is expected to assist dentists by providing accurate and efficient identification of PBL.

Label-Efficient Chest X-ray Diagnosis via Partial CLIP Adaptation

Heet Nitinkumar Dalsania

arxiv logopreprintJul 9 2025
Modern deep learning implementations for medical imaging usually rely on large labeled datasets. These datasets are often difficult to obtain due to privacy concerns, high costs, and even scarcity of cases. In this paper, a label-efficient strategy is proposed for chest X-ray diagnosis that seeks to reflect real-world hospital scenarios. The experiments use the NIH Chest X-ray14 dataset and a pre-trained CLIP ViT-B/32 model. The model is adapted via partial fine-tuning of its visual encoder and then evaluated using zero-shot and few-shot learning with 1-16 labeled examples per disease class. The tests demonstrate that CLIP's pre-trained vision-language features can be effectively adapted to few-shot medical imaging tasks, achieving over 20\% improvement in mean AUC score as compared to the zero-shot baseline. The key aspect of this work is to attempt to simulate internal hospital workflows, where image archives exist but annotations are sparse. This work evaluates a practical and scalable solution for both common and rare disease diagnosis. Additionally this research is intended for academic and experimental purposes only and has not been peer reviewed yet. All code is found at https://github.com/heet007-code/CLIP-disease-xray.

Dataset and Benchmark for Enhancing Critical Retained Foreign Object Detection

Yuli Wang, Victoria R. Shi, Liwei Zhou, Richard Chin, Yuwei Dai, Yuanyun Hu, Cheng-Yi Li, Haoyue Guan, Jiashu Cheng, Yu Sun, Cheng Ting Lin, Ihab Kamel, Premal Trivedi, Pamela Johnson, John Eng, Harrison Bai

arxiv logopreprintJul 9 2025
Critical retained foreign objects (RFOs), including surgical instruments like sponges and needles, pose serious patient safety risks and carry significant financial and legal implications for healthcare institutions. Detecting critical RFOs using artificial intelligence remains challenging due to their rarity and the limited availability of chest X-ray datasets that specifically feature critical RFOs cases. Existing datasets only contain non-critical RFOs, like necklace or zipper, further limiting their utility for developing clinically impactful detection algorithms. To address these limitations, we introduce "Hopkins RFOs Bench", the first and largest dataset of its kind, containing 144 chest X-ray images of critical RFO cases collected over 18 years from the Johns Hopkins Health System. Using this dataset, we benchmark several state-of-the-art object detection models, highlighting the need for enhanced detection methodologies for critical RFO cases. Recognizing data scarcity challenges, we further explore image synthetic methods to bridge this gap. We evaluate two advanced synthetic image methods, DeepDRR-RFO, a physics-based method, and RoentGen-RFO, a diffusion-based method, for creating realistic radiographs featuring critical RFOs. Our comprehensive analysis identifies the strengths and limitations of each synthetic method, providing insights into effectively utilizing synthetic data to enhance model training. The Hopkins RFOs Bench and our findings significantly advance the development of reliable, generalizable AI-driven solutions for detecting critical RFOs in clinical chest X-rays.

Development of Artificial Intelligence-Assisted Lumbar and Femoral BMD Estimation System Using Anteroposterior Lumbar X-Ray Images.

Moro T, Yoshimura N, Saito T, Oka H, Muraki S, Iidaka T, Tanaka T, Ono K, Ishikura H, Wada N, Watanabe K, Kyomoto M, Tanaka S

pubmed logopapersJul 9 2025
The early detection and treatment of osteoporosis and prevention of fragility fractures are urgent societal issues. We developed an artificial intelligence-assisted diagnostic system that estimated not only lumbar bone mineral density but also femoral bone mineral density from anteroposterior lumbar X-ray images. We evaluated the performance of lumbar and femoral bone mineral density estimations and the osteoporosis classification accuracy of an artificial intelligence-assisted diagnostic system using lumbar X-ray images from a population-based cohort. The artificial neural network consisted of a deep neural network for estimating lumbar and femoral bone mineral density values and classifying lumbar X-ray images into osteoporosis categories. The deep neural network was built by training dual-energy X-ray absorptiometry-derived lumbar and femoral bone mineral density values as the ground truth of the training data and preprocessed X-ray images. Five-fold cross-validation was performed to evaluate the accuracy of the estimated BMD. A total of 1454 X-ray images from 1454 participants were analyzed using the artificial neural network. For the bone mineral density estimation performance, the mean absolute errors were 0.076 g/cm<sup>2</sup> for the lumbar and 0.071 g/cm<sup>2</sup> for the femur between dual-energy X-ray absorptiometry-derived and artificial intelligence-estimated bone mineral density values. The classification performances for the lumbar and femur of patients with osteopenia, in terms of sensitivity, were 86.4% and 80.4%, respectively, and the respective specificities were 84.1% and 76.3%. CLINICAL SIGNIFICANCE: The system was able to estimate the bone mineral density and classify the osteoporosis category of not only patients in clinics or hospitals but also of general inhabitants.

Population-scale cross-sectional observational study for AI-powered TB screening on one million CXRs.

Munjal P, Mahrooqi AA, Rajan R, Jeremijenko A, Ahmad I, Akhtar MI, Pimentel MAF, Khan S

pubmed logopapersJul 9 2025
Traditional tuberculosis (TB) screening involves radiologists manually reviewing chest X-rays (CXR), which is time-consuming, error-prone, and limited by workforce shortages. Our AI model, AIRIS-TB (AI Radiology In Screening TB), aims to address these challenges by automating the reporting of all X-rays without any findings. AIRIS-TB was evaluated on over one million CXRs, achieving an AUC of 98.51% and overall false negative rate (FNR) of 1.57%, outperforming radiologists (1.85%) while maintaining a 0% TB-FNR. By selectively deferring only cases with findings to radiologists, the model has the potential to automate up to 80% of routine CXR reporting. Subgroup analysis revealed insignificant performance disparities across age, sex, HIV status, and region of origin, with sputum tests for suspected TB showing a strong correlation with model predictions. This large-scale validation demonstrates AIRIS-TB's safety and efficiency in high-volume TB screening programs, reducing radiologist workload without compromising diagnostic accuracy.

Applying deep learning techniques to identify tonsilloliths in panoramic radiography.

Katı E, Baybars SC, Danacı Ç, Tuncer SA

pubmed logopapersJul 9 2025
Tonsilloliths can be seen on panoramic radiographs (PRs) as deposits located on the middle portion of the ramus of the mandible. Although tonsilloliths are clinically harmless, the high risk of misdiagnosis leads to unnecessary advanced examinations and interventions, thus jeopardizing patient safety and increasing unnecessary resource use in the healthcare system. Therefore, this study aims to meet an important clinical need by providing accurate and rapid diagnostic support. The dataset consisted of a total of 275 PRs, with 125 PRs lacking tonsillolith and 150 PRs having tonsillolith. ResNet and EfficientNet CNN models were assessed during the model selection process. An evaluation was conducted to analyze the learning capacity, intricacy, and compatibility of each model with the problem at hand. The effectiveness of the models was evaluated using accuracy, recall, precision, and F1 score measures following the training phase. Both the ResNet18 and EfficientNetB0 models were able to differentiate between tonsillolith-present and tonsillolith-absent conditions with an average accuracy of 89%. ResNet101 demonstrated underperformance when contrasted with other models. EfficientNetB1 exhibits satisfactory accuracy in both categories. The EfficientNetB0 model exhibits a 93% precision, 87% recall, 90% F1 score, and 89% accuracy. This study indicates that implementing AI-powered deep learning techniques would significantly improve the clinical diagnosis of tonsilloliths.
Page 12 of 35341 results
Show
per page

Ready to Sharpen Your Edge?

Join hundreds of your peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.