Sort by:
Page 7 of 45449 results

Chest X-ray Pneumothorax Segmentation Using EfficientNet-B4 Transfer Learning in a U-Net Architecture

Alvaro Aranibar Roque, Helga Sebastian

arxiv logopreprintSep 4 2025
Pneumothorax, the abnormal accumulation of air in the pleural space, can be life-threatening if undetected. Chest X-rays are the first-line diagnostic tool, but small cases may be subtle. We propose an automated deep-learning pipeline using a U-Net with an EfficientNet-B4 encoder to segment pneumothorax regions. Trained on the SIIM-ACR dataset with data augmentation and a combined binary cross-entropy plus Dice loss, the model achieved an IoU of 0.7008 and Dice score of 0.8241 on the independent PTX-498 dataset. These results demonstrate that the model can accurately localize pneumothoraces and support radiologists.

A Generative Foundation Model for Chest Radiography

Yuanfeng Ji, Dan Lin, Xiyue Wang, Lu Zhang, Wenhui Zhou, Chongjian Ge, Ruihang Chu, Xiaoli Yang, Junhan Zhao, Junsong Chen, Xiangde Luo, Sen Yang, Jin Fang, Ping Luo, Ruijiang Li

arxiv logopreprintSep 4 2025
The scarcity of well-annotated diverse medical images is a major hurdle for developing reliable AI models in healthcare. Substantial technical advances have been made in generative foundation models for natural images. Here we develop `ChexGen', a generative vision-language foundation model that introduces a unified framework for text-, mask-, and bounding box-guided synthesis of chest radiographs. Built upon the latent diffusion transformer architecture, ChexGen was pretrained on the largest curated chest X-ray dataset to date, consisting of 960,000 radiograph-report pairs. ChexGen achieves accurate synthesis of radiographs through expert evaluations and quantitative metrics. We demonstrate the utility of ChexGen for training data augmentation and supervised pretraining, which led to performance improvements across disease classification, detection, and segmentation tasks using a small fraction of training data. Further, our model enables the creation of diverse patient cohorts that enhance model fairness by detecting and mitigating demographic biases. Our study supports the transformative role of generative foundation models in building more accurate, data-efficient, and equitable medical AI systems.

From Lines to Shapes: Geometric-Constrained Segmentation of X-Ray Collimators via Hough Transform

Benjamin El-Zein, Dominik Eckert, Andreas Fieselmann, Christopher Syben, Ludwig Ritschl, Steffen Kappler, Sebastian Stober

arxiv logopreprintSep 4 2025
Collimation in X-ray imaging restricts exposure to the region-of-interest (ROI) and minimizes the radiation dose applied to the patient. The detection of collimator shadows is an essential image-based preprocessing step in digital radiography posing a challenge when edges get obscured by scattered X-ray radiation. Regardless, the prior knowledge that collimation forms polygonal-shaped shadows is evident. For this reason, we introduce a deep learning-based segmentation that is inherently constrained to its geometry. We achieve this by incorporating a differentiable Hough transform-based network to detect the collimation borders and enhance its capability to extract the information about the ROI center. During inference, we combine the information of both tasks to enable the generation of refined, line-constrained segmentation masks. We demonstrate robust reconstruction of collimated regions achieving median Hausdorff distances of 4.3-5.0mm on diverse test sets of real Xray images. While this application involves at most four shadow borders, our method is not fundamentally limited by a specific number of edges.

A Foundation Model for Chest X-ray Interpretation with Grounded Reasoning via Online Reinforcement Learning

Qika Lin, Yifan Zhu, Bin Pu, Ling Huang, Haoran Luo, Jingying Ma, Zhen Peng, Tianzhe Zhao, Fangzhi Xu, Jian Zhang, Kai He, Zhonghong Ou, Swapnil Mishra, Mengling Feng

arxiv logopreprintSep 4 2025
Medical foundation models (FMs) have shown tremendous promise amid the rapid advancements in artificial intelligence (AI) technologies. However, current medical FMs typically generate answers in a black-box manner, lacking transparent reasoning processes and locally grounded interpretability, which hinders their practical clinical deployments. To this end, we introduce DeepMedix-R1, a holistic medical FM for chest X-ray (CXR) interpretation. It leverages a sequential training pipeline: initially fine-tuned on curated CXR instruction data to equip with fundamental CXR interpretation capabilities, then exposed to high-quality synthetic reasoning samples to enable cold-start reasoning, and finally refined via online reinforcement learning to enhance both grounded reasoning quality and generation performance. Thus, the model produces both an answer and reasoning steps tied to the image's local regions for each query. Quantitative evaluation demonstrates substantial improvements in report generation (e.g., 14.54% and 31.32% over LLaVA-Rad and MedGemma) and visual question answering (e.g., 57.75% and 23.06% over MedGemma and CheXagent) tasks. To facilitate robust assessment, we propose Report Arena, a benchmarking framework using advanced language models to evaluate answer quality, further highlighting the superiority of DeepMedix-R1. Expert review of generated reasoning steps reveals greater interpretability and clinical plausibility compared to the established Qwen2.5-VL-7B model (0.7416 vs. 0.2584 overall preference). Collectively, our work advances medical FM development toward holistic, transparent, and clinically actionable modeling for CXR interpretation.

Deep Self-knowledge Distillation: A hierarchical supervised learning for coronary artery segmentation

Mingfeng Lin

arxiv logopreprintSep 3 2025
Coronary artery disease is a leading cause of mortality, underscoring the critical importance of precise diagnosis through X-ray angiography. Manual coronary artery segmentation from these images is time-consuming and inefficient, prompting the development of automated models. However, existing methods, whether rule-based or deep learning models, struggle with issues like poor performance and limited generalizability. Moreover, current knowledge distillation methods applied in this field have not fully exploited the hierarchical knowledge of the model, leading to certain information waste and insufficient enhancement of the model's performance capabilities for segmentation tasks. To address these issues, this paper introduces Deep Self-knowledge Distillation, a novel approach for coronary artery segmentation that leverages hierarchical outputs for supervision. By combining Deep Distribution Loss and Pixel-wise Self-knowledge Distillation Loss, our method enhances the student model's segmentation performance through a hierarchical learning strategy, effectively transferring knowledge from the teacher model. Our method combines a loosely constrained probabilistic distribution vector with tightly constrained pixel-wise supervision, providing dual regularization for the segmentation model while also enhancing its generalization and robustness. Extensive experiments on XCAD and DCA1 datasets demonstrate that our approach outperforms the dice coefficient, accuracy, sensitivity and IoU compared to other models in comparative evaluations.

Optimizing Paths for Adaptive Fly-Scan Microscopy: An Extended Version

Yu Lu, Thomas F. Lynn, Ming Du, Zichao Di, Sven Leyffer

arxiv logopreprintSep 2 2025
In x-ray microscopy, traditional raster-scanning techniques are used to acquire a microscopic image in a series of step-scans. Alternatively, scanning the x-ray probe along a continuous path, called a fly-scan, reduces scan time and increases scan efficiency. However, not all regions of an image are equally important. Currently used fly-scan methods do not adapt to the characteristics of the sample during the scan, often wasting time in uniform, uninteresting regions. One approach to avoid unnecessary scanning in uniform regions for raster step-scans is to use deep learning techniques to select a shorter optimal scan path instead of a traditional raster scan path, followed by reconstructing the entire image from the partially scanned data. However, this approach heavily depends on the quality of the initial sampling, requires a large dataset for training, and incurs high computational costs. We propose leveraging the fly-scan method along an optimal scanning path, focusing on regions of interest (ROIs) and using image completion techniques to reconstruct details in non-scanned areas. This approach further shortens the scanning process and potentially decreases x-ray exposure dose while maintaining high-quality and detailed information in critical regions. To achieve this, we introduce a multi-iteration fly-scan framework that adapts to the scanned image. Specifically, in each iteration, we define two key functions: (1) a score function to generate initial anchor points and identify potential ROIs, and (2) an objective function to optimize the anchor points for convergence to an optimal set. Using these anchor points, we compute the shortest scanning path between optimized anchor points, perform the fly-scan, and subsequently apply image completion based on the acquired information in preparation for the next scan iteration.

Application of deep learning for detection of nasal bone fracture on X-ray nasal bone lateral view.

Mortezaei T, Dalili Kajan Z, Mirroshandel SA, Mehrpour M, Shahidzadeh S

pubmed logopapersSep 1 2025
This study aimed to assess the efficacy of deep learning applications for the detection of nasal bone fracture on X-ray nasal bone lateral view. In this retrospective observational study, 2968 X-ray nasal bone lateral views of trauma patients were collected from a radiology centre, and randomly divided into training, validation, and test sets. Preprocessing included noise reduction by using the Gaussian filter and image resizing. Edge detection was performed using the Canny edge detector. Feature extraction was conducted using the gray-level co-occurrence matrix (GLCM), histogram of oriented gradients (HOG), and local binary pattern (LBP) techniques. Several machine learning algorithms namely CNN, VGG16, VGG19, MobileNet, Xception, ResNet50V2, and InceptionV3 were employed for the classification of images into 2 classes of normal and fracture. The accuracy was the highest for VGG16 and Swin Transformer (79%) followed by ResNet50V2 and InceptionV3 (0.74), Xception (0.72), and MobileNet (0.71). The AUC was the highest for VGG16 (0.86) followed by VGG19 (0.84), MobileNet and Xception (0.83), and Swin Transformer (0.79). The tested deep learning models were capable of detecting nasal bone fractures on X-ray nasal bone lateral views with high accuracy. VGG16 was the best model with successful results.

Can super resolution via deep learning improve classification accuracy in dental radiography?

Çelik B, Mikaeili M, Genç MZ, Çelik ME

pubmed logopapersSep 1 2025
Deep learning-driven super resolution (SR) aims to enhance the quality and resolution of images, offering potential benefits in dental imaging. Although extensive research has focused on deep learning based dental classification tasks, the impact of applying SR techniques on classification remains underexplored. This study seeks to address this gap by evaluating and comparing the performance of deep learning classification models on dental images with and without SR enhancement. An open-source dental image dataset was utilized to investigate the impact of SR on image classification performance. SR was applied by 2 models with a scaling ratio of 2 and 4, while classification was performed by 4 deep learning models. Performances were evaluated by well-accepted metrics like structural similarity index (SSIM), peak signal-to-noise ratio (PSNR), accuracy, recall, precision, and F1 score. The effect of SR on classification performance is interpreted through 2 different approaches. Two SR models yielded average SSIM and PSNR values of 0.904 and 36.71 for increasing resolution with 2 scaling ratios. Average accuracy and F-1 score for the classification trained and tested with 2 SR-generated images were 0.859 and 0.873. In the first of the comparisons carried out with 2 different approaches, it was observed that the accuracy increased in at least half of the cases (8 out of 16) when different models and scaling ratios were considered, while in the second approach, SR showed a significantly higher performance for almost all cases (12 out of 16). This study demonstrated that the classification with SR-generated images significantly improved outcomes. For the first time, the classification performance of dental radiographs with improved resolution by SR has been investigated. Significant performance improvement was observed compared to the case without SR.

Synthetic Orthopantomography Image Generation Using Generative Adversarial Networks for Data Augmentation.

Waqas M, Hasan S, Ghori AF, Alfaraj A, Faheemuddin M, Khurshid Z

pubmed logopapersSep 1 2025
To overcome the scarcity of annotated dental X-ray datasets, this study presents a novel pipeline for generating high-resolution synthetic orthopantomography (OPG) images using customized generative adversarial networks (GANs). A total of 4777 real OPG images were collected from clinical centres in Pakistan, Thailand, and the U.S., covering diverse anatomical features. Twelve GAN models were initially trained, with four top-performing variants selected for further training on both combined and region-specific datasets. Synthetic images were generated at 2048 × 1024 pixels, maintaining fine anatomical detail. The evaluation was conducted using (1) a YOLO-based object detection model trained on real OPGs to assess feature representation via mean average precision, and (2) expert dentist scoring for anatomical and diagnostic realism. All selected models produced realistic synthetic OPGs. The YOLO detector achieved strong performance on these images, indicating accurate structural representation. Expert evaluations confirmed high anatomical plausibility, with models M1 and M3 achieving over 50% of the reference scores assigned to real OPGs. The developed GAN-based pipeline enables the ethical and scalable creation of synthetic OPG images, suitable for augmenting datasets used in artificial intelligence-driven dental diagnostics. This method provides a practical solution to data limitations in dental artificial intelligence, supporting model development in privacy-sensitive or low-resource environments.

TransForSeg: A Multitask Stereo ViT for Joint Stereo Segmentation and 3D Force Estimation in Catheterization

Pedram Fekri, Mehrdad Zadeh, Javad Dargahi

arxiv logopreprintSep 1 2025
Recently, the emergence of multitask deep learning models has enhanced catheterization procedures by providing tactile and visual perception data through an end-to-end architec- ture. This information is derived from a segmentation and force estimation head, which localizes the catheter in X-ray images and estimates the applied pressure based on its deflection within the image. These stereo vision architectures incorporate a CNN- based encoder-decoder that captures the dependencies between X-ray images from two viewpoints, enabling simultaneous 3D force estimation and stereo segmentation of the catheter. With these tasks in mind, this work approaches the problem from a new perspective. We propose a novel encoder-decoder Vision Transformer model that processes two input X-ray images as separate sequences. Given sequences of X-ray patches from two perspectives, the transformer captures long-range dependencies without the need to gradually expand the receptive field for either image. The embeddings generated by both the encoder and decoder are fed into two shared segmentation heads, while a regression head employs the fused information from the decoder for 3D force estimation. The proposed model is a stereo Vision Transformer capable of simultaneously segmenting the catheter from two angles while estimating the generated forces at its tip in 3D. This model has undergone extensive experiments on synthetic X-ray images with various noise levels and has been compared against state-of-the-art pure segmentation models, vision-based catheter force estimation methods, and a multitask catheter segmentation and force estimation approach. It outperforms existing models, setting a new state-of-the-art in both catheter segmentation and force estimation.
Page 7 of 45449 results
Show
per page

Ready to Sharpen Your Edge?

Join hundreds of your peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.