Sort by:
Page 2 of 324 results

Deep compressed multichannel adaptive optics scanning light ophthalmoscope.

Park J, Hagan K, DuBose TB, Maldonado RS, McNabb RP, Dubra A, Izatt JA, Farsiu S

pubmed logopapersMay 9 2025
Adaptive optics scanning light ophthalmoscopy (AOSLO) reveals individual retinal cells and their function, microvasculature, and micropathologies in vivo. As compared to the single-channel offset pinhole and two-channel split-detector nonconfocal AOSLO designs, by providing multidirectional imaging capabilities, a recent generation of multidetector and (multi-)offset aperture AOSLO modalities has been demonstrated to provide critical information about retinal microstructures. However, increasing detection channels requires expensive optical components and/or critically increases imaging time. To address this issue, we present an innovative combination of machine learning and optics as an integrated technology to compressively capture 12 nonconfocal channel AOSLO images simultaneously. Imaging of healthy participants and diseased subjects using the proposed deep compressed multichannel AOSLO showed enhanced visualization of rods, cones, and mural cells with over an order-of-magnitude improvement in imaging speed as compared to conventional offset aperture imaging. To facilitate the adaptation and integration with other in vivo microscopy systems, we made optical design, acquisition, and computational reconstruction codes open source.

Noise-Consistent Siamese-Diffusion for Medical Image Synthesis and Segmentation

Kunpeng Qiu, Zhiqiang Gao, Zhiying Zhou, Mingjie Sun, Yongxin Guo

arxiv logopreprintMay 9 2025
Deep learning has revolutionized medical image segmentation, yet its full potential remains constrained by the paucity of annotated datasets. While diffusion models have emerged as a promising approach for generating synthetic image-mask pairs to augment these datasets, they paradoxically suffer from the same data scarcity challenges they aim to mitigate. Traditional mask-only models frequently yield low-fidelity images due to their inability to adequately capture morphological intricacies, which can critically compromise the robustness and reliability of segmentation models. To alleviate this limitation, we introduce Siamese-Diffusion, a novel dual-component model comprising Mask-Diffusion and Image-Diffusion. During training, a Noise Consistency Loss is introduced between these components to enhance the morphological fidelity of Mask-Diffusion in the parameter space. During sampling, only Mask-Diffusion is used, ensuring diversity and scalability. Comprehensive experiments demonstrate the superiority of our method. Siamese-Diffusion boosts SANet's mDice and mIoU by 3.6% and 4.4% on the Polyps, while UNet improves by 1.52% and 1.64% on the ISIC2018. Code is available at GitHub.

DFEN: Dual Feature Equalization Network for Medical Image Segmentation

Jianjian Yin, Yi Chen, Chengyu Li, Zhichao Zheng, Yanhui Gu, Junsheng Zhou

arxiv logopreprintMay 9 2025
Current methods for medical image segmentation primarily focus on extracting contextual feature information from the perspective of the whole image. While these methods have shown effective performance, none of them take into account the fact that pixels at the boundary and regions with a low number of class pixels capture more contextual feature information from other classes, leading to misclassification of pixels by unequal contextual feature information. In this paper, we propose a dual feature equalization network based on the hybrid architecture of Swin Transformer and Convolutional Neural Network, aiming to augment the pixel feature representations by image-level equalization feature information and class-level equalization feature information. Firstly, the image-level feature equalization module is designed to equalize the contextual information of pixels within the image. Secondly, we aggregate regions of the same class to equalize the pixel feature representations of the corresponding class by class-level feature equalization module. Finally, the pixel feature representations are enhanced by learning weights for image-level equalization feature information and class-level equalization feature information. In addition, Swin Transformer is utilized as both the encoder and decoder, thereby bolstering the ability of the model to capture long-range dependencies and spatial correlations. We conducted extensive experiments on Breast Ultrasound Images (BUSI), International Skin Imaging Collaboration (ISIC2017), Automated Cardiac Diagnosis Challenge (ACDC) and PH$^2$ datasets. The experimental results demonstrate that our method have achieved state-of-the-art performance. Our code is publicly available at https://github.com/JianJianYin/DFEN.

Towards Better Cephalometric Landmark Detection with Diffusion Data Generation

Dongqian Guo, Wencheng Han, Pang Lyu, Yuxi Zhou, Jianbing Shen

arxiv logopreprintMay 9 2025
Cephalometric landmark detection is essential for orthodontic diagnostics and treatment planning. Nevertheless, the scarcity of samples in data collection and the extensive effort required for manual annotation have significantly impeded the availability of diverse datasets. This limitation has restricted the effectiveness of deep learning-based detection methods, particularly those based on large-scale vision models. To address these challenges, we have developed an innovative data generation method capable of producing diverse cephalometric X-ray images along with corresponding annotations without human intervention. To achieve this, our approach initiates by constructing new cephalometric landmark annotations using anatomical priors. Then, we employ a diffusion-based generator to create realistic X-ray images that correspond closely with these annotations. To achieve precise control in producing samples with different attributes, we introduce a novel prompt cephalometric X-ray image dataset. This dataset includes real cephalometric X-ray images and detailed medical text prompts describing the images. By leveraging these detailed prompts, our method improves the generation process to control different styles and attributes. Facilitated by the large, diverse generated data, we introduce large-scale vision detection models into the cephalometric landmark detection task to improve accuracy. Experimental results demonstrate that training with the generated data substantially enhances the performance. Compared to methods without using the generated data, our approach improves the Success Detection Rate (SDR) by 6.5%, attaining a notable 82.2%. All code and data are available at: https://um-lab.github.io/cepha-generation

Comparative analysis of open-source against commercial AI-based segmentation models for online adaptive MR-guided radiotherapy.

Langner D, Nachbar M, Russo ML, Boeke S, Gani C, Niyazi M, Thorwarth D

pubmed logopapersMay 8 2025
Online adaptive magnetic resonance-guided radiotherapy (MRgRT) has emerged as a state-of-the-art treatment option for multiple tumour entities, accounting for daily anatomical and tumour volume changes, thus allowing sparing of relevant organs at risk (OARs). However, the annotation of treatment-relevant anatomical structures in context of online plan adaptation remains challenging, often relying on commercial segmentation solutions due to limited availability of clinically validated alternatives. The aim of this study was to investigate whether an open-source artificial intelligence (AI) segmentation network can compete with the annotation accuracy of a commercial solution, both trained on the identical dataset, questioning the need for commercial models in clinical practice. For 47 pelvic patients, T2w MR imaging data acquired on a 1.5 T MR-Linac were manually contoured, identifying prostate, seminal vesicles, rectum, anal canal, bladder, penile bulb, and bony structures. These training data were used for the generation of an in-house AI segmentation model, a nnU-Net with residual encoder architecture featuring a streamlined single image inference pipeline, and re-training of a commercial solution. For quantitative evaluation, 20 MR images were contoured by a radiation oncologist, considered as ground truth contours (GTC) and compared with the in-house/commercial AI-based contours (iAIC/cAIC) using Dice Similarity Coefficient (DSC), 95% Hausdorff distances (HD95), and surface DSC (sDSC). For qualitative evaluation, four radiation oncologists assessed the usability of OAR/target iAIC within an online adaptive workflow using a four-point Likert scale: (1) acceptable without modification, (2) requiring minor adjustments, (3) requiring major adjustments, and (4) not usable. Patient-individual annotations were generated in a median [range] time of 23 [16-34] s for iAIC and 152 [121-198] s for cAIC, respectively. OARs showed a maximum median DSC of 0.97/0.97 (iAIC/cAIC) for bladder and minimum median DSC of 0.78/0.79 (iAIC/cAIC) for anal canal/penile bulb. Maximal respectively minimal median HD95 were detected for rectum with 17.3/20.6 mm (iAIC/cAIC) and for bladder with 5.6/6.0 mm (iAIC/cAIC). Overall, the average median DSC/HD95 values were 0.87/11.8mm (iAIC) and 0.83/10.2mm (cAIC) for OAR/targets and 0.90/11.9mm (iAIC) and 0.91/16.5mm (cAIC) for bony structures. For a tolerance of 3 mm, the highest and lowest sDSC were determined for bladder (iAIC:1.00, cAIC:0.99) and prostate in iAIC (0.89) and anal canal in cAIC (0.80), respectively. Qualitatively, 84.8% of analysed contours were considered as clinically acceptable for iAIC, while 12.9% required minor and 2.3% major adjustments or were classed as unusable. Contour-specific analysis showed that iAIC achieved the highest mean scores with 1.00 for the anal canal and the lowest with 1.61 for the prostate. This study demonstrates that open-source segmentation framework can achieve comparable annotation accuracy to commercial solutions for pelvic anatomy in online adaptive MRgRT. The adapted framework not only maintained high segmentation performance, with 84.8% of contours accepted by physicians or requiring only minor corrections (12.9%) but also enhanced clinical workflow efficiency of online adaptive MRgRT through reduced inference times. These findings establish open-source frameworks as viable alternatives to commercial systems in supervised clinical workflows.

Weakly supervised language models for automated extraction of critical findings from radiology reports.

Das A, Talati IA, Chaves JMZ, Rubin D, Banerjee I

pubmed logopapersMay 8 2025
Critical findings in radiology reports are life threatening conditions that need to be communicated promptly to physicians for timely management of patients. Although challenging, advancements in natural language processing (NLP), particularly large language models (LLMs), now enable the automated identification of key findings from verbose reports. Given the scarcity of labeled critical findings data, we implemented a two-phase, weakly supervised fine-tuning approach on 15,000 unlabeled Mayo Clinic reports. This fine-tuned model then automatically extracted critical terms on internal (Mayo Clinic, n = 80) and external (MIMIC-III, n = 123) test datasets, validated against expert annotations. Model performance was further assessed on 5000 MIMIC-IV reports using LLM-aided metrics, G-eval and Prometheus. Both manual and LLM-based evaluations showed improved task alignment with weak supervision. The pipeline and model, publicly available under an academic license, can aid in critical finding extraction for research and clinical use ( https://github.com/dasavisha/CriticalFindings_Extract ).

Automated Thoracolumbar Stump Rib Detection and Analysis in a Large CT Cohort

Hendrik Möller, Hanna Schön, Alina Dima, Benjamin Keinert-Weth, Robert Graf, Matan Atad, Johannes Paetzold, Friederike Jungmann, Rickmer Braren, Florian Kofler, Bjoern Menze, Daniel Rueckert, Jan S. Kirschke

arxiv logopreprintMay 8 2025
Thoracolumbar stump ribs are one of the essential indicators of thoracolumbar transitional vertebrae or enumeration anomalies. While some studies manually assess these anomalies and describe the ribs qualitatively, this study aims to automate thoracolumbar stump rib detection and analyze their morphology quantitatively. To this end, we train a high-resolution deep-learning model for rib segmentation and show significant improvements compared to existing models (Dice score 0.997 vs. 0.779, p-value < 0.01). In addition, we use an iterative algorithm and piece-wise linear interpolation to assess the length of the ribs, showing a success rate of 98.2%. When analyzing morphological features, we show that stump ribs articulate more posteriorly at the vertebrae (-19.2 +- 3.8 vs -13.8 +- 2.5, p-value < 0.01), are thinner (260.6 +- 103.4 vs. 563.6 +- 127.1, p-value < 0.01), and are oriented more downwards and sideways within the first centimeters in contrast to full-length ribs. We show that with partially visible ribs, these features can achieve an F1-score of 0.84 in differentiating stump ribs from regular ones. We publish the model weights and masks for public use.

Interpretable MRI-Based Deep Learning for Alzheimer's Risk and Progression

Lu, B., Chen, Y.-R., Li, R.-X., Zhang, M.-K., Yan, S.-Z., Chen, G.-Q., Castellanos, F. X., Thompson, P. M., Lu, J., Han, Y., Yan, C.-G.

medrxiv logopreprintMay 7 2025
Timely intervention for Alzheimers disease (AD) requires early detection. The development of immunotherapies targeting amyloid-beta and tau underscores the need for accessible, time-efficient biomarkers for early diagnosis. Here, we directly applied our previously developed MRI-based deep learning model for AD to the large Chinese SILCODE cohort (722 participants, 1,105 brain MRI scans). The model -- initially trained on North American data -- demonstrated robust cross-ethnic generalization, without any retraining or fine-tuning, achieving an AUC of 91.3% in AD classification with a sensitivity of 95.2%. It successfully identified 86.7% of individuals at risk of AD progression more than 5 years in advance. Individuals identified as high-risk exhibited significantly shorter median progression times. By integrating an interpretable deep learning brain risk map approach, we identified AD brain subtypes, including an MCI subtype associated with rapid cognitive decline. The models risk scores showed significant correlations with cognitive measures and plasma biomarkers, such as tau proteins and neurofilament light chain (NfL). These findings underscore the exceptional generalizability and clinical utility of MRI-based deep learning models, especially in large and diverse populations, offering valuable tools for early therapeutic intervention. The model has been made open-source and deployed to a free online website for AD risk prediction, to assist in early screening and intervention.

Enhancing efficient deep learning models with multimodal, multi-teacher insights for medical image segmentation.

Hossain KF, Kamran SA, Ong J, Tavakkoli A

pubmed logopapersMay 7 2025
The rapid evolution of deep learning has dramatically enhanced the field of medical image segmentation, leading to the development of models with unprecedented accuracy in analyzing complex medical images. Deep learning-based segmentation holds significant promise for advancing clinical care and enhancing the precision of medical interventions. However, these models' high computational demand and complexity present significant barriers to their application in resource-constrained clinical settings. To address this challenge, we introduce Teach-Former, a novel knowledge distillation (KD) framework that leverages a Transformer backbone to effectively condense the knowledge of multiple teacher models into a single, streamlined student model. Moreover, it excels in the contextual and spatial interpretation of relationships across multimodal images for more accurate and precise segmentation. Teach-Former stands out by harnessing multimodal inputs (CT, PET, MRI) and distilling the final predictions and the intermediate attention maps, ensuring a richer spatial and contextual knowledge transfer. Through this technique, the student model inherits the capacity for fine segmentation while operating with a significantly reduced parameter set and computational footprint. Additionally, introducing a novel training strategy optimizes knowledge transfer, ensuring the student model captures the intricate mapping of features essential for high-fidelity segmentation. The efficacy of Teach-Former has been effectively tested on two extensive multimodal datasets, HECKTOR21 and PI-CAI22, encompassing various image types. The results demonstrate that our KD strategy reduces the model complexity and surpasses existing state-of-the-art methods to achieve superior performance. The findings of this study indicate that the proposed methodology could facilitate efficient segmentation of complex multimodal medical images, supporting clinicians in achieving more precise diagnoses and comprehensive monitoring of pathological conditions ( https://github.com/FarihaHossain/TeachFormer ).

ChatOCT: Embedded Clinical Decision Support Systems for Optical Coherence Tomography in Offline and Resource-Limited Settings.

Liu C, Zhang H, Zheng Z, Liu W, Gu C, Lan Q, Zhang W, Yang J

pubmed logopapersMay 7 2025
Optical Coherence Tomography (OCT) is a critical imaging modality for diagnosing ocular and systemic conditions, yet its accessibility is hindered by the need for specialized expertise and high computational demands. To address these challenges, we introduce ChatOCT, an offline-capable, domain-adaptive clinical decision support system (CDSS) that integrates structured expert Q&A generation, OCT-specific knowledge injection, and activation-aware model compression. Unlike existing systems, ChatOCT functions without internet access, making it suitable for low-resource environments. ChatOCT is built upon LLaMA-2-7B, incorporating domain-specific knowledge from PubMed and OCT News through a two-stage training process: (1) knowledge injection for OCT-specific expertise and (2) Q&A instruction tuning for structured, interactive diagnostic reasoning. To ensure feasibility in offline environments, we apply activation-aware weight quantization, reducing GPU memory usage to ~ 4.74 GB, enabling deployment on standard OCT hardware. A novel expert answer generation framework mitigates hallucinations by structuring responses in a multi-step process, ensuring accuracy and interpretability. ChatOCT outperforms state-of-the-art baselines such as LLaMA-2, PMC-LLaMA-13B, and ChatDoctor by 10-15 points in coherence, relevance, and clinical utility, while reducing GPU memory requirements by 79%, while maintaining real-time responsiveness (~ 20 ms inference time). Expert ophthalmologists rated ChatOCT's outputs as clinically actionable and aligned with real-world decision-making needs, confirming its potential to assist frontline healthcare providers. ChatOCT represents an innovative offline clinical decision support system for optical coherence tomography (OCT) that runs entirely on local embedded hardware, enabling real-time analysis in resource-limited settings without internet connectivity. By offering a scalable, generalizable pipeline that integrates knowledge injection, instruction tuning, and model compression, ChatOCT provides a blueprint for next-generation, resource-efficient clinical AI solutions across multiple medical domains.
Page 2 of 324 results
Show
per page
Get Started

Upload your X-ray image and get interpretation.

Upload now →

Disclaimer: X-ray Interpreter's AI-generated results are for informational purposes only and not a substitute for professional medical advice. Always consult a healthcare professional for medical diagnosis and treatment.