Latest Papers on Radiology AI. Tags: OCT, Order: Best Match, Limit: 10.

MitoStructSeg: mitochondrial structural complexity resolution via adaptive learning for cross-sample morphometric profiling

Wang, X., Wan, X., Cai, B., Jia, Z., Chen, Y., Guo, S., Liu, Z., Zhang, F., Hu, B.

•preprint•Jul 30 2025

Mitochondrial morphology and structural changes are closely associated with metabolic dysfunction and disease progression. However, the structural complexity of mitochondria presents a major challenge for accurate segmentation and analysis. Most existing methods focus on delineating entire mitochondria but lack the capability to resolve fine internal features, particularly cristae. In this study, we introduce MitoStructSeg, a deep learning-based framework for mitochondrial structure segmentation and quantitative analysis. The core of MitoStructSeg is AMM-Seg, a novel model that integrates domain adaptation to improve cross-sample generalization, dual-channel feature fusion to enhance structural detail extraction, and continuity learning to preserve spatial coherence. This architecture enables accurate segmentation of both mitochondrial membranes and intricately folded cristae. MitoStructSeg further incorporates a quantitative analysis module that extracts key morphological metrics, including surface area, volume, and cristae density, allowing comprehensive and scalable assessment of mitochondrial morphology. The effectiveness of our approach has been validated on both human myocardial tissue and mouse kidney tissue, demonstrating its robustness in accurately segmenting mitochondria with diverse morphologies. In addition, we provide an open source, user-friendly tool to ensure practical usability.

OCT Segmentation Methodology In Silico Academic Lab Open Code

Efficacy of image similarity as a metric for augmenting small dataset retinal image segmentation.

Wallace T, Heng IS, Subasic S, Messenger C

•papers•Jul 30 2025

Synthetic images are an option for augmenting limited medical imaging datasets to improve the performance of various machine learning models. A common metric for evaluating synthetic image quality is the Fréchet Inception Distance (FID) which measures the similarity of two image datasets. In this study we evaluate the relationship between this metric and the improvement which synthetic images, generated by a Progressively Growing Generative Adversarial Network (PGGAN), grant when augmenting Diabetes-related Macular Edema (DME) intraretinal fluid segmentation performed by a U-Net model with limited amounts of training data. We find that the behaviour of augmenting with standard and synthetic images agrees with previously conducted experiments. Additionally, we show that dissimilar (high FID) datasets do not improve segmentation significantly. As FID between the training and augmenting datasets decreases, the augmentation datasets are shown to contribute to significant and robust improvements in image segmentation. Finally, we find that there is significant evidence to suggest that synthetic and standard augmentations follow separate log-normal trends between FID and improvements in model performance, with synthetic data proving more effective than standard augmentation techniques. Our findings show that more similar datasets (lower FID) will be more effective at improving U-Net performance, however, the results also suggest that this improvement may only occur when images are sufficiently dissimilar.

OCT Segmentation Methodology In Silico Open Dataset

Deep learning aging marker from retinal images unveils sex-specific clinical and genetic signatures

Trofimova, O., Böttger, L., Bors, S., Pan, Y., Liefers, B., Beyeler, M. J., Presby, D. M., Bontempi, D., Hastings, J., Klaver, C. C. W., Bergmann, S.

•preprint•Jul 29 2025

Retinal fundus images offer a non-invasive window into systemic aging. Here, we fine-tuned a foundation model (RETFound) to predict chronological age from color fundus images in 71,343 participants from the UK Biobank, achieving a mean absolute error of 2.85 years. The resulting retinal age gap (RAG), i.e., the difference between predicted and chronological age, was associated with cardiometabolic traits, inflammation, cognitive performance, mortality, dementia, cancer, and incident cardiovascular disease. Genome-wide analyses identified genes related to longevity, metabolism, neurodegeneration, and age-related eye diseases. Sex-stratified models revealed consistent performance but divergent biological signatures: males had younger-appearing retinas and stronger links to metabolic syndrome, while in females, both model attention and genetic associations pointed to a greater involvement of retinal vasculature. Our study positions retinal aging as a biologically meaningful and sex-sensitive biomarker that can support more personalized approaches to risk assessment and aging-related healthcare.

OCT Registration Retrospective Clinical In Silico Academic Lab Benchmark SOTA

SwinECAT: A Transformer-based fundus disease classification model with Shifted Window Attention and Efficient Channel Attention

Peiran Gu, Teng Yao, Mengshen He, Fuhao Duan, Feiyan Liu, RenYuan Peng, Bao Ge

•preprint•Jul 29 2025

In recent years, artificial intelligence has been increasingly applied in the field of medical imaging. Among these applications, fundus image analysis presents special challenges, including small lesion areas in certain fundus diseases and subtle inter-disease differences, which can lead to reduced prediction accuracy and overfitting in the models. To address these challenges, this paper proposes the Transformer-based model SwinECAT, which combines the Shifted Window (Swin) Attention with the Efficient Channel Attention (ECA) Attention. SwinECAT leverages the Swin Attention mechanism in the Swin Transformer backbone to effectively capture local spatial structures and long-range dependencies within fundus images. The lightweight ECA mechanism is incorporated to guide the SwinECAT's attention toward critical feature channels, enabling more discriminative feature representation. In contrast to previous studies that typically classify fundus images into 4 to 6 categories, this work expands fundus disease classification to 9 distinct types, thereby enhancing the granularity of diagnosis. We evaluate our method on the Eye Disease Image Dataset (EDID) containing 16,140 fundus images for 9-category classification. Experimental results demonstrate that SwinECAT achieves 88.29\% accuracy, with weighted F1-score of 0.88 and macro F1-score of 0.90. The classification results of our proposed model SwinECAT significantly outperform the baseline Swin Transformer and multiple compared baseline models. To our knowledge, this represents the highest reported performance for 9-category classification on this public dataset.

OCT Classification Methodology In Silico Academic Lab Benchmark SOTA

Harnessing infrared thermography and multi-convolutional neural networks for early breast cancer detection.

Attallah O

•papers•Jul 28 2025

Breast cancer is a relatively common carcinoma among women worldwide and remains a considerable public health concern. Consequently, the prompt identification of cancer is crucial, as research indicates that 96% of cancers are treatable if diagnosed prior to metastasis. Despite being considered the gold standard for breast cancer evaluation, conventional mammography possesses inherent drawbacks, including accessibility issues, especially in rural regions, and discomfort associated with the procedure. Therefore, there has been a surge in interest in non-invasive, radiation-free alternative diagnostic techniques, such as thermal imaging (thermography). Thermography employs infrared thermal sensors to capture and assess temperature maps of human breasts for the identification of potential tumours based on areas of thermal irregularity. This study proposes an advanced computer-aided diagnosis (CAD) system called Thermo-CAD to assess early breast cancer detection using thermal imaging, aimed at assisting radiologists. The CAD system employs a variety of deep learning techniques, specifically incorporating multiple convolutional neural networks (CNNs) to enhance diagnostic accuracy and reliability. To effectively integrate multiple deep features and diminish the dimensionality of features derived from each CNN, feature transformation and selection methods, including non-negative matrix factorization and Relief-F, are used leading to a reduction in classification complexity. The Thermo-CAD system is assessed utilising two datasets: the DMR-IR (Database for Mastology Research Infrared Images), for distinguishing between normal and abnormal breast tissues, and a novel thermography dataset to distinguish abnormal instances as benign or malignant. Thermo-CAD has proven to be an outstanding CAD system for thermographic breast cancer detection, attaining 100% accuracy on the DMR-IR dataset (normal versus abnormal breast cancer) using CSVM and MGSVM classifiers, and lower accuracy using LSVM and QSVM classifiers. However, it showed a lower ability to distinguish benign from malignant cases (second dataset), achieving an accuracy of 79.3% using CSVM. Yet, it remains a promising tool for early-stage cancer detection, especially in resource-constrained environments.

OCT Classification Breast Methodology In Silico Academic Lab Benchmark SOTA

Information Entropy-Based Framework for Quantifying Tortuosity in Meibomian Gland Uneven Atrophy

Kesheng Wang, Xiaoyu Chen, Chunlei He, Fenfen Li, Xinxin Yu, Dexing Kong, Shoujun Huang, Qi Dai

•preprint•Jul 24 2025

In the medical image analysis field, precise quantification of curve tortuosity plays a critical role in the auxiliary diagnosis and pathological assessment of various diseases. In this study, we propose a novel framework for tortuosity quantification and demonstrate its effectiveness through the evaluation of meibomian gland atrophy uniformity,serving as a representative application scenario. We introduce an information entropy-based tortuosity quantification framework that integrates probability modeling with entropy theory and incorporates domain transformation of curve data. Unlike traditional methods such as curvature or arc-chord ratio, this approach evaluates the tortuosity of a target curve by comparing it to a designated reference curve. Consequently, it is more suitable for tortuosity assessment tasks in medical data where biologically plausible reference curves are available, providing a more robust and objective evaluation metric without relying on idealized straight-line comparisons. First, we conducted numerical simulation experiments to preliminarily assess the stability and validity of the method. Subsequently, the framework was applied to quantify the spatial uniformity of meibomian gland atrophy and to analyze the difference in this uniformity between \textit{Demodex}-negative and \textit{Demodex}-positive patient groups. The results demonstrated a significant difference in tortuosity-based uniformity between the two groups, with an area under the curve of 0.8768, sensitivity of 0.75, and specificity of 0.93. These findings highlight the clinical utility of the proposed framework in curve tortuosity analysis and its potential as a generalizable tool for quantitative morphological evaluation in medical diagnostics.

OCT Segmentation Methodology In Silico Academic Lab

MaskedCLIP: Bridging the Masked and CLIP Space for Semi-Supervised Medical Vision-Language Pre-training

Lei Zhu, Jun Zhou, Rick Siow Mong Goh, Yong Liu

•preprint•Jul 23 2025

Foundation models have recently gained tremendous popularity in medical image analysis. State-of-the-art methods leverage either paired image-text data via vision-language pre-training or unpaired image data via self-supervised pre-training to learn foundation models with generalizable image features to boost downstream task performance. However, learning foundation models exclusively on either paired or unpaired image data limits their ability to learn richer and more comprehensive image features. In this paper, we investigate a novel task termed semi-supervised vision-language pre-training, aiming to fully harness the potential of both paired and unpaired image data for foundation model learning. To this end, we propose MaskedCLIP, a synergistic masked image modeling and contrastive language-image pre-training framework for semi-supervised vision-language pre-training. The key challenge in combining paired and unpaired image data for learning a foundation model lies in the incompatible feature spaces derived from these two types of data. To address this issue, we propose to connect the masked feature space with the CLIP feature space with a bridge transformer. In this way, the more semantic specific CLIP features can benefit from the more general masked features for semantic feature extraction. We further propose a masked knowledge distillation loss to distill semantic knowledge of original image features in CLIP feature space back to the predicted masked image features in masked feature space. With this mutually interactive design, our framework effectively leverages both paired and unpaired image data to learn more generalizable image features for downstream tasks. Extensive experiments on retinal image analysis demonstrate the effectiveness and data efficiency of our method.

OCT Classification Methodology In Silico Academic Lab GenAI

SLOTMFound: Foundation-Based Diagnosis of Multiple Sclerosis Using Retinal SLO Imaging and OCT Thickness-maps

Esmailizadeh, R., Aghababaei, A., Mirzaei, S., Arian, R., Kafieh, R.

•preprint•Jul 15 2025

Multiple Sclerosis (MS) is a chronic autoimmune disorder of the central nervous system that can lead to significant neurological disability. Retinal imaging--particularly Scanning Laser Ophthalmoscopy (SLO) and Optical Coherence Tomography (OCT)--provides valuable biomarkers for early MS diagnosis through non-invasive visualization of neurodegenerative changes. This study proposes a foundation-based bi-modal classification framework that integrates SLO images and OCT-derived retinal thickness maps for MS diagnosis. To facilitate this, we introduce two modality-specific foundation models--SLOFound and TMFound--fine-tuned from the RETFound-Fundus backbone using an independent dataset of 203 healthy eyes, acquired at Noor Ophthalmology Hospital with the Heidelberg Spectralis HRA+OCT system. This dataset, which contains only normal cases, was used exclusively for encoder adaptation and is entirely disjoint from the classification dataset. For the classification stage, we use a separate dataset comprising IR-SLO images from 32 MS patients and 70 healthy controls, collected at the Kashani Comprehensive MS Center in Isfahan, Iran. We first assess OCT-derived maps layer-wise and identify the Ganglion Cell-Inner Plexiform Layer (GCIPL) as the most informative for MS detection. All subsequent analyses utilize GCIPL thickness maps in conjunction with SLO images. Experimental evaluations on the MS classification dataset demonstrate that our foundation-based bi-modal model outperforms unimodal variants and a prior ResNet-based state-of-the-art model, achieving a classification accuracy of 97.37%, with perfect sensitivity (100%). These results highlight the effectiveness of leveraging pre-trained foundation models, even when fine-tuned on limited data, to build robust, efficient, and generalizable diagnostic tools for MS in medical imaging contexts where labeled datasets are often scarce.

OCT Classification Methodology In Silico Academic Lab Benchmark SOTA

A Lightweight and Robust Framework for Real-Time Colorectal Polyp Detection Using LOF-Based Preprocessing and YOLO-v11n

Saadat Behzadi, Danial Sharifrazi, Bita Mesbahzadeh, Javad Hassannataj Joloudarid, Roohallah Alizadehsani

•preprint•Jul 14 2025

Objectives: Timely and accurate detection of colorectal polyps plays a crucial role in diagnosing and preventing colorectal cancer, a major cause of mortality worldwide. This study introduces a new, lightweight, and efficient framework for polyp detection that combines the Local Outlier Factor (LOF) algorithm for filtering noisy data with the YOLO-v11n deep learning model. Study design: An experimental study leveraging deep learning and outlier removal techniques across multiple public datasets. Methods: The proposed approach was tested on five diverse and publicly available datasets: CVC-ColonDB, CVC-ClinicDB, Kvasir-SEG, ETIS, and EndoScene. Since these datasets originally lacked bounding box annotations, we converted their segmentation masks into suitable detection labels. To enhance the robustness and generalizability of our model, we apply 5-fold cross-validation and remove anomalous samples using the LOF method configured with 30 neighbors and a contamination ratio of 5%. Cleaned data are then fed into YOLO-v11n, a fast and resource-efficient object detection architecture optimized for real-time applications. We train the model using a combination of modern augmentation strategies to improve detection accuracy under diverse conditions. Results: Our approach significantly improves polyp localization performance, achieving a precision of 95.83%, recall of 91.85%, F1-score of 93.48%, [email protected] of 96.48%, and [email protected]:0.95 of 77.75%. Compared to previous YOLO-based methods, our model demonstrates enhanced accuracy and efficiency. Conclusions: These results suggest that the proposed method is well-suited for real-time colonoscopy support in clinical settings. Overall, the study underscores how crucial data preprocessing and model efficiency are when designing effective AI systems for medical imaging.

OCT Detection Abdominal Methodology In Silico Academic Lab Benchmark SOTA

Disentanglement and Assessment of Shortcuts in Ophthalmological Retinal Imaging Exams

Leonor Fernandes, Tiago Gonçalves, João Matos, Luis Filipe Nakayama, Jaime S. Cardoso

•preprint•Jul 13 2025

Diabetic retinopathy (DR) is a leading cause of vision loss in working-age adults. While screening reduces the risk of blindness, traditional imaging is often costly and inaccessible. Artificial intelligence (AI) algorithms present a scalable diagnostic solution, but concerns regarding fairness and generalization persist. This work evaluates the fairness and performance of image-trained models in DR prediction, as well as the impact of disentanglement as a bias mitigation technique, using the diverse mBRSET fundus dataset. Three models, ConvNeXt V2, DINOv2, and Swin V2, were trained on macula images to predict DR and sensitive attributes (SAs) (e.g., age and gender/sex). Fairness was assessed between subgroups of SAs, and disentanglement was applied to reduce bias. All models achieved high DR prediction performance in diagnosing (up to 94% AUROC) and could reasonably predict age and gender/sex (91% and 77% AUROC, respectively). Fairness assessment suggests disparities, such as a 10% AUROC gap between age groups in DINOv2. Disentangling SAs from DR prediction had varying results, depending on the model selected. Disentanglement improved DINOv2 performance (2% AUROC gain), but led to performance drops in ConvNeXt V2 and Swin V2 (7% and 3%, respectively). These findings highlight the complexity of disentangling fine-grained features in fundus imaging and emphasize the importance of fairness in medical imaging AI to ensure equitable and reliable healthcare solutions.

OCT Classification Methodology In Silico Ethics