Sort by:
Page 9 of 81804 results

OctreeNCA: Single-Pass 184 MP Segmentation on Consumer Hardware

Nick Lemke, John Kalkhof, Niklas Babendererde, Anirban Mukhopadhyay

arxiv logopreprintAug 9 2025
Medical applications demand segmentation of large inputs, like prostate MRIs, pathology slices, or videos of surgery. These inputs should ideally be inferred at once to provide the model with proper spatial or temporal context. When segmenting large inputs, the VRAM consumption of the GPU becomes the bottleneck. Architectures like UNets or Vision Transformers scale very poorly in VRAM consumption, resulting in patch- or frame-wise approaches that compromise global consistency and inference speed. The lightweight Neural Cellular Automaton (NCA) is a bio-inspired model that is by construction size-invariant. However, due to its local-only communication rules, it lacks global knowledge. We propose OctreeNCA by generalizing the neighborhood definition using an octree data structure. Our generalized neighborhood definition enables the efficient traversal of global knowledge. Since deep learning frameworks are mainly developed for large multi-layer networks, their implementation does not fully leverage the advantages of NCAs. We implement an NCA inference function in CUDA that further reduces VRAM demands and increases inference speed. Our OctreeNCA segments high-resolution images and videos quickly while occupying 90% less VRAM than a UNet during evaluation. This allows us to segment 184 Megapixel pathology slices or 1-minute surgical videos at once.

Self-supervised disc and cup segmentation via non-local deformable convolution and adaptive transformer.

Zhao W, Wang Y

pubmed logopapersAug 9 2025
Optic disc and cup segmentation is a crucial subfield of computer vision, playing a pivotal role in automated pathological image analysis. It enables precise, efficient, and automated diagnosis of ocular conditions, significantly aiding clinicians in real-world medical applications. However, due to the scarcity of medical segmentation data and the insufficient integration of global contextual information, the segmentation accuracy remains suboptimal. This issue becomes particularly pronounced in optic disc and cup cases with complex anatomical structures and ambiguous boundaries.In order to address these limitations, this paper introduces a self-supervised training strategy integrated with a newly designed network architecture to improve segmentation accuracy.Specifically,we initially propose a non-local dual deformable convolutional block,which aims to capture the irregular image patterns(i.e. boundary).Secondly,we modify the traditional vision transformer and design an adaptive K-Nearest Neighbors(KNN) transformation block to extract the global semantic context from images. Finally,an initialization strategy based on self-supervised training is proposed to reduce the burden on the network on labeled data.Comprehensive experimental evaluations demonstrate the effectiveness of our proposed method, which outperforms previous networks and achieves state-of-the-art performance,with IOU scores of 0.9577 for the optic disc and 0.8399 for the optic cup on the REFUGE dataset.

Collaborative and privacy-preserving cross-vendor united diagnostic imaging via server-rotating federated machine learning.

Wang H, Zhang X, Ren X, Zhang Z, Yang S, Lian C, Ma J, Zeng D

pubmed logopapersAug 9 2025
Federated Learning (FL) is a distributed framework that enables collaborative training of a server model across medical data vendors while preserving data privacy. However, conventional FL faces two key challenges: substantial data heterogeneity among vendors and limited flexibility from a fixed server, leading to suboptimal performance in diagnostic-imaging tasks. To address these, we propose a server-rotating federated learning method (SRFLM). Unlike traditional FL, SRFLM designates one vendor as a provisional server for federated fine-tuning, with others acting as clients. It uses a rotational server-communication mechanism and a dynamic server-election strategy, allowing each vendor to sequentially assume the server role over time. Additionally, the communication protocol of SRFLM provides strong privacy guarantees using differential privacy. We extensively evaluate SRFLM across multiple cross-vendor diagnostic imaging tasks. We envision SRFLM as paving the way to facilitate collaborative model training across medical data vendors, thereby achieving the goal of cross-vendor united diagnostic imaging.

GAN-MRI enhanced multi-organ MRI segmentation: a deep learning perspective.

Channarayapatna Srinivasa A, Bhat SS, Baduwal D, Sim ZTJ, Patil SS, Amarapur A, Prakash KNB

pubmed logopapersAug 8 2025
Clinical magnetic resonance imaging (MRI) is a high-resolution tool widely used for detailed anatomical imaging. However, prolonged scan times often lead to motion artefacts and patient discomfort. Fast acquisition techniques can reduce scan times but often produce noisy, low-contrast images, compromising segmentation accuracy essential for diagnosis and treatment planning. To address these limitations, we developed an end-to-end framework that incorporates BIDS-based data organiser and anonymizer, a GAN-based MR image enhancement model (GAN-MRI), AssemblyNet for brain region segmentation, and an attention-residual U-Net with Guided loss for abdominal and thigh segmentation. Thirty brain scans (5,400 slices) and 32 abdominal (1,920 slices) and 55 thigh scans (2,200 slices) acquired from multiple MRI scanners (GE, Siemens, Toshiba) underwent evaluation. Image quality improved significantly, with SNR and CNR for brain scans increasing from 28.44 to 42.92 (p < 0.001) and 11.88 to 18.03 (p < 0.001), respectively. Abdominal scans exhibited SNR increases from 35.30 to 50.24 (p < 0.001) and CNR from 10,290.93 to 93,767.22 (p < 0.001). Double-blind evaluations highlighted improved visualisations of anatomical structures and bias field correction. Segmentation performance improved substantially in the thigh (muscle: + 21%, IMAT: + 9%) and abdominal regions (SSAT: + 1%, DSAT: + 2%, VAT: + 12%), while brain segmentation metrics remained largely stable, reflecting the robustness of the baseline model. Proposed framework is designed to handle data from multiple anatomies with variations from different MRI scanners and centres by enhancing MRI scan and improving segmentation accuracy, diagnostic precision and treatment planning while reducing scan times and maintaining patient comfort.

GPT-4 for automated sequence-level determination of MRI protocols based on radiology request forms from clinical routine.

Terzis R, Kaya K, Schömig T, Janssen JP, Iuga AI, Kottlors J, Lennartz S, Gietzen C, Gözdas C, Müller L, Hahnfeldt R, Maintz D, Dratsch T, Pennig L

pubmed logopapersAug 8 2025
This study evaluated GPT-4's accuracy in MRI sequence selection based on radiology request forms (RRFs), comparing its performance to radiology residents. This retrospective study included 100 RRFs across four subspecialties (cardiac imaging, neuroradiology, musculoskeletal, and oncology). GPT-4 and two radiology residents (R1: 2 years, R2: 5 years MRI experience) selected sequences based on each patient's medical history and clinical questions. Considering imaging society guidelines, five board-certified specialized radiologists assessed protocols based on completeness, quality, and utility in consensus, using 5-point Likert scales. Clinical applicability was rated binarily by the institution's lead radiographer. GPT-4 achieved median scores of 3 (1-5) for completeness, 4 (1-5) for quality, and 4 (1-5) for utility, comparable to R1 (3 (1-5), 4 (1-5), 4 (1-5); each p > 0.05) but inferior to R2 (4 (1-5), 5 (1-5); p < 0.01, respectively, and 5 (1-5); p < 0.001). Subspecialty protocol quality varied: GPT-4 matched R1 (4 (2-4) vs. 4 (2-5), p = 0.20) and R2 (4 (2-5); p = 0.47) in cardiac imaging; showed no differences in neuroradiology (all 5 (1-5), p > 0.05); scored lower than R1 and R2 in musculoskeletal imaging (3 (2-5) vs. 4 (3-5); p < 0.01, and 5 (3-5); p < 0.001); and matched R1 (4 (1-5) vs. 2 (1-4), p = 0.12) as well as R2 (5 (2-5); p = 0.20) in oncology. GPT-4-based protocols were clinically applicable in 95% of cases, comparable to R1 (95%) and R2 (96%). GPT-4 generated MRI protocols with notable completeness, quality, utility, and clinical applicability, excelling in standardized subspecialties like cardiac and neuroradiology imaging while yielding lower accuracy in musculoskeletal examinations. Question Long MRI acquisition times limit patient access, making accurate protocol selection crucial for efficient diagnostics, though it's time-consuming and error-prone, especially for inexperienced residents. Findings GPT-4 generated MRI protocols of remarkable yet inconsistent quality, performing on par with an experienced resident in standardized fields, but moderately in musculoskeletal examinations. Clinical relevance The large language model can assist less experienced radiologists in determining detailed MRI protocols and counteract increasing workloads. The model could function as a semi-automatic tool, generating MRI protocols for radiologists' confirmation, optimizing resource allocation, and improving diagnostics and cost-effectiveness.

Variational volume reconstruction with the Deep Ritz Method

Conor Rowan, Sumedh Soman, John A. Evans

arxiv logopreprintAug 8 2025
We present a novel approach to variational volume reconstruction from sparse, noisy slice data using the Deep Ritz method. Motivated by biomedical imaging applications such as MRI-based slice-to-volume reconstruction (SVR), our approach addresses three key challenges: (i) the reliance on image segmentation to extract boundaries from noisy grayscale slice images, (ii) the need to reconstruct volumes from a limited number of slice planes, and (iii) the computational expense of traditional mesh-based methods. We formulate a variational objective that combines a regression loss designed to avoid image segmentation by operating on noisy slice data directly with a modified Cahn-Hilliard energy incorporating anisotropic diffusion to regularize the reconstructed geometry. We discretize the phase field with a neural network, approximate the objective at each optimization step with Monte Carlo integration, and use ADAM to find the minimum of the approximated variational objective. While the stochastic integration may not yield the true solution to the variational problem, we demonstrate that our method reliably produces high-quality reconstructed volumes in a matter of seconds, even when the slice data is sparse and noisy.

Three-dimensional pulp chamber volume quantification in first molars using CBCT: Implications for machine learning-assisted age estimation

Ding, Y., Zhong, T., He, Y., Wang, W., Zhang, S., Zhang, X., Shi, W., jin, b.

medrxiv logopreprintAug 8 2025
Accurate adult age estimation represents a critical component of forensic individual identification. However, traditional methods relying on skeletal developmental characteristics are susceptible to preservation status and developmental variation. Teeth, owing to their exceptional taphonomic resistance and minimal postmortem alteration, emerge as premier biological samples. Utilizing the high-resolution capabilities of Cone Beam Computed Tomography (CBCT), this study retrospectively analyzed 1,857 right first molars obtained from Han Chinese adults in Sichuan Province (883 males, 974 females; aged 18-65 years). Pulp chamber volume (PCV) was measured using semi-automatic segmentation in Mimics software (v21.0). Statistically significant differences in PCV were observed based on sex and tooth position (maxillary vs. mandibular). Significant negative correlations existed between PCV and age (r = -0.86 to -0.81). The strongest correlation (r = -0.88) was identified in female maxillary first molars. Eleven curvilinear regression models and six machine learning models (Linear Regression, Lasso Regression, Neural Network, Random Forest, Gradient Boosting, and XGBoost) were developed. Among the curvilinear regression models, the cubic model demonstrated the best performance, with the female maxillary-specific model achieving a mean absolute error (MAE) of 4.95 years. Machine learning models demonstrated superior accuracy. Specifically, the sex- and tooth position-specific XGBoost model for female maxillary first molars achieved an MAE of 3.14 years (R{superscript 2} = 0.87). This represents a significant 36.5% reduction in error compared to the optimal cubic regression model. These findings demonstrate that PCV measurements in first molars, combined with machine learning algorithms (specifically XGBoost), effectively overcome the limitations of traditional methods, providing a highly precise and reproducible approach for forensic age estimation.

Fourier Optics and Deep Learning Methods for Fast 3D Reconstruction in Digital Holography

Justin London

arxiv logopreprintAug 8 2025
Computer-generated holography (CGH) is a promising method that modulates user-defined waveforms with digital holograms. An efficient and fast pipeline framework is proposed to synthesize CGH using initial point cloud and MRI data. This input data is reconstructed into volumetric objects that are then input into non-convex Fourier optics optimization algorithms for phase-only hologram (POH) and complex-hologram (CH) generation using alternating projection, SGD, and quasi-Netwton methods. Comparison of reconstruction performance of these algorithms as measured by MSE, RMSE, and PSNR is analyzed as well as to HoloNet deep learning CGH. Performance metrics are shown to be improved by using 2D median filtering to remove artifacts and speckled noise during optimization.

Clinical insights to improve medical deep learning design: A comprehensive review of methods and benefits.

Thornblad TAE, Ewals LJS, Nederend J, Luyer MDP, De With PHN, van der Sommen F

pubmed logopapersAug 8 2025
The success of deep learning and computer vision of natural images has led to an increased interest in medical image deep learning applications. However, introducing black-box deep learning models leaves little room for domain-specific knowledge when making the final diagnosis. For medical computer vision applications, not only accuracy, but also robustness, interpretability and explainability are essential to ensure trust for clinicians. Medical deep learning applications can therefore benefit from insights into the application at hand by involving clinical staff and considering the clinical diagnostic process. In this review, different clinically-inspired methods are surveyed, including clinical insights used at different stages of deep learning design for three-dimensional (3D) computed tomography (CT) image data. This review is conducted by investigating 400 research articles, covering different deep learning-based approaches for diagnosis of different diseases, in terms of including clinical insights in the published work. Based on this, a further detailed review is conducted of the 47 scientific articles using clinical inspiration. The clinically-inspired methods were found to be made with respect to preparation for training, 3D medical image data processing, integration of clinical data and model architecture selection and development. This highlights different ways in which domain-specific knowledge can be used in the design of deep learning systems.

Non-invasive prediction of the secondary enucleation risk in uveal melanoma based on pretreatment CT and MRI prior to stereotactic radiotherapy.

Yedekci Y, Arimura H, Jin Y, Yilmaz MT, Kodama T, Ozyigit G, Yazici G

pubmed logopapersAug 8 2025
The aim of this study was to develop a radiomic model to non-invasively predict the risk of secondary enucleation (SE) in patients with uveal melanoma (UM) prior to stereotactic radiotherapy using pretreatment computed tomography (CT) and magnetic resonance (MR) images. This retrospective study encompasses a cohort of 308 patients diagnosed with UM who underwent stereotactic radiosurgery (SRS) or fractionated stereotactic radiotherapy (FSRT) using the CyberKnife system (Accuray, Sunnyvale, CA, USA) between 2007 and 2018. Each patient received comprehensive ophthalmologic evaluations, including assessment of visual acuity, anterior segment examination, fundus examination, and ultrasonography. All patients were followed up for a minimum of 5 years. The cohort was composed of 65 patients who underwent SE (SE+) and 243 who did not (SE-). Radiomic features were extracted from pretreatment CT and MR images. To develop a robust predictive model, four different machine learning algorithms were evaluated using these features. The stacking model utilizing CT + MR radiomic features achieved the highest predictive performance, with an area under the curve (AUC) of 0.90, accuracy of 0.86, sensitivity of 0.81, and specificity of 0.90. The feature of robust mean absolute deviation derived from the Laplacian-of-Gaussian-filtered MR images was identified as the most significant predictor, demonstrating a statistically significant difference between SE+ and SE- cases (p = 0.005). Radiomic analysis of pretreatment CT and MR images can non-invasively predict the risk of SE in UM patients undergoing SRS/FSRT. The combined CT + MR radiomic model may inform more personalized therapeutic decisions, thereby reducing unnecessary radiation exposure and potentially improving patient outcomes.
Page 9 of 81804 results
Show
per page

Ready to Sharpen Your Edge?

Join hundreds of your peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.