Sort by:
Page 6 of 42411 results

Reasoning in machine vision: learning to think fast and slow

Shaheer U. Saeed, Yipei Wang, Veeru Kasivisvanathan, Brian R. Davidson, Matthew J. Clarkson, Yipeng Hu, Daniel C. Alexander

arxiv logopreprintJun 27 2025
Reasoning is a hallmark of human intelligence, enabling adaptive decision-making in complex and unfamiliar scenarios. In contrast, machine intelligence remains bound to training data, lacking the ability to dynamically refine solutions at inference time. While some recent advances have explored reasoning in machines, these efforts are largely limited to verbal domains such as mathematical problem-solving, where explicit rules govern step-by-step reasoning. Other critical real-world tasks - including visual perception, spatial reasoning, and radiological diagnosis - require non-verbal reasoning, which remains an open challenge. Here we present a novel learning paradigm that enables machine reasoning in vision by allowing performance improvement with increasing thinking time (inference-time compute), even under conditions where labelled data is very limited. Inspired by dual-process theories of human cognition in psychology, our approach integrates a fast-thinking System I module for familiar tasks, with a slow-thinking System II module that iteratively refines solutions using self-play reinforcement learning. This paradigm mimics human reasoning by proposing, competing over, and refining solutions in data-scarce scenarios. We demonstrate superior performance through extended thinking time, compared not only to large-scale supervised learning but also foundation models and even human experts, in real-world vision tasks. These tasks include computer-vision benchmarks and cancer localisation on medical images across five organs, showcasing transformative potential for non-verbal machine reasoning.

Noise-Inspired Diffusion Model for Generalizable Low-Dose CT Reconstruction

Qi Gao, Zhihao Chen, Dong Zeng, Junping Zhang, Jianhua Ma, Hongming Shan

arxiv logopreprintJun 27 2025
The generalization of deep learning-based low-dose computed tomography (CT) reconstruction models to doses unseen in the training data is important and remains challenging. Previous efforts heavily rely on paired data to improve the generalization performance and robustness through collecting either diverse CT data for re-training or a few test data for fine-tuning. Recently, diffusion models have shown promising and generalizable performance in low-dose CT (LDCT) reconstruction, however, they may produce unrealistic structures due to the CT image noise deviating from Gaussian distribution and imprecise prior information from the guidance of noisy LDCT images. In this paper, we propose a noise-inspired diffusion model for generalizable LDCT reconstruction, termed NEED, which tailors diffusion models for noise characteristics of each domain. First, we propose a novel shifted Poisson diffusion model to denoise projection data, which aligns the diffusion process with the noise model in pre-log LDCT projections. Second, we devise a doubly guided diffusion model to refine reconstructed images, which leverages LDCT images and initial reconstructions to more accurately locate prior information and enhance reconstruction fidelity. By cascading these two diffusion models for dual-domain reconstruction, our NEED requires only normal-dose data for training and can be effectively extended to various unseen dose levels during testing via a time step matching strategy. Extensive qualitative, quantitative, and segmentation-based evaluations on two datasets demonstrate that our NEED consistently outperforms state-of-the-art methods in reconstruction and generalization performance. Source code is made available at https://github.com/qgao21/NEED.

Deep Learning Model for Automated Segmentation of Orbital Structures in MRI Images.

Bakhshaliyeva E, Reiner LN, Chelbi M, Nawabi J, Tietze A, Scheel M, Wattjes M, Dell'Orco A, Meddeb A

pubmed logopapersJun 26 2025
Magnetic resonance imaging (MRI) is a crucial tool for visualizing orbital structures and detecting eye pathologies. However, manual segmentation of orbital anatomy is challenging due to the complexity and variability of the structures. Recent advancements in deep learning (DL), particularly convolutional neural networks (CNNs), offer promising solutions for automated segmentation in medical imaging. This study aimed to train and evaluate a U-Net-based model for the automated segmentation of key orbital structures. This retrospective study included 117 patients with various orbital pathologies who underwent orbital MRI. Manual segmentation was performed on four anatomical structures: the ocular bulb, ocular tumors, retinal detachment, and the optic nerve. Following the UNet autoconfiguration by nnUNet, we conducted a five-fold cross-validation and evaluated the model's performances using Dice Similarity Coefficient (DSC) and Relative Absolute Volume Difference (RAVD) as metrics. nnU-Net achieved high segmentation performance for the ocular bulb (mean DSC: 0.931) and the optic nerve (mean DSC: 0.820). Segmentation of ocular tumors (mean DSC: 0.788) and retinal detachment (mean DSC: 0.550) showed greater variability, with performance declining in more challenging cases. Despite these challenges, the model achieved high detection rates, with ROC AUCs of 0.90 for ocular tumors and 0.78 for retinal detachment. This study demonstrates nnU-Net's capability for accurate segmentation of orbital structures, particularly the ocular bulb and optic nerve. However, challenges remain in the segmentation of tumors and retinal detachment due to variability and artifacts. Future improvements in deep learning models and broader, more diverse datasets may enhance segmentation performance, ultimately aiding in the diagnosis and treatment of orbital pathologies.

Lightweight Physics-Informed Zero-Shot Ultrasound Plane Wave Denoising

Hojat Asgariandehkordi, Mostafa Sharifzadeh, Hassan Rivaz

arxiv logopreprintJun 26 2025
Ultrasound Coherent Plane Wave Compounding (CPWC) enhances image contrast by combining echoes from multiple steered transmissions. While increasing the number of angles generally improves image quality, it drastically reduces the frame rate and can introduce blurring artifacts in fast-moving targets. Moreover, compounded images remain susceptible to noise, particularly when acquired with a limited number of transmissions. We propose a zero-shot denoising framework tailored for low-angle CPWC acquisitions, which enhances contrast without relying on a separate training dataset. The method divides the available transmission angles into two disjoint subsets, each used to form compound images that include higher noise levels. The new compounded images are then used to train a deep model via a self-supervised residual learning scheme, enabling it to suppress incoherent noise while preserving anatomical structures. Because angle-dependent artifacts vary between the subsets while the underlying tissue response is similar, this physics-informed pairing allows the network to learn to disentangle the inconsistent artifacts from the consistent tissue signal. Unlike supervised methods, our model requires no domain-specific fine-tuning or paired data, making it adaptable across anatomical regions and acquisition setups. The entire pipeline supports efficient training with low computational cost due to the use of a lightweight architecture, which comprises only two convolutional layers. Evaluations on simulation, phantom, and in vivo data demonstrate superior contrast enhancement and structure preservation compared to both classical and deep learning-based denoising methods.

Generalizable Neural Electromagnetic Inverse Scattering

Yizhe Cheng, Chunxun Tian, Haoru Wang, Wentao Zhu, Xiaoxuan Ma, Yizhou Wang

arxiv logopreprintJun 26 2025
Solving Electromagnetic Inverse Scattering Problems (EISP) is fundamental in applications such as medical imaging, where the goal is to reconstruct the relative permittivity from scattered electromagnetic field. This inverse process is inherently ill-posed and highly nonlinear, making it particularly challenging. A recent machine learning-based approach, Img-Interiors, shows promising results by leveraging continuous implicit functions. However, it requires case-specific optimization, lacks generalization to unseen data, and fails under sparse transmitter setups (e.g., with only one transmitter). To address these limitations, we revisit EISP from a physics-informed perspective, reformulating it as a two stage inverse transmission-scattering process. This formulation reveals the induced current as a generalizable intermediate representation, effectively decoupling the nonlinear scattering process from the ill-posed inverse problem. Built on this insight, we propose the first generalizable physics-driven framework for EISP, comprising a current estimator and a permittivity solver, working in an end-to-end manner. The current estimator explicitly learns the induced current as a physical bridge between the incident and scattered field, while the permittivity solver computes the relative permittivity directly from the estimated induced current. This design enables data-driven training and generalizable feed-forward prediction of relative permittivity on unseen data while maintaining strong robustness to transmitter sparsity. Extensive experiments show that our method outperforms state-of-the-art approaches in reconstruction accuracy, generalization, and robustness. This work offers a fundamentally new perspective on electromagnetic inverse scattering and represents a major step toward cost-effective practical solutions for electromagnetic imaging.

Dose-aware denoising diffusion model for low-dose CT.

Kim S, Kim BJ, Baek J

pubmed logopapersJun 26 2025
Low-dose computed tomography (LDCT) denoising plays an important role in medical imaging for reducing the radiation dose to patients. Recently, various data-driven and diffusion-based deep learning (DL) methods have been developed and shown promising results in LDCT denoising. However, challenges remain in ensuring generalizability to different datasets and mitigating uncertainty from stochastic sampling. In this paper, we introduce a novel dose-aware diffusion model that effectively reduces CT image noise while maintaining structural fidelity and being generalizable to different dose levels.
Approach: Our approach employs a physics-based forward process with continuous timesteps, enabling flexible representation of diverse noise levels. We incorporate a computationally efficient noise calibration module in our diffusion framework that resolves misalignment between intermediate results and their corresponding timesteps. Furthermore, we present a simple yet effective method for estimating appropriate timesteps for unseen LDCT images, allowing generalization to an unknown, arbitrary dose levels.
Main Results: Both qualitative and quantitative evaluation results on Mayo Clinic datasets show that the proposed method outperforms existing denoising methods in preserving the noise texture and restoring anatomical structures. The proposed method also shows consistent results on different dose levels and an unseen dataset.
Significance: We propose a novel dose-aware diffusion model for LDCT denoising, aiming to address the generalization and uncertainty issues of existing diffusion-based DL methods. Our experimental results demonstrate the effectiveness of the proposed method across different dose levels. We expect that our approach can provide a clinically practical solution for LDCT denoising with its high structural fidelity and computational efficiency.

Enhancing cancer diagnostics through a novel deep learning-based semantic segmentation algorithm: A low-cost, high-speed, and accurate approach.

Benabbou T, Sahel A, Badri A, Mourabit IE

pubmed logopapersJun 26 2025
Deep learning-based semantic segmentation approaches provide an efficient and automated means for cancer diagnosis and monitoring, which is important in clinical applications. However, implementing these approaches outside the experimental environment and using them in real-world applications requires powerful and adequate hardware resources, which are not available in most hospitals, especially in low- and middle-income countries. Consequently, clinical settings will never use most of these algorithms, or at best, their adoption will be relatively limited. To address these issues, some approaches that reduce computational costs were proposed, but they performed poorly and failed to produce satisfactory results. Therefore, finding a method that overcomes these limitations without losing performance is highly challenging. To face this challenge, our study proposes a novel, optimal convolutional neural network-based approach for medical image segmentation that consists of multiple synthesis and analysis paths connected through a series of long skip connections. The design leverages multi-scale convolution, multi-scale feature extraction, downsampling strategies, and feature map fusion methods, all of which have proven effective in enhancing performance. This framework was extensively evaluated against current state-of-the-art architectures on various medical image segmentation tasks, including lung tumors, spleen, and pancreatic tumors. The results of these experiments conclusively demonstrate the efficacy of the proposed approach in outperforming existing state-of-the-art methods across multiple evaluation metrics. This superiority is further enhanced by the framework's ability to minimize the computational complexity and decrease the number of parameters required, resulting in greater segmentation accuracy, faster processing, and better implementation efficiency.

[AI-enabled clinical decision support systems: challenges and opportunities].

Tschochohei M, Adams LC, Bressem KK, Lammert J

pubmed logopapersJun 25 2025
Clinical decision-making is inherently complex, time-sensitive, and prone to error. AI-enabled clinical decision support systems (CDSS) offer promising solutions by leveraging large datasets to provide evidence-based recommendations. These systems range from rule-based and knowledge-based to increasingly AI-driven approaches. However, key challenges persist, particularly concerning data quality, seamless integration into clinical workflows, and clinician trust and acceptance. Ethical and legal considerations, especially data privacy, are also paramount.AI-CDSS have demonstrated success in fields like radiology (e.g., pulmonary nodule detection, mammography interpretation) and cardiology, where they enhance diagnostic accuracy and improve patient outcomes. Looking ahead, chat and voice interfaces powered by large language models (LLMs) could support shared decision-making (SDM) by fostering better patient engagement and understanding.To fully realize the potential of AI-CDSS in advancing efficient, patient-centered care, it is essential to ensure their responsible development. This includes grounding AI models in domain-specific data, anonymizing user inputs, and implementing rigorous validation of AI-generated outputs before presentation. Thoughtful design and ethical oversight will be critical to integrating AI safely and effectively into clinical practice.

Accuracy and Efficiency of Artificial Intelligence and Manual Virtual Segmentation for Generation of 3D Printed Tooth Replicas.

Pedrinaci I, Nasseri A, Calatrava J, Couso-Queiruga E, Giannobile WV, Gallucci GO, Sanz M

pubmed logopapersJun 25 2025
The primary aim of this in vitro study was to compare methods for generating 3D-printed replicas through virtual segmentation, utilizing artificial intelligence (AI) or manual processes, by assessing accuracy in terms of volumetric and linear discrepancies. The secondary aims were the assessment of time efficiency with both segmentation methods, and the effect of post-processing on 3D-printed replicas. Thirty teeth were scanned through Cone Beam Computed Tomography (CBCT), capturing the region of interest from human subjects. DICOM files underwent virtual segmentation through both AI and manual methods. Replicas were fabricated with a stereolithography 3D printer. After surface scanning of pre-processed replicas and extracted teeth, STL files were superimposed to compare linear and volumetric differences using the extracted teeth as the reference. Post-processed replicas were scanned to assess the effect of post-processing on linear and volumetric changes. AI-driven segmentation resulted in statistically significant mean linear and volumetric differences of -0.709mm (SD 0.491, P< 0.001) and -4.70%, respectively. Manual segmentation showed no statistically significant differences in mean linear, -0.463mm (SD 0.335, P<0.001) and volumetric (-1.20%) measures. Comparing manual and AI-driven segmentations, AI-driven segmentation displayed mean linear and volumetric differences of -0.329mm (SD 0.566, p=0.003) and -2.23%, respectively. Additionally, AI segmentation reduced the mean time by 21.8 minutes. When comparing post-processed to pre-processed replicas, there was a volumetric reduction of -4.53% and a mean linear difference of -0.151mm (SD 0.564, p=0.042). Both segmentation methods achieved acceptable accuracy, with manual segmentation slightly more accurate but AI-driven segmentation more time-efficient. Continuous improvement in AI offers the potential for increased accuracy, efficiency, and broader application in the future.

Comparative Analysis of Automated vs. Expert-Designed Machine Learning Models in Age-Related Macular Degeneration Detection and Classification.

Durmaz Engin C, Beşenk U, Özizmirliler D, Selver MA

pubmed logopapersJun 25 2025
To compare the effectiveness of expert-designed machine learning models and code-free automated machine learning (AutoML) models in classifying optical coherence tomography (OCT) images for detecting age-related macular degeneration (AMD) and distinguishing between its dry and wet forms. Custom models were developed by an artificial intelligence expert using the EfficientNet V2 architecture, while AutoML models were created by an ophthalmologist utilizing LobeAI with transfer learning via ResNet-50 V2. Both models were designed to differentiate normal OCT images from AMD and to also distinguish between dry and wet AMD. The models were trained and tested using an 80:20 split, with each diagnostic group containing 500 OCT images. Performance metrics, including sensitivity, specificity, accuracy, and F1 scores, were calculated and compared. The expert-designed model achieved an overall accuracy of 99.67% for classifying all images, with F1 scores of 0.99 or higher across all binary class comparisons. In contrast, the AutoML model achieved an overall accuracy of 89.00%, with F1 scores ranging from 0.86 to 0.90 in binary comparisons. Notably lower recall was observed for dry AMD vs. normal (0.85) in the AutoML model, indicating challenges in correctly identifying dry AMD. While the AutoML models demonstrated acceptable performance in identifying and classifying AMD cases, the expert-designed models significantly outperformed them. The use of advanced neural network architectures and rigorous optimization in the expert-developed models underscores the continued necessity of expert involvement in the development of high-precision diagnostic tools for medical image classification.
Page 6 of 42411 results
Show
per page

Ready to Sharpen Your Edge?

Join hundreds of your peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.