Sort by:
Page 6 of 54531 results

Noise-Inspired Diffusion Model for Generalizable Low-Dose CT Reconstruction

Qi Gao, Zhihao Chen, Dong Zeng, Junping Zhang, Jianhua Ma, Hongming Shan

arxiv logopreprintJun 27 2025
The generalization of deep learning-based low-dose computed tomography (CT) reconstruction models to doses unseen in the training data is important and remains challenging. Previous efforts heavily rely on paired data to improve the generalization performance and robustness through collecting either diverse CT data for re-training or a few test data for fine-tuning. Recently, diffusion models have shown promising and generalizable performance in low-dose CT (LDCT) reconstruction, however, they may produce unrealistic structures due to the CT image noise deviating from Gaussian distribution and imprecise prior information from the guidance of noisy LDCT images. In this paper, we propose a noise-inspired diffusion model for generalizable LDCT reconstruction, termed NEED, which tailors diffusion models for noise characteristics of each domain. First, we propose a novel shifted Poisson diffusion model to denoise projection data, which aligns the diffusion process with the noise model in pre-log LDCT projections. Second, we devise a doubly guided diffusion model to refine reconstructed images, which leverages LDCT images and initial reconstructions to more accurately locate prior information and enhance reconstruction fidelity. By cascading these two diffusion models for dual-domain reconstruction, our NEED requires only normal-dose data for training and can be effectively extended to various unseen dose levels during testing via a time step matching strategy. Extensive qualitative, quantitative, and segmentation-based evaluations on two datasets demonstrate that our NEED consistently outperforms state-of-the-art methods in reconstruction and generalization performance. Source code is made available at https://github.com/qgao21/NEED.

High Resolution Isotropic 3D Cine imaging with Automated Segmentation using Concatenated 2D Real-time Imaging and Deep Learning

Mark Wrobel, Michele Pascale, Tina Yao, Ruaraidh Campbell, Elena Milano, Michael Quail, Jennifer Steeden, Vivek Muthurangu

arxiv logopreprintJun 27 2025
Background: Conventional cardiovascular magnetic resonance (CMR) in paediatric and congenital heart disease uses 2D, breath-hold, balanced steady state free precession (bSSFP) cine imaging for assessment of function and cardiac-gated, respiratory-navigated, static 3D bSSFP whole-heart imaging for anatomical assessment. Our aim is to concatenate a stack 2D free-breathing real-time cines and use Deep Learning (DL) to create an isotropic a fully segmented 3D cine dataset from these images. Methods: Four DL models were trained on open-source data that performed: a) Interslice contrast correction; b) Interslice respiratory motion correction; c) Super-resolution (slice direction); and d) Segmentation of right and left atria and ventricles (RA, LA, RV, and LV), thoracic aorta (Ao) and pulmonary arteries (PA). In 10 patients undergoing routine cardiovascular examination, our method was validated on prospectively acquired sagittal stacks of real-time cine images. Quantitative metrics (ventricular volumes and vessel diameters) and image quality of the 3D cines were compared to conventional breath hold cine and whole heart imaging. Results: All real-time data were successfully transformed into 3D cines with a total post-processing time of <1 min in all cases. There were no significant biases in any LV or RV metrics with reasonable limits of agreement and correlation. There is also reasonable agreement for all vessel diameters, although there was a small but significant overestimation of RPA diameter. Conclusion: We have demonstrated the potential of creating a 3D-cine data from concatenated 2D real-time cine images using a series of DL models. Our method has short acquisition and reconstruction times with fully segmented data being available within 2 minutes. The good agreement with conventional imaging suggests that our method could help to significantly speed up CMR in clinical practice.

BrainMT: A Hybrid Mamba-Transformer Architecture for Modeling Long-Range Dependencies in Functional MRI Data

Arunkumar Kannan, Martin A. Lindquist, Brian Caffo

arxiv logopreprintJun 27 2025
Recent advances in deep learning have made it possible to predict phenotypic measures directly from functional magnetic resonance imaging (fMRI) brain volumes, sparking significant interest in the neuroimaging community. However, existing approaches, primarily based on convolutional neural networks or transformer architectures, often struggle to model the complex relationships inherent in fMRI data, limited by their inability to capture long-range spatial and temporal dependencies. To overcome these shortcomings, we introduce BrainMT, a novel hybrid framework designed to efficiently learn and integrate long-range spatiotemporal attributes in fMRI data. Our framework operates in two stages: (1) a bidirectional Mamba block with a temporal-first scanning mechanism to capture global temporal interactions in a computationally efficient manner; and (2) a transformer block leveraging self-attention to model global spatial relationships across the deep features processed by the Mamba block. Extensive experiments on two large-scale public datasets, UKBioBank and the Human Connectome Project, demonstrate that BrainMT achieves state-of-the-art performance on both classification (sex prediction) and regression (cognitive intelligence prediction) tasks, outperforming existing methods by a significant margin. Our code and implementation details will be made publicly available at this https://github.com/arunkumar-kannan/BrainMT-fMRI

Towards automated multi-regional lung parcellation for 0.55-3T 3D T2w fetal MRI

Uus, A., Avena Zampieri, C., Downes, F., Egloff Collado, A., Hall, M., Davidson, J., Payette, K., Aviles Verdera, J., Grigorescu, I., Hajnal, J. V., Deprez, M., Aertsen, M., Hutter, J., Rutherford, M., Deprest, J., Story, L.

medrxiv logopreprintJun 26 2025
Fetal MRI is increasingly being employed in the diagnosis of fetal lung anomalies and segmentation-derived total fetal lung volumes are used as one of the parameters for prediction of neonatal outcomes. However, in clinical practice, segmentation is performed manually in 2D motion-corrupted stacks with thick slices which is time consuming and can lead to variations in estimated volumes. Furthermore, there is a known lack of consensus regarding a universal lung parcellation protocol and expected normal total lung volume formulas. The lungs are also segmented as one label without parcellation into lobes. In terms of automation, to the best of our knowledge, there have been no reported works on multi-lobe segmentation for fetal lung MRI. This work introduces the first automated deep learning segmentation pipeline for multi-regional lung segmentation for 3D motion-corrected T2w fetal body images for normal anatomy and congenital diaphragmatic hernia cases. The protocol for parcellation into 5 standard lobes was defined in the population-averaged 3D atlas. It was then used to generate a multi-label training dataset including 104 normal anatomy controls and 45 congenital diaphragmatic hernia cases from 0.55T, 1.5T and 3T acquisition protocols. The performance of 3D Attention UNet network was evaluated on 18 cases and showed good results for normal lung anatomy with expectedly lower Dice values for the ipsilateral lung. In addition, we also produced normal lung volumetry growth charts from 290 0.55T and 3T controls. This is the first step towards automated multi-regional fetal lung analysis for 3D fetal MRI.

Clinician-Led Code-Free Deep Learning for Detecting Papilloedema and Pseudopapilloedema Using Optic Disc Imaging

Shenoy, R., Samra, G. S., Sekhri, R., Yoon, H.-J., Teli, S., DeSilva, I., Tu, Z., Maconachie, G. D., Thomas, M. G.

medrxiv logopreprintJun 26 2025
ImportanceDifferentiating pseudopapilloedema from papilloedema is challenging, but critical for prompt diagnosis and to avoid unnecessary invasive procedures. Following diagnosis of papilloedema, objectively grading severity is important for determining urgency of management and therapeutic response. Automated machine learning (AutoML) has emerged as a promising tool for diagnosis in medical imaging and may provide accessible opportunities for consistent and accurate diagnosis and severity grading of papilloedema. ObjectiveThis study evaluates the feasibility of AutoML models for distinguishing the presence and severity of papilloedema using near infrared reflectance images (NIR) obtained from standard optical coherence tomography (OCT), comparing the performance of different AutoML platforms. Design, setting and participantsA retrospective cohort study was conducted using data from University Hospitals of Leicester, NHS Trust. The study involved 289 adults and children patients (813 images) who underwent optic nerve head-centred OCT imaging between 2021 and 2024. The dataset included patients with normal optic discs (69 patients, 185 images), papilloedema (135 patients, 372 images), and optic disc drusen (ODD) (85 patients, 256 images). AutoML platforms - Amazon Rekognition, Medic Mind (MM) and Google Vertex were evaluated for their ability to classify and grade papilloedema severity. Main outcomes and measuresTwo classification tasks were performed: (1) distinguishing papilloedema from normal discs and ODD; (2) grading papilloedema severity (mild/moderate vs. severe). Model performance was evaluated using area under the curve (AUC), precision, recall, F1 score, and confusion matrices for all six models. ResultsAmazon Rekognition outperformed the other platforms, achieving the highest AUC (0.90) and F1 score (0.81) in distinguishing papilloedema from normal/ODD. For papilloedema severity grading, Amazon Rekognition also performed best, with an AUC of 0.90 and F1 score of 0.79. Google Vertex and Medic Mind demonstrated good performance but had slightly lower accuracy and higher misclassification rates. Conclusions and relevanceThis evaluation of three widely available AutoML platforms using NIR images obtained from standard OCT shows promise in distinguishing and grading papilloedema. These models provide an accessible, scalable solution for clinical teams without coding expertise to feasibly develop intelligent diagnostic systems to recognise and characterise papilloedema. Further external validation and prospective testing is needed to confirm their clinical utility and applicability in diverse settings. Key PointsQuestion: Can clinician-led, code-free deep learning models using automated machine learning (AutoML) accurately differentiate papilloedema from pseudopapilloedema using optic disc imaging? Findings: Three widely available AutoML platforms were used to develop models that successfully distinguish the presence and severity of papilloedema on optic disc imaging, with Amazon Rekognition demonstrating the highest performance. Meaning: AutoML may assist clinical teams, even those with limited coding expertise, in diagnosing papilloedema, potentially reducing the need for invasive investigations.

Exploring the Design Space of 3D MLLMs for CT Report Generation

Mohammed Baharoon, Jun Ma, Congyu Fang, Augustin Toma, Bo Wang

arxiv logopreprintJun 26 2025
Multimodal Large Language Models (MLLMs) have emerged as a promising way to automate Radiology Report Generation (RRG). In this work, we systematically investigate the design space of 3D MLLMs, including visual input representation, projectors, Large Language Models (LLMs), and fine-tuning techniques for 3D CT report generation. We also introduce two knowledge-based report augmentation methods that improve performance on the GREEN score by up to 10\%, achieving the 2nd place on the MICCAI 2024 AMOS-MM challenge. Our results on the 1,687 cases from the AMOS-MM dataset show that RRG is largely independent of the size of LLM under the same training protocol. We also show that larger volume size does not always improve performance if the original ViT was pre-trained on a smaller volume size. Lastly, we show that using a segmentation mask along with the CT volume improves performance. The code is publicly available at https://github.com/bowang-lab/AMOS-MM-Solution

HyperSORT: Self-Organising Robust Training with hyper-networks

Samuel Joutard, Marijn Stollenga, Marc Balle Sanchez, Mohammad Farid Azampour, Raphael Prevost

arxiv logopreprintJun 26 2025
Medical imaging datasets often contain heterogeneous biases ranging from erroneous labels to inconsistent labeling styles. Such biases can negatively impact deep segmentation networks performance. Yet, the identification and characterization of such biases is a particularly tedious and challenging task. In this paper, we introduce HyperSORT, a framework using a hyper-network predicting UNets' parameters from latent vectors representing both the image and annotation variability. The hyper-network parameters and the latent vector collection corresponding to each data sample from the training set are jointly learned. Hence, instead of optimizing a single neural network to fit a dataset, HyperSORT learns a complex distribution of UNet parameters where low density areas can capture noise-specific patterns while larger modes robustly segment organs in differentiated but meaningful manners. We validate our method on two 3D abdominal CT public datasets: first a synthetically perturbed version of the AMOS dataset, and TotalSegmentator, a large scale dataset containing real unknown biases and errors. Our experiments show that HyperSORT creates a structured mapping of the dataset allowing the identification of relevant systematic biases and erroneous samples. Latent space clusters yield UNet parameters performing the segmentation task in accordance with the underlying learned systematic bias. The code and our analysis of the TotalSegmentator dataset are made available: https://github.com/ImFusionGmbH/HyperSORT

MedPrompt: LLM-CNN Fusion with Weight Routing for Medical Image Segmentation and Classification

Shadman Sobhan, Kazi Abrar Mahmud, Abduz Zami

arxiv logopreprintJun 26 2025
Current medical image analysis systems are typically task-specific, requiring separate models for classification and segmentation, and lack the flexibility to support user-defined workflows. To address these challenges, we introduce MedPrompt, a unified framework that combines a few-shot prompted Large Language Model (Llama-4-17B) for high-level task planning with a modular Convolutional Neural Network (DeepFusionLab) for low-level image processing. The LLM interprets user instructions and generates structured output to dynamically route task-specific pretrained weights. This weight routing approach avoids retraining the entire framework when adding new tasks-only task-specific weights are required, enhancing scalability and deployment. We evaluated MedPrompt across 19 public datasets, covering 12 tasks spanning 5 imaging modalities. The system achieves a 97% end-to-end correctness in interpreting and executing prompt-driven instructions, with an average inference latency of 2.5 seconds, making it suitable for near real-time applications. DeepFusionLab achieves competitive segmentation accuracy (e.g., Dice 0.9856 on lungs) and strong classification performance (F1 0.9744 on tuberculosis). Overall, MedPrompt enables scalable, prompt-driven medical imaging by combining the interpretability of LLMs with the efficiency of modular CNNs.

Lightweight Physics-Informed Zero-Shot Ultrasound Plane Wave Denoising

Hojat Asgariandehkordi, Mostafa Sharifzadeh, Hassan Rivaz

arxiv logopreprintJun 26 2025
Ultrasound Coherent Plane Wave Compounding (CPWC) enhances image contrast by combining echoes from multiple steered transmissions. While increasing the number of angles generally improves image quality, it drastically reduces the frame rate and can introduce blurring artifacts in fast-moving targets. Moreover, compounded images remain susceptible to noise, particularly when acquired with a limited number of transmissions. We propose a zero-shot denoising framework tailored for low-angle CPWC acquisitions, which enhances contrast without relying on a separate training dataset. The method divides the available transmission angles into two disjoint subsets, each used to form compound images that include higher noise levels. The new compounded images are then used to train a deep model via a self-supervised residual learning scheme, enabling it to suppress incoherent noise while preserving anatomical structures. Because angle-dependent artifacts vary between the subsets while the underlying tissue response is similar, this physics-informed pairing allows the network to learn to disentangle the inconsistent artifacts from the consistent tissue signal. Unlike supervised methods, our model requires no domain-specific fine-tuning or paired data, making it adaptable across anatomical regions and acquisition setups. The entire pipeline supports efficient training with low computational cost due to the use of a lightweight architecture, which comprises only two convolutional layers. Evaluations on simulation, phantom, and in vivo data demonstrate superior contrast enhancement and structure preservation compared to both classical and deep learning-based denoising methods.

Generalizable Neural Electromagnetic Inverse Scattering

Yizhe Cheng, Chunxun Tian, Haoru Wang, Wentao Zhu, Xiaoxuan Ma, Yizhou Wang

arxiv logopreprintJun 26 2025
Solving Electromagnetic Inverse Scattering Problems (EISP) is fundamental in applications such as medical imaging, where the goal is to reconstruct the relative permittivity from scattered electromagnetic field. This inverse process is inherently ill-posed and highly nonlinear, making it particularly challenging. A recent machine learning-based approach, Img-Interiors, shows promising results by leveraging continuous implicit functions. However, it requires case-specific optimization, lacks generalization to unseen data, and fails under sparse transmitter setups (e.g., with only one transmitter). To address these limitations, we revisit EISP from a physics-informed perspective, reformulating it as a two stage inverse transmission-scattering process. This formulation reveals the induced current as a generalizable intermediate representation, effectively decoupling the nonlinear scattering process from the ill-posed inverse problem. Built on this insight, we propose the first generalizable physics-driven framework for EISP, comprising a current estimator and a permittivity solver, working in an end-to-end manner. The current estimator explicitly learns the induced current as a physical bridge between the incident and scattered field, while the permittivity solver computes the relative permittivity directly from the estimated induced current. This design enables data-driven training and generalizable feed-forward prediction of relative permittivity on unseen data while maintaining strong robustness to transmitter sparsity. Extensive experiments show that our method outperforms state-of-the-art approaches in reconstruction accuracy, generalization, and robustness. This work offers a fundamentally new perspective on electromagnetic inverse scattering and represents a major step toward cost-effective practical solutions for electromagnetic imaging.
Page 6 of 54531 results
Show
per page

Ready to Sharpen Your Edge?

Join hundreds of your peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.