Sort by:
Page 5 of 78773 results

RAU: Reference-based Anatomical Understanding with Vision Language Models

Yiwei Li, Yikang Liu, Jiaqi Guo, Lin Zhao, Zheyuan Zhang, Xiao Chen, Boris Mailhe, Ankush Mukherjee, Terrence Chen, Shanhui Sun

arxiv logopreprintSep 26 2025
Anatomical understanding through deep learning is critical for automatic report generation, intra-operative navigation, and organ localization in medical imaging; however, its progress is constrained by the scarcity of expert-labeled data. A promising remedy is to leverage an annotated reference image to guide the interpretation of an unlabeled target. Although recent vision-language models (VLMs) exhibit non-trivial visual reasoning, their reference-based understanding and fine-grained localization remain limited. We introduce RAU, a framework for reference-based anatomical understanding with VLMs. We first show that a VLM learns to identify anatomical regions through relative spatial reasoning between reference and target images, trained on a moderately sized dataset. We validate this capability through visual question answering (VQA) and bounding box prediction. Next, we demonstrate that the VLM-derived spatial cues can be seamlessly integrated with the fine-grained segmentation capability of SAM2, enabling localization and pixel-level segmentation of small anatomical regions, such as vessel segments. Across two in-distribution and two out-of-distribution datasets, RAU consistently outperforms a SAM2 fine-tuning baseline using the same memory setup, yielding more accurate segmentations and more reliable localization. More importantly, its strong generalization ability makes it scalable to out-of-distribution datasets, a property crucial for medical image applications. To the best of our knowledge, RAU is the first to explore the capability of VLMs for reference-based identification, localization, and segmentation of anatomical structures in medical images. Its promising performance highlights the potential of VLM-driven approaches for anatomical understanding in automated clinical workflows.

Ultra-low-field MRI: a David versus Goliath challenge in modern imaging.

Gagliardo C, Feraco P, Contrino E, D'Angelo C, Geraci L, Salvaggio G, Gagliardo A, La Grutta L, Midiri M, Marrale M

pubmed logopapersSep 26 2025
Ultra-low-field magnetic resonance imaging (ULF-MRI), operating below 0.2 Tesla, is gaining renewed interest as a re-emerging diagnostic modality in a field dominated by high- and ultra-high-field systems. Recent advances in magnet design, RF coils, pulse sequences, and AI-based reconstruction have significantly enhanced image quality, mitigating traditional limitations such as low signal- and contrast-to-noise ratio and reduced spatial resolution. ULF-MRI offers distinct advantages: reduced susceptibility artifacts, safer imaging in patients with metallic implants, low power consumption, and true portability for point-of-care use. This narrative review synthesizes the physical foundations, technological advances, and emerging clinical applications of ULF-MRI. A focused literature search across PubMed, Scopus, IEEE Xplore, and Google Scholar was conducted up to August 11, 2025, using combined keywords targeting hardware, software, and clinical domains. Inclusion emphasized scientific rigor and thematic relevance. A comparative analysis with other imaging modalities highlights the specific niche ULF-MRI occupies within the broader diagnostic landscape. Future directions and challenges for clinical translation are explored. In a world increasingly polarized between the push for ultra-high-field excellence and the need for accessible imaging, ULF-MRI embodies a modern "David versus Goliath" theme, offering a sustainable, democratizing force capable of expanding MRI access to anyone, anywhere.

Evaluating the Accuracy and Efficiency of AI-Generated Radiology Reports Based on Positive Findings-A Qualitative Assessment of AI in Radiology.

Rajmohamed RF, Chapala S, Shazahan MA, Wali P, Botchu R

pubmed logopapersSep 26 2025
With increasing imaging demands, radiologists face growing workload pressures, often resulting in delays and reduced diagnostic efficiency. Recent advances in artificial intelligence (AI) have introduced tools for automated report generation, particularly in simpler imaging modalities, such as X-rays. However, limited research has assessed AI performance in complex studies such as MRI and CT scans, where report accuracy and clinical interpretation are critical. To evaluate the performance of a semi-automated AI-based reporting platform in generating radiology reports for complex imaging studies, and to compare its accuracy, efficiency, and user confidence with the traditional dictation method. This study involved 100 imaging cases, including MRI knee (n=21), MRI lumbar spine (n=30), CT head (n=23), and CT Abdomen and Pelvis (n=26). Consultant musculoskeletal radiologists reported each case using both traditional dictation and the AI platform. The radiologist first identified and entered the key positive findings, based on which the AI system generated a full draft report. Reporting time was recorded, and both methods were evaluated on accuracy, user confidence, and overall reporting experience (rated on a scale of 1-5). Statistical analysis was conducted using two-tailed t-tests and 95% confidence intervals. AI-generated reports demonstrated significantly improved performance across all parameters. The mean reporting time reduced from 6.1 to 3.43 min (p<0.0001) with AI-assisted report generation. Accuracy improved from 3.81 to 4.65 (p<0.0001), confidence ratings increased from 3.91 to 4.67 (p<0.0001), and overall reporting experience favored using the AI platform for generating radiology reports (mean 4.7 vs. 3.69, p<0.0001). Minor formatting errors and occasional anatomical misinterpretations were observed in AI-generated reports, but could be easily corrected by the radiologist during review. The AI-assisted reporting platform significantly improved efficiency and radiologist confidence without compromising accuracy. Although the tool performs well when provided with key clinical findings, it still requires expert oversight, especially in anatomically complex reporting. These findings support the use of AI as a supportive tool in radiology practice, with a focus on data integrity, consistency, and human validation.

EqDiff-CT: Equivariant Conditional Diffusion model for CT Image Synthesis from CBCT

Alzahra Altalib, Chunhui Li, Alessandro Perelli

arxiv logopreprintSep 26 2025
Cone-beam computed tomography (CBCT) is widely used for image-guided radiotherapy (IGRT). It provides real time visualization at low cost and dose. However, photon scattering and beam hindrance cause artifacts in CBCT. These include inaccurate Hounsfield Units (HU), reducing reliability for dose calculation, and adaptive planning. By contrast, computed tomography (CT) offers better image quality and accurate HU calibration but is usually acquired offline and fails to capture intra-treatment anatomical changes. Thus, accurate CBCT-to-CT synthesis is needed to close the imaging-quality gap in adaptive radiotherapy workflows. To cater to this, we propose a novel diffusion-based conditional generative model, coined EqDiff-CT, to synthesize high-quality CT images from CBCT. EqDiff-CT employs a denoising diffusion probabilistic model (DDPM) to iteratively inject noise and learn latent representations that enable reconstruction of anatomically consistent CT images. A group-equivariant conditional U-Net backbone, implemented with e2cnn steerable layers, enforces rotational equivariance (cyclic C4 symmetry), helping preserve fine structural details while minimizing noise and artifacts. The system was trained and validated on the SynthRAD2025 dataset, comprising CBCT-CT scans across multiple head-and-neck anatomical sites, and we compared it with advanced methods such as CycleGAN and DDPM. EqDiff-CT provided substantial gains in structural fidelity, HU accuracy and quantitative metrics. Visual findings further confirm the improved recovery, sharper soft tissue boundaries, and realistic bone reconstructions. The findings suggest that the diffusion model has offered a robust and generalizable framework for CBCT improvements. The proposed solution helps in improving the image quality as well as the clinical confidence in the CBCT-guided treatment planning and dose calculations.

[Advances in the application of multimodal image fusion technique in stomatology].

Ma TY, Zhu N, Zhang Y

pubmed logopapersSep 26 2025
Within the treatment process of modern stomatology, obtaining exquisite preoperative information is the key to accurate intraoperative planning with implementation and prognostic judgment. However, traditional single mode image has obvious shortcomings, such as "monotonous contents" and "unstable measurement accuracy", which could hardly meet the diversified needs of oral patients. Multimodal medical image fusion (MMIF) technique has been introduced into the studies of stomatology in the 1990s, aiming at realizing personalized patients' data analysis through multiple fusion algorithms, which combines the advantages of multimodal medical images while laying a stable foundation for new treatment technologies. Recently artificial intelligence (AI) has significantly increased the precision and efficiency of MMIF's registration: advanced algorithms and networks have confirmed the great compatibility between AI and MMIF. This article systematically reviews the development history of the multimodal image fusion technique and its current application in stomatology, while analyzing technological progresses within the domain combined with the background of AI's rapid development, in order to provide new ideas for achieving new advancements within the field of stomatology.

Deep Learning-Based Pneumonia Detection from Chest X-ray Images: A CNN Approach with Performance Analysis and Clinical Implications

P K Dutta, Anushri Chowdhury, Anouska Bhattacharyya, Shakya Chakraborty, Sujatra Dey

arxiv logopreprintSep 26 2025
Deep learning integration into medical imaging systems has transformed disease detection and diagnosis processes with a focus on pneumonia identification. The study introduces an intricate deep learning system using Convolutional Neural Networks for automated pneumonia detection from chest Xray images which boosts diagnostic precision and speed. The proposed CNN architecture integrates sophisticated methods including separable convolutions along with batch normalization and dropout regularization to enhance feature extraction while reducing overfitting. Through the application of data augmentation techniques and adaptive learning rate strategies the model underwent training on an extensive collection of chest Xray images to enhance its generalization capabilities. A convoluted array of evaluation metrics such as accuracy, precision, recall, and F1 score collectively verify the model exceptional performance by recording an accuracy rate of 91. This study tackles critical clinical implementation obstacles such as data privacy protection, model interpretability, and integration with current healthcare systems beyond just model performance. This approach introduces a critical advancement by integrating medical ontologies with semantic technology to improve diagnostic accuracy. The study enhances AI diagnostic reliability by integrating machine learning outputs with structured medical knowledge frameworks to boost interpretability. The findings demonstrate AI powered healthcare tools as a scalable efficient pneumonia detection solution. This study advances AI integration into clinical settings by developing more precise automated diagnostic methods that deliver consistent medical imaging results.

Integrating CT image reconstruction, segmentation, and large language models for enhanced diagnostic insight.

Abbasi AA, Farooqi AH

pubmed logopapersSep 25 2025
Deep learning has significantly advanced medical imaging, particularly computed tomography (CT), which is vital for diagnosing heart and cancer patients, evaluating treatments, and tracking disease progression. High-quality CT images enhance clinical decision-making, making image reconstruction a key research focus. This study develops a framework to improve CT image quality while minimizing reconstruction time. The proposed four-step medical image analysis framework includes reconstruction, preprocessing, segmentation, and image description. Initially, raw projection data undergoes reconstruction via a Radon transform to generate a sinogram, which is then used to construct a CT image of the pelvis. A convolutional neural network (CNN) ensures high-quality reconstruction. A bilateral filter reduces noise while preserving critical anatomical features. If required, a medical expert can review the image. The K-means clustering algorithm segments the preprocessed image, isolating the pelvis and removing irrelevant structures. Finally, the FuseCap model generates an automated textual description to assist radiologists. The framework's effectiveness is evaluated using peak signal-to-noise ratio (PSNR), normalized mean square error (NMSE), and structural similarity index measure (SSIM). The achieved values-PSNR 30.784, NMSE 0.032, and SSIM 0.877-demonstrate superior performance compared to existing methods. The proposed framework reconstructs high-quality CT images from raw projection data, integrating segmentation and automated descriptions to provide a decision-support tool for medical experts. By enhancing image clarity, segmenting outputs, and providing descriptive insights, this research aims to reduce the workload of frontline medical professionals and improve diagnostic efficiency.

Clinically Explainable Disease Diagnosis based on Biomarker Activation Map.

Zang P, Wang C, Hormel TT, Bailey ST, Hwang TS, Jia Y

pubmed logopapersSep 25 2025
Artificial intelligence (AI)-based disease classifiers have achieved specialist-level performances in several diagnostic tasks. However, real-world adoption of these classifiers remains challenging due to the black box issue. Here, we report a novel biomarker activation map (BAM) generation framework that can provide clinically meaningful explainability to current AI-based disease classifiers. We designed the framework based on the concept of residual counterfactual explanation by generating counterfactual outputs that could reverse the decision-making of the disease classifier. The BAM was generated as the difference map between the counterfactual output and original input with postprocessing. We evaluated the BAM on four different disease classifiers, including an age-related macular degeneration classier based on fundus photography, a diabetic retinopathy classifier based on optical coherence tomography angiography, a brain tumor classifier based on magnetic resonance imaging (MRI), and a breast cancer classifier based on computerized tomography (CT) scans. The highlighted regions in the BAM correlated highly with manually demarcated biomarkers of each disease. The BAM can improve the clinical applicability of an AI-based disease classifier by providing intuitive output clinicians can use to understand and verify the diagnostic decision.

Multimodal text guided network for chest CT pneumonia classification.

Feng Y, Huang G, Ju F, Cui H

pubmed logopapersSep 25 2025
Pneumonia is a prevalent and serious respiratory disease, responsible for a significant number of cases globally. With advancements in deep learning, the automatic diagnosis of pneumonia has attracted significant research attention in medical image classification. However, current methods still face several challenges. First, since lesions are often visible in only a few slices, slice-based classification algorithms may overlook critical spatial contextual information in CT sequences, and slice-level annotations are labor-intensive. Moreover, chest CT sequence-based pneumonia classification algorithms that rely solely on sequence-level coarse-grained labels remain limited, especially in integrating multi-modal information. To address these challenges, we propose a Multi-modal Text-Guided Network (MTGNet) for pneumonia classification using chest CT sequences. In this model, we design a sequential graph pooling network to encode the CT sequences by gradually selecting important slice features to obtain a sequence-level representation. Additionally, a CT description encoder is developed to learn representations from textual reports. To simulate the clinical diagnostic process, we employ multi-modal training and single-modal testing. A modal transfer module is proposed to generate simulated textual features from CT sequences. Cross-modal attention is then employed to fuse the sequence-level and simulated textual representations, thereby enhancing feature learning within the CT sequences by incorporating semantic information from textual descriptions. Furthermore, contrastive learning is applied to learn discriminative features by maximizing the similarity of positive sample pairs and minimizing the similarity of negative sample pairs. Extensive experiments on a self-constructed pneumonia CT sequences dataset demonstrate that the proposed model significantly improves classification performance.

Role of artificial intelligence in screening and medical imaging of precancerous gastric diseases.

Kotelevets SM

pubmed logopapersSep 24 2025
Serological screening, endoscopic imaging, morphological visual verification of precancerous gastric diseases and changes in the gastric mucosa are the main stages of early detection, accurate diagnosis and preventive treatment of gastric precancer. Laboratory - serological, endoscopic and histological diagnostics are carried out by medical laboratory technicians, endoscopists, and histologists. Human factors have a very large share of subjectivity. Endoscopists and histologists are guided by the descriptive principle when formulating imaging conclusions. Diagnostic reports from doctors often result in contradictory and mutually exclusive conclusions. Erroneous results of diagnosticians and clinicians have fatal consequences, such as late diagnosis of gastric cancer and high mortality of patients. Effective population serological screening is only possible with the use of machine processing of laboratory test results. Currently, it is possible to replace subjective imprecise description of endoscopic and histological images by a diagnostician with objective, highly sensitive and highly specific visual recognition using convolutional neural networks with deep machine learning. There are many machine learning models to use. All machine learning models have predictive capabilities. Based on predictive models, it is necessary to identify the risk levels of gastric cancer in patients with a very high probability.
Page 5 of 78773 results
Show
per page

Ready to Sharpen Your Edge?

Join hundreds of your peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.