Sort by:
Page 23 of 24234 results

Improving AI models for rare thyroid cancer subtype by text guided diffusion models.

Dai F, Yao S, Wang M, Zhu Y, Qiu X, Sun P, Qiu C, Yin J, Shen G, Sun J, Wang M, Wang Y, Yang Z, Sang J, Wang X, Sun F, Cai W, Zhang X, Lu H

pubmed logopapersMay 13 2025
Artificial intelligence applications in oncology imaging often struggle with diagnosing rare tumors. We identify significant gaps in detecting uncommon thyroid cancer types with ultrasound, where scarce data leads to frequent misdiagnosis. Traditional augmentation strategies do not capture the unique disease variations, hindering model training and performance. To overcome this, we propose a text-driven generative method that fuses clinical insights with image generation, producing synthetic samples that realistically reflect rare subtypes. In rigorous evaluations, our approach achieves substantial gains in diagnostic metrics, surpasses existing methods in authenticity and diversity measures, and generalizes effectively to other private and public datasets with various rare cancers. In this work, we demonstrate that text-guided image augmentation substantially enhances model accuracy and robustness for rare tumor detection, offering a promising avenue for more reliable and widespread clinical adoption.

A survey of deep-learning-based radiology report generation using multimodal inputs.

Wang X, Figueredo G, Li R, Zhang WE, Chen W, Chen X

pubmed logopapersMay 13 2025
Automatic radiology report generation can alleviate the workload for physicians and minimize regional disparities in medical resources, therefore becoming an important topic in the medical image analysis field. It is a challenging task, as the computational model needs to mimic physicians to obtain information from multi-modal input data (i.e., medical images, clinical information, medical knowledge, etc.), and produce comprehensive and accurate reports. Recently, numerous works have emerged to address this issue using deep-learning-based methods, such as transformers, contrastive learning, and knowledge-base construction. This survey summarizes the key techniques developed in the most recent works and proposes a general workflow for deep-learning-based report generation with five main components, including multi-modality data acquisition, data preparation, feature learning, feature fusion and interaction, and report generation. The state-of-the-art methods for each of these components are highlighted. Additionally, we summarize the latest developments in large model-based methods and model explainability, along with public datasets, evaluation methods, current challenges, and future directions in this field. We have also conducted a quantitative comparison between different methods in the same experimental setting. This is the most up-to-date survey that focuses on multi-modality inputs and data fusion for radiology report generation. The aim is to provide comprehensive and rich information for researchers interested in automatic clinical report generation and medical image analysis, especially when using multimodal inputs, and to assist them in developing new algorithms to advance the field.

Automatic CTA analysis for blood vessels and aneurysm features extraction in EVAR planning.

Robbi E, Ravanelli D, Allievi S, Raunig I, Bonvini S, Passerini A, Trianni A

pubmed logopapersMay 12 2025
Endovascular Aneurysm Repair (EVAR) is a minimally invasive procedure crucial for treating abdominal aortic aneurysms (AAA), where precise pre-operative planning is essential. Current clinical methods rely on manual measurements, which are time-consuming and prone to errors. Although AI solutions are increasingly being developed to automate aspects of these processes, most existing approaches primarily focus on computing volumes and diameters, falling short of delivering a fully automated pre-operative analysis. This work presents BRAVE (Blood Vessels Recognition and Aneurysms Visualization Enhancement), the first comprehensive AI-driven solution for vascular segmentation and AAA analysis using pre-operative CTA scans. BRAVE offers exhaustive segmentation, identifying both the primary abdominal aorta and secondary vessels, often overlooked by existing methods, providing a complete view of the vascular structure. The pipeline performs advanced volumetric analysis of the aneurysm sac, quantifying thrombotic tissue and calcifications, and automatically identifies the proximal and distal sealing zones, critical for successful EVAR procedures. BRAVE enables fully automated processing, reducing manual intervention and improving clinical workflow efficiency. Trained on a multi-center open-access dataset, it demonstrates generalizability across different CTA protocols and patient populations, ensuring robustness in diverse clinical settings. This solution saves time, ensures precision, and standardizes the process, enhancing vascular surgeons' decision-making.

The March to Harmonized Imaging Standards for Retinal Imaging.

Gim N, Ferguson AN, Blazes M, Lee CS, Lee AY

pubmed logopapersMay 11 2025
The adoption of standardized imaging protocols in retinal imaging is critical to overcoming challenges posed by fragmented data formats across devices and manufacturers. The lack of standardization hinders clinical interoperability, collaborative research, and the development of artificial intelligence (AI) models that depend on large, high-quality datasets. The Digital Imaging and Communication in Medicine (DICOM) standard offers a robust solution for ensuring interoperability in medical imaging. Although DICOM is widely utilized in radiology and cardiology, its adoption in ophthalmology remains limited. Retinal imaging modalities such as optical coherence tomography (OCT), fundus photography, and OCT angiography (OCTA) have revolutionized retinal disease management but are constrained by proprietary and non-standardized formats. This review underscores the necessity for harmonized imaging standards in ophthalmology, detailing DICOM standards for retinal imaging including ophthalmic photography (OP), OCT, and OCTA, and their requisite metadata information. Additionally, the potential of DICOM standardization for advancing AI applications in ophthalmology is explored. A notable example is the Artificial Intelligence Ready and Equitable Atlas for Diabetes Insights (AI-READI) dataset, the first publicly available standards-compliant DICOM retinal imaging dataset. This dataset encompasses diverse retinal imaging modalities, including color fundus photography, infrared, autofluorescence, OCT, and OCTA. By leveraging multimodal retinal imaging, AI-READI provides a transformative resource for studying diabetes and its complications, setting a blueprint for future datasets aimed at harmonizing imaging formats and enabling AI-driven breakthroughs in ophthalmology. Our manuscript also addresses challenges in retinal imaging for diabetic patients, retinal imaging-based AI applications for studying diabetes, and potential advancements in retinal imaging standardization.

Creation of an Open-Access Lung Ultrasound Image Database For Deep Learning and Neural Network Applications

Kumar, A., Nandakishore, P., Gordon, A. J., Baum, E., Madhok, J., Duanmu, Y., Kugler, J.

medrxiv logopreprintMay 11 2025
BackgroundLung ultrasound (LUS) offers advantages over traditional imaging for diagnosing pulmonary conditions, with superior accuracy compared to chest X-ray and similar performance to CT at lower cost. Despite these benefits, widespread adoption is limited by operator dependency, moderate interrater reliability, and training requirements. Deep learning (DL) could potentially address these challenges, but development of effective algorithms is hindered by the scarcity of comprehensive image repositories with proper metadata. MethodsWe created an open-source dataset of LUS images derived a multi-center study involving N=226 adult patients presenting with respiratory symptoms to emergency departments between March 2020 and April 2022. Images were acquired using a standardized scanning protocol (12-zone or modified 8-zone) with various point-of-care ultrasound devices. Three blinded researchers independently analyzed each image following consensus guidelines, with disagreements adjudicated to provide definitive interpretations. Videos were pre-processed to remove identifiers, and frames were extracted and resized to 128x128 pixels. ResultsThe dataset contains 1,874 video clips comprising 303,977 frames. Half of the participants (50%) had COVID-19 pneumonia. Among all clips, 66% contained no abnormalities, 18% contained B-lines, 4.5% contained consolidations, 6.4% contained both B-lines and consolidations, and 5.2% had indeterminate findings. Pathological findings varied significantly by lung zone, with anterior zones more frequently normal and less likely to show consolidations compared to lateral and posterior zones. DiscussionThis dataset represents one of the largest annotated LUS repositories to date, including both COVID-19 and non-COVID-19 patients. The comprehensive metadata and expert interpretations enhance its utility for DL applications. Despite limitations including potential device-specific characteristics and COVID-19 predominance, this repository provides a valuable resource for developing AI tools to improve LUS acquisition and interpretation.

Reproducing and Improving CheXNet: Deep Learning for Chest X-ray Disease Classification

Daniel Strick, Carlos Garcia, Anthony Huang

arxiv logopreprintMay 10 2025
Deep learning for radiologic image analysis is a rapidly growing field in biomedical research and is likely to become a standard practice in modern medicine. On the publicly available NIH ChestX-ray14 dataset, containing X-ray images that are classified by the presence or absence of 14 different diseases, we reproduced an algorithm known as CheXNet, as well as explored other algorithms that outperform CheXNet's baseline metrics. Model performance was primarily evaluated using the F1 score and AUC-ROC, both of which are critical metrics for imbalanced, multi-label classification tasks in medical imaging. The best model achieved an average AUC-ROC score of 0.85 and an average F1 score of 0.39 across all 14 disease classifications present in the dataset.

Towards Better Cephalometric Landmark Detection with Diffusion Data Generation

Dongqian Guo, Wencheng Han, Pang Lyu, Yuxi Zhou, Jianbing Shen

arxiv logopreprintMay 9 2025
Cephalometric landmark detection is essential for orthodontic diagnostics and treatment planning. Nevertheless, the scarcity of samples in data collection and the extensive effort required for manual annotation have significantly impeded the availability of diverse datasets. This limitation has restricted the effectiveness of deep learning-based detection methods, particularly those based on large-scale vision models. To address these challenges, we have developed an innovative data generation method capable of producing diverse cephalometric X-ray images along with corresponding annotations without human intervention. To achieve this, our approach initiates by constructing new cephalometric landmark annotations using anatomical priors. Then, we employ a diffusion-based generator to create realistic X-ray images that correspond closely with these annotations. To achieve precise control in producing samples with different attributes, we introduce a novel prompt cephalometric X-ray image dataset. This dataset includes real cephalometric X-ray images and detailed medical text prompts describing the images. By leveraging these detailed prompts, our method improves the generation process to control different styles and attributes. Facilitated by the large, diverse generated data, we introduce large-scale vision detection models into the cephalometric landmark detection task to improve accuracy. Experimental results demonstrate that training with the generated data substantially enhances the performance. Compared to methods without using the generated data, our approach improves the Success Detection Rate (SDR) by 6.5%, attaining a notable 82.2%. All code and data are available at: https://um-lab.github.io/cepha-generation

KEVS: enhancing segmentation of visceral adipose tissue in pre-cystectomy CT with Gaussian kernel density estimation.

Boucher T, Tetlow N, Fung A, Dewar A, Arina P, Kerneis S, Whittle J, Mazomenos EB

pubmed logopapersMay 9 2025
The distribution of visceral adipose tissue (VAT) in cystectomy patients is indicative of the incidence of postoperative complications. Existing VAT segmentation methods for computed tomography (CT) employing intensity thresholding have limitations relating to inter-observer variability. Moreover, the difficulty in creating ground-truth masks limits the development of deep learning (DL) models for this task. This paper introduces a novel method for VAT prediction in pre-cystectomy CT, which is fully automated and does not require ground-truth VAT masks for training, overcoming aforementioned limitations. We introduce the kernel density-enhanced VAT segmentator (KEVS), combining a DL semantic segmentation model, for multi-body feature prediction, with Gaussian kernel density estimation analysis of predicted subcutaneous adipose tissue to achieve accurate scan-specific predictions of VAT in the abdominal cavity. Uniquely for a DL pipeline, KEVS does not require ground-truth VAT masks. We verify the ability of KEVS to accurately segment abdominal organs in unseen CT data and compare KEVS VAT segmentation predictions to existing state-of-the-art (SOTA) approaches in a dataset of 20 pre-cystectomy CT scans, collected from University College London Hospital (UCLH-Cyst), with expert ground-truth annotations. KEVS presents a <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow><mn>4.80</mn> <mo>%</mo></mrow> </math> and <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow><mn>6.02</mn> <mo>%</mo></mrow> </math> improvement in Dice coefficient over the second best DL and thresholding-based VAT segmentation techniques respectively when evaluated on UCLH-Cyst. This research introduces KEVS, an automated, SOTA method for the prediction of VAT in pre-cystectomy CT which eliminates inter-observer variability and is trained entirely on open-source CT datasets which do not contain ground-truth VAT masks.

Chest X-Ray Visual Saliency Modeling: Eye-Tracking Dataset and Saliency Prediction Model.

Lou J, Wang H, Wu X, Ng JCH, White R, Thakoor KA, Corcoran P, Chen Y, Liu H

pubmed logopapersMay 8 2025
Radiologists' eye movements during medical image interpretation reflect their perceptual-cognitive processes of diagnostic decisions. The eye movement data can be modeled to represent clinically relevant regions in a medical image and potentially integrated into an artificial intelligence (AI) system for automatic diagnosis in medical imaging. In this article, we first conduct a large-scale eye-tracking study involving 13 radiologists interpreting 191 chest X-ray (CXR) images, establishing a best-of-its-kind CXR visual saliency benchmark. We then perform analysis to quantify the reliability and clinical relevance of saliency maps (SMs) generated for CXR images. We develop CXR image saliency prediction method (CXRSalNet), a novel saliency prediction model that leverages radiologists' gaze information to optimize the use of unlabeled CXR images, enhancing training and mitigating data scarcity. We also demonstrate the application of our CXR saliency model in enhancing the performance of AI-powered diagnostic imaging systems.

From Pixels to Polygons: A Survey of Deep Learning Approaches for Medical Image-to-Mesh Reconstruction

Fengming Lin, Arezoo Zakeri, Yidan Xue, Michael MacRaild, Haoran Dou, Zherui Zhou, Ziwei Zou, Ali Sarrami-Foroushani, Jinming Duan, Alejandro F. Frangi

arxiv logopreprintMay 6 2025
Deep learning-based medical image-to-mesh reconstruction has rapidly evolved, enabling the transformation of medical imaging data into three-dimensional mesh models that are critical in computational medicine and in silico trials for advancing our understanding of disease mechanisms, and diagnostic and therapeutic techniques in modern medicine. This survey systematically categorizes existing approaches into four main categories: template models, statistical models, generative models, and implicit models. Each category is analysed in detail, examining their methodological foundations, strengths, limitations, and applicability to different anatomical structures and imaging modalities. We provide an extensive evaluation of these methods across various anatomical applications, from cardiac imaging to neurological studies, supported by quantitative comparisons using standard metrics. Additionally, we compile and analyze major public datasets available for medical mesh reconstruction tasks and discuss commonly used evaluation metrics and loss functions. The survey identifies current challenges in the field, including requirements for topological correctness, geometric accuracy, and multi-modality integration. Finally, we present promising future research directions in this domain. This systematic review aims to serve as a comprehensive reference for researchers and practitioners in medical image analysis and computational medicine.
Page 23 of 24234 results
Show
per page

Ready to Sharpen Your Edge?

Join hundreds of your peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.