Sort by:
Page 14 of 56552 results

Automated mitral valve segmentation in PLAX-view transthoracic echocardiography for anatomical assessment and risk stratification.

Jansen GE, Molenaar MA, Schuuring MJ, Bouma BJ, Išgum I

pubmed logopapersAug 20 2025
Accurate segmentation of the mitral valve in transthoracic echocardiography (TTE) enables the extraction of various anatomical parameters that are important for guiding clinical management. However, manual mitral valve segmentation is time-consuming and prone to interobserver variability. To support robust automatic analysis of mitral valve anatomy, we propose a novel AI-based method for mitral valve segmentation and anatomical measurement extraction. We retrospectively collected a set of echocardiographic exams from 1756 consecutive patients with suspected coronary artery disease. For these patients, we retrieved expert-defined scores for mitral regurgitation (MR) severity and follow-up characteristics. PLAX-view videos were automatically identified, and the inside border of the mitral valve leaflets were manually segmented in 182 patients. To automatically segment mitral valve leaflets, we designed a deep neural network that takes a video frame and outputs a distance- and classification-map for each leaflet, supervised by manual segmentations. From the resulting automatic segmentations, we extracted leaflet length, annulus diameter, tenting area, and coaptation length. To demonstrate the clinical relevance of these automatically extracted measurements, we performed univariable and multivariable Cox Regression survival analysis, with the clinical endpoint defined as heart-failure hospitalization or all-cause mortality. We trained the segmentation model on annotated frames of 111 patients, and tested segmentation performance on a set of 71 patients. For the survival analysis, we included 1,117 patients (mean age 64.1 ± 12.4 years, 58% male, median follow-up 3.3 years). The trained model achieved an average surface distance of 0.89 mm, a Hausdorff distance of 3.34 mm, and a temporal consistency score of 97%. Additionally, leaflet coaptation was accurately detected in 93% of annotated frames. In univariable Cox regression, automated annulus diameter (>35 mm, hazard ratio (HR) = 2.38, p<0.001), tenting area (>2.4 cm<sup>2</sup>, HR = 2.48, p<0.001), tenting height (>10 mm, HR = 1.91, p<0.001), and coaptation length (>3 mm, HR = 1.53, p = 0.007) were significantly associated with the defined clinical endpoint. For reference, significant MR by expert assessment resulted in an HR of 2.31 (p<0.001). In multivariable Cox Regression analysis, automated annulus diameter and coaptation length predicted the defined endpoint as independent parameters (p = 0.03 and p = 0.05, respectively). Our method allows accurate segmentation of the mitral valve in TTE, and enables fully automated quantification of key measurements describing mitral valve anatomy. This has the potential to improve risk stratification for cardiac patients.

Deep learning approach for screening neonatal cerebral lesions on ultrasound in China.

Lin Z, Zhang H, Duan X, Bai Y, Wang J, Liang Q, Zhou J, Xie F, Shentu Z, Huang R, Chen Y, Yu H, Weng Z, Ni D, Liu L, Zhou L

pubmed logopapersAug 20 2025
Timely and accurate diagnosis of severe neonatal cerebral lesions is critical for preventing long-term neurological damage and addressing life-threatening conditions. Cranial ultrasound is the primary screening tool, but the process is time-consuming and reliant on operator's proficiency. In this study, a deep-learning powered neonatal cerebral lesions screening system capable of automatically extracting standard views from cranial ultrasound videos and identifying cases with severe cerebral lesions is developed based on 8,757 neonatal cranial ultrasound images. The system demonstrates an area under the curve of 0.982 and 0.944, with sensitivities of 0.875 and 0.962 on internal and external video datasets, respectively. Furthermore, the system outperforms junior radiologists and performs on par with mid-level radiologists, with 55.11% faster examination efficiency. In conclusion, the developed system can automatically extract standard views and make correct diagnosis with efficiency from cranial ultrasound videos and might be useful to deploy in multiple application scenarios.

Machine learning-based method for the detection of dextrocardia in ultrasound video clips.

Hernandez-Cruz N, Patey O, Salovic B, Papageorghiou A, Noble JA

pubmed logopapersAug 20 2025
Dextrocardia is a congenital anomaly arising during fetal development, characterised by the abnormal positioning of the heart on the right side of the chest, instead of its usual anatomical location on the left. This paper describes a machine learning-based method to automatically assess ultrasound (US) transverse videos to detect dextrocardia by analysing the Situs and four-chamber (4CH) views. The method processes ultrasound video sweeps that users capture, which include the Situs and 4CH views. The automated analysis method consists of three stages. First, four fetal anatomical structures (chest, spine, stomach and heart) are automatically segmented using SegFormer. Second, a quality assessment (QA) module verifies that the video includes informative frames. Thirdly, the orientation of the stomach and heart relative to the fetal chest (either right or left side) is determined to assess dextrocardia. The method utilises a Transformer-based segmentation model to perform segmentation of the fetal anatomy. Segmentation performance was evaluated using the Dice coefficient, and fetal anatomy centroid estimation accuracy using root mean squared error (RMSE). Dextrocardia was classified based on a frame-based classification score (FBCS). The datasets consist of 142 pairs of Situs and 4CH US (284 frames in total) for training; and 14 US videos (7 normal, 7 dextrocardia, 2,916 frames total) for testing. The method achieved a Dice score of 0.968, 0.958, 0.953, 0.949 for chest, spine, stomach and heart segmentation, respectively, and anatomy centroid RMSE of 0.23mm, 0.34mm, 0.25mm, 0.39mm for the same structures. The QA rejected 172 frames. The assessment for dextrocardia achieved a FBCS of 0.99 with a standard deviation of 0.01 for normal and 0.02 for dextrocardia videos. Our automated method demonstrates accurate segmentation and reliable detection of dextrocardia from US videos. Due to the simple acquisition protocol and its robust analytical pipeline, our method is suitable for healthcare providers who are non-cardiac experts. It has the potential to facilitate earlier and more consistent prenatal identification of dextrocardia during screening, particularly in settings with limited access to experts in fetal echocardiography.

From Slices to Structures: Unsupervised 3D Reconstruction of Female Pelvic Anatomy from Freehand Transvaginal Ultrasound

Max Krähenmann, Sergio Tascon-Morales, Fabian Laumer, Julia E. Vogt, Ece Ozkan

arxiv logopreprintAug 20 2025
Volumetric ultrasound has the potential to significantly improve diagnostic accuracy and clinical decision-making, yet its widespread adoption remains limited by dependence on specialized hardware and restrictive acquisition protocols. In this work, we present a novel unsupervised framework for reconstructing 3D anatomical structures from freehand 2D transvaginal ultrasound (TVS) sweeps, without requiring external tracking or learned pose estimators. Our method adapts the principles of Gaussian Splatting to the domain of ultrasound, introducing a slice-aware, differentiable rasterizer tailored to the unique physics and geometry of ultrasound imaging. We model anatomy as a collection of anisotropic 3D Gaussians and optimize their parameters directly from image-level supervision, leveraging sensorless probe motion estimation and domain-specific geometric priors. The result is a compact, flexible, and memory-efficient volumetric representation that captures anatomical detail with high spatial fidelity. This work demonstrates that accurate 3D reconstruction from 2D ultrasound images can be achieved through purely computational means, offering a scalable alternative to conventional 3D systems and enabling new opportunities for AI-assisted analysis and diagnosis.

Deep Learning Model for Breast Shear Wave Elastography to Improve Breast Cancer Diagnosis (INSPiRED 006): An International, Multicenter Analysis.

Cai L, Pfob A, Barr RG, Duda V, Alwafai Z, Balleyguier C, Clevert DA, Fastner S, Gomez C, Goncalo M, Gruber I, Hahn M, Kapetas P, Nees J, Ohlinger R, Riedel F, Rutten M, Stieber A, Togawa R, Sidey-Gibbons C, Tozaki M, Wojcinski S, Heil J, Golatta M

pubmed logopapersAug 20 2025
Shear wave elastography (SWE) has been investigated as a complement to B-mode ultrasound for breast cancer diagnosis. Although multicenter trials suggest benefits for patients with Breast Imaging Reporting and Data System (BI-RADS) 4(a) breast masses, widespread adoption remains limited because of the absence of validated velocity thresholds. This study aims to develop and validate a deep learning (DL) model using SWE images (artificial intelligence [AI]-SWE) for BI-RADS 3 and 4 breast masses and compare its performance with human experts using B-mode ultrasound. We used data from an international, multicenter trial (ClinicalTrials.gov identifier: NCT02638935) evaluating SWE in women with BI-RADS 3 or 4 breast masses across 12 institutions in seven countries. Images from 11 sites were used to develop an EfficientNetB1-based DL model. An external validation was conducted using data from the 12th site. Another validation was performed using the latest SWE software from a separate institutional cohort. Performance metrics included sensitivity, specificity, false-positive reduction, and area under the receiver operator curve (AUROC). The development set included 924 patients (4,026 images); the external validation sets included 194 patients (562 images) and 176 patients (188 images, latest SWE software). AI-SWE achieved an AUROC of 0.94 (95% CI, 0.91 to 0.96) and 0.93 (95% CI, 0.88 to 0.98) in the two external validation sets. Compared with B-mode ultrasound, AI-SWE significantly reduced false-positive rates by 62.1% (20.4% [30/147] <i>v</i> 53.8% [431/801]; <i>P</i> < .001) and 38.1% (33.3% [14/42] <i>v</i> 53.8% [431/801]; <i>P</i> < .001), with comparable sensitivity (97.9% [46/47] and 97.8% [131/134] <i>v</i> 98.1% [311/317]; <i>P</i> = .912 and <i>P</i> = .810). AI-SWE demonstrated accuracy comparable with human experts in malignancy detection while significantly reducing false-positive imaging findings (ie, unnecessary biopsies). Future studies should explore its integration into multimodal breast cancer diagnostics.

A Fully Transformer Based Multimodal Framework for Explainable Cancer Image Segmentation Using Radiology Reports

Enobong Adahada, Isabel Sassoon, Kate Hone, Yongmin Li

arxiv logopreprintAug 19 2025
We introduce Med-CTX, a fully transformer based multimodal framework for explainable breast cancer ultrasound segmentation. We integrate clinical radiology reports to boost both performance and interpretability. Med-CTX achieves exact lesion delineation by using a dual-branch visual encoder that combines ViT and Swin transformers, as well as uncertainty aware fusion. Clinical language structured with BI-RADS semantics is encoded by BioClinicalBERT and combined with visual features utilising cross-modal attention, allowing the model to provide clinically grounded, model generated explanations. Our methodology generates segmentation masks, uncertainty maps, and diagnostic rationales all at once, increasing confidence and transparency in computer assisted diagnosis. On the BUS-BRA dataset, Med-CTX achieves a Dice score of 99% and an IoU of 95%, beating existing baselines U-Net, ViT, and Swin. Clinical text plays a key role in segmentation accuracy and explanation quality, as evidenced by ablation studies that show a -5.4% decline in Dice score and -31% in CIDEr. Med-CTX achieves good multimodal alignment (CLIP score: 85%) and increased confi dence calibration (ECE: 3.2%), setting a new bar for trustworthy, multimodal medical architecture.

Deep learning for detection and diagnosis of intrathoracic lymphadenopathy from endobronchial ultrasound multimodal videos: A multi-center study.

Chen J, Li J, Zhang C, Zhi X, Wang L, Zhang Q, Yu P, Tang F, Zha X, Wang L, Dai W, Xiong H, Sun J

pubmed logopapersAug 19 2025
Convex probe endobronchial ultrasound (CP-EBUS) ultrasonographic features are important for diagnosing intrathoracic lymphadenopathy. Conventional methods for CP-EBUS imaging analysis rely heavily on physician expertise. To overcome this obstacle, we propose a deep learning-aided diagnostic system (AI-CEMA) to automatically select representative images, identify lymph nodes (LNs), and differentiate benign from malignant LNs based on CP-EBUS multimodal videos. AI-CEMA is first trained using 1,006 LNs from a single center and validated with a retrospective study and then demonstrated with a prospective multi-center study on 267 LNs. AI-CEMA achieves an area under the curve (AUC) of 0.8490 (95% confidence interval [CI], 0.8000-0.8980), which is comparable to experienced experts (AUC, 0.7847 [95% CI, 0.7320-0.8373]; p = 0.080). Additionally, AI-CEMA is successfully transferred to a pulmonary lesion diagnosis task and obtains a commendable AUC of 0.8192 (95% CI, 0.7676-0.8709). In conclusion, AI-CEMA shows great potential in clinical diagnosis of intrathoracic lymphadenopathy and pulmonary lesions by providing automated, noninvasive, and expert-level diagnosis.

A Multimodal Large Language Model as an End-to-End Classifier of Thyroid Nodule Malignancy Risk: Usability Study.

Sng GGR, Xiang Y, Lim DYZ, Tung JYM, Tan JH, Chng CL

pubmed logopapersAug 19 2025
Thyroid nodules are common, with ultrasound imaging as the primary modality for their assessment. Risk stratification systems like the American College of Radiology Thyroid Imaging Reporting and Data System (ACR TI-RADS) have been developed but suffer from interobserver variability and low specificity. Artificial intelligence, particularly large language models (LLMs) with multimodal capabilities, presents opportunities for efficient end-to-end diagnostic processes. However, their clinical utility remains uncertain. This study evaluates the accuracy and consistency of multimodal LLMs for thyroid nodule risk stratification using the ACR TI-RADS system, examining the effects of model fine-tuning, image annotation, prompt engineering, and comparing open-source versus commercial models. In total, 3 multimodal vision-language models were evaluated: Microsoft's open-source Large Language and Visual Assistant (LLaVA) model, its medically fine-tuned variant (Large Language and Vision Assistant for bioMedicine [LLaVA-Med]), and OpenAI's commercial o3 model. A total of 192 thyroid nodules from publicly available ultrasound image datasets were assessed. Each model was evaluated using 2 prompts (basic and modified) and 2 image scenarios (unlabeled vs radiologist-annotated), yielding 6912 responses. Model outputs were compared with expert ratings for accuracy and consistency. Statistical comparisons included Chi-square tests, Mann-Whitney U tests, and Fleiss' kappa for interrater reliability. Overall, 88.4% (6110/6912) of responses were valid, with the o3 model producing the highest validity rate (2273/2304, 98.6%), followed by LLaVA (2108/2304, 91.5%) and LLaVA-Med (1729/2304, 75%; P<.001). The o3 model demonstrated the highest accuracy overall, achieving up to 57.3% accuracy in Thyroid Imaging Reporting and Data System (TI-RADS) classification, although still remaining suboptimal. Labeled images improved accuracy marginally in nodule margin assessment only when evaluating LLaVA models (407/768, 53% to 447/768, 58.2%; P=.04). Prompt engineering improved accuracy for composition (649/1,152, 56.3% vs 483/1152, 41.9%; P<.001), but significantly reduced accuracy for shape, margins, and overall classification. Consistency was the highest with the o3 model (up to 85.4%), but was comparable for LLaVA and significantly improved with image labeling and modified prompts across multiple TI-RADS categories (P<.001). Subgroup analysis for o3 alone showed prompt engineering did not affect accuracy significantly but markedly improved consistency across all TI-RADS categories (up to 97.1% for shape, P<.001). Interrater reliability was consistently poor across all combinations (Fleiss' kappa<0.60). The study demonstrates the comparative advantages and limitations of multimodal LLMs for thyroid nodule risk stratification. While the commercial model (o3) consistently outperformed open-source models in accuracy and consistency, even the best-performing model outputs remained suboptimal for direct clinical deployment. Prompt engineering significantly enhanced output consistency, particularly in the commercial model. These findings underline the importance of strategic model optimization techniques and highlight areas requiring further development before multimodal LLMs can be reliably used in clinical thyroid imaging workflows.

LGFFM: A Localized and Globalized Frequency Fusion Model for Ultrasound Image Segmentation.

Luo X, Wang Y, Ou-Yang L

pubmed logopapersAug 19 2025
Accurate segmentation of ultrasound images plays a critical role in disease screening and diagnosis. Recently, neural network-based methods have garnered significant attention for their potential in improving ultrasound image segmentation. However, these methods still face significant challenges, primarily due to inherent issues in ultrasound images, such as low resolution, speckle noise, and artifacts. Additionally, ultrasound image segmentation encompasses a wide range of scenarios, including organ segmentation (e.g., cardiac and fetal head) and lesion segmentation (e.g., breast cancer and thyroid nodules), making the task highly diverse and complex. Existing methods are often designed for specific segmentation scenarios, which limits their flexibility and ability to meet the diverse needs across various scenarios. To address these challenges, we propose a novel Localized and Globalized Frequency Fusion Model (LGFFM) for ultrasound image segmentation. Specifically, we first design a Parallel Bi-Encoder (PBE) architecture that integrates Local Feature Blocks (LFB) and Global Feature Blocks (GLB) to enhance feature extraction. Additionally, we introduce a Frequency Domain Mapping Module (FDMM) to capture texture information, particularly high-frequency details such as edges. Finally, a Multi-Domain Fusion (MDF) method is developed to effectively integrate features across different domains. We conduct extensive experiments on eight representative public ultrasound datasets across four different types. The results demonstrate that LGFFM outperforms current state-of-the-art methods in both segmentation accuracy and generalization performance.

Multi-View Echocardiographic Embedding for Accessible AI Development

Tohyama, T., Han, A., Yoon, D., Paik, K., Gow, B., Izath, N., Kpodonu, J., Celi, L. A.

medrxiv logopreprintAug 19 2025
Background and AimsEchocardiography serves as a cornerstone of cardiovascular diagnostics through multiple standardized imaging views. While recent AI foundation models demonstrate superior capabilities across cardiac imaging tasks, their massive computational requirements and reliance on large-scale datasets create accessibility barriers, limiting AI development to well-resourced institutions. Vector embedding approaches offer promising solutions by leveraging compact representations from original medical images for downstream applications. Furthermore, demographic fairness remains critical, as AI models may incorporate biases that confound clinically relevant features. We developed a multi-view encoder framework to address computational accessibility while investigating demographic fairness challenges. MethodsWe utilized the MIMIC-IV-ECHO dataset (7,169 echocardiographic studies) to develop a transformer-based multi-view encoder that aggregates view-level representations into study-level embeddings. The framework incorporated adversarial learning to suppress demographic information while maintaining clinical performance. We evaluated performance across 21 binary classification tasks encompassing echocardiographic measurements and clinical diagnoses, comparing against foundation model baselines with varying adversarial weights. ResultsThe multi-view encoder achieved a mean improvement of 9.0 AUC points (12.0% relative improvement) across clinical tasks compared to foundation model embeddings. Performance remained robust with limited echocardiographic views compared to the conventional approach. However, adversarial learning showed limited effectiveness in reducing demographic shortcuts, with stronger weighting substantially compromising diagnostic performance. ConclusionsOur framework democratizes advanced cardiac AI capabilities, enabling substantial diagnostic improvements without massive computational infrastructure. While algorithmic approaches to demographic fairness showed limitations, the multi-view encoder provides a practical pathway for broader AI adoption in cardiovascular medicine with enhanced efficiency in real-world clinical settings. Structured graphical abstract or graphical abstractO_ST_ABSKey QuestionC_ST_ABSCan multi-view encoder frameworks achieve superior diagnostic performance compared to foundation model embeddings while reducing computational requirements and maintaining robust performance with fewer echocardiographic views for cardiac AI applications? Key FindingMulti-view encoder achieved 12.0% relative improvement (9.0 AUC points) across 21 cardiac tasks compared to foundation model baselines, with efficient 512-dimensional vector embeddings and robust performance using fewer echocardiographic views. Take-home MessageVector embedding approaches with attention-based multi-view integration significantly improve cardiac diagnostic performance while reducing computational requirements, offering a pathway toward more efficient AI implementation in clinical settings. O_FIG O_LINKSMALLFIG WIDTH=200 HEIGHT=83 SRC="FIGDIR/small/25333725v1_ufig1.gif" ALT="Figure 1"> View larger version (22K): [email protected]@a75818org.highwire.dtl.DTLVardef@88a588org.highwire.dtl.DTLVardef@12bad06_HPS_FORMAT_FIGEXP M_FIG C_FIG Translational PerspectiveOur proposed multi-view encoder framework overcomes critical barriers to the widespread adoption of artificial intelligence in echocardiography. By dramatically reducing computational requirements, the multi-view encoder approach allows smaller healthcare institutions to develop sophisticated AI models locally. The framework maintains robust performance with fewer echocardiographic examinations, which addresses real-world clinical constraints where comprehensive imaging is not feasible due to patient factors or time limitations. This technology provides a practical way to democratize advanced cardiac AI capabilities, which could improve access to cardiovascular care across diverse healthcare settings while reducing dependence on proprietary datasets and massive computational resources.
Page 14 of 56552 results
Show
per page

Ready to Sharpen Your Edge?

Join hundreds of your peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.