Sort by:
Page 6 of 768 results

Longitudinal Validation of a Deep Learning Index for Aortic Stenosis Progression

Park, J., Kim, J., Yoon, Y. E., Jeon, J., Lee, S.-A., Choi, H.-M., Hwang, I.-C., Cho, G.-Y., Chang, H.-J., Park, J.-H.

medrxiv logopreprintMay 19 2025
AimsAortic stenosis (AS) is a progressive disease requiring timely monitoring and intervention. While transthoracic echocardiography (TTE) remains the diagnostic standard, deep learning (DL)-based approaches offer potential for improved disease tracking. This study examined the longitudinal changes in a previously developed DL-derived index for AS continuum (DLi-ASc) and assessed its value in predicting progression to severe AS. Methods and ResultsWe retrospectively analysed 2,373 patients a(7,371 TTEs) from two tertiary hospitals. DLi-ASc (scaled 0-100), derived from parasternal long- and/or short-axis views, was tracked longitudinally. DLi-ASc increased in parallel with worsening AS stages (p for trend <0.001) and showed strong correlations with AV maximal velocity (Vmax) (Pearson correlation coefficients [PCC] = 0.69, p<0.001) and mean pressure gradient (mPG) (PCC = 0.66, p<0.001). Higher baseline DLi-ASc was associated with a faster AS progression rate (p for trend <0.001). Additionally, the annualised change in DLi-ASc, estimated using linear mixed-effect models, correlated strongly with the annualised progression of AV Vmax (PCC = 0.71, p<0.001) and mPG (PCC = 0.68, p<0.001). In Fine-Gray competing risk models, baseline DLi-ASc independently predicted progression to severe AS, even after adjustment for AV Vmax or mPG (hazard ratio per 10-point increase = 2.38 and 2.80, respectively) ConclusionDLi-ASc increased in parallel with AS progression and independently predicted severe AS progression. These findings support its role as a non-invasive imaging-based digital marker for longitudinal AS monitoring and risk stratification.

Harnessing Artificial Intelligence for Accurate Diagnosis and Radiomics Analysis of Combined Pulmonary Fibrosis and Emphysema: Insights from a Multicenter Cohort Study

Zhang, S., Wang, H., Tang, H., Li, X., Wu, N.-W., Lang, Q., Li, B., Zhu, H., Chen, X., Chen, K., Xie, B., Zhou, A., Mo, C.

medrxiv logopreprintMay 18 2025
Combined Pulmonary Fibrosis and Emphysema (CPFE), formally recognized as a distinct pulmonary syndrome in 2022, is characterized by unique clinical features and pathogenesis that may lead to respiratory failure and death. However, the diagnosis of CPFE presents significant challenges that hinder effective treatment. Here, we assembled three-dimensional (3D) reconstruction data of the chest High-Resolution Computed Tomography (HRCT) of patients from multiple hospitals across different provinces in China, including Xiangya Hospital, West China Hospital, and Fujian Provincial Hospital. Using this dataset, we developed CPFENet, a deep learning-based diagnostic model for CPFE. It accurately differentiates CPFE from COPD, with performance comparable to that of professional radiologists. Additionally, we developed a CPFE score based on radiomic analysis of 3D CT images to quantify disease characteristics. Notably, female patients demonstrated significantly higher CPFE scores than males, suggesting potential sex-specific differences in CPFE. Overall, our study establishes the first diagnostic framework for CPFE, providing a diagnostic model and clinical indicators that enable accurate classification and characterization of the syndrome.

The effect of medical explanations from large language models on diagnostic decisions in radiology

Spitzer, P., Hendriks, D., Rudolph, J., Schläger, S., Ricke, J., Kühl, N., Hoppe, B., Feuerriegel, S.

medrxiv logopreprintMay 18 2025
Large language models (LLMs) are increasingly used by physicians for diagnostic support. A key advantage of LLMs is the ability to generate explanations that can help physicians understand the reasoning behind a diagnosis. However, the best-suited format for LLM-generated explanations remains unclear. In this large-scale study, we examined the effect of different formats for LLM explanations on clinical decision-making. For this, we conducted a randomized experiment with radiologists reviewing patient cases with radiological images (N = 2020 assessments). Participants received either no LLM support (control group) or were supported by one of three LLM-generated explanations: (1) a standard output providing the diagnosis without explanation; (2) a differential diagnosis comparing multiple possible diagnoses; or (3) a chain-of-thought explanation offering a detailed reasoning process for the diagnosis. We find that the format of explanations significantly influences diagnostic accuracy. The chain-of-thought explanations yielded the best performance, improving the diagnostic accuracy by 12.2% compared to the control condition without LLM support (P = 0.001). The chain-of-thought explanations are also superior to the standard output without explanation (+7.2%; P = 0.040) and the differential diagnosis format (+9.7%; P = 0.004). We further assessed the robustness of these findings across case difficulty and different physician backgrounds such as general vs. specialized radiologists. Evidently, explaining the reasoning for a diagnosis helps physicians to identify and correct potential errors in LLM predictions and thus improve overall decisions. Altogether, the results highlight the importance of how explanations in medical LLMs are generated to maximize their utility in clinical practice. By designing explanations to support the reasoning processes of physicians, LLMs can improve diagnostic performance and, ultimately, patient outcomes.

Foundation versus Domain-Specific Models for Left Ventricular Segmentation on Cardiac Ultrasound

Chao, C.-J., Gu, Y., Kumar, W., Xiang, T., Appari, L., Wu, J., Farina, J. M., Wraith, R., Jeong, J., Arsanjani, R., Garvan, K. C., Oh, J. K., Langlotz, C. P., Banerjee, I., Li, F.-F., Adeli, E.

medrxiv logopreprintMay 17 2025
The Segment Anything Model (SAM) was fine-tuned on the EchoNet-Dynamic dataset and evaluated on external transthoracic echocardiography (TTE) and Point-of-Care Ultrasound (POCUS) datasets from CAMUS (University Hospital of St Etienne) and Mayo Clinic (99 patients: 58 TTE, 41 POCUS). Fine-tuned SAM was superior or comparable to MedSAM. The fine-tuned SAM also outperformed EchoNet and U-Net models, demonstrating strong generalization, especially on apical 2-chamber (A2C) images (fine-tuned SAM vs. EchoNet: CAMUS-A2C: DSC 0.891 {+/-} 0.040 vs. 0.752 {+/-} 0.196, p<0.0001) and POCUS (DSC 0.857 {+/-} 0.047 vs. 0.667 {+/-} 0.279, p<0.0001). Additionally, SAM-enhanced workflow reduced annotation time by 50% (11.6 {+/-} 4.5 sec vs. 5.7 {+/-} 1.7 sec, p<0.0001) while maintaining segmentation quality. We demonstrated an effective strategy for fine-tuning a vision foundation model for enhancing clinical workflow efficiency and supporting human-AI collaboration.

Single View Echocardiographic Analysis for Left Ventricular Outflow Tract Obstruction Prediction in Hypertrophic Cardiomyopathy: A Deep Learning Approach

Kim, J., Park, J., Jeon, J., Yoon, Y. E., Jang, Y., Jeong, H., Lee, S.-A., Choi, H.-M., Hwang, I.-C., Cho, G.-Y., Chang, H.-J.

medrxiv logopreprintMay 14 2025
BackgroundAccurate left ventricular outflow tract obstruction (LVOTO) assessment is crucial for hypertrophic cardiomyopathy (HCM) management and prognosis. Traditional methods, requiring multiple views, Doppler, and provocation, is often infeasible, especially where resources are limited. This study aimed to develop and validate a deep learning (DL) model capable of predicting severe LVOTO in HCM patients using only the parasternal long-axis (PLAX) view from transthoracic echocardiography (TTE). MethodsA DL model was trained on PLAX videos extracted from TTE examinations (developmental dataset, n=1,007) to capture both morphological and dynamic motion features, generating a DL index for LVOTO (DLi-LVOTO, range 0-100). Performance was evaluated in an internal test dataset (ITDS, n=87) and externally validated in the distinct hospital dataset (DHDS, n=1,334) and the LVOTO reduction treatment dataset (n=156). ResultsThe model achieved high accuracy in detecting severe LVOTO (pressure gradient[&ge;] 50mmHg), with area under the receiver operating characteristics curve (AUROC) of 0.97 (95% confidence interval: 0.92-1.00) in ITDS and 0.93 (0.92-0.95) in DHDS. At a DLi-LVOTO threshold of 70, the model demonstrated a specificity of 97.3% and negative predictive value (NPV) of 96.1% in ITDS. In DHDS, a cutoff of 60 yielded a specificity of 94.6% and NPV of 95.5%. DLi-LVOTO also decreased significantly after surgical myectomy or Mavacamten treatment, correlating with reductions in peak pressure gradient (p<0.001 for all). ConclusionsOur DL-based approach predicts severe LVOTO using only the PLAX view from TTE, serving as a complementary tool, particularly in resource-limited settings or when Doppler is unavailable, and for monitoring treatment response.

Multi-Task Deep Learning for Predicting Metabolic Syndrome from Retinal Fundus Images in a Japanese Health Checkup Dataset

Itoh, T., Nishitsuka, K., Fukuma, Y., Wada, S.

medrxiv logopreprintMay 14 2025
BackgroundRetinal fundus images provide a noninvasive window into systemic health, offering opportunities for early detection of metabolic disorders such as metabolic syndrome (METS). ObjectiveThis study aimed to develop a deep learning model to predict METS from fundus images obtained during routine health checkups, leveraging a multi-task learning approach. MethodsWe retrospectively analyzed 5,000 fundus images from Japanese health checkup participants. Convolutional neural network (CNN) models were trained to classify METS status, incorporating fundus-specific data augmentation strategies and auxiliary regression tasks targeting clinical parameters such as abdominal circumference (AC). Model performance was evaluated using validation accuracy, test accuracy, and the area under the receiver operating characteristic curve (AUC). ResultsModels employing fundus-specific augmentation demonstrated more stable convergence and superior validation accuracy compared to general-purpose augmentation. Incorporating AC as an auxiliary task further enhanced performance across architectures. The final ensemble model with test-time augmentation achieved a test accuracy of 0.696 and an AUC of 0.73178. ConclusionCombining multi-task learning, fundus-specific data augmentation, and ensemble prediction substantially improves deep learning-based METS classification from fundus images. This approach may offer a practical, noninvasive screening tool for metabolic syndrome in general health checkup settings.

Enhancing Liver Fibrosis Measurement: Deep Learning and Uncertainty Analysis Across Multi-Centre Cohorts

Wojciechowska, M. K., Malacrino, S., Windell, D., Culver, E., Dyson, J., UK-AIH Consortium,, Rittscher, J.

medrxiv logopreprintMay 13 2025
O_FIG O_LINKSMALLFIG WIDTH=200 HEIGHT=111 SRC="FIGDIR/small/25326981v1_ufig1.gif" ALT="Figure 1"> View larger version (31K): [email protected]@14e7b87org.highwire.dtl.DTLVardef@19005c4org.highwire.dtl.DTLVardef@6ac42f_HPS_FORMAT_FIGEXP M_FIG O_FLOATNOGraphical AbstractC_FLOATNO C_FIG HighlightsO_LIA retrospective cohort of liver biopsies collected from over 20 healthcare centres has been assembled. C_LIO_LIThe cohort is characterized on the basis of collagen staining used for liver fibrosis assessment. C_LIO_LIA computational pipeline for the quantification of collagen from liver histology slides has been developed and applied to the described cohorts. C_LIO_LIUncertainty estimation is evaluated as a method to build trust in deep-learning based collagen predictions. C_LI The introduction of digital pathology has revolutionised the way in which histology-based measurements can support large, multi-centre studies. How-ever, pooling data from various centres often reveals significant differences in specimen quality, particularly regarding histological staining protocols. These variations present challenges in reliably quantifying features from stained tissue sections using image analysis. In this study, we investigate the statistical variation of measuring fibrosis across a liver cohort composed of four individual studies from 20 clinical sites across Europe and North America. In a first step, we apply colour consistency measurements to analyse staining variability across this diverse cohort. Subsequently, a learnt segmentation model is used to quantify the collagen proportionate area (CPA) and employed uncertainty mapping to evaluate the quality of the segmentations. Our analysis highlights a lack of standardisation in PicroSirius Red (PSR) staining practices, revealing significant variability in staining protocols across institutions. The deconvolution of the staining of the digitised slides identified the different numbers and types of counterstains used, leading to potentially incomparable results. Our analysis highlights the need for standardised staining protocols to ensure reliable collagen quantification in liver biopsies. The tools and methodologies presented here can be applied to perform slide colour quality control in digital pathology studies, thus enhancing the comparability and reproducibility of fibrosis assessment in the liver and other tissues.

LiteMIL: A Computationally Efficient Transformer-Based MIL for Cancer Subtyping on Whole Slide Images.

Kussaibi, H.

medrxiv logopreprintMay 12 2025
PurposeAccurate cancer subtyping is crucial for effective treatment; however, it presents challenges due to overlapping morphology and variability among pathologists. Although deep learning (DL) methods have shown potential, their application to gigapixel whole slide images (WSIs) is often hindered by high computational demands and the need for efficient, context-aware feature aggregation. This study introduces LiteMIL, a computationally efficient transformer-based multiple instance learning (MIL) network combined with Phikon, a pathology-tuned self-supervised feature extractor, for robust and scalable cancer subtyping on WSIs. MethodsInitially, patches were extracted from TCGA-THYM dataset (242 WSIs, six subtypes) and subsequently fed in real-time to Phikon for feature extraction. To train MILs, features were arranged into uniform bags using a chunking strategy that maintains tissue context while increasing training data. LiteMIL utilizes a learnable query vector within an optimized multi-head attention module for effective feature aggregation. The models performance was evaluated against established MIL methods on the Thymic Dataset and three additional TCGA datasets (breast, lung, and kidney cancer). ResultsLiteMIL achieved 0.89 {+/-} 0.01 F1 score and 0.99 AUC on Thymic dataset, outperforming other MILs. LiteMIL demonstrated strong generalizability across the external datasets, scoring the best on breast and kidney cancer datasets. Compared to TransMIL, LiteMIL significantly reduces training time and GPU memory usage. Ablation studies confirmed the critical role of the learnable query and layer normalization in enhancing performance and stability. ConclusionLiteMIL offers a resource-efficient, robust solution. Its streamlined architecture, combined with the compact Phikon features, makes it suitable for integrating into routine histopathological workflows, particularly in resource-limited settings.

Automated scout-image-based estimation of contrast agent dosing: a deep learning approach

Schirrmeister, R., Taleb, L., Friemel, P., Reisert, M., Bamberg, F., Weiss, J., Rau, A.

medrxiv logopreprintMay 12 2025
We developed and tested a deep-learning-based algorithm for the approximation of contrast agent dosage based on computed tomography (CT) scout images. We prospectively enrolled 817 patients undergoing clinically indicated CT imaging, predominantly of the thorax and/or abdomen. Patient weight was collected by study staff prior to the examination 1) with a weight scale and 2) as self-reported. Based on the scout images, we developed an EfficientNet convolutional neural network pipeline to estimate the optimal contrast agent dose based on patient weight and provide a browser-based user interface as a versatile open-source tool to account for different contrast agent compounds. We additionally analyzed the body-weight-informative CT features by synthesizing representative examples for different weights using in-context learning and dataset distillation. The cohort consisted of 533 thoracic, 70 abdominal and 229 thoracic-abdominal CT scout scans. Self-reported patient weight was statistically significantly lower than manual measurements (75.13 kg vs. 77.06 kg; p < 10-5, Wilcoxon signed-rank test). Our pipeline predicted patient weight with a mean absolute error of 3.90 {+/-} 0.20 kg (corresponding to a roughly 4.48 - 11.70 ml difference in contrast agent depending on the agent) in 5-fold cross-validation and is publicly available at https://tinyurl.com/ct-scout-weight. Interpretability analysis revealed that both larger anatomical shape and higher overall attenuation were predictive of body weight. Our open-source deep learning pipeline allows for the automatic estimation of accurate contrast agent dosing based on scout images in routine CT imaging studies. This approach has the potential to streamline contrast agent dosing workflows, improve efficiency, and enhance patient safety by providing quick and accurate weight estimates without additional measurements or reliance on potentially outdated records. The models performance may vary depending on patient positioning and scout image quality and the approach requires validation on larger patient cohorts and other clinical centers. Author SummaryAutomation of medical workflows using AI has the potential to increase reproducibility while saving costs and time. Here, we investigated automating the estimation of the required contrast agent dosage for CT examinations. We trained a deep neural network to predict the body weight from the initial 2D CT Scout images that are required prior to the actual CT examination. The predicted weight is then converted to a contrast agent dosage based on contrast-agent-specific conversion factors. To facilitate application in clinical routine, we developed a user-friendly browser-based user interface that allows clinicians to select a contrast agent or input a custom conversion factor to receive dosage suggestions, with local data processing in the browser. We also investigate what image characteristics predict body weight and find plausible relationships such as higher attenuation and larger anatomical shapes correlating with higher body weights. Our work goes beyond prior work by implementing a single model for a variety of anatomical regions, providing an accessible user interface and investigating the predictive characteristics of the images.

Automatic Quantification of Ki-67 Labeling Index in Pediatric Brain Tumors Using QuPath

Spyretos, C., Pardo Ladino, J. M., Blomstrand, H., Nyman, P., Snodahl, O., Shamikh, A., Elander, N. O., Haj-Hosseini, N.

medrxiv logopreprintMay 12 2025
AO_SCPLOWBSTRACTC_SCPLOWThe quantification of the Ki-67 labeling index (LI) is critical for assessing tumor proliferation and prognosis in tumors, yet manual scoring remains a common practice. This study presents an automated workflow for Ki-67 scoring in whole slide images (WSIs) using an Apache Groovy code script for QuPath, complemented by a Python-based post-processing script, providing cell density maps and summary tables. The tissue and cell segmentation are performed using StarDist, a deep learning model, and adaptive thresholding to classify Ki-67 positive and negative nuclei. The pipeline was applied to a cohort of 632 pediatric brain tumor cases with 734 Ki-67-stained WSIs from the Childrens Brain Tumor Network. Medulloblastoma showed the highest Ki-67 LI (median: 19.84), followed by atypical teratoid rhabdoid tumor (median: 19.36). Moderate values were observed in brainstem glioma-diffuse intrinsic pontine glioma (median: 11.50), high-grade glioma (grades 3 & 4) (median: 9.50), and ependymoma (median: 5.88). Lower indices were found in meningioma (median: 1.84), while the lowest were seen in low-grade glioma (grades 1 & 2) (median: 0.85), dysembryoplastic neuroepithelial tumor (median: 0.63), and ganglioglioma (median: 0.50). The results aligned with the consensus of the oncology, demonstrating a significant correlation in Ki-67 LI across most of the tumor families/types, with high malignancy tumors showing the highest proliferation indices and lower malignancy tumors exhibiting lower Ki-67 LI. The automated approach facilitates the assessment of large amounts of Ki-67 WSIs in research settings.
Page 6 of 768 results
Show
per page

Ready to Sharpen Your Edge?

Join hundreds of your peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.