Sort by:
Page 18 of 73728 results

Predicting ADC map quality from T2-weighted MRI: A deep learning approach for early quality assessment to assist point-of-care.

Brender JR, Ota M, Nguyen N, Ford JW, Kishimoto S, Harmon SA, Wood BJ, Pinto PA, Krishna MC, Choyke PL, Turkbey B

pubmed logopapersJul 17 2025
Poor quality prostate MRI images compromise diagnostic accuracy, with diffusion-weighted imaging and the resulting apparent diffusion coefficient (ADC) maps being particularly vulnerable. These maps are critical for prostate cancer diagnosis, yet current methods relying on standardizing technical parameters fail to consistently ensure image quality. We propose a novel deep learning approach to predict low-quality ADC maps using T2-weighted (T2W) images, enabling real-time corrective interventions during imaging. A multi-site dataset of T2W images and ADC maps from 486 patients, spanning 62 external clinics and in-house imaging, was retrospectively analyzed. A neural network was trained to classify ADC map quality as "diagnostic" or "non-diagnostic" based solely on T2W images. Rectal cross-sectional area measurements were evaluated as an interpretable metric for susceptibility-induced distortions. Analysis revealed limited correlation between individual acquisition parameters and image quality, with horizontal phase encoding significant for T2 imaging (p < 0.001, AUC = 0.6735) and vertical resolution for ADC maps (p = 0.006, AUC = 0.6348). By contrast, the neural network achieved robust performance for ADC map quality prediction from T2 images, with 83 % sensitivity and 90 % negative predictive value in multicenter validation, comparable to single-site models using ADC maps directly. Remarkably, it generalized well to unseen in-house data (94 ± 2 % accuracy). Rectal cross-sectional area correlated with ADC quality (AUC = 0.65), offering a simple, interpretable metric. The probability of low quality, uninterpretable ADC maps can be inferred early in the imaging process by a neural network approach, allowing corrective action to be employed.

Large Language Model-Based Entity Extraction Reliably Classifies Pancreatic Cysts and Reveals Predictors of Malignancy: A Cross-Sectional and Retrospective Cohort Study

Papale, A. J., Flattau, R., Vithlani, N., Mahajan, D., Ziemba, Y., Zavadsky, T., Carvino, A., King, D., Nadella, S.

medrxiv logopreprintJul 17 2025
Pancreatic cystic lesions (PCLs) are often discovered incidentally on imaging and may progress to pancreatic ductal adenocarcinoma (PDAC). PCLs have a high incidence in the general population, and adherence to screening guidelines can be variable. With the advent of technologies that enable automated text classification, we sought to evaluate various natural language processing (NLP) tools including large language models (LLMs) for identifying and classifying PCLs from radiology reports. We correlated our classification of PCLs to clinical features to identify risk factors for a positive PDAC biopsy. We contrasted a previously described NLP classifier to LLMs for prospective identification of PCLs in radiology. We evaluated various LLMs for PCL classification into low-risk or high-risk categories based on published guidelines. We compared prompt-based PCL classification to specific entity-guided PCL classification. To this end, we developed tools to deidentify radiology and track patients longitudinally based on their radiology reports. Additionally, we used our newly developed tools to evaluate a retrospective database of patients who underwent pancreas biopsy to determine associated factors including those in their radiology reports and clinical features using multivariable logistic regression modelling. Of 14,574 prospective radiology reports, 665 (4.6%) described a pancreatic cyst, including 175 (1.2%) high-risk lesions. Our Entity-Extraction Large Language Model tool achieved recall 0.992 (95% confidence interval [CI], 0.985-0.998), precision 0.988 (0.979-0.996), and F1-score 0.990 (0.985-0.995) for detecting cysts; F1-scores were 0.993 (0.987-0.998) for low-risk and 0.977 (0.952-0.995) for high-risk classification. Among 4,285 biopsy patients, 330 had pancreatic cysts documented [&ge;]6 months before biopsy. In the final multivariable model (AUC = 0.877), independent predictors of adenocarcinoma were change in duct caliber with upstream atrophy (adjusted odds ratio [AOR], 4.94; 95% CI, 1.30-18.79), mural nodules (AOR, 11.02; 1.81-67.26), older age (AOR, 1.10; 1.05-1.16), lower body mass index (AOR, 0.86; 0.76-0.96), and total bilirubin (AOR, 1.81; 1.18-2.77). Automated NLP-based analysis of radiology reports using LLM-driven entity extraction can accurately identify and risk-stratify PCLs and, when retrospectively applied, reveal factors predicting malignant progression. Widespread implementation may improve surveillance and enable earlier intervention.

Multi-DECT image-based radiomics with interpretable machine learning for preoperative prediction of tumor budding grade and prognosis in colorectal cancer: a dual-center study.

Lin G, Chen W, Chen Y, Cao J, Mao W, Xia S, Chen M, Xu M, Lu C, Ji J

pubmed logopapersJul 16 2025
This study evaluates the predictive ability of multiparametric dual-energy computed tomography (multi-DECT) radiomics for tumor budding (TB) grade and prognosis in patients with colorectal cancer (CRC). This study comprised 510 CRC patients at two institutions. The radiomics features of multi-DECT images (including polyenergetic, virtual monoenergetic, iodine concentration [IC], and effective atomic number images) were screened to build radiomics models utilizing nine machine learning (ML) algorithms. An ML-based fusion model comprising clinical-radiological variables and radiomics features was developed. The assessment of model performance was conducted through the area under the receiver operating characteristic curve (AUC), while the model's interpretability was assessed by shapley additive explanation (SHAP). The prognostic significance of the fusion model was determined via survival analysis. The CT-reported lymph node status and normalized IC were used to develop a clinical-radiological model. Among the nine examined ML algorithms, the extreme gradient boosting (XGB) algorithm performed best. The XGB-based fusion model containing multi-DECT radiomics features outperformed the clinical-radiological model in predicting TB grade, demonstrating superior AUCs of 0.969 in the training cohort, 0.934 in the internal validation cohort, and 0.897 in the external validation cohort. The SHAP analysis identified variables influencing model predictions. Patients with a model-predicted high TB grade had worse recurrence-free survival (RFS) in both the training (P < 0.001) and internal validation (P = 0.016) cohorts. The XGB-based fusion model using multi-DECT radiomics could serve as a non-invasive tool to predict TB grade and RFS in patients with CRC preoperatively.

Multimodal Large Language Model With Knowledge Retrieval Using Flowchart Embedding for Forming Follow-Up Recommendations for Pancreatic Cystic Lesions.

Zhu Z, Liu J, Hong CW, Houshmand S, Wang K, Yang Y

pubmed logopapersJul 16 2025
<b>BACKGROUND</b>. The American College of Radiology (ACR) Incidental Findings Committee (IFC) algorithm provides guidance for pancreatic cystic lesion (PCL) management. Its implementation using plain-text large language model (LLM) solutions is challenging given that key components include multimodal data (e.g., figures and tables). <b>OBJECTIVE</b>. The purpose of the study is to evaluate a multimodal LLM approach incorporating knowledge retrieval using flowchart embedding for forming follow-up recommendations for PCL management. <b>METHODS</b>. This retrospective study included patients who underwent abdominal CT or MRI from September 1, 2023, to September 1, 2024, and whose report mentioned a PCL. The reports' Findings sections were inputted to a multimodal LLM (GPT-4o). For task 1 (198 patients: mean age, 69.0 ± 13.0 [SD] years; 110 women, 88 men), the LLM assessed PCL features (presence of PCL, PCL size and location, presence of main pancreatic duct communication, presence of worrisome features or high-risk stigmata) and formed a follow-up recommendation using three knowledge retrieval methods (default knowledge, plain-text retrieval-augmented generation [RAG] from the ACR IFC algorithm PDF document, and flowchart embedding using the LLM's image-to-text conversion for in-context integration of the document's flowcharts and tables). For task 2 (85 patients: mean initial age, 69.2 ± 10.8 years; 48 women, 37 men), an additional relevant prior report was inputted; the LLM assessed for interval PCL change and provided an adjusted follow-up schedule accounting for prior imaging using flowchart embedding. Three radiologists assessed LLM accuracy in task 1 for PCL findings in consensus and follow-up recommendations independently; one radiologist assessed accuracy in task 2. <b>RESULTS</b>. For task 1, the LLM with flowchart embedding had accuracy for PCL features of 98.0-99.0%. The accuracy of the LLM follow-up recommendations based on default knowledge, plain-text RAG, and flowchart embedding for radiologist 1 was 42.4%, 23.7%, and 89.9% (<i>p</i> < .001), respectively; radiologist 2 was 39.9%, 24.2%, and 91.9% (<i>p</i> < .001); and radiologist 3 was 40.9%, 25.3%, and 91.9% (<i>p</i> < .001). For task 2, the LLM using flowchart embedding showed an accuracy for interval PCL change of 96.5% and for adjusted follow-up schedules of 81.2%. <b>CONCLUSION</b>. Multimodal flowchart embedding aided the LLM's automated provision of follow-up recommendations adherent to a clinical guidance document. <b>CLINICAL IMPACT</b>. The framework could be extended to other incidental findings through the use of other clinical guidance documents as the model input.

Deep learning for appendicitis: development of a three-dimensional localization model on CT.

Takaishi T, Kawai T, Kokubo Y, Fujinaga T, Ojio Y, Yamamoto T, Hayashi K, Owatari Y, Ito H, Hiwatashi A

pubmed logopapersJul 16 2025
To develop and evaluate a deep learning model for detecting appendicitis on abdominal CT. This retrospective single-center study included 567 CTs of appendicitis patients (330 males, age range 20-96) obtained between 2011 and 2020, randomly split into training (n = 517) and validation (n = 50) sets. The validation set was supplemented with 50 control CTs performed for acute abdomen. For a test dataset, 100 appendicitis CTs and 100 control CTs were consecutively collected from a separate period after 2021. Exclusion criteria included age < 20, perforation, unclear appendix, and appendix tumors. Appendicitis CTs were annotated with three-dimensional bounding boxes that encompassed inflamed appendices. CT protocols were unenhanced, 5-mm slice-thickness, 512 × 512 pixel matrix. The deep learning algorithm was based on faster region convolutional neural network (Faster R-CNN). Two board-certified radiologists visually graded model predictions on the test dataset using a 5-point Likert scale (0: no detection, 1: false, 2: poor, 3: fair, 4: good), with scores ≥ 3 considered true positives. Inter-rater agreement was assessed using weighted kappa statistics. The effects of intra-abdominal fat, periappendiceal fat-stranding, presence of appendicolith, and appendix diameter on the model's recall were analyzed using binary logistic regression. The model showed a precision of 0.66 (87/132), a recall of 0.87 (87/100), and a false-positive rate per patient of 0.23 (45/200). The inter-rater agreement for Likert scores of 2-4 was κ = 0.76. The logistic regression analysis showed that only intra-abdominal fat had a significant impact on the model's precision (p = 0.02). We developed a model capable of detecting appendicitis on CT with a three-dimensional bounding box.

Evaluating Artificial Intelligence-Assisted Prostate Biparametric MRI Interpretation: An International Multireader Study.

Gelikman DG, Yilmaz EC, Harmon SA, Huang EP, An JY, Azamat S, Law YM, Margolis DJA, Marko J, Panebianco V, Esengur OT, Lin Y, Belue MJ, Gaur S, Bicchetti M, Xu Z, Tetreault J, Yang D, Xu D, Lay NS, Gurram S, Shih JH, Merino MJ, Lis R, Choyke PL, Wood BJ, Pinto PA, Turkbey B

pubmed logopapersJul 16 2025
<b>Background:</b> Variability in prostate biparametric MRI (bpMRI) interpretation limits diagnostic reliability for prostate cancer (PCa). Artificial intelligence (AI) has potential to reduce this variability and improve diagnostic accuracy. <b>Objective:</b> The objective of this study was to evaluate impact of a deep learning AI model on lesion- and patient-level clinically significant PCa (csPCa) and PCa detection rates and interreader agreement in bpMRI interpretations. <b>Methods:</b> This retrospective, multireader, multicenter study used a balanced incomplete block design for MRI randomization. Six radiologists of varying experience interpreted bpMRI scans with and without AI assistance in alternating sessions. The reference standard for lesion-level detection for cases was whole-mount pathology after radical prostatectomy; for control patients, negative 12-core systematic biopsies. In all, 180 patients (120 in the case group, 60 in the control group) who underwent mpMRI and prostate biopsy or radical prostatectomy between January 2013 and December 2022 were included. Lesion-level sensitivity, PPV, patient-level AUC for csPCa and PCa detection, and interreader agreement in lesion-level PI-RADS scores and size measurements were assessed. <b>Results:</b> AI assistance improved lesion-level PPV (PI-RADS ≥ 3: 77.2% [95% CI, 71.0-83.1%] vs 67.2% [61.1-72.2%] for csPCa; 80.9% [75.2-85.7%] vs 69.4% [63.4-74.1%] for PCa; both p < .001), reduced lesion-level sensitivity (PIRADS ≥ 3: 44.4% [38.6-50.5%] vs 48.0% [42.0-54.2%] for csPCa, p = .01; 41.7% [37.0-47.4%] vs 44.9% [40.5-50.2%] for PCa, p = .01), and no difference in patient-level AUC (0.822 [95% CI, 0.768-0.866] vs 0.832 [0.787-0.868] for csPCa, p = .61; 0.833 [0.782-0.874] vs 0.835 [0.792-0.871] for PCa, p = .91). AI assistance improved interreader agreement for lesion-level PI-RADS scores (κ = 0.748 [95% CI, 0.701-0.796] vs 0.336 [0.288-0.381], p < .001), lesion size measurements (coverage probability of 0.397 [0.376-0.419] vs 0.367 [0.349-0.383], p < .001), and patient-level PI-RADS scores (κ = 0.704 [0.627-0.767] versus 0.507 [0.421-0.584], p < .001). <b>Conclusion:</b> AI improved lesion-level PPV and interreader agreement with slightly lower lesion-level sensitivity. <b>Clinical Impact:</b> AI may enhance consistency and reduce false-positives in bpMRI interpretations. Further optimization is required to improve sensitivity without compromising specificity.

Identifying Signatures of Image Phenotypes to Track Treatment Response in Liver Disease

Matthias Perkonigg, Nina Bastati, Ahmed Ba-Ssalamah, Peter Mesenbrink, Alexander Goehler, Miljen Martic, Xiaofei Zhou, Michael Trauner, Georg Langs

arxiv logopreprintJul 16 2025
Quantifiable image patterns associated with disease progression and treatment response are critical tools for guiding individual treatment, and for developing novel therapies. Here, we show that unsupervised machine learning can identify a pattern vocabulary of liver tissue in magnetic resonance images that quantifies treatment response in diffuse liver disease. Deep clustering networks simultaneously encode and cluster patches of medical images into a low-dimensional latent space to establish a tissue vocabulary. The resulting tissue types capture differential tissue change and its location in the liver associated with treatment response. We demonstrate the utility of the vocabulary on a randomized controlled trial cohort of non-alcoholic steatohepatitis patients. First, we use the vocabulary to compare longitudinal liver change in a placebo and a treatment cohort. Results show that the method identifies specific liver tissue change pathways associated with treatment, and enables a better separation between treatment groups than established non-imaging measures. Moreover, we show that the vocabulary can predict biopsy derived features from non-invasive imaging data. We validate the method on a separate replication cohort to demonstrate the applicability of the proposed method.

Image quality and radiation dose of reduced-dose abdominopelvic computed tomography (CT) with silver filter and deep learning reconstruction.

Otgonbaatar C, Jeon SH, Cha SJ, Shim H, Kim JW, Ahn JH

pubmed logopapersJul 16 2025
To assess the image quality and radiation dose between reduced-dose CT with deep learning reconstruction (DLR) using SilverBeam filter and standard dose with iterative reconstruction (IR) in abdominopelvic CT. In total, 182 patients (mean age ± standard deviation, 63 ± 14 years; 100 men) were included. Standard-dose scanning was performed with a tube voltage of 100 kVp, automatic tube current modulation, and IR reconstruction, whereas reduced-dose scanning was performed with a tube voltage of 120 kVp, a SilverBeam filter, and DLR. Additionally, a contrast-enhanced (CE)-boost image was obtained for reduced-dose scanning. Radiation dose, objective, and subjective image analyses were performed in each body mass index (BMI) category. The radiation dose for SilverBeam with DLR was significantly lower than that of standard dose with IR, with an average reduction in the effective dose of 59.0% (1.87 vs. 4.57 mSv). Standard dose with IR (10.59 ± 1.75) and SilverBeam with DLR (10.60 ± 1.08) showed no significant difference in image noise (p = 0.99). In the obese group (BMI > 25 kg/m<sup>2</sup>), there were no significant differences in SNRs of the liver, pancreas, and spleen between standard dose with IR and SilverBeam with DLR. SilverBeam with DLR + CE-boost demonstrated significantly better SNRs and CNRs, compared with standard dose with IR and SilverBeam with DLR. DLR combined with silver filter is effective for routine abdominopelvic CT, achieving a clearly reduced radiation dose while providing image quality that is non-inferior to standard dose with IR.

Automatic segmentation of liver structures in multi-phase MRI using variants of nnU-Net and Swin UNETR.

Raab F, Strotzer Q, Stroszczynski C, Fellner C, Einspieler I, Haimerl M, Lang EW

pubmed logopapersJul 16 2025
Accurate segmentation of the liver parenchyma, portal veins, hepatic veins, and lesions from MRI is important for hepatic disease monitoring and treatment. Multi-phase contrast enhanced imaging is superior in distinguishing hepatic structures compared to single-phase approaches, but automated approaches for detailed segmentation of hepatic structures are lacking. This study evaluates deep learning architectures for segmenting liver structures from multi-phase Gd-EOB-DTPA-enhanced T1-weighted VIBE MRI scans. We utilized 458 T1-weighted VIBE scans of pathological livers, with 78 manually labeled for liver parenchyma, hepatic and portal veins, aorta, lesions, and ascites. An additional dataset of 47 labeled subjects was used for cross-scanner evaluation. Three models were evaluated using nested cross-validation: the conventional nnU-Net, the ResEnc nnU-Net, and the Swin UNETR. The late arterial phase was identified as the optimal fixed phase for co-registration. Both nnU-Net variants outperformed Swin UNETR across most tasks. The conventional nnU-Net achieved the highest segmentation performance for liver parenchyma (DSC: 0.97; 95% CI 0.97, 0.98), portal vein (DSC: 0.83; 95% CI 0.80, 0.87), and hepatic vein (DSC: 0.78; 95% CI 0.77, 0.80). Lesion and ascites segmentation proved challenging for all models, with the conventional nnU-Net performing best. This study demonstrates the effectiveness of deep learning, particularly nnU-Net variants, for detailed liver structure segmentation from multi-phase MRI. The developed models and preprocessing pipeline offer potential for improved liver disease assessment and surgical planning in clinical practice.

Comparative study of 2D vs. 3D AI-enhanced ultrasound for fetal crown-rump length evaluation in the first trimester.

Zhang Y, Huang Y, Chen C, Hu X, Pan W, Luo H, Huang Y, Wang H, Cao Y, Yi Y, Xiong Y, Ni D

pubmed logopapersJul 16 2025
Accurate fetal growth evaluation is crucial for monitoring fetal health, with crown-rump length (CRL) being the gold standard for estimating gestational age and assessing growth during the first trimester. To enhance CRL evaluation accuracy and efficiency, we developed an artificial intelligence (AI)-based model (3DCRL-Net) using the 3D U-Net architecture for automatic landmark detection to achieve CRL plane localization and measurement in 3D ultrasound. We then compared its performance to that of experienced radiologists using both 2D and 3D ultrasound for fetal growth assessment. This prospective consecutive study collected fetal data from 1,326 ultrasound screenings conducted at 11-14 weeks of gestation (June 2021 to June 2023). Three experienced radiologists performed fetal screening using 2D video (2D-RAD) and 3D volume (3D-RAD) to obtain the CRL plane and measurement. The 3DCRL-Net model automatically outputs the landmark position, CRL plane localization and measurement. Three specialists audited the planes achieved by radiologists and 3DCRL-Net as standard or non-standard. The performance of CRL landmark detection, plane localization, measurement and time efficiency was evaluated in the internal testing dataset, comparing results with 3D-RAD. In the external dataset, CRL plane localization, measurement accuracy, and time efficiency were compared among the three groups. The internal dataset consisted of 126 cases in the testing set (training: validation: testing = 8:1:1), and the external dataset included 245 cases. On the internal testing set, 3DCRL-Net achieved a mean absolute distance error of 1.81 mm for the nine landmarks, higher accuracy in standard plane localization compared to 3D-RAD (91.27% vs. 80.16%), and strong consistency in CRL measurements (mean absolute error (MAE): 1.26 mm; mean difference: 0.37 mm, P = 0.70). The average time required per fetal case was 2.02 s for 3DCRL-Net versus 2 min for 3D-RAD (P < 0.001). On the external testing dataset, 3DCRL-Net demonstrated high performance in standard plane localization, achieving results comparable to 2D-RAD and 3D-RAD (accuracy: 91.43% vs. 93.06% vs. 86.12%), with strong consistency in CRL measurements, compared to 2D-RAD, which showed an MAE of 1.58 mm and a mean difference of 1.12 mm (P = 0.25). For 2D-RAD vs. 3DCRL-Net, the Pearson correlation and R² were 0.96 and 0.93, respectively, with an MAE of 0.11 ± 0.12 weeks. The average time required per fetal case was 5 s for 3DCRL-Net, compared to 2 min for 3D-RAD and 35 s for 2D-RAD (P < 0.001). The 3DCRL-Net model provides a rapid, accurate, and fully automated solution for CRL measurement in 3D ultrasound, achieving expert-level performance and significantly improving the efficiency and reliability of first-trimester fetal growth assessment.
Page 18 of 73728 results
Show
per page

Ready to Sharpen Your Edge?

Join hundreds of your peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.