Sort by:
Page 87 of 1591585 results

Test-Time-Scaling for Zero-Shot Diagnosis with Visual-Language Reasoning

Ji Young Byun, Young-Jin Park, Navid Azizan, Rama Chellappa

arxiv logopreprintJun 11 2025
As a cornerstone of patient care, clinical decision-making significantly influences patient outcomes and can be enhanced by large language models (LLMs). Although LLMs have demonstrated remarkable performance, their application to visual question answering in medical imaging, particularly for reasoning-based diagnosis, remains largely unexplored. Furthermore, supervised fine-tuning for reasoning tasks is largely impractical due to limited data availability and high annotation costs. In this work, we introduce a zero-shot framework for reliable medical image diagnosis that enhances the reasoning capabilities of LLMs in clinical settings through test-time scaling. Given a medical image and a textual prompt, a vision-language model processes a medical image along with a corresponding textual prompt to generate multiple descriptions or interpretations of visual features. These interpretations are then fed to an LLM, where a test-time scaling strategy consolidates multiple candidate outputs into a reliable final diagnosis. We evaluate our approach across various medical imaging modalities -- including radiology, ophthalmology, and histopathology -- and demonstrate that the proposed test-time scaling strategy enhances diagnostic accuracy for both our and baseline methods. Additionally, we provide an empirical analysis showing that the proposed approach, which allows unbiased prompting in the first stage, improves the reliability of LLM-generated diagnoses and enhances classification accuracy.

ADAgent: LLM Agent for Alzheimer's Disease Analysis with Collaborative Coordinator

Wenlong Hou, Guangqian Yang, Ye Du, Yeung Lau, Lihao Liu, Junjun He, Ling Long, Shujun Wang

arxiv logopreprintJun 11 2025
Alzheimer's disease (AD) is a progressive and irreversible neurodegenerative disease. Early and precise diagnosis of AD is crucial for timely intervention and treatment planning to alleviate the progressive neurodegeneration. However, most existing methods rely on single-modality data, which contrasts with the multifaceted approach used by medical experts. While some deep learning approaches process multi-modal data, they are limited to specific tasks with a small set of input modalities and cannot handle arbitrary combinations. This highlights the need for a system that can address diverse AD-related tasks, process multi-modal or missing input, and integrate multiple advanced methods for improved performance. In this paper, we propose ADAgent, the first specialized AI agent for AD analysis, built on a large language model (LLM) to address user queries and support decision-making. ADAgent integrates a reasoning engine, specialized medical tools, and a collaborative outcome coordinator to facilitate multi-modal diagnosis and prognosis tasks in AD. Extensive experiments demonstrate that ADAgent outperforms SOTA methods, achieving significant improvements in accuracy, including a 2.7% increase in multi-modal diagnosis, a 0.7% improvement in multi-modal prognosis, and enhancements in MRI and PET diagnosis tasks.

Autonomous Computer Vision Development with Agentic AI

Jin Kim, Muhammad Wahi-Anwa, Sangyun Park, Shawn Shin, John M. Hoffman, Matthew S. Brown

arxiv logopreprintJun 11 2025
Agentic Artificial Intelligence (AI) systems leveraging Large Language Models (LLMs) exhibit significant potential for complex reasoning, planning, and tool utilization. We demonstrate that a specialized computer vision system can be built autonomously from a natural language prompt using Agentic AI methods. This involved extending SimpleMind (SM), an open-source Cognitive AI environment with configurable tools for medical image analysis, with an LLM-based agent, implemented using OpenManus, to automate the planning (tool configuration) for a particular computer vision task. We provide a proof-of-concept demonstration that an agentic system can interpret a computer vision task prompt, plan a corresponding SimpleMind workflow by decomposing the task and configuring appropriate tools. From the user input prompt, "provide sm (SimpleMind) config for lungs, heart, and ribs segmentation for cxr (chest x-ray)"), the agent LLM was able to generate the plan (tool configuration file in YAML format), and execute SM-Learn (training) and SM-Think (inference) scripts autonomously. The computer vision agent automatically configured, trained, and tested itself on 50 chest x-ray images, achieving mean dice scores of 0.96, 0.82, 0.83, for lungs, heart, and ribs, respectively. This work shows the potential for autonomous planning and tool configuration that has traditionally been performed by a data scientist in the development of computer vision applications.

Cross-dataset Evaluation of Dementia Longitudinal Progression Prediction Models

Zhang, C., An, L., Wulan, N., Nguyen, K.-N., Orban, C., Chen, P., Chen, C., Zhou, J. H., Liu, K., Yeo, B. T. T., Alzheimer's Disease Neuroimaging Initiative,, Australian Imaging Biomarkers and Lifestyle Study of Aging,

medrxiv logopreprintJun 11 2025
IntroductionAccurately predicting Alzheimers Disease (AD) progression is useful for clinical care. The 2019 TADPOLE (The Alzheimers Disease Prediction Of Longitudinal Evolution) challenge evaluated 92 algorithms from 33 teams worldwide. Unlike typical clinical prediction studies, TADPOLE accommodates (1) variable number of observed timepoints across patients, (2) missing data across modalities and visits, and (3) prediction over an open-ended time horizon, which better reflects real-world data. However, TADPOLE only used the Alzheimers Disease Neuroimaging Initiative (ADNI) dataset, so how well top algorithms generalize to other cohorts remains unclear. MethodsWe tested five algorithms in three external datasets covering 2,312 participants and 13,200 timepoints. The algorithms included FROG, the overall TADPOLE winner, which utilized a unique Longitudinal-to-Cross-sectional (L2C) transformation to convert variable-length longitudinal histories into feature vectors of the same length across participants (i.e., same-length feature vectors). We also considered two FROG variants. One variant unified all XGBoost models from the original FROG with a single feedforward neural network (FNN), which we referred to as L2C-FNN. We also included minimal recurrent neural networks (MinimalRNN), which was ranked second at publication time, as well as AD Course Map (AD-Map), which outperformed MinimalRNN at publication time. All five models - three FROG variants, MinimalRNN and AD-Map - were trained on ADNI and tested on the external datasets. ResultsL2C-FNN performed the best overall. In the case of predicting cognition and ventricle volume, L2C-FNN and AD-Map were the best. For clinical diagnosis prediction, L2C-FNN was the best, while AD-Map was the worst. L2C-FNN also maintained its edge over other models, regardless of the number of observed timepoints, and regardless of the prediction horizon from 0 to 6 years into the future. ConclusionsL2C-FNN shows strong potential for both short-term and long-term dementia progression prediction. Pretrained ADNI models are available: https://github.com/ThomasYeoLab/CBIG/tree/master/stable_projects/predict_phenotypes/Zhang2025_L2CFNN.

Slide-free surface histology enables rapid colonic polyp interpretation across specialties and foundation AI

Yong, A., Husna, N., Tan, K. H., Manek, G., Sim, R., Loi, R., Lee, O., Tang, S., Soon, G., Chan, D., Liang, K.

medrxiv logopreprintJun 11 2025
Colonoscopy is a mainstay of colorectal cancer screening and has helped to lower cancer incidence and mortality. The resection of polyps during colonoscopy is critical for tissue diagnosis and prevention of colorectal cancer, albeit resulting in increased resource requirements and expense. Discarding resected benign polyps without sending for histopathological processing and confirmatory diagnosis, known as the resect and discard strategy, could enhance efficiency but is not commonly practiced due to endoscopists predominant preference for pathological confirmation. The inaccessibility of histopathology from unprocessed resected tissue hampers endoscopic decisions. We show that intraprocedural fibre-optic microscopy with ultraviolet-C surface excitation (FUSE) of polyps post-resection enables rapid diagnosis, potentially complementing endoscopic interpretation and incorporating pathologist oversight. In a clinical study of 28 patients, slide-free FUSE microscopy of freshly resected polyps yielded mucosal views that greatly magnified the surface patterns observed on endoscopy and revealed previously unavailable histopathological signatures. We term this new cross-specialty readout surface histology. In blinded interpretations of 42 polyps (19 training, 23 reading) by endoscopists and pathologists of varying experience, surface histology differentiated normal/benign, low-grade dysplasia, and high-grade dysplasia and cancer, with 100% performance in classifying high/low risk. This FUSE dataset was also successfully interpreted by foundation AI models pretrained on histopathology slides, illustrating a new potential for these models to not only expedite conventional pathology tasks but also autonomously provide instant expert feedback during procedures that typically lack pathologists. Surface histology readouts during colonoscopy promise to empower endoscopist decisions and broadly enhance confidence and participation in resect and discard. One Sentence SummaryRapid microscopy of resected polyps during colonoscopy yielded accurate diagnoses, promising to enhance colorectal screening.

AI-based Hepatic Steatosis Detection and Integrated Hepatic Assessment from Cardiac CT Attenuation Scans Enhances All-cause Mortality Risk Stratification: A Multi-center Study

Yi, J., Patel, K., Miller, R. J., Marcinkiewicz, A. M., Shanbhag, A., Hijazi, W., Dharmavaram, N., Lemley, M., Zhou, J., Zhang, W., Liang, J. X., Ramirez, G., Builoff, V., Slipczuk, L., Travin, M., Alexanderson, E., Carvajal-Juarez, I., Packard, R. R., Al-Mallah, M., Ruddy, T. D., Einstein, A. J., Feher, A., Miller, E. J., Acampa, W., Knight, S., Le, V., Mason, S., Calsavara, V. F., Chareonthaitawee, P., Wopperer, S., Kwan, A. C., Wang, L., Berman, D. S., Dey, D., Di Carli, M. F., Slomka, P.

medrxiv logopreprintJun 11 2025
BackgroundHepatic steatosis (HS) is a common cardiometabolic risk factor frequently present but under- diagnosed in patients with suspected or known coronary artery disease. We used artificial intelligence (AI) to automatically quantify hepatic tissue measures for identifying HS from CT attenuation correction (CTAC) scans during myocardial perfusion imaging (MPI) and evaluate their added prognostic value for all-cause mortality prediction. MethodsThis study included 27039 consecutive patients [57% male] with MPI scans from nine sites. We used an AI model to segment liver and spleen on low dose CTAC scans and quantify the liver measures, and the difference of liver minus spleen (LmS) measures. HS was defined as mean liver attenuation < 40 Hounsfield units (HU) or LmS attenuation < -10 HU. Additionally, we used seven sites to develop an AI liver risk index (LIRI) for comprehensive hepatic assessment by integrating the hepatic measures and two external sites to validate its improved prognostic value and generalizability for all-cause mortality prediction over HS. FindingsMedian (interquartile range [IQR]) age was 67 [58, 75] years and body mass index (BMI) was 29.5 [25.5, 34.7] kg/m2, with diabetes in 8950 (33%) patients. The algorithm identified HS in 6579 (24%) patients. During median [IQR] follow-up of 3.58 [1.86, 5.15] years, 4836 (18%) patients died. HS was associated with increased mortality risk overall (adjusted hazard ratio (HR): 1.14 [1.05, 1.24], p=0.0016) and in subpopulations. LIRI provided higher prognostic value than HS after adjustments overall (adjusted HR 1.5 [1.32, 1.69], p<0.0001 vs HR 1.16 [1.02, 1.31], p=0.0204) and in subpopulations. InterpretationsAI-based hepatic measures automatically identify HS from CTAC scans in patients undergoing MPI without additional radiation dose or physician interaction. Integrated liver assessment combining multiple hepatic imaging measures improved risk stratification for all-cause mortality. FundingNational Heart, Lung, and Blood Institute/National Institutes of Health. Research in context Evidence before this studyExisting studies show that fully automated hepatic quantification analysis from chest computed tomography (CT) scans is feasible. While hepatic measures show significant potential for improving risk stratification and patient management, CT attenuation correction (CTAC) scans from patients undergoing myocardial perfusion imaging (MPI) have rarely been utilized for concurrent and automated volumetric hepatic analysis beyond its current utilization for attenuation correction and coronary artery calcium burden assessment. We conducted a literature review on PubMed and Google Scholar on April 1st, 2025, using the following keywords: ("liver" OR "hepatic") AND ("quantification" OR "measure") AND ("risk stratification" OR "survival analysis" OR "prognosis" OR "prognostic prediction") AND ("CT" OR "computed tomography"). Previous studies have established approaches for the identification of hepatic steatosis (HS) and its prognostic value in various small- scale cohorts using either invasive biopsy or non-invasive imaging approaches. However, CT-based non- invasive imaging, existing research predominantly focuses on manual region-of-interest (ROI)-based hepatic quantification from selected CT slices or on identifying hepatic steatosis without comprehensive prognostic assessment in large-scale and multi-site cohorts, which hinders the association evaluation of hepatic steatosis for risk stratification in clinical routine with less precise estimates, weak statistical reliability, and limited subgroup analysis to assess bias effects. No existing studies investigated the prognostic value of hepatic steatosis measured in consecutive patients undergoing MPI. These patients usually present with multiple cardiovascular risk factors such as hypertension, dyslipidemia, diabetes and family history of coronary disease. Whether hepatic measures could provide added prognostic value over existing cardiometabolic factors is unknown. Furthermore, despite the diverse hepatic measures on CT in existing literature, integrated AI-based assessment has not been investigated before though it may improve the risk stratification further over HS. Lastly, previous research relied on dedicated CT scans performed for screening purposes. CTAC scans obtained routinely with MPI had never been utilized for automated HS detection and prognostic evaluation, despite being readily available at no additional cost or radiation exposure. Added value of this studyIn this multi-center (nine sites) international (three countries) study of 27039 consecutive patients undergoing myocardial perfusion imaging (MPI) with PET or SPECT, we used an innovative artificial intelligence (AI)- based approach for automatically segmenting the entire liver and spleen volumes from low-dose ungated CT attenuation correction (CTAC) scans acquired during MPI, followed by the identification of hepatic steatosis. We evaluated the added prognostic value of several key hepatic metrics--liver measures (mean attenuation, coefficient of variation (CoV), entropy, and standard deviation), and similar measures for the difference of liver minus spleen (LmS)--derived from volumetric quantification of CTAC scans with adjustment for existing clinical and MPI variables. A HS imaging criterion (HSIC: a patient has moderate or severe hepatic steatosis if the mean liver attenuation is < 40 Hounsfield unit (HU) or the difference of liver mean attenuation and spleen mean attenuation is < -10 HU) was used to detect HS. These hepatic metrics were assessed for their ability to predict all-cause mortality in a large-scale and multi-center patient cohort. Additionally, we developed and validated an eXtreme Gradient Boosting decision tree model for integrated liver assessment and risk stratification by combining the hepatic metrics with the demographic variables to derive a liver risk index (LIRI). Our results demonstrated strong associations between the hepatic metrics and all-cause mortality, even after adjustment for clinical variables, myocardial perfusion, and atherosclerosis biomarkers. Our results revealed significant differences in the association of HS with mortality in different sex, age, and race subpopulations. Similar differences were also observed in various chronic disease subpopulations such as obese and diabetic subpopulations. These results highlighted the modifying effects of various patient characteristics, partially accounting for the inconsistent association observed in existing studies. Compared with individual hepatic measures, LIRI showed significant improvement compared to HSIC-based HS in mortality prediction in external testing. All these demonstrate the feasibility of HS detection and integrated liver assessment from cardiac low-dose CT scans from MPI, which is also expected to apply for generic chest CT scans which have coverage of liver and spleen while prior studies used dedicated abdominal CT scans for such purposes. Implications of all the available evidenceRoutine point-of-care analysis of hepatic quantification can be seamlessly integrated into all MPI using CTAC scans to noninvasively identify HS at no additional cost or radiation exposure. The automatically derived hepatic metrics enhance risk stratification by providing additional prognostic value beyond existing clinical and imaging factors, and the LIRI enables comprehensive assessment of liver and further improves risk stratification and patient management.

Sonopermeation combined with stroma normalization enables complete cure using nano-immunotherapy in murine breast tumors.

Neophytou C, Charalambous A, Voutouri C, Angeli S, Panagi M, Stylianopoulos T, Mpekris F

pubmed logopapersJun 10 2025
Nano-immunotherapy shows great promise in improving patient outcomes, as seen in advanced triple-negative breast cancer, but it does not cure the disease, with median survival under two years. Therefore, understanding resistance mechanisms and developing strategies to enhance its effectiveness in breast cancer is crucial. A key resistance mechanism is the pronounced desmoplasia in the tumor microenvironment, which leads to dysfunction of tumor blood vessels and thus, to hypoperfusion, limited drug delivery and hypoxia. Ultrasound sonopermeation and agents that normalize the tumor stroma have been employed separately to restore vascular abnormalities in tumors with some success. Here, we performed in vivo studies in two murine, orthotopic breast tumor models to explore if combination of ultrasound sonopermeation with a stroma normalization drug can synergistically improve tumor perfusion and enhance the efficacy of nano-immunotherapy. We found that the proposed combinatorial treatment can drastically reduce primary tumor growth and in many cases tumors were no longer measurable. Overall survival studies showed that all mice that received the combination treatment survived and rechallenge experiments revealed that the survivors obtained immunological memory. Employing ultrasound elastography and contrast enhanced ultrasound along with proteomics analysis, flow cytometry and immunofluorescene staining, we found the combinatorial treatment reduced tumor stiffness to normal levels, restoring tumor perfusion and oxygenation. Furthermore, it increased infiltration and activity of immune cells and altered the levels of immunosupportive chemokines. Finally, using machine learning analysis, we identified that tumor stiffness, CD8<sup>+</sup> T cells and M2-type macrophages were strong predictors of treatment response.

Robotic Central Pancreatectomy with Omental Pedicle Flap: Tactics and Tips.

Kawano F, Lim MA, Kemprecos HJ, Tsai K, Cheah D, Tigranyan A, Kaviamuthan K, Pillai A, Chen JC, Polites G, Mise Y, Cohen M, Saiura A, Conrad C

pubmed logopapersJun 10 2025
Robotic central pancreatectomy is increasingly used for pre- or low-grade malignant tumors in the pancreatic body balancing preservation of pancreatic function while removing the target lesion.<sup>1-3</sup> Today, there is no established reconstruction method and high rates of postpancreatectomy fistulas (POPF) remain a significant concern. <sup>4,5</sup> We developed novel technique involving transgastric pancreaticogastrostomy with an omental pedicle advancement flap to reduce the risk of POPF. Additionally, preoperative deep-learning 3D organ modeling plays a crucial role in enhancing spatial understanding to enhance procedural safety.<sup>6,7</sup> METHODS: A 76-year-old female patient with a 33-mm, biopsy-confirmed high-risk IPMN underwent robotic-assisted central pancreatectomy. Preoperative CT was processed with a deep-learning system to create a patient-specific 3D model, enabling virtual simulation of port configurations. The optimal setup was selected based on the spatial relationship between port site, tumor location, and anatomy A transgastric pancreaticogastrostomy with omental flap reinforcement was performed to reduce POPF leading to a simpler reconstruction compared to pancreaticojejunostomy. The procedure lasted 218 min with minimal blood loss (50 ml). No complications occurred, and the patient was discharged on postoperative Day 3 after drain removal. Final pathology showed low-grade dysplasia. This approach, facilitated by robotic assistance, effectively preserves pancreatic function while treating a low-grade malignant tumor. Preoperative 3D organ modeling enhances the spatial understanding with the goal to increase procedural safety. Finally, the omental pedicle advancement flap technique shows promise in possibly reducing the incidence or at least the impact of POPF.

A Deep Learning Model for Identifying the Risk of Mesenteric Malperfusion in Acute Aortic Dissection Using Initial Diagnostic Data: Algorithm Development and Validation.

Jin Z, Dong J, Li C, Jiang Y, Yang J, Xu L, Li P, Xie Z, Li Y, Wang D, Ji Z

pubmed logopapersJun 10 2025
Mesenteric malperfusion (MMP) is an uncommon but devastating complication of acute aortic dissection (AAD) that combines 2 life-threatening conditions-aortic dissection and acute mesenteric ischemia. The complex pathophysiology of MMP poses substantial diagnostic and management challenges. Currently, delayed diagnosis remains a critical contributor to poor outcomes because of the absence of reliable individualized risk assessment tools. This study aims to develop and validate a deep learning-based model that integrates multimodal data to identify patients with AAD at high risk of MMP. This multicenter retrospective study included 525 patients with AAD from 2 hospitals. The training and internal validation cohort consisted of 450 patients from Beijing Anzhen Hospital, whereas the external validation cohort comprised 75 patients from Nanjing Drum Tower Hospital. Three machine learning models were developed: the benchmark model using laboratory parameters, the multiorgan feature-based AAD complicating MMP (MAM) model based on computed tomography angiography images, and the integrated model combining both data modalities. Model performance was assessed using the area under the curve, accuracy, sensitivity, specificity, and Brier score. To improve interpretability, gradient-weighted class activation mapping was used to identify and visualize discriminative imaging features. Univariate and multivariate regression analyses were used to evaluate the prognostic significance of the risk score generated by the optimal model. In the external validation cohort, the integrated model demonstrated superior performance, with an area under the curve of 0.780 (95% CI 0.777-0.785), which was significantly greater than those of the benchmark model (0.586, 95% CI 0.574-0.586) and the MAM model (0.732, 95% CI 0.724-0.734). This highlights the benefits of multimodal integration over single-modality approaches. Additional classification metrics revealed that the integrated model had an accuracy of 0.760 (95% CI 0.758-0.764), a sensitivity of 0.667 (95% CI 0.659-0.675), a specificity of 0.783 (95% CI 0.781-0.788), and a Brier score of 0.143 (95% CI 0.143-0.145). Moreover, gradient-weighted class activation mapping visualizations of the MAM model revealed that during positive predictions, the model focused more on key anatomical areas, particularly the superior mesenteric artery origin and intestinal regions with characteristic gas or fluid accumulation. Univariate and multivariate analyses also revealed that the risk score derived from the integrated model was independently associated with inhospital mortality risk among patients with AAD undergoing endovascular or surgical treatment (odds ratio 1.030, 95% CI 1.004-1.056; P=.02). Our findings demonstrate that compared with unimodal approaches, an integrated deep learning model incorporating both imaging and clinical data has greater diagnostic accuracy for MMP in patients with AAD. This model may serve as a valuable tool for early risk identification, facilitating timely therapeutic decision-making. Further prospective validation is warranted to confirm its clinical utility. Chinese Clinical Registry Center ChiCTR2400086050; http://www.chictr.org.cn/showproj.html?proj=226129.

Arthroscopy-validated diagnostic performance of sub-5-min deep learning super-resolution 3T knee MRI in children and adolescents.

Vosshenrich J, Breit HC, Donners R, Obmann MM, Harder D, Ahlawat S, Walter SS, Serfaty A, Cantarelli Rodrigues T, Recht M, Stern SE, Fritz J

pubmed logopapersJun 10 2025
This study aims to determine the diagnostic performance of sub-5-min combined sixfold parallel imaging (PIx3)-simultaneous multislice (SMSx2)-accelerated deep learning (DL) super-resolution 3T knee MRI in children and adolescents. Children with painful knee conditions who underwent PIx3-SMSx2-accelerated DL super-resolution 3T knee MRI and arthroscopy between October 2022 and December 2023 were retrospectively included. Nine fellowship-trained musculoskeletal radiologists independently scored the MRI studies for image quality and the presence of artifacts (Likert scales, range: 1 = very bad/severe, 5 = very good/absent), as well as structural abnormalities. Interreader agreements and diagnostic performance testing was performed. Forty-four children (mean age: 15 ± 2 years; range: 9-17 years; 24 boys) who underwent knee MRI and arthroscopic surgery within 22 days (range, 2-133) were evaluated. Overall image quality was very good (median rating: 5 [IQR: 4-5]). Motion artifacts (5 [5-5]) and image noise (5 [4-5]) were absent. Arthroscopy-verified abnormalities were detected with good or better interreader agreement (κ ≥ 0.74). Sensitivity, specificity, accuracy, and AUC values were 100%, 84%, 93%, and 0.92, respectively, for anterior cruciate ligament tears; 71%, 97%, 93%, and 0.84 for medial meniscus tears; 65%, 100%, 86%, and 0.82 for lateral meniscus tears; 100%, 100%, 100%, and 1.00 for discoid lateral menisci; 100%, 95%, 96%, and 0.98 for medial patellofemoral ligament tears; and 55%, 100%, 98%, and 0.77 for articular cartilage defects. Clinical sub-5-min PIx3-SMSx2-accelerated DL super-resolution 3T knee MRI provides excellent image quality and high diagnostic performance for diagnosing internal derangement in children and adolescents.
Page 87 of 1591585 results
Show
per page

Ready to Sharpen Your Edge?

Join hundreds of your peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.