Sort by:
Page 95 of 1601600 results

CardioCoT: Hierarchical Reasoning for Multimodal Survival Analysis

Shaohao Rui, Haoyang Su, Jinyi Xiang, Lian-Ming Wu, Xiaosong Wang

arxiv logopreprintMay 25 2025
Accurate prediction of major adverse cardiovascular events recurrence risk in acute myocardial infarction patients based on postoperative cardiac MRI and associated clinical notes is crucial for precision treatment and personalized intervention. Existing methods primarily focus on risk stratification capability while overlooking the need for intermediate robust reasoning and model interpretability in clinical practice. Moreover, end-to-end risk prediction using LLM/VLM faces significant challenges due to data limitations and modeling complexity. To bridge this gap, we propose CardioCoT, a novel two-stage hierarchical reasoning-enhanced survival analysis framework designed to enhance both model interpretability and predictive performance. In the first stage, we employ an evidence-augmented self-refinement mechanism to guide LLM/VLMs in generating robust hierarchical reasoning trajectories based on associated radiological findings. In the second stage, we integrate the reasoning trajectories with imaging data for risk model training and prediction. CardioCoT demonstrates superior performance in MACE recurrence risk prediction while providing interpretable reasoning processes, offering valuable insights for clinical decision-making.

CDPDNet: Integrating Text Guidance with Hybrid Vision Encoders for Medical Image Segmentation

Jiong Wu, Yang Xing, Boxiao Yu, Wei Shao, Kuang Gong

arxiv logopreprintMay 25 2025
Most publicly available medical segmentation datasets are only partially labeled, with annotations provided for a subset of anatomical structures. When multiple datasets are combined for training, this incomplete annotation poses challenges, as it limits the model's ability to learn shared anatomical representations among datasets. Furthermore, vision-only frameworks often fail to capture complex anatomical relationships and task-specific distinctions, leading to reduced segmentation accuracy and poor generalizability to unseen datasets. In this study, we proposed a novel CLIP-DINO Prompt-Driven Segmentation Network (CDPDNet), which combined a self-supervised vision transformer with CLIP-based text embedding and introduced task-specific text prompts to tackle these challenges. Specifically, the framework was constructed upon a convolutional neural network (CNN) and incorporated DINOv2 to extract both fine-grained and global visual features, which were then fused using a multi-head cross-attention module to overcome the limited long-range modeling capability of CNNs. In addition, CLIP-derived text embeddings were projected into the visual space to help model complex relationships among organs and tumors. To further address the partial label challenge and enhance inter-task discriminative capability, a Text-based Task Prompt Generation (TTPG) module that generated task-specific prompts was designed to guide the segmentation. Extensive experiments on multiple medical imaging datasets demonstrated that CDPDNet consistently outperformed existing state-of-the-art segmentation methods. Code and pretrained model are available at: https://github.com/wujiong-hub/CDPDNet.git.

CDPDNet: Integrating Text Guidance with Hybrid Vision Encoders for Medical Image Segmentation

Jiong Wu, Yang Xing, Boxiao Yu, Wei Shao, Kuang Gong

arxiv logopreprintMay 25 2025
Most publicly available medical segmentation datasets are only partially labeled, with annotations provided for a subset of anatomical structures. When multiple datasets are combined for training, this incomplete annotation poses challenges, as it limits the model's ability to learn shared anatomical representations among datasets. Furthermore, vision-only frameworks often fail to capture complex anatomical relationships and task-specific distinctions, leading to reduced segmentation accuracy and poor generalizability to unseen datasets. In this study, we proposed a novel CLIP-DINO Prompt-Driven Segmentation Network (CDPDNet), which combined a self-supervised vision transformer with CLIP-based text embedding and introduced task-specific text prompts to tackle these challenges. Specifically, the framework was constructed upon a convolutional neural network (CNN) and incorporated DINOv2 to extract both fine-grained and global visual features, which were then fused using a multi-head cross-attention module to overcome the limited long-range modeling capability of CNNs. In addition, CLIP-derived text embeddings were projected into the visual space to help model complex relationships among organs and tumors. To further address the partial label challenge and enhance inter-task discriminative capability, a Text-based Task Prompt Generation (TTPG) module that generated task-specific prompts was designed to guide the segmentation. Extensive experiments on multiple medical imaging datasets demonstrated that CDPDNet consistently outperformed existing state-of-the-art segmentation methods. Code and pretrained model are available at: https://github.com/wujiong-hub/CDPDNet.git.

Integrating Large language models into radiology workflow: Impact of generating personalized report templates from summary.

Gupta A, Hussain M, Nikhileshwar K, Rastogi A, Rangarajan K

pubmed logopapersMay 25 2025
To evaluate feasibility of large language models (LLMs) to convert radiologist-generated report summaries into personalized report templates, and assess its impact on scan reporting time and quality. In this retrospective study, 100 CT scans from oncology patients were randomly divided into two equal sets. Two radiologists generated conventional reports for one set and summary reports for the other, and vice versa. Three LLMs - GPT-4, Google Gemini, and Claude Opus - generated complete reports from the summaries using institution-specific generic templates. Two expert radiologists qualitatively evaluated the radiologist summaries and LLM-generated reports using the ACR RADPEER scoring system, using conventional radiologist reports as reference. Reporting time for conventional versus summary-based reports was compared, and LLM-generated reports were analyzed for errors. Quantitative similarity and linguistic metrics were computed to assess report alignment across models with the original radiologist-generated report summaries. Statistical analyses were performed using Python 3.0 to identify significant differences in reporting times, error rates and quantitative metrics. The average reporting time was significantly shorter for summary method (6.76 min) compared to conventional method (8.95 min) (p < 0.005). Among the 100 radiologist summaries, 10 received RADPEER scores worse than 1, with three deemed to have clinically significant discrepancies. Only one LLM-generated report received a worse RADPEER score than its corresponding summary. Error frequencies among LLM-generated reports showed no significant differences across models, with template-related errors being most common (χ<sup>2</sup> = 1.146, p = 0.564). Quantitative analysis indicated significant differences in similarity and linguistic metrics among the three LLMs (p < 0.05), reflecting unique generation patterns. Summary-based scan reporting along with use of LLMs to generate complete personalized report templates can shorten reporting time while maintaining the report quality. However, there remains a need for human oversight to address errors in the generated reports. Summary-based reporting of radiological studies along with the use of large language models to generate tailored reports using generic templates has the potential to make the workflow more efficient by shortening the reporting time while maintaining the quality of reporting.

PolyPose: Localizing Deformable Anatomy in 3D from Sparse 2D X-ray Images using Polyrigid Transforms

Vivek Gopalakrishnan, Neel Dey, Polina Golland

arxiv logopreprintMay 25 2025
Determining the 3D pose of a patient from a limited set of 2D X-ray images is a critical task in interventional settings. While preoperative volumetric imaging (e.g., CT and MRI) provides precise 3D localization and visualization of anatomical targets, these modalities cannot be acquired during procedures, where fast 2D imaging (X-ray) is used instead. To integrate volumetric guidance into intraoperative procedures, we present PolyPose, a simple and robust method for deformable 2D/3D registration. PolyPose parameterizes complex 3D deformation fields as a composition of rigid transforms, leveraging the biological constraint that individual bones do not bend in typical motion. Unlike existing methods that either assume no inter-joint movement or fail outright in this under-determined setting, our polyrigid formulation enforces anatomically plausible priors that respect the piecewise rigid nature of human movement. This approach eliminates the need for expensive deformation regularizers that require patient- and procedure-specific hyperparameter optimization. Across extensive experiments on diverse datasets from orthopedic surgery and radiotherapy, we show that this strong inductive bias enables PolyPose to successfully align the patient's preoperative volume to as few as two X-ray images, thereby providing crucial 3D guidance in challenging sparse-view and limited-angle settings where current registration methods fail.

Evaluation of synthetic training data for 3D intraoral reconstruction of cleft patients from single images.

Lingens L, Lill Y, Nalabothu P, Benitez BK, Mueller AA, Gross M, Solenthaler B

pubmed logopapersMay 24 2025
This study investigates the effectiveness of synthetic training data in predicting 2D landmarks for 3D intraoral reconstruction in cleft lip and palate patients. We take inspiration from existing landmark prediction and 3D reconstruction techniques for faces and demonstrate their potential in medical applications. We generated both real and synthetic datasets from intraoral scans and videos. A convolutional neural network was trained using a negative-Gaussian log-likelihood loss function to predict 2D landmarks and their corresponding confidence scores. The predicted landmarks were then used to fit a statistical shape model to generate 3D reconstructions from individual images. We analyzed the model's performance on real patient data and explored the dataset size required to overcome the domain gap between synthetic and real images. Our approach generates satisfying results on synthetic data and shows promise when tested on real data. The method achieves rapid 3D reconstruction from single images and can therefore provide significant value in day-to-day medical work. Our results demonstrate that synthetic training data are viable for training models to predict 2D landmarks and reconstruct 3D meshes in patients with cleft lip and palate. This approach offers an accessible, low-cost alternative to traditional methods, using smartphone technology for noninvasive, rapid, and accurate 3D reconstructions in clinical settings.

Classifying athletes and non-athletes by differences in spontaneous brain activity: a machine learning and fMRI study.

Peng L, Xu L, Zhang Z, Wang Z, Zhong X, Wang L, Peng Z, Xu R, Shao Y

pubmed logopapersMay 24 2025
Different types of sports training can induce distinct changes in brain activity and function; however, it remains unclear if there are commonalities across various sports disciplines. Moreover, the relationship between these brain activity alterations and the duration of sports training requires further investigation. This study employed resting-state functional magnetic resonance imaging (rs-fMRI) techniques to analyze spontaneous brain activity using the amplitude of low-frequency fluctuations (ALFF) and fractional amplitude of low-frequency fluctuations (fALFF) in 86 highly trained athletes compared to 74 age- and gender-matched non-athletes. Our findings revealed significantly higher ALFF values in the Insula_R (Right Insula), OFCpost_R (Right Posterior orbital gyrus), and OFClat_R (Right Lateral orbital gyrus) in athletes compared to controls, whereas fALFF in the Postcentral_R (Right Postcentral) was notably higher in controls. Additionally, we identified a significant negative correlation between fALFF values in the Postcentral_R of athletes and their years of professional training. Utilizing machine learning algorithms, we achieved accurate classification of brain activity patterns distinguishing athletes from non-athletes with over 96.97% accuracy. These results suggest that the functional reorganization observed in athletes' brains may signify an adaptation to prolonged training, potentially reflecting enhanced processing efficiency. This study emphasizes the importance of examining the impact of long-term sports training on brain function, which could influence cognitive and sensory systems crucial for optimal athletic performance. Furthermore, machine learning methods could be used in the future to select athletes based on differences in brain activity.

Quantitative image quality metrics enable resource-efficient quality control of clinically applied AI-based reconstructions in MRI.

White OA, Shur J, Castagnoli F, Charles-Edwards G, Whitcher B, Collins DJ, Cashmore MTD, Hall MG, Thomas SA, Thompson A, Harrison CA, Hopkinson G, Koh DM, Winfield JM

pubmed logopapersMay 24 2025
AI-based MRI reconstruction techniques improve efficiency by reducing acquisition times whilst maintaining or improving image quality. Recent recommendations from professional bodies suggest centres should perform quality assessments on AI tools. However, monitoring long-term performance presents challenges, due to model drift or system updates. Radiologist-based assessments are resource-intensive and may be subjective, highlighting the need for efficient quality control (QC) measures. This study explores using image quality metrics (IQMs) to assess AI-based reconstructions. 58 patients undergoing standard-of-care rectal MRI were imaged using AI-based and conventional T2-weighted sequences. Paired and unpaired IQMs were calculated. Sensitivity of IQMs to detect retrospective perturbations in AI-based reconstructions was assessed using control charts, and statistical comparisons between the four MR systems in the evaluation were performed. Two radiologists evaluated the image quality of the perturbed images, giving an indication of their clinical relevance. Paired IQMs demonstrated sensitivity to changes in AI-reconstruction settings, identifying deviations outside ± 2 standard deviations of the reference dataset. Unpaired metrics showed less sensitivity. Paired IQMs showed no difference in performance between 1.5 T and 3 T systems (p > 0.99), whilst minor but significant (p < 0.0379) differences were noted for unpaired IQMs. IQMs are effective for QC of AI-based MR reconstructions, offering resource-efficient alternatives to repeated radiologist evaluations. Future work should expand this to other imaging applications and assess additional measures.

Deep learning reconstruction combined with contrast-enhancement boost in dual-low dose CT pulmonary angiography: a two-center prospective trial.

Shen L, Lu J, Zhou C, Bi Z, Ye X, Zhao Z, Xu M, Zeng M, Wang M

pubmed logopapersMay 24 2025
To investigate whether the deep learning reconstruction (DLR) combined with contrast-enhancement-boost (CE-boost) technique can improve the diagnostic quality of CT pulmonary angiography (CTPA) at low radiation and contrast doses, compared with routine CTPA using hybrid iterative reconstruction (HIR). This prospective two-center study included 130 patients who underwent CTPA for suspected pulmonary embolism. Patients were randomly divided into two groups: the routine CTPA group, reconstructed using HIR; and the dual-low dose CTPA group, reconstructed using HIR and DLR, additionally combined with the CE-boost to generate HIR-boost and DLR-boost images. Signal-to-noise ratio (SNR) and contrast-to-noise ratio (CNR) of pulmonary arteries were quantitatively assessed. Two experienced radiologists independently ordered CT images (5, best; 1, worst) based on overall image noise and vascular contrast. Diagnostic performance for PE detection was calculated for each dataset. Patient demographics were similar between groups. Compared to HIR images of the routine group, DLR-boost images of the dual-low dose group were significantly better at qualitative scores (p < 0.001). The CT values of pulmonary arteries between the DLR-boost and the HIR images were comparable (p > 0.05), whereas the SNRs and CNRs of pulmonary arteries in the DLR-boost images were the highest among all five datasets (p < 0.001). The AUCs of DLR, HIR-boost, and DLR-boost were 0.933, 0.924, and 0.986, respectively (all p > 0.05). DLR combined with CE-boost technique can significantly improve the image quality of CTPA with reduced radiation and contrast doses, facilitating a more accurate diagnosis of pulmonary embolism. Question The dual-low dose protocol is essential for detecting pulmonary emboli (PE) in follow-up CT pulmonary angiography (PA), yet effective solutions are still lacking. Findings Deep learning reconstruction (DLR)-boost with reduced radiation and contrast doses demonstrated higher quantitative and qualitative image quality than hybrid-iterative reconstruction in the routine CTPA. Clinical relevance DLR-boost based low-radiation and low-contrast-dose CTPA protocol offers a novel strategy to further enhance the image quality and diagnosis accuracy for pulmonary embolism patients.

Preoperative risk assessment of invasive endometrial cancer using MRI-based radiomics: a systematic review and meta-analysis.

Gao Y, Liang F, Tian X, Zhang G, Zhang H

pubmed logopapersMay 24 2025
Image-derived machine learning (ML) is a robust and growing field in diagnostic imaging systems for both clinicians and radiologists. Accurate preoperative radiological evaluation of the invasive ability of endometrial cancer (EC) can increase the degree of clinical benefit. The present study aimed to investigate the diagnostic performance of magnetic resonance imaging (MRI)-derived artificial intelligence for accurate preoperative assessment of the invasive risk. The PubMed, Embase, Cochrane Library and Web of Science databases were searched, and pertinent English-language papers were collected. The pooled sensitivity, specificity, diagnostic odds ratio (DOR), and positive and negative likelihood ratios (PLR and NLR, respectively) of all the papers were calculated using Stata software. The results were plotted on a summary receiver operating characteristic (SROC) curve, publication bias and threshold effects were evaluated, and meta-regression and subgroup analyses were conducted to explore the possible causes of intratumoral heterogeneity. MRI-based radiomics revealed pooled sensitivity (SEN) and specificity (SPE) values of 0.85 and 0.82 for the prediction of high-grade EC; 0.80 and 0.85 for deep myometrial invasion (DMI); 0.85 and 0.73 for lymphovascular space invasion (LVSI); 0.79 and 0.85 for microsatellite instability (MSI); and 0.90 and 0.72 for lymph node metastasis (LNM), respectively. For LVSI prediction and high-grade histological analysis, meta-regression revealed that the image segmentation and MRI-based radiomics modeling contributed to heterogeneity (p = 0.003 and 0.04). Through a systematic review and meta-analysis of the reported literature, preoperative MRI-derived ML could help clinicians accurately evaluate EC risk factors, potentially guiding individual treatment thereafter.
Page 95 of 1601600 results
Show
per page
Get Started

Upload your X-ray image and get interpretation.

Upload now →

Disclaimer: X-ray Interpreter's AI-generated results are for informational purposes only and not a substitute for professional medical advice. Always consult a healthcare professional for medical diagnosis and treatment.