Sort by:
Page 308 of 3903892 results

Image Aesthetic Reasoning: A New Benchmark for Medical Image Screening with MLLMs

Zheng Sun, Yi Wei, Long Yu

arxiv logopreprintMay 29 2025
Multimodal Large Language Models (MLLMs) are of great application across many domains, such as multimodal understanding and generation. With the development of diffusion models (DM) and unified MLLMs, the performance of image generation has been significantly improved, however, the study of image screening is rare and its performance with MLLMs is unsatisfactory due to the lack of data and the week image aesthetic reasoning ability in MLLMs. In this work, we propose a complete solution to address these problems in terms of data and methodology. For data, we collect a comprehensive medical image screening dataset with 1500+ samples, each sample consists of a medical image, four generated images, and a multiple-choice answer. The dataset evaluates the aesthetic reasoning ability under four aspects: \textit{(1) Appearance Deformation, (2) Principles of Physical Lighting and Shadow, (3) Placement Layout, (4) Extension Rationality}. For methodology, we utilize long chains of thought (CoT) and Group Relative Policy Optimization with Dynamic Proportional Accuracy reward, called DPA-GRPO, to enhance the image aesthetic reasoning ability of MLLMs. Our experimental results reveal that even state-of-the-art closed-source MLLMs, such as GPT-4o and Qwen-VL-Max, exhibit performance akin to random guessing in image aesthetic reasoning. In contrast, by leveraging the reinforcement learning approach, we are able to surpass the score of both large-scale models and leading closed-source models using a much smaller model. We hope our attempt on medical image screening will serve as a regular configuration in image aesthetic reasoning in the future.

Deep Learning CAIPIRINHA-VIBE Improves and Accelerates Head and Neck MRI.

Nitschke LV, Lerchbaumer M, Ulas T, Deppe D, Nickel D, Geisel D, Kubicka F, Wagner M, Walter-Rittel T

pubmed logopapersMay 29 2025
The aim of this study was to evaluate image quality for contrast-enhanced (CE) neck MRI with a deep learning-reconstructed VIBE sequence with acceleration factors (AF) 4 (DL4-VIBE) and 6 (DL6-VIBE). Patients referred for neck MRI were examined in a 3-Tesla scanner in this prospective, single-center study. Four CE fat-saturated (FS) VIBE sequences were acquired in each patient: Star-VIBE (4:01 min), VIBE (2:05 min), DL4-VIBE (0:24 min), DL6-VIBE (0:17 min). Image quality was evaluated by three radiologists with a 5-point Likert scale and included overall image quality, muscle contour delineation, conspicuity of mucosa and pharyngeal musculature, FS uniformity, and motion artifacts. Objective image quality was assessed with signal-to-noise ratio (SNR), contrast-to-noise ratio (CNR), and quantification of metal artifacts. 68 patients (60.3% male; mean age 57.4±16 years) were included in this study. DL4-VIBE was superior for overall image quality, delineation of muscle contours, differentiation of mucosa and pharyngeal musculature, vascular delineation, and motion artifacts. Notably, DL4-VIBE exhibited exceptional FS uniformity (p<0.001). SNR and CNR were superior for DL4-VIBE compared to all other sequences (p<0.001). Metal artifacts were least pronounced in the standard VIBE, followed by DL4-VIBE (p<0.001). Although DL6-VIBE was inferior to DL4-VIBE, it demonstrated improved FS homogeneity, delineation of pharyngeal mucosa, and CNR compared to Star-VIBE and VIBE. DL4-VIBE significantly improves image quality for CE neck MRI with a fraction of the scan time of conventional sequences.

A combined attention mechanism for brain tumor segmentation of lower-grade glioma in magnetic resonance images.

Hedibi H, Beladgham M, Bouida A

pubmed logopapersMay 29 2025
Low-grade gliomas (LGGs) are among the most problematic brain tumors to reliably segment in FLAIR MRI, and effective delineation of these lesions is critical for clinical diagnosis, treatment planning, and patient monitoring. Nevertheless, conventional U-Net-based approaches usually suffer from the loss of critical structural details owing to repetitive down-sampling, while the encoder features often retain irrelevant information that is not properly utilized by the decoder. To solve these challenges, this paper offers a dual-attention U-shaped design, named ECASE-Unet, which seamlessly integrates Efficient Channel Attention (ECA) and Squeeze-and-Excitation (SE) blocks in both the encoder and decoder stages. By selectively recalibrating channel-wise information, the model increases diagnostically significant regions of interest and reduces noise. Furthermore, dilated convolutions are introduced at the bottleneck layer to capture multi-scale contextual cues without inflating computational complexity, and dropout regularization is systematically applied to prevent overfitting on heterogeneous data. Experimental results on the Kaggle Low-Grade-Glioma dataset suggest that ECASE-Unet greatly outperforms previous segmentation algorithms, reaching a Dice coefficient of 0.9197 and an Intersection over Union (IoU) of 0.8521. Comprehensive ablation studies further reveal that integrating ECA and SE modules delivers complementing benefits, supporting the model's robust efficacy in precisely identifying LGG boundaries. These findings underline the potential of ECASE-Unet to expedite clinical operations and improve patient outcomes. Future work will focus on improving the model's applicability to new MRI modalities and studying the integration of clinical characteristics for a more comprehensive characterization of brain tumors.

Motion-resolved parametric imaging derived from short dynamic [<sup>18</sup>F]FDG PET/CT scans.

Artesani A, van Sluis J, Providência L, van Snick JH, Slart RHJA, Noordzij W, Tsoumpas C

pubmed logopapersMay 29 2025
This study aims to assess the added value of utilizing short-dynamic whole-body PET/CT scans and implementing motion correction before quantifying metabolic rate, offering more insights into physiological processes. While this approach may not be commonly adopted, addressing motion effects is crucial due to their demonstrated potential to cause significant errors in parametric imaging. A 15-minute dynamic FDG PET acquisition protocol was utilized for four lymphoma patients undergoing therapy evaluation. Parametric imaging was obtained using a population-based input function (PBIF) derived from twelve patients with full 65-minute dynamic FDG PET acquisition. AI-based registration methods were employed to correct misalignments between both PET and ACCT and PET-to-PET. Tumour characteristics were assessed using both parametric images and standardized uptake values (SUV). The motion correction process significantly reduced mismatches between images without significantly altering voxel intensity values, except for SUV<sub>max</sub>. Following the alignment of the attenuation correction map with the PET frame, an increase in SUV<sub>max</sub> in FDG-avid lymph nodes was observed, indicating its susceptibility to spatial misalignments. In contrast, Patlak K<sub>i</sub> parameter was highly sensitive to misalignment across PET frames, that notably altered the Patlak slope. Upon completion of the motion correction process, the parametric representation revealed heterogeneous behaviour among lymph nodes compared to SUV images. Notably, reduced volume of elevated metabolic rate was determined in the mediastinal lymph nodes in contrast with an SUV of 5 g/ml, indicating potential perfusion or inflammation. Motion resolved short-dynamic PET can enhance the utility and reliability of parametric imaging, an aspect often overlooked in commercial software.

Exploring best-performing radiomic features with combined multilevel discrete wavelet decompositions for multiclass COVID-19 classification using chest X-ray images.

Özcan H

pubmed logopapersMay 29 2025
Discrete wavelet transforms have been applied in many machine learning models for the analysis of COVID-19; however, little is known about the impact of combined multilevel wavelet decompositions for the disease identification. This study proposes a computer-aided diagnosis system for addressing the combined multilevel effects of multiscale radiomic features on multiclass COVID-19 classification using chest X-ray images. A two-level discrete wavelet transform was applied to an optimal region of interest to obtain multiscale decompositions. Both approximation and detail coefficients were extensively investigated in varying frequency bands through 1240 experimental models. High dimensionality in the feature space was managed using a proposed filter- and wrapper-based feature selection approach. A comprehensive comparison was conducted between the bands and features to explore best-performing ensemble algorithm models. The results indicated that incorporating multilevel decompositions could lead to improved model performance. An inclusive region of interest, encompassing both lungs and the mediastinal regions, was identified to enhance feature representation. The light gradient-boosting machine, applied on combined bands with the features of basic, gray-level, Gabor, histogram of oriented gradients and local binary patterns, achieved the highest weighted precision, sensitivity, specificity, and accuracy of 97.50 %, 97.50 %, 98.75 %, and 97.50 %, respectively. The COVID-19-versus-the-rest receiver operating characteristic area under the curve was 0.9979. These results underscore the potential of combining decomposition levels with the original signals and employing an inclusive region of interest for effective COVID-19 detection, while the feature selection and training processes remain efficient within a practical computational time.

Research on multi-algorithm and explainable AI techniques for predictive modeling of acute spinal cord injury using multimodal data.

Tai J, Wang L, Xie Y, Li Y, Fu H, Ma X, Li H, Li X, Yan Z, Liu J

pubmed logopapersMay 29 2025
Machine learning technology has been extensively applied in the medical field, particularly in the context of disease prediction and patient rehabilitation assessment. Acute spinal cord injury (ASCI) is a sudden trauma that frequently results in severe neurological deficits and a significant decline in quality of life. Early prediction of neurological recovery is crucial for the personalized treatment planning. While extensively explored in other medical fields, this study is the first to apply multiple machine learning methods and Shapley Additive Explanations (SHAP) analysis specifically to ASCI for predicting neurological recovery. A total of 387 ASCI patients were included, with clinical, imaging, and laboratory data collected. Key features were selected using univariate analysis, Lasso regression, and other feature selection techniques, integrating clinical, radiomics, and laboratory data. A range of machine learning models, including XGBoost, Logistic Regression, KNN, SVM, Decision Tree, Random Forest, LightGBM, ExtraTrees, Gradient Boosting, and Gaussian Naive Bayes, were evaluated, with Gaussian Naive Bayes exhibiting the best performance. Radiomics features extracted from T2-weighted fat-suppressed MRI scans, such as original_glszm_SizeZoneNonUniformity and wavelet-HLL_glcm_SumEntropy, significantly enhanced predictive accuracy. SHAP analysis identified critical clinical features, including IMLL, INR, BMI, Cys C, and RDW-CV, in the predictive model. The model was validated and demonstrated excellent performance across multiple metrics. The clinical utility and interpretability of the model were further enhanced through the application of patient clustering and nomogram analysis. This model has the potential to serve as a reliable tool for clinicians in the formulation of personalized treatment plans and prognosis assessment.

CT-Based Radiomics for Predicting PD-L1 Expression in Non-small Cell Lung Cancer: A Systematic Review and Meta-analysis.

Salimi M, Vadipour P, Khosravi A, Salimi B, Mabani M, Rostami P, Seifi S

pubmed logopapersMay 29 2025
The efficacy of immunotherapy in non-small cell lung cancer (NSCLC) is intricately associated with baseline PD-L1 expression rates. The standard method for measuring PD-L1 is immunohistochemistry, which is invasive and may not capture tumor heterogeneity. The primary aim of the current study is to assess whether CT-based radiomics models can accurately predict PD-L1 expression status in NSCLC and evaluate their quality and potential gaps in their design. Scopus, PubMed, Web of Science, Embase, and IEEE databases were systematically searched up until February 14, 2025, to retrieve relevant studies. Data from validation cohorts of models that classified patients by tumor proportion score (TPS) of 1% (TPS1) and 50% (TPS50) were extracted and analyzed separately. Quality assessment was performed through METRICS and QUADAS-2 tools. Diagnostic test accuracy meta-analysis was conducted using a bivariate random-effects approach to pool values of performance metrics. The qualitative synthesis included twenty-two studies, and the meta-analysis analyzed 11 studies with 997 individual subjects. The pooled AUC, sensitivity, and specificity of TPS1 models were 0.85, 0.76, and 0.79, respectively. The pooled AUC, sensitivity, and specificity of TPS50 models were 0.88, 0.72, and 0.86, accordingly. The QUADAS-2 tool identified a substantial risk of bias regarding the flow and timing and index test domains. Certain methodological limitations were highlighted by the METRICS score, which averaged 58.1% and ranged from 24% to 83.4%. CT-based radiomics demonstrates strong potential as a non-invasive method for predicting PD-L1 expression in NSCLC. While promising, significant methodological gaps must be addressed to achieve the generalizability and reliability required for clinical application.

Gaussian random fields as an abstract representation of patient metadata for multimodal medical image segmentation.

Cassidy B, McBride C, Kendrick C, Reeves ND, Pappachan JM, Raad S, Yap MH

pubmed logopapersMay 29 2025
Growing rates of chronic wound occurrence, especially in patients with diabetes, has become a recent concerning trend. Chronic wounds are difficult and costly to treat, and have become a serious burden on health care systems worldwide. Innovative deep learning methods for the detection and monitoring of such wounds have the potential to reduce the impact to patients and clinicians. We present a novel multimodal segmentation method which allows for the introduction of patient metadata into the training workflow whereby the patient data are expressed as Gaussian random fields. Our results indicate that the proposed method improved performance when utilising multiple models, each trained on different metadata categories. Using the Diabetic Foot Ulcer Challenge 2022 test set, when compared to the baseline results (intersection over union = 0.4670, Dice similarity coefficient = 0.5908) we demonstrate improvements of +0.0220 and +0.0229 for intersection over union and Dice similarity coefficient respectively. This paper presents the first study to focus on integrating patient data into a chronic wound segmentation workflow. Our results show significant performance gains when training individual models using specific metadata categories, followed by average merging of prediction masks using distance transforms. All source code for this study is available at: https://github.com/mmu-dermatology-research/multimodal-grf.

Ultrasound image-based contrastive fusion non-invasive liver fibrosis staging algorithm.

Dong X, Tan Q, Xu S, Zhang J, Zhou M

pubmed logopapersMay 29 2025
The diagnosis of liver fibrosis is usually based on histopathological examination of liver puncture specimens. Although liver puncture is accurate, it has invasive risks and high economic costs, which are difficult for some patients to accept. Therefore, this study uses deep learning technology to build a liver fibrosis diagnosis model to achieve non-invasive staging of liver fibrosis, avoid complications, and reduce costs. This study uses ultrasound examination to obtain pure liver parenchyma image section data. With the consent of the patient, combined with the results of percutaneous liver puncture biopsy, the degree of liver fibrosis indicated by ultrasound examination data is judged. The concept of Fibrosis Contrast Layer (FCL) is creatively introduced in our experimental method, which can help our model more keenly capture the significant differences in the characteristics of liver fibrosis of various grades. Finally, through label fusion (LF), the characteristics of liver specimens of the same fibrosis stage are abstracted and fused to improve the accuracy and stability of the diagnostic model. Experimental evaluation demonstrated that our model achieved an accuracy of 85.6%, outperforming baseline models such as ResNet (81.9%), InceptionNet (80.9%), and VGG (80.8%). Even under a small-sample condition (30% data), the model maintained an accuracy of 84.8%, significantly outperforming traditional deep-learning models exhibiting sharp performance declines. The training results show that in the whole sample data set and 30% small sample data set training environments, the FCLLF model's test performance results are better than those of traditional deep learning models such as VGG, ResNet, and InceptionNet. The performance of the FCLLF model is more stable, especially in the small sample data set environment. Our proposed FCLLF model effectively improves the accuracy and stability of liver fibrosis staging using non-invasive ultrasound imaging.

Automated classification of midpalatal suture maturation stages from CBCTs using an end-to-end deep learning framework.

Milani OH, Mills L, Nikho A, Tliba M, Allareddy V, Ansari R, Cetin AE, Elnagar MH

pubmed logopapersMay 29 2025
Accurate classification of midpalatal suture maturation stages is critical for orthodontic diagnosis, treatment planning, and the assessment of maxillary growth. Cone Beam Computed Tomography (CBCT) imaging offers detailed insights into this craniofacial structure but poses unique challenges for deep learning image recognition model design due to its high dimensionality, noise artifacts, and variability in image quality. To address these challenges, we propose a novel technique that highlights key image features through a simple filtering process to improve image clarity prior to analysis, thereby enhancing the learning process and better aligning with the distribution of the input data domain. Our preprocessing steps include region-of-interest extraction, followed by high-pass and Sobel filtering for emphasis of low-level features. The feature extraction integrates Convolutional Neural Networks (CNN) architectures, such as EfficientNet and ResNet18, alongside our novel Multi-Filter Convolutional Residual Attention Network (MFCRAN) enhanced with Discrete Cosine Transform (DCT) layers. Moreover, to better capture the inherent order within the data classes, we augment the supervised training process with a ranking loss by attending to the relationship within the label domain. Furthermore, to adhere to diagnostic constraints while training the model, we introduce a tailored data augmentation strategy to improve classification accuracy and robustness. In order to validate our method, we employed a k-fold cross-validation protocol on a private dataset comprising 618 CBCT images, annotated into five stages (A, B, C, D, and E) by expert evaluators. The experimental results demonstrate the effectiveness of our proposed approach, achieving the highest classification accuracy of 79.02%, significantly outperforming competing architectures, which achieved accuracies ranging from 71.87 to 78.05%. This work introduces a novel and fully automated framework for midpalatal suture maturation classification, marking a substantial advancement in orthodontic diagnostics and treatment planning.
Page 308 of 3903892 results
Show
per page

Ready to Sharpen Your Edge?

Join hundreds of your peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.