Sort by:
Page 22 of 66652 results

Vision-Language Model-Based Semantic-Guided Imaging Biomarker for Lung Nodule Malignancy Prediction.

Zhuang L, Tabatabaei SMH, Salehi-Rad R, Tran LM, Aberle DR, Prosper AE, Hsu W

pubmed logopapersAug 8 2025
Machine learning models have utilized semantic features, deep features, or both to assess lung nodule malignancy. However, their reliance on manual annotation during inference, limited interpretability, and sensitivity to imaging variations hinder their application in real-world clinical settings. Thus, this research aims to integrate semantic features derived from radiologists' assessments of nodules, guiding the model to learn clinically relevant, robust, and explainable imaging features for predicting lung cancer. We obtained 938 low-dose CT scans from the National Lung Screening Trial (NLST) with 1,246 nodules and semantic features. Additionally, the Lung Image Database Consortium dataset contains 1,018 CT scans, with 2,625 lesions annotated for nodule characteristics. Three external datasets were obtained from UCLA Health, the LUNGx Challenge, and the Duke Lung Cancer Screening. We fine-tuned a pretrained Contrastive Language-Image Pretraining (CLIP) model with a parameter-efficient fine-tuning approach to align imaging and semantic text features and predict the one-year lung cancer diagnosis. Our model outperformed state-of-the-art (SOTA) models in the NLST test set with an AUROC of 0.901 and AUPRC of 0.776. It also showed robust results in external datasets. Using CLIP, we also obtained predictions on semantic features through zero-shot inference, such as nodule margin (AUROC: 0.812), nodule consistency (0.812), and pleural attachment (0.840). Our approach surpasses the SOTA models in predicting lung cancer across datasets collected from diverse clinical settings, providing explainable outputs, aiding clinicians in comprehending the underlying meaning of model predictions. This approach also prevents the model from learning shortcuts and generalizes across clinical settings. The code is available at https://github.com/luotingzhuang/CLIP_nodule.

Development and Validation of Pneumonia Patients Prognosis Prediction Model in Emergency Department Disposition Time.

Hwang S, Heo S, Hong S, Cha WC, Yoo J

pubmed logopapersAug 7 2025
This study aimed to develop and evaluate an artificial intelligence model to predict 28-day mortality of pneumonia patients at the time of disposition from emergency department (ED). A multicenter retrospective study was conducted on data from pneumonia patients who visited the ED of a tertiary academic hospital for 8 months and from the Medical Information Mart for Intensive Care (MIMIC-IV) database. We combined chest X-ray information, clinical data, and CURB-65 score to develop three models with the CURB-65 score as a baseline. A total of 2,874 ED visits were analyzed. The RSF model using CXR, clinical data and CURB-65 achieved a C-index of 0.872 in test set, significantly outperforming the CURB-65 score. This study developed a prediction model in pneumonia patients' prognosis, highlighting the potential for supporting clinical decision making in ED through multi-modal clinical information.

AIMR-MediTell: Attention-Infused Mask RNN for Medical Image Interpretation and Report Generation.

Chen L, Yang L, Bedir O

pubmed logopapersAug 7 2025
Medical diagnostics often rely on the interpretation of complex medical images. However, manual analysis and report generation by medical practitioners are time-consuming, and the inherent ambiguity in chest X-rays presents significant challenges for automated systems in producing interpretable results. To address this, we propose Attention-Infused Mask Recurrent Neural Network (AIMR-MediTell), a deep learning framework integrating instance segmentation using Mask RCNN with attention-based feature extraction to identify and highlight abnormal regions in chest X-rays. This framework also incorporates an encoder-decoder structure with pretrained BioWordVec embeddings to generate explanatory reports based on augmented images. We evaluated AIMR-MediTell on the Open-I dataset, achieving a BLEU-4 score of 0.415, outperforming existing models. Our results demonstrate the effectiveness of the proposed model, showing that incorporating masked regions enhances report accuracy and interpretability. By identifying malfunction areas and automating report generation for X-ray images, our approach has the potential to significantly improve the efficiency and accuracy of medical image analysis.

CT-GRAPH: Hierarchical Graph Attention Network for Anatomy-Guided CT Report Generation

Hamza Kalisch, Fabian Hörst, Jens Kleesiek, Ken Herrmann, Constantin Seibold

arxiv logopreprintAug 7 2025
As medical imaging is central to diagnostic processes, automating the generation of radiology reports has become increasingly relevant to assist radiologists with their heavy workloads. Most current methods rely solely on global image features, failing to capture fine-grained organ relationships crucial for accurate reporting. To this end, we propose CT-GRAPH, a hierarchical graph attention network that explicitly models radiological knowledge by structuring anatomical regions into a graph, linking fine-grained organ features to coarser anatomical systems and a global patient context. Our method leverages pretrained 3D medical feature encoders to obtain global and organ-level features by utilizing anatomical masks. These features are further refined within the graph and then integrated into a large language model to generate detailed medical reports. We evaluate our approach for the task of report generation on the large-scale chest CT dataset CT-RATE. We provide an in-depth analysis of pretrained feature encoders for CT report generation and show that our method achieves a substantial improvement of absolute 7.9\% in F1 score over current state-of-the-art methods. The code is publicly available at https://github.com/hakal104/CT-GRAPH.

Enhancing image retrieval through optimal barcode representation.

Khosrowshahli R, Kheiri F, Asilian Bidgoli A, Tizhoosh HR, Makrehchi M, Rahnamayan S

pubmed logopapersAug 7 2025
Data binary encoding has proven to be a versatile tool for optimizing data processing and memory efficiency in various machine learning applications. This includes deep barcoding, generating barcodes from deep learning feature extraction for image retrieval of similar cases among millions of indexed images. Despite the recent advancement in barcode generation methods, converting high-dimensional feature vectors (e.g., deep features) to compact and discriminative binary barcodes is still an urgent necessity and remains an unresolved problem. Difference-based binarization of features is one of the most efficient binarization methods, transforming continuous feature vectors into binary sequences and capturing trend information. However, the performance of this method is highly dependent on the ordering of the input features, leading to a significant combinatorial challenge. This research addresses this problem by optimizing feature sequences based on retrieval performance metrics. Our approach identifies optimal feature orderings, leading to substantial improvements in retrieval effectiveness compared to arbitrary or default orderings. We assess the performance of the proposed approach in various medical and non-medical image retrieval tasks. This evaluation includes medical images from The Cancer Genome Atlas (TCGA), a comprehensive publicly available dataset, as well as COVID-19 Chest X-rays dataset. In addition, we evaluate the proposed approach on non-medical benchmark image datasets, such as CIFAR-10, CIFAR-100, and Fashion-MNIST. Our findings demonstrate the importance of optimizing binary barcode representation to significantly enhance accuracy for fast image retrieval across a wide range of applications, highlighting the applicability and potential of barcodes in various domains.

Application of prediction model based on CT radiomics in prognosis of patients with non-small cell lung cancer.

Peng Z, Wang Y, Qi Y, Hu H, Fu Y, Li J, Li W, Li Z, Guo W, Shen C, Jiang J, Yang B

pubmed logopapersAug 6 2025
To establish and validate the utility of computed tomography (CT) radiomics for the prognosis of patients with non-small cell lung cancer (NSCLC). Overall, 215 patients with pathologic diagnosis of NSCLC were included, chest CT images and clinical data were collected before treatment, and follow-up was conducted to assess brain metastasis and survival. Radiomics characteristics were extracted from the chest CT lung window images of each patient, key characteristics were screened, the radiomics score (Radscore) was calculated, and radiomics, clinical, and combined models were constructed using clinically independent predictive factors. A nomogram was constructed based on the final joint model to visualize prediction results. Predictive efficacy was evaluated using the concordance index (C-index), and survival (Kaplan-Meier) and calibration curves were drawn to further evaluate predictive efficacy. The training set included 151 patients (43 with brain metastasis and 108 without brain metastasis) and 64 patients (18 with brain metastasis and 46 without). Multivariate analysis revealed that lymph node metastasis, lymphocyte percentage, and neuron-specific enolase (NSE) were independent predictors of brain metastasis in patients with NSCLC. The area under the curve (AUC) of the these models were 0.733, 0.836, and 0.849, respectively, in the training set and were 0.739, 0.779, and 0.816, respectively, in the validation set. Multivariate Cox regression analysis revealed that the number of brain metastases, distant metastases elsewhere, and C-reactive protein levels were independent predictors of postoperative survival in patients with brain metastases (<i>P</i> < 0.05). The calibration curve exhibited that the predicted values of the prognostic prediction model agreed well with the actual values. The model based on CT radiomics characteristics can effectively predict NSCLC brain metastasis and its prognosis and provide guidance for individualized treatment of NSCLC patients.

Small Lesions-aware Bidirectional Multimodal Multiscale Fusion Network for Lung Disease Classification

Jianxun Yu, Ruiquan Ge, Zhipeng Wang, Cheng Yang, Chenyu Lin, Xianjun Fu, Jikui Liu, Ahmed Elazab, Changmiao Wang

arxiv logopreprintAug 6 2025
The diagnosis of medical diseases faces challenges such as the misdiagnosis of small lesions. Deep learning, particularly multimodal approaches, has shown great potential in the field of medical disease diagnosis. However, the differences in dimensionality between medical imaging and electronic health record data present challenges for effective alignment and fusion. To address these issues, we propose the Multimodal Multiscale Cross-Attention Fusion Network (MMCAF-Net). This model employs a feature pyramid structure combined with an efficient 3D multi-scale convolutional attention module to extract lesion-specific features from 3D medical images. To further enhance multimodal data integration, MMCAF-Net incorporates a multi-scale cross-attention module, which resolves dimensional inconsistencies, enabling more effective feature fusion. We evaluated MMCAF-Net on the Lung-PET-CT-Dx dataset, and the results showed a significant improvement in diagnostic accuracy, surpassing current state-of-the-art methods. The code is available at https://github.com/yjx1234/MMCAF-Net

On the effectiveness of multimodal privileged knowledge distillation in two vision transformer based diagnostic applications

Simon Baur, Alexandra Benova, Emilio Dolgener Cantú, Jackie Ma

arxiv logopreprintAug 6 2025
Deploying deep learning models in clinical practice often requires leveraging multiple data modalities, such as images, text, and structured data, to achieve robust and trustworthy decisions. However, not all modalities are always available at inference time. In this work, we propose multimodal privileged knowledge distillation (MMPKD), a training strategy that utilizes additional modalities available solely during training to guide a unimodal vision model. Specifically, we used a text-based teacher model for chest radiographs (MIMIC-CXR) and a tabular metadata-based teacher model for mammography (CBIS-DDSM) to distill knowledge into a vision transformer student model. We show that MMPKD can improve the resulting attention maps' zero-shot capabilities of localizing ROI in input images, while this effect does not generalize across domains, as contrarily suggested by prior research.

Benchmarking Uncertainty and its Disentanglement in multi-label Chest X-Ray Classification

Simon Baur, Wojciech Samek, Jackie Ma

arxiv logopreprintAug 6 2025
Reliable uncertainty quantification is crucial for trustworthy decision-making and the deployment of AI models in medical imaging. While prior work has explored the ability of neural networks to quantify predictive, epistemic, and aleatoric uncertainties using an information-theoretical approach in synthetic or well defined data settings like natural image classification, its applicability to real life medical diagnosis tasks remains underexplored. In this study, we provide an extensive uncertainty quantification benchmark for multi-label chest X-ray classification using the MIMIC-CXR-JPG dataset. We evaluate 13 uncertainty quantification methods for convolutional (ResNet) and transformer-based (Vision Transformer) architectures across a wide range of tasks. Additionally, we extend Evidential Deep Learning, HetClass NNs, and Deep Deterministic Uncertainty to the multi-label setting. Our analysis provides insights into uncertainty estimation effectiveness and the ability to disentangle epistemic and aleatoric uncertainties, revealing method- and architecture-specific strengths and limitations.

Quantum Federated Learning in Healthcare: The Shift from Development to Deployment and from Models to Data.

Bhatia AS, Kais S, Alam MA

pubmed logopapersAug 6 2025
Healthcare organizations have a high volume of sensitive data and traditional technologies have limited storage capacity and computational resources. The prospect of sharing healthcare data for machine learning is more arduous due to firm regulations related to patient privacy. In recent years, federated learning has offered a solution to accelerate distributed machine learning addressing concerns related to data privacy and governance. Currently, the blend of quantum computing and machine learning has experienced significant attention from academic institutions and research communities. The ultimate objective of this work is to develop a federated quantum machine learning framework (FQML) to tackle the optimization, security, and privacy challenges in the healthcare industry for medical imaging tasks. In this work, we proposed federated quantum convolutional neural networks (QCNNs) with distributed training across edge devices. To demonstrate the feasibility of the proposed FQML framework, we performed extensive experiments on two benchmark medical datasets (Pneumonia MNIST, and CT kidney disease analysis), which are non-independently and non-identically partitioned among the healthcare institutions/clients. The proposed framework is validated and assessed via large-scale simulations. Based on our results, the quantum simulation experiments achieve performance levels on par with well-known classical CNN models, 86.3% accuracy on the pneumonia dataset and 92.8% on the CT-kidney dataset, while requiring fewer model parameters and consuming less data. Moreover, the client selection mechanism is proposed to reduce the computation overhead at each communication round, which effectively improves the convergence rate.
Page 22 of 66652 results
Show
per page

Ready to Sharpen Your Edge?

Join hundreds of your peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.