Sort by:
Page 20 of 24236 results

Deep learning approaches for classification tasks in medical X-ray, MRI, and ultrasound images: a scoping review.

Laçi H, Sevrani K, Iqbal S

pubmed logopapersMay 7 2025
Medical images occupy the largest part of the existing medical information and dealing with them is challenging not only in terms of management but also in terms of interpretation and analysis. Hence, analyzing, understanding, and classifying them, becomes a very expensive and time-consuming task, especially if performed manually. Deep learning is considered a good solution for image classification, segmentation, and transfer learning tasks since it offers a large number of algorithms to solve such complex problems. PRISMA-ScR guidelines have been followed to conduct the scoping review with the aim of exploring how deep learning is being used to classify a broad spectrum of diseases diagnosed using an X-ray, MRI, or Ultrasound image modality.Findings contribute to the existing research by outlining the characteristics of the adopted datasets and the preprocessing or augmentation techniques applied to them. The authors summarized all relevant studies based on the deep learning models used and the accuracy achieved for classification. Whenever possible, they included details about the hardware and software configurations, as well as the architectural components of the models employed. Moreover, the models that achieved the highest accuracy in disease classification were highlighted, along with their strengths. The authors also discussed the limitations of the current approaches and proposed future directions for medical image classification.

A deep learning model combining circulating tumor cells and radiological features in the multi-classification of mediastinal lesions in comparison with thoracic surgeons: a large-scale retrospective study.

Wang F, Bao M, Tao B, Yang F, Wang G, Zhu L

pubmed logopapersMay 7 2025
CT images and circulating tumor cells (CTCs) are indispensable for diagnosing the mediastinal lesions by providing radiological and intra-tumoral information. This study aimed to develop and validate a deep multimodal fusion network (DMFN) combining CTCs and CT images for the multi-classification of mediastinal lesions. In this retrospective diagnostic study, we enrolled 1074 patients with 1500 enhanced CT images and 1074 CTCs results between Jan 1, 2020, and Dec 31, 2023. Patients were divided into the training cohort (n = 434), validation cohort (n = 288), and test cohort (n = 352). The DMFN and monomodal convolutional neural network (CNN) models were developed and validated using the CT images and CTCs results. The diagnostic performances of DMFN and monomodal CNN models were based on the Paraffin-embedded pathologies from surgical tissues. The predictive abilities were compared with thoracic resident physicians, attending physicians, and chief physicians by the area under the receiver operating characteristic (ROC) curve, and diagnostic results were visualized in the heatmap. For binary classification, the predictive performances of DMFN (AUC = 0.941, 95% CI 0.901-0.982) were better than the monomodal CNN model (AUC = 0.710, 95% CI 0.664-0.756). In addition, the DMFN model achieved better predictive performances than the thoracic chief physicians, attending physicians, and resident physicians (P = 0.054, 0.020, 0.016) respectively. For the multiclassification, the DMFN achieved encouraging predictive abilities (AUC = 0.884, 95%CI 0.837-0.931), significantly outperforming the monomodal CNN (AUC = 0.722, 95%CI 0.705-0.739), also better than the chief physicians (AUC = 0.787, 95%CI 0.714-0.862), attending physicians (AUC = 0.632, 95%CI 0.612-0.654), and resident physicians (AUC = 0.541, 95%CI 0.508-0.574). This study showed the feasibility and effectiveness of CNN model combing CT images and CTCs levels in predicting the diagnosis of mediastinal lesions. It could serve as a useful method to assist thoracic surgeons in improving diagnostic accuracy and has the potential to make management decisions.

Enhancing efficient deep learning models with multimodal, multi-teacher insights for medical image segmentation.

Hossain KF, Kamran SA, Ong J, Tavakkoli A

pubmed logopapersMay 7 2025
The rapid evolution of deep learning has dramatically enhanced the field of medical image segmentation, leading to the development of models with unprecedented accuracy in analyzing complex medical images. Deep learning-based segmentation holds significant promise for advancing clinical care and enhancing the precision of medical interventions. However, these models' high computational demand and complexity present significant barriers to their application in resource-constrained clinical settings. To address this challenge, we introduce Teach-Former, a novel knowledge distillation (KD) framework that leverages a Transformer backbone to effectively condense the knowledge of multiple teacher models into a single, streamlined student model. Moreover, it excels in the contextual and spatial interpretation of relationships across multimodal images for more accurate and precise segmentation. Teach-Former stands out by harnessing multimodal inputs (CT, PET, MRI) and distilling the final predictions and the intermediate attention maps, ensuring a richer spatial and contextual knowledge transfer. Through this technique, the student model inherits the capacity for fine segmentation while operating with a significantly reduced parameter set and computational footprint. Additionally, introducing a novel training strategy optimizes knowledge transfer, ensuring the student model captures the intricate mapping of features essential for high-fidelity segmentation. The efficacy of Teach-Former has been effectively tested on two extensive multimodal datasets, HECKTOR21 and PI-CAI22, encompassing various image types. The results demonstrate that our KD strategy reduces the model complexity and surpasses existing state-of-the-art methods to achieve superior performance. The findings of this study indicate that the proposed methodology could facilitate efficient segmentation of complex multimodal medical images, supporting clinicians in achieving more precise diagnoses and comprehensive monitoring of pathological conditions ( https://github.com/FarihaHossain/TeachFormer ).

Radiological evaluation and clinical implications of deep learning- and MRI-based synthetic CT for the assessment of cervical spine injuries.

Fischer G, Schlosser TPC, Dietrich TJ, Kim OC, Zdravkovic V, Martens B, Fehlings MG, Jans L, Vereecke E, Stienen MN, Hejrati N

pubmed logopapersMay 7 2025
Efficient evaluation of soft tissues and bony structures following cervical spine trauma is critical. We sought to evaluate the diagnostic validity of magnetic resonance imaging (MRI)-based synthetic CT (sCT) compared with conventional computed tomography (CT) for cervical spine injuries. In a prospective, multicenter study, patients with cervical spine injuries underwent CT and MRI within 48 h after injury. A panel of five clinicians independently reviewed the images for diagnostic accuracy, lesion characterization (AO Spine classification), and soft tissue trauma. Fracture visibility, anterior (AVH) and posterior wall height (PVH), vertebral body angle (VBA), segmental kyphosis (SK), with corresponding interobserver reliability (intraclass correlation coefficients (ICC)) and intermodal differences (Fleiss' Kappa), were recorded. The accuracy of estimating Hounsfield unit (HU) values and mean cortical surface distances were measured. Thirty-seven patients (44 cervical spine fractures) were enrolled. sCT demonstrated a sensitivity of 97.3% for visualizing fractures. Intermodal agreement regarding injury classification indicated almost perfect agreement (κ = 0.922; p < 0.001). Inter-reader ICCs were good to excellent (CT vs. sCT): AVH (0.88, 0.87); PVH (0.87, 0.88); VBA (0.78, 0.76); SK (0.77, 0.93). Intermodal agreement showed a mean absolute difference of 0.3 mm (AVH), 0.3 mm (PVH), 1.15° (VBA) and 0.51° (SK), respectively. MRI visualized additional soft tissue trauma in 56.8% of patients. Voxelwise comparisons of sCT showed good to excellent agreement with CT in terms of HUs (mean absolute error of 20 (SD ± 62)) and a mean absolute cortical surface distance of 0.45 mm (SD ± 0.13). sCT is a promising, radiation-free imaging technique for diagnosing cervical spine injuries with similar accuracy to CT. Question Assessing the accuracy of MRI-based synthetic CT (sCT) for fracture visualization and classification in comparison to the gold standard of CT for cervical spine injuries. Findings sCT demonstrated a 97.3% sensitivity in detecting fractures and exhibited near-perfect intermodal agreement in classifying injuries according to the AO Spine classification system. Clinical relevance sCT is a promising, radiation-free imaging modality that offers comparable accuracy to CT in visualizing and classifying cervical spine injuries. The combination of conventional MRI sequences for soft tissue evaluation with sCT reconstruction for bone visualization provides comprehensive diagnostic information.

Automated Detection of Black Hole Sign for Intracerebral Hemorrhage Patients Using Self-Supervised Learning.

Wang H, Schwirtlich T, Houskamp EJ, Hutch MR, Murphy JX, do Nascimento JS, Zini A, Brancaleoni L, Giacomozzi S, Luo Y, Naidech AM

pubmed logopapersMay 7 2025
Intracerebral Hemorrhage (ICH) is a devastating form of stroke. Hematoma expansion (HE), growth of the hematoma on interval scans, predicts death and disability. Accurate prediction of HE is crucial for targeted interventions to improve patient outcomes. The black hole sign (BHS) on non-contrast computed tomography (CT) scans is a predictive marker for HE. An automated method to recognize the BHS and predict HE could speed precise patient selection for treatment. In. this paper, we presented a novel framework leveraging self-supervised learning (SSL) techniques for BHS identification on head CT images. A ResNet-50 encoder model was pre-trained on over 1.7 million unlabeled head CT images. Layers for binary classification were added on top of the pre-trained model. The resulting model was fine-tuned using the training data and evaluated on the held-out test set to collect AUC and F1 scores. The evaluations were performed on scan and slice levels. We ran different panels, one using two multi-center datasets for external validation and one including parts of them in the pre-training RESULTS: Our model demonstrated strong performance in identifying BHS when compared with the baseline model. Specifically, the model achieved scan-level AUC scores between 0.75-0.89 and F1 scores between 0.60-0.70. Furthermore, it exhibited robustness and generalizability across an external dataset, achieving a scan-level AUC score of up to 0.85 and an F1 score of up to 0.60, while it performed less well on another dataset with more heterogeneous samples. The negative effects could be mitigated after including parts of the external datasets in the fine-tuning process. This study introduced a novel framework integrating SSL into medical image classification, particularly on BHS identification from head CT scans. The resulting pre-trained head CT encoder model showed potential to minimize manual annotation, which would significantly reduce labor, time, and costs. After fine-tuning, the framework demonstrated promising performance for a specific downstream task, identifying the BHS to predict HE, upon comprehensive evaluation on diverse datasets. This approach holds promise for enhancing medical image analysis, particularly in scenarios with limited data availability. ICH = Intracerebral Hemorrhage; HE = Hematoma Expansion; BHS = Black Hole Sign; CT = Computed Tomography; SSL = Self-supervised Learning; AUC = Area Under the receiver operator Curve; CNN = Convolutional Neural Network; SimCLR = Simple framework for Contrastive Learning of visual Representation; HU = Hounsfield Unit; CLAIM = Checklist for Artificial Intelligence in Medical Imaging; VNA = Vendor Neutral Archive; DICOM = Digital Imaging and Communications in Medicine; NIfTI = Neuroimaging Informatics Technology Initiative; INR = International Normalized Ratio; GPU= Graphics Processing Unit; NIH= National Institutes of Health.

Prompt Engineering for Large Language Models in Interventional Radiology.

Dietrich N, Bradbury NC, Loh C

pubmed logopapersMay 7 2025
Prompt engineering plays a crucial role in optimizing artificial intelligence (AI) and large language model (LLM) outputs by refining input structure, a key factor in medical applications where precision and reliability are paramount. This Clinical Perspective provides an overview of prompt engineering techniques and their relevance to interventional radiology (IR). It explores key strategies, including zero-shot, one- or few-shot, chain-of-thought, tree-of-thought, self-consistency, and directional stimulus prompting, demonstrating their application in IR-specific contexts. Practical examples illustrate how these techniques can be effectively structured for workplace and clinical use. Additionally, the article discusses best practices for designing effective prompts and addresses challenges in the clinical use of generative AI, including data privacy and regulatory concerns. It concludes with an outlook on the future of generative AI in IR, highlighting advances including retrieval-augmented generation, domain-specific LLMs, and multimodal models.

Multistage Diffusion Model With Phase Error Correction for Fast PET Imaging.

Gao Y, Huang Z, Xie X, Zhao W, Yang Q, Yang X, Yang Y, Zheng H, Liang D, Liu J, Chen R, Hu Z

pubmed logopapersMay 7 2025
Fast PET imaging is clinically important for reducing motion artifacts and improving patient comfort. While recent diffusion-based deep learning methods have shown promise, they often fail to capture the true PET degradation process, suffer from accumulated inference errors, introduce artifacts, and require extensive reconstruction iterations. To address these challenges, we propose a novel multistage diffusion framework tailored for fast PET imaging. At the coarse level, we design a multistage structure to approximate the temporal non-linear PET degradation process in a data-driven manner, using paired PET images collected under different acquisition duration. A Phase Error Correction Network (PECNet) ensures consistency across stages by correcting accumulated deviations. At the fine level, we introduce a deterministic cold diffusion mechanism, which simulates intra-stage degradation through interpolation between known acquisition durations-significantly reducing reconstruction iterations to as few as 10. Evaluations on [<sup>68</sup>Ga]FAPI and [<sup>18</sup>F]FDG PET datasets demonstrate the superiority of our approach, achieving peak PSNRs of 36.2 dB and 39.0 dB, respectively, with average SSIMs over 0.97. Our framework offers high-fidelity PET imaging with fewer iterations, making it practical for accelerated clinical imaging.

The added value of artificial intelligence using Quantib Prostate for the detection of prostate cancer at multiparametric magnetic resonance imaging.

Russo T, Quarta L, Pellegrino F, Cosenza M, Camisassa E, Lavalle S, Apostolo G, Zaurito P, Scuderi S, Barletta F, Marzorati C, Stabile A, Montorsi F, De Cobelli F, Brembilla G, Gandaglia G, Briganti A

pubmed logopapersMay 7 2025
Artificial intelligence (AI) has been proposed to assist radiologists in reporting multiparametric magnetic resonance imaging (mpMRI) of the prostate. We evaluate the diagnostic performance of radiologists with different levels of experience when reporting mpMRI with the support of available AI-based software (Quantib Prostate). This is a single-center study (NCT06298305) involving 110 patients. Those with a positive mpMRI (PI-RADS ≥ 3) underwent targeted plus systematic biopsy (TBx plus SBx), while those with a negative mpMRI but a high clinical suspicion of prostate cancer (PCa) underwent SBx. Three readers with different levels of experience, identified as R1, R2, and R3 reviewed all mpMRI. Inter-reader agreement among the three readers with or without the assistance of Quantib Prostate as well as sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), and diagnostic accuracy for the detection of clinically significant PCa (csPCa) were assessed. 102 patients underwent prostate biopsy and the csPCa detection rate was 47%. Using Quantib Prostate resulted in an increased number of lesions identified for R3 (101 vs. 127). Inter-reader agreement slightly increased when using Quantib Prostate from 0.37 to 0.41 without vs. with Quantib Prostate, respectively. PPV, NPV and diagnostic accuracy (measured by the area under the curve [AUC]) of R3 improved (0.51 vs. 0.55, 0.65 vs.0.82 and 0.56 vs. 0.62, respectively). Conversely, no changes were observed for R1 and R2. Using Quantib Prostate did not enhance the detection rate of csPCa for readers with some experience in prostate imaging. However, for an inexperienced reader, this AI-based software is demonstrated to improve the performance. Name of registry: clinicaltrials.gov. NCT06298305. Date of registration: 2022-09.

ChatOCT: Embedded Clinical Decision Support Systems for Optical Coherence Tomography in Offline and Resource-Limited Settings.

Liu C, Zhang H, Zheng Z, Liu W, Gu C, Lan Q, Zhang W, Yang J

pubmed logopapersMay 7 2025
Optical Coherence Tomography (OCT) is a critical imaging modality for diagnosing ocular and systemic conditions, yet its accessibility is hindered by the need for specialized expertise and high computational demands. To address these challenges, we introduce ChatOCT, an offline-capable, domain-adaptive clinical decision support system (CDSS) that integrates structured expert Q&A generation, OCT-specific knowledge injection, and activation-aware model compression. Unlike existing systems, ChatOCT functions without internet access, making it suitable for low-resource environments. ChatOCT is built upon LLaMA-2-7B, incorporating domain-specific knowledge from PubMed and OCT News through a two-stage training process: (1) knowledge injection for OCT-specific expertise and (2) Q&A instruction tuning for structured, interactive diagnostic reasoning. To ensure feasibility in offline environments, we apply activation-aware weight quantization, reducing GPU memory usage to ~ 4.74 GB, enabling deployment on standard OCT hardware. A novel expert answer generation framework mitigates hallucinations by structuring responses in a multi-step process, ensuring accuracy and interpretability. ChatOCT outperforms state-of-the-art baselines such as LLaMA-2, PMC-LLaMA-13B, and ChatDoctor by 10-15 points in coherence, relevance, and clinical utility, while reducing GPU memory requirements by 79%, while maintaining real-time responsiveness (~ 20 ms inference time). Expert ophthalmologists rated ChatOCT's outputs as clinically actionable and aligned with real-world decision-making needs, confirming its potential to assist frontline healthcare providers. ChatOCT represents an innovative offline clinical decision support system for optical coherence tomography (OCT) that runs entirely on local embedded hardware, enabling real-time analysis in resource-limited settings without internet connectivity. By offering a scalable, generalizable pipeline that integrates knowledge injection, instruction tuning, and model compression, ChatOCT provides a blueprint for next-generation, resource-efficient clinical AI solutions across multiple medical domains.

Accelerated inference for thyroid nodule recognition in ultrasound imaging using FPGA.

Ma W, Wu X, Zhang Q, Li X, Wu X, Wang J

pubmed logopapersMay 7 2025
Thyroid cancer is the most prevalent malignant tumour in the endocrine system, with its incidence steadily rising in recent years. Current central processing units (CPUs) and graphics processing units (GPUs) face significant challenges in terms of processing speed, energy consumption, cost, and scalability in the identification of thyroid nodules, making them inadequate for the demands of future green, efficient, and accessible healthcare. To overcome these limitations, this study proposes an efficient quantized inference method using a field-programmable gate array (FPGA). We employ the YOLOv4-tiny neural network model, enhancing software performance with the K-means + + optimization algorithm and improving hardware performance through techniques such as 8-bit weight quantization, batch normalization, and convolutional layer fusion. The study is based on the ZYNQ7020 FPGA platform. Experimental results demonstrate an average accuracy of 81.44% on the Tn3k dataset and 81.20% on the internal test set from a Chinese tertiary hospital. The power consumption of the FPGA platform, CPU (Intel Core i5-10200 H), and GPU (NVIDIA RTX 4090) were 3.119 watts, 45 watts, and 68 watts, respectively, with energy efficiency ratios of 5.45, 0.31, and 5.56. This indicates that the FPGA's energy efficiency is 17.6 times that of the CPU and 0.98 times that of the GPU. These results show that the FPGA not only significantly outperforms the CPU in speed but also consumes far less power than the GPU. Moreover, using mid-to-low-end FPGAs yields performance comparable to that of commercial-grade GPUs. This technology presents a novel solution for medical imaging diagnostics, with the potential to significantly enhance the speed, accuracy, and environmental sustainability of ultrasound image analysis, thereby supporting the future development of medical care.
Page 20 of 24236 results
Show
per page
Get Started

Upload your X-ray image and get interpretation.

Upload now →

Disclaimer: X-ray Interpreter's AI-generated results are for informational purposes only and not a substitute for professional medical advice. Always consult a healthcare professional for medical diagnosis and treatment.