Sort by:
Page 16 of 99982 results

Integration of nested cross-validation, automated hyperparameter optimization, high-performance computing to reduce and quantify the variance of test performance estimation of deep learning models.

Calle P, Bates A, Reynolds JC, Liu Y, Cui H, Ly S, Wang C, Zhang Q, de Armendi AJ, Shettar SS, Fung KM, Tang Q, Pan C

pubmed logopapersSep 10 2025
The variability and biases in the real-world performance benchmarking of deep learning models for medical imaging compromise their trustworthiness for real-world deployment. The common approach of holding out a single fixed test set fails to quantify the variance in the estimation of test performance metrics. This study introduces NACHOS (Nested and Automated Cross-validation and Hyperparameter Optimization using Supercomputing) to reduce and quantify the variance of test performance metrics of deep learning models. NACHOS integrates Nested Cross-Validation (NCV) and Automated Hyperparameter Optimization (AHPO) within a parallelized high-performance computing (HPC) framework. NACHOS was demonstrated on a chest X-ray repository and an Optical Coherence Tomography (OCT) dataset under multiple data partitioning schemes. Beyond performance estimation, DACHOS (Deployment with Automated Cross-validation and Hyperparameter Optimization using Supercomputing) is introduced to leverage AHPO and cross-validation to build the final model on the full dataset, improving expected deployment performance. The findings underscore the importance of NCV in quantifying and reducing estimation variance, AHPO in optimizing hyperparameters consistently across test folds, and HPC in ensuring computational feasibility. By integrating these methodologies, NACHOS and DACHOS provide a scalable, reproducible, and trustworthy framework for DL model evaluation and deployment in medical imaging. To maximize public availability, the full open-source codebase is provided at https://github.com/thepanlab/NACHOS.

Non-invasive prediction of invasive lung adenocarcinoma and high-risk histopathological characteristics in resectable early-stage adenocarcinoma by [18F]FDG PET/CT radiomics-based machine learning models: a prospective cohort Study.

Cao X, Lv Z, Li Y, Li M, Hu Y, Liang M, Deng J, Tan X, Wang S, Geng W, Xu J, Luo P, Zhou M, Xiao W, Guo M, Liu J, Huang Q, Hu S, Sun Y, Lan X, Jin Y

pubmed logopapersSep 10 2025
Precise preoperative discrimination of invasive lung adenocarcinoma (IA) from preinvasive lesions (adenocarcinoma in situ [AIS]/minimally invasive adenocarcinoma [MIA]) and prediction of high-risk histopathological features are critical for optimizing resection strategies in early-stage lung adenocarcinoma (LUAD). In this multicenter study, 813 LUAD patients (tumors ≤3 cm) formed the training cohort. A total of 1,709 radiomic features were extracted from the PET/CT images. Feature selection was performed using the max-relevance and min-redundancy (mRMR) algorithm and least absolute shrinkage and selection operator (LASSO). Hybrid machine learning models integrating [18F]FDG PET/CT radiomics and clinical-radiological features were developed using H2O.ai AutoML. Models were validated in a prospective internal cohort (N = 256, 2021-2022) and external multicenter cohort (N = 418). Performance was assessed via AUC, calibration, decision curve analysis (DCA) and survival assessment. The hybrid model achieved AUCs of 0.93 (95% CI: 0.90-0.96) for distinguishing IA from AIS/MIA (internal test) and 0.92 (0.90-0.95) in external testing. For predicting high-risk histopathological features (grade-III, lymphatic/pleural/vascular/nerve invasion, STAS), AUCs were 0.82 (0.77-0.88) and 0.85 (0.81-0.89) in internal/external sets. DCA confirmed superior net benefit over CT model. The model stratified progression-free (P = 0.002) and overall survival (P = 0.017) in the TCIA cohort. PET/CT radiomics-based models enable accurate non-invasive prediction of invasiveness and high-risk pathology in early-stage LUAD, guiding optimal surgical resection.

Out-of-the-Box Large Language Models for Detecting and Classifying Critical Findings in Radiology Reports Using Various Prompt Strategies.

Talati IA, Chaves JMZ, Das A, Banerjee I, Rubin DL

pubmed logopapersSep 10 2025
<b>Background:</b> The increasing complexity and volume of radiology reports present challenges for timely critical findings communication. <b>Purpose:</b> To evaluate the performance of two out-of-the-box LLMs in detecting and classifying critical findings in radiology reports using various prompt strategies. <b>Methods:</b> The analysis included 252 radiology reports of varying modalities and anatomic regions extracted from the MIMIC-III database, divided into a prompt engineering tuning set of 50 reports, a holdout test set of 125 reports, and a pool of 77 remaining reports used as examples for few-shot prompting. An external test set of 180 chest radiography reports was extracted from the CheXpert Plus database. Reports were manually reviewed to identify critical findings and classify such findings into one of three categories (true critical finding, known/expected critical finding, equivocal critical finding). Following prompt engineering using various prompt strategies, a final prompt for optimal true critical findings detection was selected. Two general-purpose LLMs, GPT-4 and Mistral-7B, processed reports in the test sets using the final prompt. Evaluation included automated text similarity metrics (BLEU-1, ROUGE-F1, G-Eval) and manual performance metrics (precision, recall). <b>Results:</b> For true critical findings, zero-shot, few-shot static (five examples), and few-shot dynamic (five examples) prompting yielded BLEU-1 of 0.691, 0.778, and 0.748; ROUGE-F1 of 0.706, 0.797, and 0.773; and G-Eval of 0.428, 0.573, and 0.516. Precision and recall for true critical findings, known/expected critical findings, and equivocal critical findings, in the holdout test set for GPT-4 were 90.1% and 86.9%, 80.9% and 85.0%, and 80.5% and 94.3%; in the holdout test set for Mistral-7B were 75.6% and 77.4%, 34.1% and 70.0%, and 41.3% and 74.3%; in the external test set for GPT-4 were 82.6% and 98.3%, 76.9% and 71.4%, and 70.8% and 85.0%; and in the external test set for Mistral-7B were 75.0% and 93.1%, 33.3% and 92.9%, and 34.0% and 80.0%. <b>Conclusion:</b> Out-of-the-box LLMs were used to detect and classify arbitrary numbers of critical findings in radiology reports. The optimal model for true critical findings entailed a few-shot static approach. <b>Clinical Impact:</b> The study shows a role of contemporary general-purpose models in adapting to specialized medical tasks using minimal data annotation.

Integrating Perfusion with AI-derived Coronary Calcium on CT attenuation scans to improve selection of low-risk studies for stress-only SPECT MPI.

Miller RJH, Barrett O, Shanbhag A, Rozanski A, Dey D, Lemley M, Van Kriekinge SD, Kavanagh PB, Feher A, Miller EJ, Einstein AJ, Ruddy TD, Bateman T, Kaufmann PA, Liang JX, Berman DS, Slomka PJ

pubmed logopapersSep 10 2025
In many contemporary laboratories a completely normal stress perfusion SPECT-MPI is required for rest imaging cancelation. We hypothesized that an artificial intelligence (AI)-derived CAC score of 0 from computed tomography attenuation correction (CTAC) scans obtained during hybrid SPECT/CT, may identify additional patients at low risk of MACE who could be selected for stress-only imaging. Patients without known coronary artery disease who underwent SPECT/CT MPI and had stress total perfusion deficit (TPD) <5% were included. Stress TPD was categorized as no abnormality (stress TPD 0%) or minimal abnormality (stress TPD 1-4%). CAC was automatically quantified from the CTAC scans. We evaluated associations with major adverse cardiovascular events (MACE). In total, 6,884 patients (49.4% males and median age 63 years) were included. Of these, 9.7% experienced MACE (15% non-fatal MI, 2.7% unstable angina, 38.5% coronary revascularization and 43.8% deaths). Compared to patients with TPD 0%, those with TPD 1-4% and CAC 0 had lower MACE risk (hazard ratio [HR] 0.58; 95% confidence interval [CI] 0.45-0.76), while those with TPD 1-4% and CAC score>0 had a higher MACE risk (HR 1.90; 95%CI 1.56-2.30). Compared to canceling rest scans only in patients with normal perfusion (TPD 0%), by canceling rest scans in patients with CAC 0, more than twice as many rest scans (55% vs 25%) could be cancelled. Using AI-derived CAC 0 on CT scans with hybrid SPECT/CT in patients with a stress TPD<5% can double the proportion of patients in whom stress-only procedures could be safely performed.

Vision-Language Semantic Aggregation Leveraging Foundation Model for Generalizable Medical Image Segmentation

Wenjun Yu, Yinchen Zhou, Jia-Xuan Jiang, Shubin Zeng, Yuee Li, Zhong Wang

arxiv logopreprintSep 10 2025
Multimodal models have achieved remarkable success in natural image segmentation, yet they often underperform when applied to the medical domain. Through extensive study, we attribute this performance gap to the challenges of multimodal fusion, primarily the significant semantic gap between abstract textual prompts and fine-grained medical visual features, as well as the resulting feature dispersion. To address these issues, we revisit the problem from the perspective of semantic aggregation. Specifically, we propose an Expectation-Maximization (EM) Aggregation mechanism and a Text-Guided Pixel Decoder. The former mitigates feature dispersion by dynamically clustering features into compact semantic centers to enhance cross-modal correspondence. The latter is designed to bridge the semantic gap by leveraging domain-invariant textual knowledge to effectively guide deep visual representations. The synergy between these two mechanisms significantly improves the model's generalization ability. Extensive experiments on public cardiac and fundus datasets demonstrate that our method consistently outperforms existing SOTA approaches across multiple domain generalization benchmarks.

Individual hearts: computational models for improved management of cardiovascular disease.

van Osta N, van Loon T, Lumens J

pubmed logopapersSep 9 2025
Cardiovascular disease remains a leading cause of morbidity and mortality worldwide, with conventional management often applying standardised approaches that struggle to address individual variability in increasingly complex patient populations. Computational models, both knowledge-driven and data-driven, have the potential to reshape cardiovascular medicine by offering innovative tools that integrate patient-specific information with physiological understanding or statistical inference to generate insights beyond conventional diagnostics. This review traces how computational modelling has evolved from theoretical research tools into clinical decision support systems that enable personalised cardiovascular care. We examine this evolution across three key domains: enhancing diagnostic accuracy through improved measurement techniques, deepening mechanistic insights into cardiovascular pathophysiology and enabling precision medicine through patient-specific simulations. The review covers the complementary strengths of data-driven approaches, which identify patterns in large clinical datasets, and knowledge-driven models, which simulate cardiovascular processes based on established biophysical principles. Applications range from artificial intelligence-guided measurements and model-informed diagnostics to digital twins that enable in silico testing of therapeutic interventions in the digital replicas of individual hearts. This review outlines the main types of cardiovascular modelling, highlighting their strengths, limitations and complementary potential through current clinical and research applications. We also discuss future directions, emphasising the need for interdisciplinary collaboration, pragmatic model design and integration of hybrid approaches. While progress is promising, challenges remain in validation, regulatory approval and clinical workflow integration. With continued development and thoughtful implementation, computational models hold the potential to enable more informed decision-making and advance truly personalised cardiovascular care.

A comprehensive review of techniques, algorithms, advancements, challenges, and clinical applications of multi-modal medical image fusion for improved diagnosis.

Zubair M, Hussain M, Albashrawi MA, Bendechache M, Owais M

pubmed logopapersSep 9 2025
Multi-modal medical image fusion (MMIF) is increasingly recognized as an essential technique for enhancing diagnostic precision and facilitating effective clinical decision-making within computer-aided diagnosis systems. MMIF combines data from X-ray, MRI, CT, PET, SPECT, and ultrasound to create detailed, clinically useful images of patient anatomy and pathology. These integrated representations significantly advance diagnostic accuracy, lesion detection, and segmentation. This comprehensive review meticulously surveys the evolution, methodologies, algorithms, current advancements, and clinical applications of MMIF. We present a critical comparative analysis of traditional fusion approaches, including pixel-, feature-, and decision-level methods, and delves into recent advancements driven by deep learning, generative models, and transformer-based architectures. A critical comparative analysis is presented between these conventional methods and contemporary techniques, highlighting differences in robustness, computational efficiency, and interpretability. The article addresses extensive clinical applications across oncology, neurology, and cardiology, demonstrating MMIF's vital role in precision medicine through improved patient-specific therapeutic outcomes. Moreover, the review thoroughly investigates the persistent challenges affecting MMIF's broad adoption, including issues related to data privacy, heterogeneity, computational complexity, interpretability of AI-driven algorithms, and integration within clinical workflows. It also identifies significant future research avenues, such as the integration of explainable AI, adoption of privacy-preserving federated learning frameworks, development of real-time fusion systems, and standardization efforts for regulatory compliance. This review organizes key knowledge, outlines challenges, and highlights opportunities, guiding researchers, clinicians, and developers in advancing MMIF for routine clinical use and promoting personalized healthcare. To support further research, we provide a GitHub repository that includes popular multi-modal medical imaging datasets along with recent models in our shared GitHub repository.

Intraoperative 2D/3D Registration via Spherical Similarity Learning and Inference-Time Differentiable Levenberg-Marquardt Optimization

Minheng Chen, Youyong Kong

arxiv logopreprintSep 8 2025
Intraoperative 2D/3D registration aligns preoperative 3D volumes with real-time 2D radiographs, enabling accurate localization of instruments and implants. A recent fully differentiable similarity learning framework approximates geodesic distances on SE(3), expanding the capture range of registration and mitigating the effects of substantial disturbances, but existing Euclidean approximations distort manifold structure and slow convergence. To address these limitations, we explore similarity learning in non-Euclidean spherical feature spaces to better capture and fit complex manifold structure. We extract feature embeddings using a CNN-Transformer encoder, project them into spherical space, and approximate their geodesic distances with Riemannian distances in the bi-invariant SO(4) space. This enables a more expressive and geometrically consistent deep similarity metric, enhancing the ability to distinguish subtle pose differences. During inference, we replace gradient descent with fully differentiable Levenberg-Marquardt optimization to accelerate convergence. Experiments on real and synthetic datasets show superior accuracy in both patient-specific and patient-agnostic scenarios.

New imaging techniques and trends in radiology.

Kantarcı M, Aydın S, Oğul H, Kızılgöz V

pubmed logopapersSep 8 2025
Radiography is a field of medicine inherently intertwined with technology. The dependency on technology is very high for obtaining images in ultrasound (US), computed tomography (CT), and magnetic resonance imaging (MRI). Although the reduction in radiation dose is not applicable in US and MRI, advancements in technology have made it possible in CT, with ongoing studies aimed at further optimization. The resolution and diagnostic quality of images obtained through advancements in each modality are steadily improving. Additionally, technological progress has significantly shortened acquisition times for CT and MRI. The use of artificial intelligence (AI), which is becoming increasingly widespread worldwide, has also been incorporated into radiography. This technology can produce more accurate and reproducible results in US examinations. Machine learning offers great potential for improving image quality, creating more distinct and useful images, and even developing new US imaging modalities. Furthermore, AI technologies are increasingly prevalent in CT and MRI for image evaluation, image generation, and enhanced image quality.

Curia: A Multi-Modal Foundation Model for Radiology

Corentin Dancette, Julien Khlaut, Antoine Saporta, Helene Philippe, Elodie Ferreres, Baptiste Callard, Théo Danielou, Léo Alberge, Léo Machado, Daniel Tordjman, Julie Dupuis, Korentin Le Floch, Jean Du Terrail, Mariam Moshiri, Laurent Dercle, Tom Boeken, Jules Gregory, Maxime Ronot, François Legou, Pascal Roux, Marc Sapoval, Pierre Manceron, Paul Hérent

arxiv logopreprintSep 8 2025
AI-assisted radiological interpretation is based on predominantly narrow, single-task models. This approach is impractical for covering the vast spectrum of imaging modalities, diseases, and radiological findings. Foundation models (FMs) hold the promise of broad generalization across modalities and in low-data settings. However, this potential has remained largely unrealized in radiology. We introduce Curia, a foundation model trained on the entire cross-sectional imaging output of a major hospital over several years, which to our knowledge is the largest such corpus of real-world data-encompassing 150,000 exams (130 TB). On a newly curated 19-task external validation benchmark, Curia accurately identifies organs, detects conditions like brain hemorrhages and myocardial infarctions, and predicts outcomes in tumor staging. Curia meets or surpasses the performance of radiologists and recent foundation models, and exhibits clinically significant emergent properties in cross-modality, and low-data regimes. To accelerate progress, we release our base model's weights at https://huggingface.co/raidium/curia.
Page 16 of 99982 results
Show
per page

Ready to Sharpen Your Edge?

Join hundreds of your peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.