Latest Papers on Radiology AI. Sources: medrxiv, Tags: Clinical Pilot.

The HeartMagic prospective observational study protocol - characterizing subtypes of heart failure with preserved ejection fraction

Meyer, P., Rocca, A., Banus, J., Ogier, A. C., Georgantas, C., Calarnou, P., Fatima, A., Vallee, J.-P., Deux, J.-F., Thomas, A., Marquis, J., Monney, P., Lu, H., Ledoux, J.-B., Tillier, C., Crowe, L. A., Abdurashidova, T., Richiardi, J., Hullin, R., van Heeswijk, R. B.

•preprint•Sep 16 2025

Introduction Heart failure (HF) is a life-threatening syndrome with significant morbidity and mortality. While evidence-based drug treatments have effectively reduced morbidity and mortality in HF with reduced ejection fraction (HFrEF), few therapies have been demonstrated to improve outcomes in HF with preserved ejection fraction (HFpEF). The multifaceted clinical presentation is one of the main reasons why the current understanding of HFpEF remains limited. This may be caused by the existence of several HFpEF disease subtypes that each need different treatments. There is therefore an unmet need for a holistic approach that combines comprehensive imaging with metabolomic, transcriptomic and genomic mapping to subtype HFpEF patients. This protocol details the approach employed in the HeartMagic study to address this gap in understanding. Methods This prospective multi-center observational cohort study will include 500 consecutive patients with actual or recent hospitalization for treatment of HFpEF at two Swiss university hospitals, along with 50 age-matched HFrEF patients and 50 age-matched healthy controls. Diagnosis of heart failure is based on clinical signs and symptoms and subgrouping HF patients is based on the left-ventricular ejection fraction. In addition to routine clinical workup, participants undergo genomic, transcriptomic, and metabolomic analyses, while the anatomy, composition, and function of the heart are quantified by comprehensive echocardiography and magnetic resonance imaging (MRI). Quantitative MRI is also applied to characterize the kidney. The primary outcome is a composite of one-year cardiovascular mortality or rehospitalization. Machine learning (ML) based multi-modal clustering will be employed to identify distinct HFpEF subtypes in the holistic data. The clinical importance of these subtypes shall be evaluated based on their association with the primary outcome. Statistical analysis will include group comparisons across modalities, survival analysis for the primary outcome, and integrative multi-modal clustering combining clinical, imaging, ECG, genomic, transcriptomic, and metabolomic data to identify and validate HFpEF subtypes. Discussion The integration of comprehensive MRI with extensive genomic and metabolomic profiling in this study will result in an unprecedented panoramic view of HFpEF and should enable us to distinguish functional subgroups of HFpEF patients. This approach has the potential to provide unprecedented insights on HFpEF disease and should provide a basis for personalized therapies. Beyond this, identifying HFpEF subtypes with specific molecular and structural characteristics could lead to new targeted pharmacological interventions, with the potential to improve patient outcomes.

MRI Classification Cardiac Prospective Clinical Pilot Academic Lab GenAI

Ultra-low-field MRI for imaging of severe multiple sclerosis: a case-controlled study

Bergsland, N., Burnham, A., Dwyer, M. G., Bartnik, A., Schweser, F., Kennedy, C., Tranquille, A., Semy, M., Schnee, E., Young-Hong, D., Eckert, S., Hojnacki, D., Reilly, C., Benedict, R. H., Weinstock-Guttman, B., Zivadinov, R.

•preprint•Aug 27 2025

BackgroundSevere multiple sclerosis (MS) presents challenges for clinical research due to mobility constraints and specialized care needs. Traditional MRI studies often exclude this population, limiting understanding of severe MS progression. Portable, ultra-low-field MRI enables bedside imaging. ObjectivesTo (i) assess the feasibility of portable MRI in severe MS, (ii) compare measurement approaches for automated tissue volumetry from ultra-low-field MRI. MethodsThis prospective study enrolled 40 progressive MS patients (24 severely disabled, 16 less severe) from academic and skilled nursing settings. Participants underwent 0.064T MRI for tissue volumetry using conventional and artificial intelligence (AI)-driven segmentation. Clinical assessments included physical disability and cognition. Group comparisons and MRI-clinical associations were assessed. ResultsMRI passed rigorous quality control, reflecting complete brain coverage and lack of motion artifact, in 38/40 participants. In terms of severe versus less severe disease, the largest effect sizes were obtained with conventionally-calculated gray matter (GM) volume (partial 2=0.360), cortical GM volume (partial 2=0.349), and whole brain volume (partial 2=0.290) while an AI-based approach yielded the highest effect size for white matter volume (partial 2=0.209). For clinical outcomes, the most consistent associations were found using conventional processing while AI-based methods were dependent on algorithm and input image, especially for cortical GM volume. ConclusionPortable, ultralow-field MRI is a feasible bedside tool that can provide insights into late-stage neurodegeneration in individuals living with severe MS. However, careful consideration is required in implementing tissue volumetry pipelines as findings are heavily dependent on the choice of algorithm and input.

MRI Segmentation Neurological Prospective Clinical Pilot Academic Lab

Using deep learning methods to shorten acquisition time in children's renal cortical imaging

Gan, C., Niu, P., Pan, B., Chen, X., Xu, L., Huang, K., Chen, H., Wang, Q., Ding, L., Yin, Y., Wu, S., Gong, N.-j.

•preprint•Aug 13 2025

PurposeThis study evaluates the capability of diffusion-based generative models to reconstruct diagnostic-quality renal cortical images from reduced-acquisition-time pediatric 99mTc-DMSA scintigraphy. Materials and MethodsA prospective study was conducted on 99mTc-DMSA scintigraphic data from consecutive pediatric patients with suspected urinary tract infections (UTIs) acquired between November 2023 and October 2024. A diffusion model SR3 was trained to reconstruct standard-quality images from simulated reduced-count data. Performance was benchmarked against U-Net, U2-Net, Restormer, and a Poisson-based variant of SR3 (PoissonSR3). Quantitative assessment employed peak signal-to-noise ratio (PSNR), structural similarity index (SSIM), Frechet inception distance (FID), and learned perceptual image patch similarity (LPIPS). Renal contrast and anatomic fidelity were assessed using the target-to-background ratio (TBR) and the Dice similarity coefficient respectively. Wilcoxon signed-rank tests were used for statistical analysis. ResultsThe training cohort comprised 94 participants (mean age 5.16{+/-}3.90 years; 48 male) with corresponding Poisson-downsampled images, while the test cohort included 36 patients (mean age 6.39{+/-}3.16 years; 14 male). SR3 outperformed all models, achieving the highest PSNR (30.976{+/-}2.863, P<.001), SSIM (0.760{+/-}0.064, P<.001), FID (25.687{+/-}16.223, P<.001), and LPIPS (0.055{+/-}0.022, P<.001). Further, SR3 maintained excellent renal contrast (TBR: left kidney 7.333{+/-}2.176; right kidney 7.156{+/-}1.808) and anatomical consistency (Dice coefficient: left kidney 0.749{+/-}0.200; right kidney 0.745{+/-}0.176), representing significant improvements over the fast scan (all P < .001). While Restormer, U-Net, and PoissonSR3 showed statistically significant improvements across all metrics, U2-Net exhibited limited improvement restricted to SSIM and left kidney TBR (P < .001). ConclusionSR3 enables high-quality reconstruction of 99mTc-DMSA images from 4-fold accelerated acquisitions, demonstrating potential for substantial reduction in imaging duration while preserving both diagnostic image quality and renal anatomical integrity.

SPECT Reconstruction Abdominal Prospective Clinical Pilot Academic Lab

Artificial Intelligence quantified prostate specific membrane antigen imaging in metastatic castrate-resistant prostate cancer patients treated with Lutetium-177-PSMA-617

Yu, S. L., Wang, X., Wen, S., Holler, S., Bodkin, M., Kolodney, J., Najeeb, S., Hogan, T.

•preprint•Aug 12 2025

PURPOSEThe VISION study1 found that Lutetium-177 (177Lu)-PSMA-617 ("Lu-177") improved overall survival in metastatic castrate resistant prostate cancer (mCRPC). We assessed whether artificial intelligence enhanced PSMA imaging in mCRPC patients starting Lu-177 could identify those with better treatment outcomes. PATIENTS AND METHODSWe conducted a single site, tertiary center, retrospective cohort study in 51 consecutive mCRPC patients treated 2022-2024 with Lu-177. These patients had received most standard treatments, with disease progression. Planned treatment was Lu-177 every 6 weeks while continuing androgen deprivation therapy. Before starting treatment, PSMA images were analyzed for SUVmax and quantified tumor volume using artificial intelligence software (aPROMISE, Exinni Inc.). RESULTSFifty-one mCRPC patients were treated with Lu-177; 33 (65%) received 4 or more treatment cycles and these 33 had Kaplan-Meier median overall survival (OS) of 19.3 months and 23 (70%) surviving at 24 month data analysis. At first cycle Lu-177, these 33 had significantly more favorable levels of serum albumin, alkaline phosphatase, calcium, glucose, prostate specific antigen (PSA), ECOG performance status, and F18 PSMA imaging SUV-maximum values - reflecting PSMA "target expression". In a "protocol-eligibility" analysis, 30 of the 51 patients (59%) were considered "protocol-eligible" and 21 (41%) "protocol-ineligible" based on initial clinical parameters, as defined in Methods. "Protocol-eligible" patients had OS of 14.6 mo and 63% survival at 24 months. AI-enhanced F18 PSMA quantified imaging found "protocol-eligible" tumor volume in mL to be only 39% of the volume in "ineligible" patients. CONCLUSIONIn this cohort of mCRPC patients receiving Lu-177, pre-treatment AI-assisted F18 PSMA imaging finding higher PSMA SUV / lower tumor volume associated with the patients ability to have four or more treatment cycles, protocol eligibility, and better overall survival. KEY POINTSO_ST_ABSQuestionC_ST_ABSIn mCRPC patients initiating Lu-177 therapy, can AI-assisted F18 PSMA imaging identify patients who have better treatment outomes? Findings33 (65%) of a 51 consecutive patient mCRPC cohort were able to receive 4 or more cycles Lu-177. These patients had significantly more favorable serum albumin, alkaline phosphatase, calcium, glucose, PSA, performance status, and higher AI-PSMA scan SUV-maximum values, with a trend toward lower PSMA tumor volumes in mL. They had Kaplan-Meier median OS of 19.3 months and 70% survived at 24 months. AI-enhanced PSMA tumor volumes (mL) in "protocol eligible" patients were significantly lower - only 40% - than tumor volumes of "protocol ineligible" patients. MeaningIn this cohort of mCRPC patients receiving Lu-177, pre-treatment AI-assisted F18 PSMA imaging finding higher PSMA SUV / lower tumor volume associated with the patients ability to have four or more treatment cycles, protocol eligibility, and better overall survival.

PET Segmentation Abdominal Retrospective Clinical Clinical Pilot Startup

CAPoxy: a feasibility study to investigate multispectral imaging in nailfold capillaroscopy

Taylor-Williams, M., Khalil, I., Manning, J., Dinsdale, G., Berks, M., Porcu, L., Wilkinson, S., Bohndiek, S., Murray, A.

•preprint•Aug 5 2025

BackgroundNailfold capillaroscopy enables visualisation of structural abnormalities in the microvasculature of patients with systemic sclerosis (SSc). The objective of this feasibility study was to determine whether multispectral imaging could provide functional assessment (differences in haemoglobin concentration or oxygenation) of capillaries to aid discrimination between healthy controls and patients with SSc. MSI of nailfold capillaries visualizes the smallest blood vessels and the impact of SSc on angiogenesis and their deformation, making it suitable for evaluating oxygenation-sensitive imaging techniques. Imaging of the nailfold capillaries offers tissue-specific oxygenation information, unlike pulse oximetry, which measures arterial blood oxygenation as a single-point measurement. MethodsThe CAPoxy study was a single-centre, cross-sectional, feasibility study of nailfold capillary multispectral imaging, comparing a cohort of patients with SSc to controls. A nine-band multispectral camera was used to image 22 individuals (10 patients with SSc and 12 controls). Linear mixed-effects models and summary statistics were used to compare the different regions of the nailfold (capillaries, surrounding edges, and outside area) between SSc and controls. A machine learning model was used to compare the two groups. ResultsPatients with SSc exhibited higher indicators of haemoglobin concentration in the capillary and adjacent regions compared to controls, which were significant in the regions surrounding the capillaries (p<0.001). There were also spectral differences between the SSc and controls groups that could indicate differences in oxygenation of the capillaries and surrounding tissue. Additionally, a machine learning model distinguished SSc patients from healthy controls with an accuracy of 84%, suggesting potential for multispectral imaging to classify SSc based on structural and functional microvascular changes. ConclusionsData indicates that multispectral imaging differentiates between patients with SSc from controls based on differences in vascular function. Further work to develop a targeted spectral camera would further improve the contrast between patients with SSc and controls, enabling better imaging. Key messagesMultispectral imaging holds promise for providing functional oxygenation measurement in nailfold capillaroscopy. Significant oxygenation differences between individuals with systemic sclerosis and healthy controls can be detected with multispectral imaging in the tissue surrounding capillaries.

Mixed Modality Classification Retrospective Clinical Clinical Pilot Academic Lab

The impacts of artificial intelligence on the workload of diagnostic radiology services: A rapid review and stakeholder contextualisation

Sutton, C., Prowse, J., Elshehaly, M., Randell, R.

•preprint•Jul 24 2025

BackgroundAdvancements in imaging technology, alongside increasing longevity and co-morbidities, have led to heightened demand for diagnostic radiology services. However, there is a shortfall in radiology and radiography staff to acquire, read and report on such imaging examinations. Artificial intelligence (AI) has been identified, notably by AI developers, as a potential solution to impact positively the workload of radiology services for diagnostics to address this staffing shortfall. MethodsA rapid review complemented with data from interviews with UK radiology service stakeholders was undertaken. ArXiv, Cochrane Library, Embase, Medline and Scopus databases were searched for publications in English published between 2007 and 2022. Following screening 110 full texts were included. Interviews with 15 radiology service managers, clinicians and academics were carried out between May and September 2022. ResultsMost literature was published in 2021 and 2022 with a distinct focus on AI for diagnostics of lung and chest disease (n = 25) notably COVID-19 and respiratory system cancers, closely followed by AI for breast screening (n = 23). AI contribution to streamline the workload of radiology services was categorised as autonomous, augmentative and assistive contributions. However, percentage estimates, of workload reduction, varied considerably with the most significant reduction identified in national screening programmes. AI was also recognised as aiding radiology services through providing second opinion, assisting in prioritisation of images for reading and improved quantification in diagnostics. Stakeholders saw AI as having the potential to remove some of the laborious work and contribute service resilience. ConclusionsThis review has shown there is limited data on real-world experiences from radiology services for the implementation of AI in clinical production. Autonomous, augmentative and assistive AI can, as noted in the article, decrease workload and aid reading and reporting, however the governance surrounding these advancements lags.

Mixed Modality Classification Review Clinical Pilot Academic Lab Policy Ethics

A View-Agnostic Deep Learning Framework for Comprehensive Analysis of 2D-Echocardiography

Anisuzzaman, D. M., Malins, J. G., Jackson, J. I., Lee, E., Naser, J. A., Rostami, B., Bird, J. G., Spiegelstein, D., Amar, T., Ngo, C. C., Oh, J. K., Pellikka, P. A., Thaden, J. J., Lopez-Jimenez, F., Poterucha, T. J., Friedman, P. A., Pislaru, S., Kane, G. C., Attia, Z. I.

•preprint•Jul 11 2025

Echocardiography traditionally requires experienced operators to select and interpret clips from specific viewing angles. Clinical decision-making is therefore limited for handheld cardiac ultrasound (HCU), which is often collected by novice users. In this study, we developed a view-agnostic deep learning framework to estimate left ventricular ejection fraction (LVEF), patient age, and patient sex from any of several views containing the left ventricle. Model performance was: (1) consistently strong across retrospective transthoracic echocardiography (TTE) datasets; (2) comparable between prospective HCU versus TTE (625 patients; LVEF r2 0.80 vs. 0.86, LVEF [> or [≤]40%] AUC 0.981 vs. 0.993, age r2 0.85 vs. 0.87, sex classification AUC 0.985 vs. 0.996); (3) comparable between prospective HCU data collected by experts versus novice users (100 patients; LVEF r2 0.78 vs. 0.66, LVEF AUC 0.982 vs. 0.966). This approach may broaden the clinical utility of echocardiography by lessening the need for user expertise in image acquisition.

Ultrasound Classification Cardiac Prospective Clinical Pilot Academic Lab Benchmark SOTA

Cardiac Measurement Calculation on Point-of-Care Ultrasonography with Artificial Intelligence

Mercaldo, S. F., Bizzo, B. C., Sadore, T., Halle, M. A., MacDonald, A. L., Newbury-Chaet, I., L'Italien, E., Schultz, A. S., Tam, V., Hegde, S. M., Mangion, J. R., Mehrotra, P., Zhao, Q., Wu, J., Hillis, J.

•preprint•Jun 28 2025

IntroductionPoint-of-care ultrasonography (POCUS) enables clinicians to obtain critical diagnostic information at the bedside especially in resource limited settings. This information may include 2D cardiac quantitative data, although measuring the data manually can be time-consuming and subject to user experience. Artificial intelligence (AI) can potentially automate this quantification. This study assessed the interpretation of key cardiac measurements on POCUS images by an AI-enabled device (AISAP Cardio V1.0). MethodsThis retrospective diagnostic accuracy study included 200 POCUS cases from four hospitals (two in Israel and two in the United States). Each case was independently interpreted by three cardiologists and the device for seven measurements (left ventricular (LV) ejection fraction, inferior vena cava (IVC) maximal diameter, left atrial (LA) area, right atrial (RA) area, LV end diastolic diameter, right ventricular (RV) fractional area change and aortic root diameter). The endpoints were the root mean square error (RMSE) of the device compared to the average cardiologist measurement (LV ejection fraction and IVC maximal diameter were primary endpoints; the other measurements were secondary endpoints). Predefined passing criteria were based on the upper bounds of the RMSE 95% confidence intervals (CIs). The inter-cardiologist RMSE was also calculated for reference. ResultsThe device achieved the passing criteria for six of the seven measurements. While not achieving the passing criterion for RV fractional area change, it still achieved a better RMSE than the inter-cardiologist RMSE. The RMSE was 6.20% (95% CI: 5.57 to 6.83; inter-cardiologist RMSE of 8.23%) for LV ejection fraction, 0.25cm (95% CI: 0.20 to 0.29; 0.36cm) for IVC maximal diameter, 2.39cm2 (95% CI: 1.96 to 2.82; 4.39cm2) for LA area, 2.11cm2 (95% CI: 1.75 to 2.47; 3.49cm2) for RA area, 5.06mm (95% CI: 4.58 to 5.55; 4.67mm) for LV end diastolic diameter, 10.17% (95% CI: 9.01 to 11.33; 14.12%) for RV fractional area change and 0.19cm (95% CI: 0.16 to 0.21; 0.24cm) for aortic root diameter. DiscussionThe device accurately calculated these cardiac measurements especially when benchmarked against inter-cardiologist variability. Its use could assist clinicians who utilize POCUS and better enable their clinical decision-making.

Ultrasound Segmentation Cardiac Retrospective Clinical Clinical Pilot Academic Lab Benchmark SOTA

Diagnostic Performance of Universal versus Stratified Computer-Aided Detection Thresholds for Chest X-Ray-Based Tuberculosis Screening

Sung, J., Kitonsa, P. J., Nalutaaya, A., Isooba, D., Birabwa, S., Ndyabayunga, K., Okura, R., Magezi, J., Nantale, D., Mugabi, I., Nakiiza, V., Dowdy, D. W., Katamba, A., Kendall, E. A.

•preprint•Jun 24 2025

BackgroundComputer-aided detection (CAD) software analyzes chest X-rays for features suggestive of tuberculosis (TB) and provides a numeric abnormality score. However, estimates of CAD accuracy for TB screening are hindered by the lack of confirmatory data among people with lower CAD scores, including those without symptoms. Additionally, the appropriate CAD score thresholds for obtaining further testing may vary according to population and client characteristics. MethodsWe screened for TB in Ugandan individuals aged [≥]15 years using portable chest X-rays with CAD (qXR v3). Participants were offered screening regardless of their symptoms. Those with X-ray scores above a threshold of 0.1 (range, 0 - 1) were asked to provide sputum for Xpert Ultra testing. We estimated the diagnostic accuracy of CAD for detecting Xpert-positive TB when using the same threshold for all individuals (under different assumptions about TB prevalence among people with X-ray scores <0.1), and compared this estimate to age- and/or sex-stratified approaches. FindingsOf 52,835 participants screened for TB using CAD, 8,949 (16.9%) had X-ray scores [≥]0.1. Of 7,219 participants with valid Xpert Ultra results, 382 (5.3%) were Xpert-positive, including 81 with trace results. Assuming 0.1% of participants with X-ray scores <0.1 would have been Xpert-positive if tested, qXR had an estimated AUC of 0.920 (95% confidence interval 0.898-0.941) for Xpert-positive TB. Stratifying CAD thresholds according to age and sex improved accuracy; for example, at 96.1% specificity, estimated sensitivity was 75.0% for a universal threshold (of [≥]0.65) versus 76.9% for thresholds stratified by age and sex (p=0.046). InterpretationThe accuracy of CAD for TB screening among all screening participants, including those without symptoms or abnormal chest X-rays, is higher than previously estimated. Stratifying CAD thresholds based on client characteristics such as age and sex could further improve accuracy, enabling a more effective and personalized approach to TB screening. FundingNational Institutes of Health Research in contextO_ST_ABSEvidence before this studyC_ST_ABSThe World Health Organization (WHO) has endorsed computer-aided detection (CAD) as a screening tool for tuberculosis (TB), but the appropriate CAD score that triggers further diagnostic evaluation for tuberculosis varies by population. The WHO recommends determining the appropriate CAD threshold for specific settings and population and considering unique thresholds for specific populations, including older age groups, among whom CAD may perform poorly. We performed a PubMed literature search for articles published until September 9, 2024, using the search terms "tuberculosis" AND ("computer-aided detection" OR "computer aided detection" OR "CAD" OR "computer-aided reading" OR "computer aided reading" OR "artificial intelligence"), which resulted in 704 articles. Among them, we identified studies that evaluated the performance of CAD for tuberculosis screening and additionally reviewed relevant references. Most prior studies reported area under the curves (AUC) ranging from 0.76 to 0.88 but limited their evaluations to individuals with symptoms or abnormal chest X-rays. Some prior studies identified subgroups (including older individuals and people with prior TB) among whom CAD had lower-than-average AUCs, and authors discussed how the prevalence of such characteristics could affect the optimal value of a population-wide CAD threshold; however, none estimated the accuracy that could be gained with adjusting CAD thresholds between individuals based on personal characteristics. Added value of this studyIn this study, all consenting individuals in a high-prevalence setting were offered chest X-ray screening, regardless of symptoms, if they were [≥]15 years old, not pregnant, and not on TB treatment. A very low CAD score cutoff (qXR v3 score of 0.1 on a 0-1 scale) was used to select individuals for confirmatory sputum molecular testing, enabling the detection of radiographically mild forms of TB and facilitating comparisons of diagnostic accuracy at different CAD thresholds. With this more expansive, symptom-neutral evaluation of CAD, we estimated an AUC of 0.920, and we found that the qXR v3 threshold needed to decrease to under 0.1 to meet the WHO target product profile goal of [≥]90% sensitivity and [≥]70% specificity. Compared to using the same thresholds for all participants, adjusting CAD thresholds by age and sex strata resulted in a 1 to 2% increase in sensitivity without affecting specificity. Implications of all the available evidenceTo obtain high sensitivity with CAD screening in high-prevalence settings, low score thresholds may be needed. However, countries with a high burden of TB often do not have sufficient resources to test all individuals above a low threshold. In such settings, adjusting CAD thresholds based on individual characteristics associated with TB prevalence (e.g., male sex) and those associated with false-positive X-ray results (e.g., old age) can potentially improve the efficiency of TB screening programs.

X-Ray Detection Chest Retrospective Clinical Clinical Pilot Academic Lab Benchmark SOTA

CEREBLEED: Automated quantification and severity scoring of intracranial hemorrhage on non-contrast CT

Cepeda, S., Esteban-Sinovas, O., Arrese, I., Sarabia, R.

•preprint•Jun 13 2025

BackgroundIntracranial hemorrhage (ICH), whether spontaneous or traumatic, is a neurological emergency with high morbidity and mortality. Accurate assessment of severity is essential for neurosurgical decision-making. This study aimed to develop and evaluate a fully automated, deep learning-based tool for the standardized assessment of ICH severity, based on the segmentation of the hemorrhage and intracranial structures, and the computation of an objective severity index. MethodsNon-contrast cranial CT scans from patients with spontaneous or traumatic ICH were retrospectively collected from public datasets and a tertiary care center. Deep learning models were trained to segment hemorrhages and intracranial structures. These segmentations were used to compute a severity index reflecting bleeding burden and mass effect through volumetric relationships. Segmentation performance was evaluated on a hold-out test cohort. In a prospective cohort, the severity index was assessed in relation to expert-rated CT severity, clinical outcomes, and the need for urgent neurosurgical intervention. ResultsA total of 1,110 non-contrast cranial CT scans were analyzed, 900 from the retrospective cohort and 200 from the prospective evaluation cohort. The binary segmentation model achieved a median Dice score of 0.90 for total hemorrhage. The multilabel model yielded Dice scores ranging from 0.55 to 0.94 across hemorrhage subtypes. The severity index significantly correlated with expert-rated CT severity (p < 0.001), the modified Rankin Scale (p = 0.007), and the Glasgow Outcome Scale-Extended (p = 0.039), and independently predicted the need for urgent surgery (p < 0.001). A threshold [~]300 was identified as a decision point for surgical management (AUC = 0.83). ConclusionWe developed a fully automated and openly accessible pipeline for the analysis of non-contrast cranial CT in intracranial hemorrhage. It computes a novel index that objectively quantifies hemorrhage severity and is significantly associated with clinically relevant outcomes, including the need for urgent neurosurgical intervention.

CT Segmentation Neurological Retrospective Clinical Clinical Pilot Academic Lab Open Code

Filter Papers

Tags