Sort by:
Page 1 of 19 results

CAPoxy: a feasibility study to investigate multispectral imaging in nailfold capillaroscopy

Taylor-Williams, M., Khalil, I., Manning, J., Dinsdale, G., Berks, M., Porcu, L., Wilkinson, S., Bohndiek, S., Murray, A.

medrxiv logopreprintAug 5 2025
BackgroundNailfold capillaroscopy enables visualisation of structural abnormalities in the microvasculature of patients with systemic sclerosis (SSc). The objective of this feasibility study was to determine whether multispectral imaging could provide functional assessment (differences in haemoglobin concentration or oxygenation) of capillaries to aid discrimination between healthy controls and patients with SSc. MSI of nailfold capillaries visualizes the smallest blood vessels and the impact of SSc on angiogenesis and their deformation, making it suitable for evaluating oxygenation-sensitive imaging techniques. Imaging of the nailfold capillaries offers tissue-specific oxygenation information, unlike pulse oximetry, which measures arterial blood oxygenation as a single-point measurement. MethodsThe CAPoxy study was a single-centre, cross-sectional, feasibility study of nailfold capillary multispectral imaging, comparing a cohort of patients with SSc to controls. A nine-band multispectral camera was used to image 22 individuals (10 patients with SSc and 12 controls). Linear mixed-effects models and summary statistics were used to compare the different regions of the nailfold (capillaries, surrounding edges, and outside area) between SSc and controls. A machine learning model was used to compare the two groups. ResultsPatients with SSc exhibited higher indicators of haemoglobin concentration in the capillary and adjacent regions compared to controls, which were significant in the regions surrounding the capillaries (p<0.001). There were also spectral differences between the SSc and controls groups that could indicate differences in oxygenation of the capillaries and surrounding tissue. Additionally, a machine learning model distinguished SSc patients from healthy controls with an accuracy of 84%, suggesting potential for multispectral imaging to classify SSc based on structural and functional microvascular changes. ConclusionsData indicates that multispectral imaging differentiates between patients with SSc from controls based on differences in vascular function. Further work to develop a targeted spectral camera would further improve the contrast between patients with SSc and controls, enabling better imaging. Key messagesMultispectral imaging holds promise for providing functional oxygenation measurement in nailfold capillaroscopy. Significant oxygenation differences between individuals with systemic sclerosis and healthy controls can be detected with multispectral imaging in the tissue surrounding capillaries.

The impacts of artificial intelligence on the workload of diagnostic radiology services: A rapid review and stakeholder contextualisation

Sutton, C., Prowse, J., Elshehaly, M., Randell, R.

medrxiv logopreprintJul 24 2025
BackgroundAdvancements in imaging technology, alongside increasing longevity and co-morbidities, have led to heightened demand for diagnostic radiology services. However, there is a shortfall in radiology and radiography staff to acquire, read and report on such imaging examinations. Artificial intelligence (AI) has been identified, notably by AI developers, as a potential solution to impact positively the workload of radiology services for diagnostics to address this staffing shortfall. MethodsA rapid review complemented with data from interviews with UK radiology service stakeholders was undertaken. ArXiv, Cochrane Library, Embase, Medline and Scopus databases were searched for publications in English published between 2007 and 2022. Following screening 110 full texts were included. Interviews with 15 radiology service managers, clinicians and academics were carried out between May and September 2022. ResultsMost literature was published in 2021 and 2022 with a distinct focus on AI for diagnostics of lung and chest disease (n = 25) notably COVID-19 and respiratory system cancers, closely followed by AI for breast screening (n = 23). AI contribution to streamline the workload of radiology services was categorised as autonomous, augmentative and assistive contributions. However, percentage estimates, of workload reduction, varied considerably with the most significant reduction identified in national screening programmes. AI was also recognised as aiding radiology services through providing second opinion, assisting in prioritisation of images for reading and improved quantification in diagnostics. Stakeholders saw AI as having the potential to remove some of the laborious work and contribute service resilience. ConclusionsThis review has shown there is limited data on real-world experiences from radiology services for the implementation of AI in clinical production. Autonomous, augmentative and assistive AI can, as noted in the article, decrease workload and aid reading and reporting, however the governance surrounding these advancements lags.

A View-Agnostic Deep Learning Framework for Comprehensive Analysis of 2D-Echocardiography

Anisuzzaman, D. M., Malins, J. G., Jackson, J. I., Lee, E., Naser, J. A., Rostami, B., Bird, J. G., Spiegelstein, D., Amar, T., Ngo, C. C., Oh, J. K., Pellikka, P. A., Thaden, J. J., Lopez-Jimenez, F., Poterucha, T. J., Friedman, P. A., Pislaru, S., Kane, G. C., Attia, Z. I.

medrxiv logopreprintJul 11 2025
Echocardiography traditionally requires experienced operators to select and interpret clips from specific viewing angles. Clinical decision-making is therefore limited for handheld cardiac ultrasound (HCU), which is often collected by novice users. In this study, we developed a view-agnostic deep learning framework to estimate left ventricular ejection fraction (LVEF), patient age, and patient sex from any of several views containing the left ventricle. Model performance was: (1) consistently strong across retrospective transthoracic echocardiography (TTE) datasets; (2) comparable between prospective HCU versus TTE (625 patients; LVEF r2 0.80 vs. 0.86, LVEF [> or [&le;]40%] AUC 0.981 vs. 0.993, age r2 0.85 vs. 0.87, sex classification AUC 0.985 vs. 0.996); (3) comparable between prospective HCU data collected by experts versus novice users (100 patients; LVEF r2 0.78 vs. 0.66, LVEF AUC 0.982 vs. 0.966). This approach may broaden the clinical utility of echocardiography by lessening the need for user expertise in image acquisition.

Cardiac Measurement Calculation on Point-of-Care Ultrasonography with Artificial Intelligence

Mercaldo, S. F., Bizzo, B. C., Sadore, T., Halle, M. A., MacDonald, A. L., Newbury-Chaet, I., L'Italien, E., Schultz, A. S., Tam, V., Hegde, S. M., Mangion, J. R., Mehrotra, P., Zhao, Q., Wu, J., Hillis, J.

medrxiv logopreprintJun 28 2025
IntroductionPoint-of-care ultrasonography (POCUS) enables clinicians to obtain critical diagnostic information at the bedside especially in resource limited settings. This information may include 2D cardiac quantitative data, although measuring the data manually can be time-consuming and subject to user experience. Artificial intelligence (AI) can potentially automate this quantification. This study assessed the interpretation of key cardiac measurements on POCUS images by an AI-enabled device (AISAP Cardio V1.0). MethodsThis retrospective diagnostic accuracy study included 200 POCUS cases from four hospitals (two in Israel and two in the United States). Each case was independently interpreted by three cardiologists and the device for seven measurements (left ventricular (LV) ejection fraction, inferior vena cava (IVC) maximal diameter, left atrial (LA) area, right atrial (RA) area, LV end diastolic diameter, right ventricular (RV) fractional area change and aortic root diameter). The endpoints were the root mean square error (RMSE) of the device compared to the average cardiologist measurement (LV ejection fraction and IVC maximal diameter were primary endpoints; the other measurements were secondary endpoints). Predefined passing criteria were based on the upper bounds of the RMSE 95% confidence intervals (CIs). The inter-cardiologist RMSE was also calculated for reference. ResultsThe device achieved the passing criteria for six of the seven measurements. While not achieving the passing criterion for RV fractional area change, it still achieved a better RMSE than the inter-cardiologist RMSE. The RMSE was 6.20% (95% CI: 5.57 to 6.83; inter-cardiologist RMSE of 8.23%) for LV ejection fraction, 0.25cm (95% CI: 0.20 to 0.29; 0.36cm) for IVC maximal diameter, 2.39cm2 (95% CI: 1.96 to 2.82; 4.39cm2) for LA area, 2.11cm2 (95% CI: 1.75 to 2.47; 3.49cm2) for RA area, 5.06mm (95% CI: 4.58 to 5.55; 4.67mm) for LV end diastolic diameter, 10.17% (95% CI: 9.01 to 11.33; 14.12%) for RV fractional area change and 0.19cm (95% CI: 0.16 to 0.21; 0.24cm) for aortic root diameter. DiscussionThe device accurately calculated these cardiac measurements especially when benchmarked against inter-cardiologist variability. Its use could assist clinicians who utilize POCUS and better enable their clinical decision-making.

Diagnostic Performance of Universal versus Stratified Computer-Aided Detection Thresholds for Chest X-Ray-Based Tuberculosis Screening

Sung, J., Kitonsa, P. J., Nalutaaya, A., Isooba, D., Birabwa, S., Ndyabayunga, K., Okura, R., Magezi, J., Nantale, D., Mugabi, I., Nakiiza, V., Dowdy, D. W., Katamba, A., Kendall, E. A.

medrxiv logopreprintJun 24 2025
BackgroundComputer-aided detection (CAD) software analyzes chest X-rays for features suggestive of tuberculosis (TB) and provides a numeric abnormality score. However, estimates of CAD accuracy for TB screening are hindered by the lack of confirmatory data among people with lower CAD scores, including those without symptoms. Additionally, the appropriate CAD score thresholds for obtaining further testing may vary according to population and client characteristics. MethodsWe screened for TB in Ugandan individuals aged [&ge;]15 years using portable chest X-rays with CAD (qXR v3). Participants were offered screening regardless of their symptoms. Those with X-ray scores above a threshold of 0.1 (range, 0 - 1) were asked to provide sputum for Xpert Ultra testing. We estimated the diagnostic accuracy of CAD for detecting Xpert-positive TB when using the same threshold for all individuals (under different assumptions about TB prevalence among people with X-ray scores <0.1), and compared this estimate to age- and/or sex-stratified approaches. FindingsOf 52,835 participants screened for TB using CAD, 8,949 (16.9%) had X-ray scores [&ge;]0.1. Of 7,219 participants with valid Xpert Ultra results, 382 (5.3%) were Xpert-positive, including 81 with trace results. Assuming 0.1% of participants with X-ray scores <0.1 would have been Xpert-positive if tested, qXR had an estimated AUC of 0.920 (95% confidence interval 0.898-0.941) for Xpert-positive TB. Stratifying CAD thresholds according to age and sex improved accuracy; for example, at 96.1% specificity, estimated sensitivity was 75.0% for a universal threshold (of [&ge;]0.65) versus 76.9% for thresholds stratified by age and sex (p=0.046). InterpretationThe accuracy of CAD for TB screening among all screening participants, including those without symptoms or abnormal chest X-rays, is higher than previously estimated. Stratifying CAD thresholds based on client characteristics such as age and sex could further improve accuracy, enabling a more effective and personalized approach to TB screening. FundingNational Institutes of Health Research in contextO_ST_ABSEvidence before this studyC_ST_ABSThe World Health Organization (WHO) has endorsed computer-aided detection (CAD) as a screening tool for tuberculosis (TB), but the appropriate CAD score that triggers further diagnostic evaluation for tuberculosis varies by population. The WHO recommends determining the appropriate CAD threshold for specific settings and population and considering unique thresholds for specific populations, including older age groups, among whom CAD may perform poorly. We performed a PubMed literature search for articles published until September 9, 2024, using the search terms "tuberculosis" AND ("computer-aided detection" OR "computer aided detection" OR "CAD" OR "computer-aided reading" OR "computer aided reading" OR "artificial intelligence"), which resulted in 704 articles. Among them, we identified studies that evaluated the performance of CAD for tuberculosis screening and additionally reviewed relevant references. Most prior studies reported area under the curves (AUC) ranging from 0.76 to 0.88 but limited their evaluations to individuals with symptoms or abnormal chest X-rays. Some prior studies identified subgroups (including older individuals and people with prior TB) among whom CAD had lower-than-average AUCs, and authors discussed how the prevalence of such characteristics could affect the optimal value of a population-wide CAD threshold; however, none estimated the accuracy that could be gained with adjusting CAD thresholds between individuals based on personal characteristics. Added value of this studyIn this study, all consenting individuals in a high-prevalence setting were offered chest X-ray screening, regardless of symptoms, if they were [&ge;]15 years old, not pregnant, and not on TB treatment. A very low CAD score cutoff (qXR v3 score of 0.1 on a 0-1 scale) was used to select individuals for confirmatory sputum molecular testing, enabling the detection of radiographically mild forms of TB and facilitating comparisons of diagnostic accuracy at different CAD thresholds. With this more expansive, symptom-neutral evaluation of CAD, we estimated an AUC of 0.920, and we found that the qXR v3 threshold needed to decrease to under 0.1 to meet the WHO target product profile goal of [&ge;]90% sensitivity and [&ge;]70% specificity. Compared to using the same thresholds for all participants, adjusting CAD thresholds by age and sex strata resulted in a 1 to 2% increase in sensitivity without affecting specificity. Implications of all the available evidenceTo obtain high sensitivity with CAD screening in high-prevalence settings, low score thresholds may be needed. However, countries with a high burden of TB often do not have sufficient resources to test all individuals above a low threshold. In such settings, adjusting CAD thresholds based on individual characteristics associated with TB prevalence (e.g., male sex) and those associated with false-positive X-ray results (e.g., old age) can potentially improve the efficiency of TB screening programs.

CEREBLEED: Automated quantification and severity scoring of intracranial hemorrhage on non-contrast CT

Cepeda, S., Esteban-Sinovas, O., Arrese, I., Sarabia, R.

medrxiv logopreprintJun 13 2025
BackgroundIntracranial hemorrhage (ICH), whether spontaneous or traumatic, is a neurological emergency with high morbidity and mortality. Accurate assessment of severity is essential for neurosurgical decision-making. This study aimed to develop and evaluate a fully automated, deep learning-based tool for the standardized assessment of ICH severity, based on the segmentation of the hemorrhage and intracranial structures, and the computation of an objective severity index. MethodsNon-contrast cranial CT scans from patients with spontaneous or traumatic ICH were retrospectively collected from public datasets and a tertiary care center. Deep learning models were trained to segment hemorrhages and intracranial structures. These segmentations were used to compute a severity index reflecting bleeding burden and mass effect through volumetric relationships. Segmentation performance was evaluated on a hold-out test cohort. In a prospective cohort, the severity index was assessed in relation to expert-rated CT severity, clinical outcomes, and the need for urgent neurosurgical intervention. ResultsA total of 1,110 non-contrast cranial CT scans were analyzed, 900 from the retrospective cohort and 200 from the prospective evaluation cohort. The binary segmentation model achieved a median Dice score of 0.90 for total hemorrhage. The multilabel model yielded Dice scores ranging from 0.55 to 0.94 across hemorrhage subtypes. The severity index significantly correlated with expert-rated CT severity (p < 0.001), the modified Rankin Scale (p = 0.007), and the Glasgow Outcome Scale-Extended (p = 0.039), and independently predicted the need for urgent surgery (p < 0.001). A threshold [~]300 was identified as a decision point for surgical management (AUC = 0.83). ConclusionWe developed a fully automated and openly accessible pipeline for the analysis of non-contrast cranial CT in intracranial hemorrhage. It computes a novel index that objectively quantifies hemorrhage severity and is significantly associated with clinically relevant outcomes, including the need for urgent neurosurgical intervention.

Slide-free surface histology enables rapid colonic polyp interpretation across specialties and foundation AI

Yong, A., Husna, N., Tan, K. H., Manek, G., Sim, R., Loi, R., Lee, O., Tang, S., Soon, G., Chan, D., Liang, K.

medrxiv logopreprintJun 11 2025
Colonoscopy is a mainstay of colorectal cancer screening and has helped to lower cancer incidence and mortality. The resection of polyps during colonoscopy is critical for tissue diagnosis and prevention of colorectal cancer, albeit resulting in increased resource requirements and expense. Discarding resected benign polyps without sending for histopathological processing and confirmatory diagnosis, known as the resect and discard strategy, could enhance efficiency but is not commonly practiced due to endoscopists predominant preference for pathological confirmation. The inaccessibility of histopathology from unprocessed resected tissue hampers endoscopic decisions. We show that intraprocedural fibre-optic microscopy with ultraviolet-C surface excitation (FUSE) of polyps post-resection enables rapid diagnosis, potentially complementing endoscopic interpretation and incorporating pathologist oversight. In a clinical study of 28 patients, slide-free FUSE microscopy of freshly resected polyps yielded mucosal views that greatly magnified the surface patterns observed on endoscopy and revealed previously unavailable histopathological signatures. We term this new cross-specialty readout surface histology. In blinded interpretations of 42 polyps (19 training, 23 reading) by endoscopists and pathologists of varying experience, surface histology differentiated normal/benign, low-grade dysplasia, and high-grade dysplasia and cancer, with 100% performance in classifying high/low risk. This FUSE dataset was also successfully interpreted by foundation AI models pretrained on histopathology slides, illustrating a new potential for these models to not only expedite conventional pathology tasks but also autonomously provide instant expert feedback during procedures that typically lack pathologists. Surface histology readouts during colonoscopy promise to empower endoscopist decisions and broadly enhance confidence and participation in resect and discard. One Sentence SummaryRapid microscopy of resected polyps during colonoscopy yielded accurate diagnoses, promising to enhance colorectal screening.

Novel Deep Learning Framework for Simultaneous Assessment of Left Ventricular Mass and Longitudinal Strain: Clinical Feasibility and Validation in Patients with Hypertrophic Cardiomyopathy

Park, J., Yoon, Y. E., Jang, Y., Jung, T., Jeon, J., Lee, S.-A., Choi, H.-M., Hwang, I.-C., Chun, E. J., Cho, G.-Y., Chang, H.-J.

medrxiv logopreprintMay 23 2025
BackgroundThis study aims to present the Segmentation-based Myocardial Advanced Refinement Tracking (SMART) system, a novel artificial intelligence (AI)-based framework for transthoracic echocardiography (TTE) that incorporates motion tracking and left ventricular (LV) myocardial segmentation for automated LV mass (LVM) and global longitudinal strain (LVGLS) assessment. MethodsThe SMART system demonstrates LV speckle tracking based on motion vector estimation, refined by structural information using endocardial and epicardial segmentation throughout the cardiac cycle. This approach enables automated measurement of LVMSMART and LVGLSSMART. The feasibility of SMART is validated in 111 hypertrophic cardiomyopathy (HCM) patients (median age: 58 years, 69% male) who underwent TTE and cardiac magnetic resonance imaging (CMR). ResultsLVGLSSMART showed a strong correlation with conventional manual LVGLS measurements (Pearsons correlation coefficient [PCC] 0.851; mean difference 0 [-2-0]). When compared to CMR as the reference standard for LVM, the conventional dimension-based TTE method overestimated LVM (PCC 0.652; mean difference: 106 [90-123]), whereas LVMSMART demonstrated excellent agreement with CMR (PCC 0.843; mean difference: 1 [-11-13]). For predicting extensive myocardial fibrosis, LVGLSSMART and LVMSMART exhibited performance comparable to conventional LVGLS and CMR (AUC: 0.72 and 0.66, respectively). Patients identified as high-risk for extensive fibrosis by LVGLSSMART and LVMSMART had significantly higher rates of adverse outcomes, including heart failure hospitalization, new-onset atrial fibrillation, and defibrillator implantation. ConclusionsThe SMART technique provides a comparable LVGLS evaluation and a more accurate LVM assessment than conventional TTE, with predictive values for myocardial fibrosis and adverse outcomes. These findings support its utility in HCM management.

The effect of medical explanations from large language models on diagnostic decisions in radiology

Spitzer, P., Hendriks, D., Rudolph, J., Schläger, S., Ricke, J., Kühl, N., Hoppe, B., Feuerriegel, S.

medrxiv logopreprintMay 18 2025
Large language models (LLMs) are increasingly used by physicians for diagnostic support. A key advantage of LLMs is the ability to generate explanations that can help physicians understand the reasoning behind a diagnosis. However, the best-suited format for LLM-generated explanations remains unclear. In this large-scale study, we examined the effect of different formats for LLM explanations on clinical decision-making. For this, we conducted a randomized experiment with radiologists reviewing patient cases with radiological images (N = 2020 assessments). Participants received either no LLM support (control group) or were supported by one of three LLM-generated explanations: (1) a standard output providing the diagnosis without explanation; (2) a differential diagnosis comparing multiple possible diagnoses; or (3) a chain-of-thought explanation offering a detailed reasoning process for the diagnosis. We find that the format of explanations significantly influences diagnostic accuracy. The chain-of-thought explanations yielded the best performance, improving the diagnostic accuracy by 12.2% compared to the control condition without LLM support (P = 0.001). The chain-of-thought explanations are also superior to the standard output without explanation (+7.2%; P = 0.040) and the differential diagnosis format (+9.7%; P = 0.004). We further assessed the robustness of these findings across case difficulty and different physician backgrounds such as general vs. specialized radiologists. Evidently, explaining the reasoning for a diagnosis helps physicians to identify and correct potential errors in LLM predictions and thus improve overall decisions. Altogether, the results highlight the importance of how explanations in medical LLMs are generated to maximize their utility in clinical practice. By designing explanations to support the reasoning processes of physicians, LLMs can improve diagnostic performance and, ultimately, patient outcomes.
Page 1 of 19 results
Show
per page
1

Ready to Sharpen Your Edge?

Join hundreds of your peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.