Sort by:
Page 7 of 45441 results

SpectMamba: Integrating Frequency and State Space Models for Enhanced Medical Image Detection

Yao Wang, Dong Yang, Zhi Qiao, Wenjian Huang, Liuzhi Yang, Zhen Qian

arxiv logopreprintSep 1 2025
Abnormality detection in medical imaging is a critical task requiring both high efficiency and accuracy to support effective diagnosis. While convolutional neural networks (CNNs) and Transformer-based models are widely used, both face intrinsic challenges: CNNs have limited receptive fields, restricting their ability to capture broad contextual information, and Transformers encounter prohibitive computational costs when processing high-resolution medical images. Mamba, a recent innovation in natural language processing, has gained attention for its ability to process long sequences with linear complexity, offering a promising alternative. Building on this foundation, we present SpectMamba, the first Mamba-based architecture designed for medical image detection. A key component of SpectMamba is the Hybrid Spatial-Frequency Attention (HSFA) block, which separately learns high- and low-frequency features. This approach effectively mitigates the loss of high-frequency information caused by frequency bias and correlates frequency-domain features with spatial features, thereby enhancing the model's ability to capture global context. To further improve long-range dependencies, we propose the Visual State-Space Module (VSSM) and introduce a novel Hilbert Curve Scanning technique to strengthen spatial correlations and local dependencies, further optimizing the Mamba framework. Comprehensive experiments show that SpectMamba achieves state-of-the-art performance while being both effective and efficient across various medical image detection tasks.

YOLOv8-BCD: a real-time deep learning framework for pulmonary nodule detection in computed tomography imaging.

Zhu W, Wang X, Xing J, Xu XS, Yuan M

pubmed logopapersSep 1 2025
Lung cancer remains one of the malignant tumors with the highest global morbidity and mortality rates. Detecting pulmonary nodules in computed tomography (CT) images is essential for early lung cancer screening. However, traditional detection methods often suffer from low accuracy and efficiency, limiting their clinical effectiveness. This study aims to devise an advanced deep-learning framework capable of achieving high-precision, rapid identification of pulmonary nodules in CT imaging, thereby facilitating earlier and more accurate diagnosis of lung cancer. To address these issues, this paper proposes an improved deep-learning framework named YOLOv8-BCD, based on YOLOv8 and integrating the BiFormer attention mechanism, Content-Aware ReAssembly of Features (CARAFE) up-sampling method, and Depth-wise Over-Parameterized Depth-wise Convolution (DO-DConv) enhanced convolution. To overcome common challenges such as low resolution, noise, and artifacts in lung CT images, the model employs Super-Resolution Generative Adversarial Network (SRGAN)-based image enhancement during preprocessing. The BiFormer attention mechanism is introduced into the backbone to enhance feature extraction capabilities, particularly for small nodules, while CARAFE and DO-DConv modules are incorporated into the head to optimize feature fusion efficiency and reduce computational complexity. Experimental comparisons using 550 CT images from the LUng Nodule Analysis 2016 dataset (LUNA16 dataset) demonstrated that the proposed YOLOv8-BCD achieved detection accuracy and mean average precision (mAP) at an intersection over union (IoU) threshold of 0.5 (mAP<sub>0.5</sub>) of 86.4% and 88.3%, respectively, surpassing YOLOv8 by 2.2% in accuracy, 4.5% in mAP<sub>0.5</sub>. Additional evaluation on the external TianChi lung nodule dataset further confirmed the model's generalization capability, achieving an mAP<sub>0.5</sub> of 83.8% and mAP<sub>0.5-0.95</sub> of 43.9% with an inference speed of 98 frames per second (FPS). The YOLOv8-BCD model effectively assists clinicians by significantly reducing interpretation time, improving diagnostic accuracy, and minimizing the risk of missed diagnoses, thereby enhancing patient outcomes.

CXR-MultiTaskNet a unified deep learning framework for joint disease localization and classification in chest radiographs.

Reddy KD, Patil A

pubmed logopapersAug 31 2025
Chest X-ray (CXR) is a challenging problem in automated medical diagnosis, where complex visual patterns of thoracic diseases must be precisely identified through multi-label classification and lesion localization. Current approaches typically consider classification and localization in isolation, resulting in a piecemeal system that does not exploit common representations and is often not clinically interpretable, as well as limited in handling multi-label diseases. Although multi-task learning frameworks, such as DeepChest and CLN, appear to meet this goal, they suffer from task interference and poor explainability, which limits their practical application in real-world clinical workflows. To address these limitations, we present a unified multi-task deep learning framework, CXR-MultiTaskNet, for simultaneously classifying thoracic diseases and localizing lesions in chest X-rays. Our framework comprises a standard ResNet50 feature extractor, two task-specific heads for multi-task learning, and a Grad-CAM-based explainability module that provides accurate predictions and enhances clinical explainability. We formulate a joint loss that weighs the relative importance of representation extraction, which is large due to class variations, and the final loss, which is larger in the detection loss that occurs in extreme class imbalances between days and the detectability of varying disease manifestation types. Recent advances made by deep learning methods in the identification of disease in chest X-ray images are promising; however, there are limitations in their performance for complete analysis due to the lack of interpretability, some inherent weaknesses of convolutional neural networks (CNN), and prior learning of classification at the image level before localization of the disease. In this paper, we propose a dual-attention-based hierarchical feature extraction approach, which addresses the challenges of deep learning in detecting diseases in chest X-ray images. Through the use of visual attention maps, the detection steps can be better tracked, and therefore, the entire process is made more interpretable than with a traditional CNN-embedding model. We also manage to obtain both disease-level and pixel-level predictions, which enable explainable and comprehensive analysis of each image and aid in localizing each detected abnormality area. The proposed approach was further optimized for X-ray images by computing the objective losses during training, which ultimately gives higher significance to smaller lesions. Experimental evaluations on a benchmark chest X-ray dataset demonstrate the potential of the proposed approach achieving a macro F1-score of 0.965 (0.968 micro F1-score) for disease classification and mean IoU of 0.851 ([email protected]) for localization of diseases Content: Model intepretability, Chest X-ray image disease detection, Detection region localization, Weakly supervised transfer learning Lesion localization → 5 of 0.927 Compared to state-of-the-art single-task and multi-task baselines, these results are consistently better. The presented framework provides an integrated, method-based approach to chest X-ray analysis that is clinically useful, interpretable, and scalable for automation, allowing for efficient diagnostic pathways and enhanced clinical decision-making. This single framework can serve as a router for next-gen explainable AI in radiology.

Impact of pre-test probability on AI-LVO detection: a systematic review of LVO prevalence across clinical contexts.

Olivé-Gadea M, Mayol J, Requena M, Rodrigo-Gisbert M, Rizzo F, Garcia-Tornel A, Simonetti R, Diana F, Muchada M, Pagola J, Rodriguez-Luna D, Rodriguez-Villatoro N, Rubiera M, Molina CA, Tomasello A, Hernandez D, de Dios Lascuevas M, Ribo M

pubmed logopapersAug 31 2025
Rapid identification of large vessel occlusion (LVO) in acute ischemic stroke (AIS) is essential for reperfusion therapy. Screening tools, including Artificial Intelligence (AI) based algorithms, have been developed to accelerate detection but rely heavily on pre-test LVO prevalence. This study aimed to review LVO prevalence across clinical contexts and analyze its impact on AI-algorithm performance. We systematically reviewed studies reporting consecutive suspected AIS cohorts. Cohorts were grouped into four clinical scenarios based on patient selection criteria: (a) high suspicion of LVO by stroke specialists (direct-to-angiosuite candidates), (b) high suspicion of LVO according to pre-hospital scales, (c) and (d) any suspected AIS without considering severity cut-off in a hospital or pre-hospital setting, respectively. We analyzed LVO prevalence in each scenario and assessed the false discovery rate (FDR) - number of positive studies needed to encounter a false positive, if applying eight commercially available LVO-detecting algorithms. We included 87 cohorts from 80 studies. Median LVO prevalence was: (a) 84% (77-87%), (b) 35% (26-42%), (c) 19% (14-25%), and (d) 14% (8-22%). At high prevalence levels: (a) FDR ranged between 0.007 (1 false positive in 142 positives) and 0.023 (1 in 43), whereas in low prevalence scenarios (Ccand d), FDR ranged between 0.168 (1 in 6) and 0.543 (over 1 in 2). To ensure meaningful clinical impact, AI algorithms must be evaluated within the specific populations and care pathways where they are applied.

Flow Matching-Based Data Synthesis for Robust Anatomical Landmark Localization.

Hadzic A, Bogensperger L, Berghold A, Urschler M

pubmed logopapersAug 29 2025
Anatomical landmark localization (ALL) plays a crucial role in medical imaging for applications such as therapy planning and surgical interventions. State-ofthe- art deep learning methods for ALL are often trained on small datasets due to the scarcity of large, annotated medical data. This constraint often leads to overfitting on the training dataset, which in turn reduces the model's ability to generalize to unseen data. To address these challenges, we propose a multi-channel generative approach utilizing Flow Matching to synthesize diverse annotated images for data augmentation in ALL tasks. Each synthetically generated sample consists of a medical image paired with a multi-channel heatmap that encodes its landmark configuration, from which the corresponding landmark annotations can be derived. We assess the quality of synthetic image-heatmap pairs automatically using a Statistical Shape Model to evaluate landmark plausibility and compute the Fréchet Inception Distance score to quantify image quality. Our results show that pairs synthesized via Flow Matching exhibit superior quality and diversity compared with those generated by other state-of-the-art generative models like Generative Adversarial Networks or diffusion models. Furthermore, we investigate the effect of integrating synthetic data into the training process of an ALL network. In our experiments, the ALL network trained with Flow Matching-generated data demonstrates improved robustness, particularly in scenarios with limited training data or occlusions, compared with baselines that utilize solely real images or synthetic data from alternative generative models.

Artificial intelligence software to detect small hepatic lesions on hepatobiliary-phase images using multiscale sampling.

Maeda S, Nakamura Y, Higaki T, Karasudani A, Yamaguchi T, Ishihara M, Baba T, Kondo S, Fonseca D, Awai K

pubmed logopapersAug 29 2025
To investigate the effect of multiscale sampling artificial intelligence (msAI) software adapted to small hepatic lesions on the diagnostic performance of readers interpreting gadoxetic acid-enhanced hepatobiliary-phase (HBP) images. HBP images of 30 patients harboring 186 hepatic lesions were included. Three board-certified radiologists, 9 radiology residents, and 2 general physicians interpreted HBP image data sets twice, once with and once without the msAI software at 2-week intervals. Jackknife free-response receiver-operating characteristic analysis was performed to calculate the figure of merit (FOM) for detecting hepatic lesions. The negative consultation ratio (NCR), percentage of correct diagnoses turning into incorrect by the AI software, was calculated. We defined readers whose NCR was lower than 10% as those correctly diagnosed the false findings presented by the software. The msAI software significantly improved the lesion localization fraction (LLF) for all readers (0.74 vs 0.82, p < 0.01); the FOM did not (0.76 vs 0.78, p = 0.45). In lesion-size-based subgroup analysis, the LLF (0.40 vs 0.53, p < 0.01) improved significantly with the AI software even for lesions smaller than 6 mm, whereas the FOM (0.63 vs 0.66, p = 0.51) showed no significant difference. Among 10 readers with an NCR lower than 10%, not only the LLF but also the FOM were significantly better with the software (LLF 0.77 vs 0.82, FOM 0.79 vs 0.84, both p < 0.01). The detectability of small hepatic lesions on HBP images was improved with msAI software especially when its results were properly evaluated.

Adapting Foundation Model for Dental Caries Detection with Dual-View Co-Training

Tao Luo, Han Wu, Tong Yang, Dinggang Shen, Zhiming Cui

arxiv logopreprintAug 28 2025
Accurate dental caries detection from panoramic X-rays plays a pivotal role in preventing lesion progression. However, current detection methods often yield suboptimal accuracy due to subtle contrast variations and diverse lesion morphology of dental caries. In this work, inspired by the clinical workflow where dentists systematically combine whole-image screening with detailed tooth-level inspection, we present DVCTNet, a novel Dual-View Co-Training network for accurate dental caries detection. Our DVCTNet starts with employing automated tooth detection to establish two complementary views: a global view from panoramic X-ray images and a local view from cropped tooth images. We then pretrain two vision foundation models separately on the two views. The global-view foundation model serves as the detection backbone, generating region proposals and global features, while the local-view model extracts detailed features from corresponding cropped tooth patches matched by the region proposals. To effectively integrate information from both views, we introduce a Gated Cross-View Attention (GCV-Atten) module that dynamically fuses dual-view features, enhancing the detection pipeline by integrating the fused features back into the detection model for final caries detection. To rigorously evaluate our DVCTNet, we test it on a public dataset and further validate its performance on a newly curated, high-precision dental caries detection dataset, annotated using both intra-oral images and panoramic X-rays for double verification. Experimental results demonstrate DVCTNet's superior performance against existing state-of-the-art (SOTA) methods on both datasets, indicating the clinical applicability of our method. Our code and labeled dataset are available at https://github.com/ShanghaiTech-IMPACT/DVCTNet.

Mitosis detection in domain shift scenarios: a Mamba-based approach

Gennaro Percannella, Mattia Sarno, Francesco Tortorella, Mario Vento

arxiv logopreprintAug 28 2025
Mitosis detection in histopathology images plays a key role in tumor assessment. Although machine learning algorithms could be exploited for aiding physicians in accurately performing such a task, these algorithms suffer from significative performance drop when evaluated on images coming from domains that are different from the training ones. In this work, we propose a Mamba-based approach for mitosis detection under domain shift, inspired by the promising performance demonstrated by Mamba in medical imaging segmentation tasks. Specifically, our approach exploits a VM-UNet architecture for carrying out the addressed task, as well as stain augmentation operations for further improving model robustness against domain shift. Our approach has been submitted to the track 1 of the MItosis DOmain Generalization (MIDOG) challenge. Preliminary experiments, conducted on the MIDOG++ dataset, show large room for improvement for the proposed method.

Automated system of analysis to quantify pediatric hip morphology.

Gartland CN, Healy J, Lynham RS, Nowlan NC, Green C, Redmond SJ

pubmed logopapersAug 28 2025
Developmental dysplasia of the hip (DDH), a developmental deformity with an incidence of 0.1-3.4%, lacks an objective and reliable definition and assessment metric by which to conduct timely diagnosis. This work aims to address this challenge by developing a system of analysis to accurately detect 22 key anatomical landmarks in anteroposterior pelvic radiographs of the juvenile hip, from which a range of novel salient morphological measures can be derived. A coarse-to-fine approach was implemented, with six model variations of the U-Net deep neural network architecture compared for the coarse model and four variations for the fine model; model variations included differences in data augmentation applied, image input size, network attention gates, and loss function design. The best performing combination achieved a root-mean-square error in the positional accuracy of landmark detection of 3.79 mm with a bias and precision in the x-direction of 0.03 ± 17.6 mm and y-direction of 1.76 ± 22.5 mm in the image frame of reference. Average errors for each morphological metric are in line with the performance of clinical experts. Future work will use this system to perform a population analysis to accurately characterize hip joint morphology and develop an objective and reliable assessment metric for DDH.

The African Breast Imaging Dataset for Equitable Cancer Care: Protocol for an Open Mammogram and Ultrasound Breast Cancer Detection Dataset

Musinguzi, D., Katumba, A., Kawooya, M. G., Malumba, R., Nakatumba-Nabende, J., Achuka, S. A., Adewole, M., Anazodo, U.

medrxiv logopreprintAug 28 2025
IntroductionBreast cancer is one of the most common cancers globally. Its incidence in Africa has increased sharply, surpassing that in high-income countries. Mortality remains high due to late-stage diagnosis, when treatment is less effetive. We propose the first open, longitudinal breast imaging dataset from Africa comprising point-of-care ultrasound scans, mammograms, biopsy pathology, and clinical profiles to support early detection using machine learning. Methods and AnalysisWe will engage women through community outreach and train them in self-examination. Those with suspected lesions, particularly with a family history of breast cancer, will be invited to participate. A total of 100 women will undergo baseline assessment at medical centers, including clinical exams, blood tests, and mammograms. Follow-up point-of-care ultrasound scans and clinical data will be collected at 3 and 6 months, with final assessments at 9 months including mammograms. Ethics and DisseminationThe study has been approved by the Institutional Review Boards at ECUREI and the MAI Lab. Findings will be disseminated through peer-reviewed journals and scientific conferences.
Page 7 of 45441 results
Show
per page

Ready to Sharpen Your Edge?

Join hundreds of your peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.