pisco_log
banner

Optimization and Application of Attention-Fused Deep Learning Models for Enhanced Computer Vision Image Segmentation

Wei Deng, Jiajun Ling, Zhicheng Yin*

Abstract


Image segmentation, a main part of computer vision, was changed by deep learning, especially Fully Convolutional Networks
(FCN) and U-Net. But these models have problems when it comes to understanding how far away things are and telling apart different
front and back sides that are hard to spot, so the edges where they end are unclear. and attention mechanisms (AMs), models in fact now
have the power to selectively pay attention to salient pieces of information and ignore irrelevant information. Although these attention
modules are successful (e. g. CBAM, Non-Local blocks), they bring large computational expenses, hindering their application in the
resource-limited environment. In this paper addressing the need for a solution that is as necessary as it is optimized, a new, efficient and
novel attention scheme is introduced: The Gated Double Path Attention (GDPA) module. gdpa upgrades the process for recalibrating
features using a parallel approach instead of a sequential one in the spatial and channel attention parts; this decreases the time it takes. it's
also able to reduce the model efficiency by using smaller convolutions, 1D convolutions for channel and Adaptive Deformable Convolution
for the spatial. We include this improved module in a U-Net backbone and test it on the very hard task of analyzing medical images by
segmenting brain tumors using the BraTS 2020 dataset. From our experiments, we can see that the GDPA-UNet model has achieved a
significant improvement over the baseline U-Net and other commonly used attention-augmented models. It shows better performance
metrics in two commonly used evaluation metrics, Dice on region overlaps and Hausdorff distance on boundary accuracy. This verifies the
effectiveness and optimization of our proposed model.

Keywords


Image Segmentation; Computer Vision; Attention Mechanism; Deep Learning; Model Optimization; U-Net, Medical Imaging

Full Text:

PDF

Included Database


References


[1] Atek S, Mehidi I, Jabri D, et al. Deep learning for multi-modal medical image segmentation: a survey and comparative study[J]. Brain

imaging and behavior, 2025, (prepublish):1-26. DOI:10.1007/S11682-025-01052-3.

[2] Wang K, Wang Z, Li Z, et al. Oriented object detection in optical remote sensing images using deep learning: a survey[J]. Artificial Intelligence Review, 2025, 58(11):350-350. DOI:10.1007/S10462-025-11256-0.

[3] Xiankai L, Wenguan W, Jianbing S, et al. Segmenting Objects from Relational Visual Data[J]. IEEE transactions on pattern analysis and

machine intelligence, 2021, PP. DOI:10.1109/TPAMI.2021.3115815.

[4] Benjamin P, Nikolaus K. Capturing the objects of vision with neural networks[J]. Nature Human Behaviour, 2021, 5(9):1127-1144.

DOI:10.1038/S41562-021-01194-6.




DOI: http://dx.doi.org/10.70711/aitr.v3i4.8192

Refbacks

  • There are currently no refbacks.