冯增喜
邮箱:
所属单位:建筑设备科学与工程学院
发表刊物:Signal Image and Video Processing
关键字:Small object detection · UAV images · Attention mechanism · Cross-scale feature fusion
摘要:Object detection in Unmanned Aerial Vehicle (UAV) images has diverse applications, leading to increasing research interest. Despite the success of detection in natural scenes, UAV images pose two unique challenges: the high prevalence of small objects and significant variations in object scales, limiting the performance of existing methods. To address these, we propose Att-YOLO, a novel small object detection model for UAV images, which improves YOLOv7 with attentive learning and attentional fusion. First, during feature extraction, we introduce an attentive representation learning module with a spatial attention mechanism to highlight foreground features and a channel attention module to reduce background noise. Second, we design an attentional feature fusion strategy to leverage multi-scale feature correlations, assigning dynamic weights to better integrate cross-layer information, which is crucial for handling scale variations. Third, to improve small object detection, we extend the Generalized Efficient Layer Aggregation Network (GELAN) with Swin Transformer blocks, enabling the model to capture both local and global features effectively. Additionally, Wise-IoU (WIoU) v3 is used as the bounding box regression loss to improve localization precision. Extensive experiments show that Att-YOLO achieves competitive performance with state-of-the-art methods, achieving a mean Average Precision (mAP) of 41.8% on the VisDrone2019 dataset and 28.2% on the UAVDT dataset, while maintaining superior AP50 and showing advantageous computational efficiency
第一作者:冯增喜
论文类型:期刊论文
卷号:19(14): 1235.
ISSN号:1863-1703
是否译文:否
发表时间:2025-10-03
