中文
Profile
VIEW MORE
教育背景: 1.1998-2002年:毕业于西安建筑科技大学(本科); 2.2002-2005年:毕业于西安建筑科技大学(硕士); 3.2011-2017年:毕业于西安建筑科技大学(博士); 工作经历: 1.2005至2018.10:担任西安建筑科技大学信息与控制工程学院专业教师; 2.2018至今:担任西安建筑科技大学建筑设备科学与工程学院专业教师、副院长; 社会兼职: 陕西省自动化学会智能建筑与楼宇自动化专业委员会副秘书长
冯增喜
Associate Professor
Paper Publications
A Small Object Detection Framework on UAV Images via Attentive Representation Learning and Attentional Feature Fusion
Release time:2025-12-20 Hits:
Affiliation of Author(s):
建筑设备科学与工程学院
Journal:
Signal Image and Video Processing
Key Words:
Small object detection · UAV images · Attention mechanism · Cross-scale feature fusion
Abstract:
Object detection in Unmanned Aerial Vehicle (UAV) images has diverse applications, leading to increasing research interest. Despite the success of detection in natural scenes, UAV images pose two unique challenges: the high prevalence of small objects and significant variations in object scales, limiting the performance of existing methods. To address these, we propose Att-YOLO, a novel small object detection model for UAV images, which improves YOLOv7 with attentive learning and attentional fusion. First, during feature extraction, we introduce an attentive representation learning module with a spatial attention mechanism to highlight foreground features and a channel attention module to reduce background noise. Second, we design an attentional feature fusion strategy to leverage multi-scale feature correlations, assigning dynamic weights to better integrate cross-layer information, which is crucial for handling scale variations. Third, to improve small object detection, we extend the Generalized Efficient Layer Aggregation Network (GELAN) with Swin Transformer blocks, enabling the model to capture both local and global features effectively. Additionally, Wise-IoU (WIoU) v3 is used as the bounding box regression loss to improve localization precision. Extensive experiments show that Att-YOLO achieves competitive performance with state-of-the-art methods, achieving a mean Average Precision (mAP) of 41.8% on the VisDrone2019 dataset and 28.2% on the UAVDT dataset, while maintaining superior AP50 and showing advantageous computational efficiency
First Author:
fengzengxi
Indexed by:
Journal paper
Volume:
19(14): 1235.
ISSN No.:
1863-1703
Translation or Not:
no
Date of Publication:
2025-10-03

Pre One:A Power Load Prediction by LSTM Model Based on the Double Attention Mechanism for Hospital Building

Next One:IBMF-based Global Optimization Control on Energy Conservation of a Central Air-conditioning System