TY - JOUR
T1 - Consumer-Centric Insights into Resilient Small Object Detection
T2 - SCIoU Loss and Recursive Transformer Network
AU - Wang, Le
AU - Shi, Yu
AU - Mao, Guojun
AU - Dharejo, Fayaz Ali
AU - Javed, Sajid
AU - Alathbah, Moath
N1 - Publisher Copyright:
© 1975-2011 IEEE.
PY - 2024/2/1
Y1 - 2024/2/1
N2 - As an emerging consumer electronic product, the use of unmanned aerial vehicle(UAV) for a variety of tasks has received growing attention and favor in the enterprise or individual consumer electronics market in recent years. The deep neural network based object detectors are convenient to embed into the UAV product, however, the drone-captured images could bring the potential challenges of object occlusion, large scale difference and complex background to these methods because they are not designed for the detection of small and tiny objects within the aerial images. To address the problem, we propose an improved YOLO paradigm called SR-YOLO with an Efficient Neck, Shape CIoU and Recursion Bottleneck Transformer for better object detection performance in consumer-level UAV products. Firstly, an efficient neck structure is presented to retain richer features through a small object detection layer and an up-sampling operator suitable for small object detection. Secondly, we design a new prediction box loss function called shape complete-IoU(SCIoU), which utilizes a width (height) limiting factor to alleviate the deficiency that the CIoU only focuses on aspect ratios by taking into account both the aspect ratio and the ratio of the two boxes' widths. Moreover, combined with recurrent neural network and multi-head self-attention mechanism at the cyclic manner, a recursive bottleneck transformer is constructed to relieve the impact of highly dense scene and occlusion problems exists in UAV images. We conduct the extensive experiments on two public datasets of VisDrone2019 and TinyPerson, where the results show that the proposed model surpasses the compared YOLO by 8.1% and 3.2% in mAP_50 respectively. In addition, the analysis and case study also validate our SR-YOLO's superiority and effectiveness.
AB - As an emerging consumer electronic product, the use of unmanned aerial vehicle(UAV) for a variety of tasks has received growing attention and favor in the enterprise or individual consumer electronics market in recent years. The deep neural network based object detectors are convenient to embed into the UAV product, however, the drone-captured images could bring the potential challenges of object occlusion, large scale difference and complex background to these methods because they are not designed for the detection of small and tiny objects within the aerial images. To address the problem, we propose an improved YOLO paradigm called SR-YOLO with an Efficient Neck, Shape CIoU and Recursion Bottleneck Transformer for better object detection performance in consumer-level UAV products. Firstly, an efficient neck structure is presented to retain richer features through a small object detection layer and an up-sampling operator suitable for small object detection. Secondly, we design a new prediction box loss function called shape complete-IoU(SCIoU), which utilizes a width (height) limiting factor to alleviate the deficiency that the CIoU only focuses on aspect ratios by taking into account both the aspect ratio and the ratio of the two boxes' widths. Moreover, combined with recurrent neural network and multi-head self-attention mechanism at the cyclic manner, a recursive bottleneck transformer is constructed to relieve the impact of highly dense scene and occlusion problems exists in UAV images. We conduct the extensive experiments on two public datasets of VisDrone2019 and TinyPerson, where the results show that the proposed model surpasses the compared YOLO by 8.1% and 3.2% in mAP_50 respectively. In addition, the analysis and case study also validate our SR-YOLO's superiority and effectiveness.
KW - Bottleneck transformer
KW - small object detection
KW - Unmanned aerial vehicle (UAV) image
KW - you only look once
UR - http://www.scopus.com/inward/record.url?scp=85177060045&partnerID=8YFLogxK
U2 - 10.1109/TCE.2023.3330788
DO - 10.1109/TCE.2023.3330788
M3 - Article
AN - SCOPUS:85177060045
SN - 0098-3063
VL - 70
SP - 2178
EP - 2187
JO - IEEE Transactions on Consumer Electronics
JF - IEEE Transactions on Consumer Electronics
IS - 1
ER -