Deformable Part Region Learning and Feature Aggregation Tree Representation for Object Detection

Citations

WEB OF SCIENCE

13
Citations

SCOPUS

13

초록

Region-based object detection infers object regions for one or more categories in an image. Due to the recent advances in deep learning and region proposal methods, object detectors based on convolutional neural networks (CNNs) have been flourishing and provided promising detection results. However, the accuracy of the convolutional object detectors can be degraded often due to the low feature discriminability caused by geometric variation or transformation of an object. In this article, we propose a deformable part region (DPR) learning in order to allow decomposed part regions to be deformable according to the geometric transformation of an object. Because the ground truth of the part models is not available in many cases, we design part model losses for the detection and segmentation, and learn the geometric parameters by minimizing an integral loss including those part losses. As a result, we can train our DPR network without extra supervision, and make multi-part models deformable according to object geometric variation. Moreover, we propose a novel feature aggregation tree (FAT) so as to learn more discriminative region of interest (RoI) features via bottom-up tree construction. The FAT can learn the stronger semantic features by aggregating part RoI features along the bottom-up pathways of the tree. We also present a spatial and channel attention mechanism for the aggregation between different node features. Based on the proposed DPR and FAT networks, we design a new cascade architecture that can refine detection tasks iteratively. Without bells and whistles, we achieve impressive detection and segmentation results on MSCOCO and PASCAL VOC datasets. Our Cascade D-PRD achieves the 57.9 box AP with the Swin-L backbone. We also provide an extensive ablation study to prove the effectiveness and usefulness of the proposed methods for large-scale object detection.

키워드

Convolutional object detectordeformable part modelfeature aggregation treecascade detectionlarge scale object detectioninstance segmentation
제목
Deformable Part Region Learning and Feature Aggregation Tree Representation for Object Detection
저자
Bae, Seung-Hwan
DOI
10.1109/TPAMI.2023.3268864
발행일
2023-09-01
유형
Article
저널명
IEEE Transactions on Pattern Analysis and Machine Intelligence
45
9
페이지
10817 ~ 10834