Decode-MOT: How Can We Hurdle Frames to Go Beyond Tracking-by-Detection?

Citations

WEB OF SCIENCE

26
Citations

SCOPUS

32

초록

The speed of tracking-by-detection (TBD) greatly depends on the number of running a detector because the detection is the most expensive operation in TBD. In many practical cases, multi-object tracking (MOT) can be, however, achieved based tracking-by-motion (TBM) only. This is a possible solution without much loss of MOT accuracy when the variations of object cardinality and motions are not much within consecutive frames. Therefore, the MOT problem can be transformed to find the best TBD and TBM mechanism. To achieve it, we propose a novel decision coordinator for MOT (Decode-MOT) which can determine the best TBD/TBM mechanism according to scene and tracking contexts. In specific, our Decode-MOT learns tracking and scene contextual similarities between frames. Because the contextual similarities can vary significantly according to the used trackers and tracking scenes, we learn the Decode-MOT via self-supervision. The evaluation results on MOT challenge datasets prove that our method can boost the tracking speed greatly while keeping the state-of-the-art MOT accuracy. Our code will be available at https://github.com/reussite-cv/Decode-MOT.

키워드

Multi-object Trackingtracking-by-detectiontracking-by-motionscene and tracking contextual learninghierarchical association
제목
Decode-MOT: How Can We Hurdle Frames to Go Beyond Tracking-by-Detection?
저자
Lee, Seong-HoPark, Dae-HyeonBae, Seung-Hwan
DOI
10.1109/TIP.2023.3298538
발행일
2023
유형
Article
저널명
IEEE Transactions on Image Processing
32
페이지
4378 ~ 4392