qertgig.blogg.se - Video duplicate detector

In: Computer Vision and Pattern Recognition Workshops, pp. Giancola, S., Amine, M., Dghaily, T., Ghanem, B.: SoccerNet: a scalable dataset for action spotting in soccer videos. In: Computer Vision and Pattern Recognition, pp. Geiger, A., Lenz, P., Urtasun, R.: Are we ready for autonomous driving? The KITTI vision benchmark suite. In: International Conference on Advanced Video and Signal-Based Surveillance, pp. 248–255 (2009)Įllis, A., Ferryman, J.M.: PETS2010 and PETS2009 evaluation of results using individual ground truthed single views. arxiv abs/2003.09003 (2020)ĭeng, J., Dong, W., Socher, R., Li, L., Li, K., Fei-Fei, L.: ImageNet: a large-scale hierarchical image database. ĭendorfer, P., et al.: MOT20: a benchmark for multi object tracking in crowded scenes. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. arxiv abs/2104.00194 (2021)ĭave, A., Khurana, T., Tokmakov, P., Schmid, C., Ramanan, D.: TAO: a large-scale benchmark for tracking any object. 5030–5039 (2018)Ĭhu, P., Wang, J., You, Q., Ling, H., Liu, Z.: TransMOT: spatial-temporal graph transformer for multiple object tracking. Ĭhavdarova, T., et al.: WILDTRACK: a multi-camera HD dataset for dense unscripted pedestrian detection. 3464–3468 (2016)Ĭarion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: International Conference on Image Processing, pp. īewley, A., Ge, Z., Ott, L., Ramos, F.T., Upcroft, B.: Simple online and realtime tracking.

2008, 246309 (2008)īertinetto, L., Valmadre, J., Henriques, J.F., Vedaldi, A., Torr, P.H.S.: Fully-convolutional Siamese networks for object tracking. 941–951 (2019)īernardin, K., Stiefelhagen, R.: Evaluating multiple object tracking performance: the CLEAR MOT metrics. In: International Conference on Computer Vision, pp. Keywordsīergmann, P., Meinhardt, T., Leal-Taixé, L.: Tracking without bells and whistles. Finally, our model, which is trained only with volleyball videos, can be applied directly to basketball and soccer videos, which shows the priority of our method. Meanwhile, experiments on MOT-series and DanceTrack discover that D \(^3\) can accelerate convergence during training, especially saving up to 80 \(\%\) of the original training time on MOT17. Extensive experiments on RallyTrack show that combining D \(^3\) and RH can dramatically improve the tracking performance with 9.2 in MOTA and 4.5 in HOTA. Moreover, to complement the tracking dataset that without shot changes, we release a new dataset based on sports video named RallyTrack. RH, triggered by the team sports substitution rules, is exceedingly suitable for sports videos.

Once duplicate detection occurs, D \(^3\) immediately modifies the procedure by generating enhanced box losses. To address this problem, we meticulously design a novel transformer-based Duplicate Detection Decontaminator (D \(^3\)) for training, and a specific algorithm Rally-Hungarian (RH) for matching. In this paper, the duplicate detection is newly and precisely defined as occlusion misreporting on the same athlete by multiple detection boxes in one frame. Tracking multiple athletes in sports videos is a very challenging Multi-Object Tracking (MOT) task, since athletes often have the same appearance and are intimately covered with each other, making a common occlusion problem becomes an abhorrent duplicate detection.