Search CORE

23,693 research outputs found

Attentional Action-Driven Deep Networks for Visual Object Tracking

Author: 윤상두
Publication venue: 서울대학교 대학원
Publication date: 01/08/2017
Field of study

학위논문 (박사)-- 서울대학교 대학원 공과대학 전기·컴퓨터공학부, 2017. 8. 최진영.This dissertation proposes a novel visual tracking method which is controlled by sequential actions learned by deep reinforcement learning. In the recent trackers using deep networks, tracking-by-detection scheme is adopted to select the target position with the highest matching score. The tracking-by-detection scheme achieves a good performance in a simple manner but is inefficient in exploring candidates. We propose an efficient action-driven deep tracker which is controlled by sequential actions trained by deep reinforcement learning. In contrast to the existing trackers using deep networks, the proposed tracker is designed to achieve a light computation as well as satisfactory tracking accuracy in both location and scale. The deep network to control actions is pre-trained using various training sequences and fine-tuned during tracking for online adaptation to target and background changes. The pre-training is done by utilizing deep reinforcement learning as well as supervised learning. The use of reinforcement learning enables even partially labeled data to be successfully utilized for semi-supervised learning. In addition, this dissertation tackles a tracking problem of an object interacting with other objects in a complex scene such as basketball game scenes containing various interactions among players and painting motions. For this purpose, we design a multi-agent architecture diverse interaction movements among neighboring objects near the target object. The multi-agent architecture is designed so that the main tracker could determine a proper action by utilizing the states of neighboring trackers. Through extensive evaluation, the proposed tracker is validated to achieve a competitive performance that is much faster than state-of-the-art deep network-based trackers.Chapter 1 Introduction 1 1.1 Background 1 1.2 Related Work 4 1.2.1 Visual Object Tracking 4 1.2.2 Action-Driven Approach 6 1.2.3 Deep Reinforcement Learning 7 1.2.4 Multi-agent Reinforcement Learning 8 1.3 Contents of Research 9 1.4 Thesis Organization 12 Chapter 2 Preliminaries 13 2.1 Reinforcement Learning 13 2.1.1 Markov Decision Process 14 2.1.2 Reinforcement Learning Problem 15 2.2 Policy Gradient 16 Chapter 3 Action-Driven Visual Tracking 23 3.1 Overview 23 3.2 Problem Settings 25 3.3 Action-Decision Network Architecture 28 3.4 Training Methods for Action-Decision Network 32 3.4.1 Training ADNet with Supervised Learning 32 3.4.2 Training ADNet with Reinforcement Learning 35 3.5 Online Adaptation in Tracking 40 3.6 Implementation Details 42 3.6.1 Pretraining the ADNet 42 3.6.2 Online Adaptation of the ADNet 42 Chapter 4 Interacted Action-Driven Visual Tracking 47 4.1 Overview 47 4.2 Problem Settings 50 4.3 Proposed Method 51 4.3.1 Baseline 51 4.3.2 Multi-agent Architecture 55 4.4 Training Methods for Multi-agent ADNet 58 4.4.1 Training Multi-agent ADNet with Supervised Learning 58 4.4.2 Training of Multi-agent ADNet with Reinforcement Learning 60 4.5 Online Adaptation in Tracking 60 4.6 Implementation Details 61 4.6.1 Pretraining 61 4.6.2 Online Adaptation 61 Chapter 5 Experiments 65 5.1 Action-Driven Visual Tracking 65 5.1.1 Datasets 67 5.1.2 Self-comparison 69 5.1.3 Analysis 69 5.1.4 State-of-the-art Comparison 76 5.1.5 Qualitative Results 79 5.2 Interacted Action-Driven Tracking 85 5.2.1 Datasets 85 5.2.2 Self Comparison 88 5.2.3 Quantitative Results 88 5.2.4 Qualitative Results 89 Chapter 6 Conclusion 95 6.1 Concluding Remarks 95 6.2 Future Works 96Docto

SNU Open Repository and Archive