Search CORE

860 research outputs found

Object Detection in 20 Years: A Survey

Author: Guo Yuhong
Shi Zhenwei
Ye Jieping
Zou Zhengxia
Publication venue
Publication date: 15/05/2019
Field of study

Object detection, as of one the most fundamental and challenging problems in computer vision, has received great attention in recent years. Its development in the past two decades can be regarded as an epitome of computer vision history. If we think of today's object detection as a technical aesthetics under the power of deep learning, then turning back the clock 20 years we would witness the wisdom of cold weapon era. This paper extensively reviews 400+ papers of object detection in the light of its technical evolution, spanning over a quarter-century's time (from the 1990s to 2019). A number of topics have been covered in this paper, including the milestone detectors in history, detection datasets, metrics, fundamental building blocks of the detection system, speed up techniques, and the recent state of the art detection methods. This paper also reviews some important detection applications, such as pedestrian detection, face detection, text detection, etc, and makes an in-deep analysis of their challenges as well as technical improvements in recent years.Comment: This work has been submitted to the IEEE TPAMI for possible publicatio

arXiv.org e-Print Archive

Automatic Designs in Deep Neural Networks

Author: Liu Lanlan
Publication venue
Publication date: 01/01/2020
Field of study

To train a Deep Neural Network (DNN) that performs well for a task, many design steps are taken including data designs, model designs and loss designs. Despite that remarkable progress has been made in all these domains of designing DNNs, the unexplored design space of each component is still vast. That brings the research field of developing automated techniques to lift some heavy work from human researchers when exploring the design space. The automated designs can help human researchers to make massive or challenging design choices and reduce the expertise required from human researchers. Much effort has been made towards automated designs of DNNs, including synthetic data generation, automated data augmentation, neural architecture search and so on. Despite the huge effort, the automation of DNN designs is still far from complete. This thesis contributes in two ways: identifying new problems in the DNN design pipeline that can be solved automatically, and proposing new solutions to problems that have been explored by automated designs. The first part of this thesis presents two problems that were usually solved with manual designs but can benefit from automated designs. To tackle the problem of inefficient computation due to using a static DNN architecture for different inputs, some manual efforts have been made to use different networks for different inputs as needed, such as cascade models. We propose an automated dynamic inference framework that can cut this manual effort and automatically choose different architectures for different inputs during inference. To tackle the problem of designing differentiable loss functions for non-differentiable performance metrics, researchers usually design the loss manually for each individual task. We propose an unified loss framework that reduces the amount of manual design of losses in different tasks. The second part of this thesis discusses developing new techniques in domains where the automated design has been shown effective. In the synthetic data generation domain, we propose a novel method to automatically generate synthetic data for small-data object detection. The synthetic data generated can amend the limited annotated real data of the small-data object detection tasks, such as rare disease detection. In the architecture search domain, we propose an architecture search method customized for generative adversarial networks (GANs). GANs are commonly known unstable to train where we propose this new method that can stabilize the training of GANs in the architecture search process.PHDComputer Science & EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/163208/1/llanlan_1.pd

Deep Blue Documents at the University of Michigan

Generative Adversarial Networks for Online Visual Object Tracking Systems

Author: zin ghsoun
Publication venue: Scholars Commons @ Laurier
Publication date: 01/01/2019
Field of study

Object Tracking is one of the essential tasks in computer vision domain as it has numerous applications in various fields, such as human-computer interaction, video surveillance, augmented reality, and robotics. Object Tracking refers to the process of detecting and locating the target object in a series of frames in a video. The state-of-the-art for tracking-by-detection framework is typically made up of two steps to track the target object. The first step is drawing multiple samples near the target region of the previous frame. The second step is classifying each sample as either the target object or the background. Visual object tracking remains one of the most challenging task due to variations in visual data such as target occlusion, background clutter, illumination changes, scale changes, as well as challenges stem from the tracking problem including fast motion, out of view, motion blur, deformation, and in and out planar rotation. These challenges continue to be tackled by researchers as they investigate more effective algorithms that are able to track any object under various changing conditions. To keep the research community motivated, there are several annual tracker benchmarking competitions organized to consolidate performance measures and evaluation protocols in different tracking subfields such as Visual Object Tracking VOT challenges and The Multiple Object Tracking MOT Challenges [1, 2]. Despite the excellent performance achieved with deep learning, modern deep tracking methods are still limited in several aspects. The variety of appearance changes over time remains a problem for deep trackers, owing to spatial overlap between positive samples. Furthermore, existing methods require high computational load and suffer from slow running speed. Recently, Generative Adversarial Networks (GANs) have shown excellent results in solving a variety of computer vision problems, making them attractive in investigating their potential use in achieving better results in other computer vision applications, namely, visual object tracking. In this thesis, we explore the impact of using Residual Network ResNet as an alternative feature extractor to Visual Geometry Group VGG which is commonly used in literature. Furthermore, we attempt to address the limitations of object tracking by exploiting the ongoing advancement in Generative Adversarial Networks. We describe a generative adversarial network intended to improve the tracker’s classifier during the online training phase. The network generates adaptive masks to augment the positive samples detected by the convolutional layer of the tracker’s model in order to improve the model’s classifier by making the samples more difficult. Then we integrate this network with Multi-Domain Convolutional Neural Network (MDNet) tracker and present the results. Furthermore, we introduce a novel tracker, MDResNet, by substituting the convolutional layers of MDNet that were originally taken from Visual Geometry Group (VGG-M) network with layers taken from Residual Deep Network (ResNet-50) and the results are compared. We also introduce a new tracker, Region of Interest with Adversarial Learning (ROIAL), by integrating the generative adversarial network with the Real-Time Multi-Domain Convolutional Network (RT-MDNet) tracker. We also integrate the GAN network with MDResNet and MDNet and compare the results with ROIAL

Wilfrid Laurier University