Search CORE

183 research outputs found

Human Detection and Tracking for Video Surveillance A Cognitive Science Approach

Author: Gajjar Vandit
Gurnani Ayesha
Khandhediya Yash
Publication venue
Publication date: 03/09/2017
Field of study

With crimes on the rise all around the world, video surveillance is becoming more important day by day. Due to the lack of human resources to monitor this increasing number of cameras manually new computer vision algorithms to perform lower and higher level tasks are being developed. We have developed a new method incorporating the most acclaimed Histograms of Oriented Gradients the theory of Visual Saliency and the saliency prediction model Deep Multi Level Network to detect human beings in video sequences. Furthermore we implemented the k Means algorithm to cluster the HOG feature vectors of the positively detected windows and determined the path followed by a person in the video. We achieved a detection precision of 83.11% and a recall of 41.27%. We obtained these results 76.866 times faster than classification on normal images.Comment: ICCV 2017 Venice, Italy Pages 5 Figures

arXiv.org e-Print Archive

Crossref

Persistent Evidence of Local Image Properties in Generic ConvNets

Author: Azizpour Hossein
Carlsson Stefan
Ek Carl Henrik
Maki Atsuto
Razavian Ali Sharif
Sullivan Josephine
Publication venue
Publication date: 24/11/2014
Field of study

Supervised training of a convolutional network for object classification should make explicit any information related to the class of objects and disregard any auxiliary information associated with the capture of the image or the variation within the object class. Does this happen in practice? Although this seems to pertain to the very final layers in the network, if we look at earlier layers we find that this is not the case. Surprisingly, strong spatial information is implicit. This paper addresses this, in particular, exploiting the image representation at the first fully connected layer, i.e. the global image descriptor which has been recently shown to be most effective in a range of visual recognition tasks. We empirically demonstrate evidences for the finding in the contexts of four different tasks: 2d landmark detection, 2d object keypoints prediction, estimation of the RGB values of input image, and recovery of semantic label of each pixel. We base our investigation on a simple framework with ridge rigression commonly across these tasks, and show results which all support our insight. Such spatial information can be used for computing correspondence of landmarks to a good accuracy, but should potentially be useful for improving the training of the convolutional nets for classification purposes

arXiv.org e-Print Archive

Publikationer från KTH

CiteSeerX

Crossref

Digitala Vetenskapliga Arkivet - Academic Archive On-line

Explore Bristol Research

Articulated Pose Estimation by a Graphical Model with Image Dependent Pairwise Relations

Author: Chen Xianjie
Yuille Alan
Publication venue
Publication date: 04/11/2014
Field of study

We present a method for estimating articulated human pose from a single static image based on a graphical model with novel pairwise relations that make adaptive use of local image measurements. More precisely, we specify a graphical model for human pose which exploits the fact the local image measurements can be used both to detect parts (or joints) and also to predict the spatial relationships between them (Image Dependent Pairwise Relations). These spatial relationships are represented by a mixture model. We use Deep Convolutional Neural Networks (DCNNs) to learn conditional probabilities for the presence of parts and their spatial relationships within image patches. Hence our model combines the representational flexibility of graphical models with the efficiency and statistical power of DCNNs. Our method significantly outperforms the state of the art methods on the LSP and FLIC datasets and also performs very well on the Buffy dataset without any training.Comment: NIPS 2014 Camera Read

arXiv.org e-Print Archive

CiteSeerX

End-to-end weakly-supervised semantic alignment

Author: Arandjelović Relja
Rocco Ignacio
Sivic Josef
Publication venue
Publication date: 24/04/2018
Field of study

We tackle the task of semantic alignment where the goal is to compute dense semantic correspondence aligning two images depicting objects of the same category. This is a challenging task due to large intra-class variation, changes in viewpoint and background clutter. We present the following three principal contributions. First, we develop a convolutional neural network architecture for semantic alignment that is trainable in an end-to-end manner from weak image-level supervision in the form of matching image pairs. The outcome is that parameters are learnt from rich appearance variation present in different but semantically related images without the need for tedious manual annotation of correspondences at training time. Second, the main component of this architecture is a differentiable soft inlier scoring module, inspired by the RANSAC inlier scoring procedure, that computes the quality of the alignment based on only geometrically consistent correspondences thereby reducing the effect of background clutter. Third, we demonstrate that the proposed approach achieves state-of-the-art performance on multiple standard benchmarks for semantic alignment.Comment: In 2018 IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2018

arXiv.org e-Print Archive

Crossref

INRIA a CCSD electronic archive server

Heterogeneous Multi-task Learning for Human Pose Estimation with Deep Convolutional Neural Network

Author: Chan Antoni B.
Li Sijin
Liu Zhi-Qiang
Publication venue
Publication date: 01/01/2014
Field of study

We propose an heterogeneous multi-task learning framework for human pose estimation from monocular image with deep convolutional neural network. In particular, we simultaneously learn a pose-joint regressor and a sliding-window body-part detector in a deep network architecture. We show that including the body-part detection task helps to regularize the network, directing it to converge to a good solution. We report competitive and state-of-art results on several data sets. We also empirically show that the learned neurons in the middle layer of our network are tuned to localized body parts

arXiv.org e-Print Archive

CiteSeerX

Crossref

Multi-Person Pose Estimation with Local Joint-to-Person Associations

Author: A Newell
Adrian Bulat
D Tran
E Insafutdinov
M Andriluka
M Dantone
M Eichner
PF Felzenszwalb
Y Yang
Publication venue
Publication date: 31/08/2016
Field of study

Despite of the recent success of neural networks for human pose estimation, current approaches are limited to pose estimation of a single person and cannot handle humans in groups or crowds. In this work, we propose a method that estimates the poses of multiple persons in an image in which a person can be occluded by another person or might be truncated. To this end, we consider multi-person pose estimation as a joint-to-person association problem. We construct a fully connected graph from a set of detected joint candidates in an image and resolve the joint-to-person association and outlier detection using integer linear programming. Since solving joint-to-person association jointly for all persons in an image is an NP-hard problem and even approximations are expensive, we solve the problem locally for each person. On the challenging MPII Human Pose Dataset for multiple persons, our approach achieves the accuracy of a state-of-the-art method, but it is 6,000 to 19,000 times faster.Comment: Accepted to European Conference on Computer Vision (ECCV) Workshops, Crowd Understanding, 201

arXiv.org e-Print Archive

Crossref