14 research outputs found

    The NIV Shorter Concordance

    No full text

    Guiding visual surveillance by tracking human attention

    No full text
    We describe a novel method for directing the attention of an automated surveillance system. Our starting premise is that the attention of people in a scene can be used as an indicator of interesting areas and events. To determine people’s attention from passive visual observations we develop a system for automatic tracking and detection of individual heads to infer their gaze direction. The former is achieved by combining a histogram of oriented gradient (HOG) based head detector with frame-to-frame tracking using multiple point features to provide stable head images. The latter is achieved using a head pose classification method which uses randomised ferns with decision branches based on both HOG and colour based features to determine a coarse gaze direction for each person in the scene. By building both static and temporally varying maps of areas where people look we are able to identify interesting regions.Ben Benfold, Ian Reidhttp://www.bmva.org/bmvc/2009/index.ht

    Stable multi-target tracking in real-time surveillance video

    No full text
    The majority of existing pedestrian trackers concentrate on maintaining the identities of targets, however systems for remote biometric analysis or activity recognition in surveillance video often require stable bounding-boxes around pedestrians rather than approximate locations. We present a multi-target tracking system that is designed specifically for the provision of stable and accurate head location estimates. By performing data association over a sliding window of frames, we are able to correct many data association errors and fill in gaps where observations are missed. The approach is multi-threaded and combines asynchronous HOG detections with simultaneous KLT tracking and Markov-Chain Monte-Carlo Data Association (MCM-CDA) to provide guaranteed real-time tracking in high definition video. Where previous approaches have used ad-hoc models for data association, we use a more principled approach based on a Minimal Description Length (MDL) objective which accurately models the affinity between observations. We demonstrate by qualitative and quantitative evaluation that the system is capable of providing precise location estimates for large crowds of pedestrians in real-time. To facilitate future performance comparisons, we make a new dataset with hand annotated ground truth head locations publicly available.Ben Benfold and Ian Reidhttp://cvpr2011.org/index.htm

    Unsupervised learning of a scene-specific coarse gaze estimator

    No full text
    We present a method to estimate the coarse gaze directions of people from surveillance data. Unlike previous work we aim to do this without recourse to a large hand-labelled corpus of training data. In contrast we propose a method for learning a classifier without any hand labelled data using only the output from an automatic tracking system. A Conditional Random Field is used to model the interactions between the head motion, walking direction, and appearance to recover the gaze directions and simultaneously train randomised decision tree classifiers. Experiments demonstrate performance exceeding that of conventionally trained classifiers on two large surveillance datasets.Ben Benfold and Ian Rei

    Gaze directed camera control for face image acquisition

    No full text
    Face recognition in surveillance situations usually requires high resolution face images to be captured from remote active cameras. Since the recognition accuracy is typically a function of the face direction with frontal faces more likely to lead to reliable recognition we propose a system which optimises the capturing of such images by using coarse gaze estimates from a static camera. By considering the potential information gain from observing each target, our system automatically sets the pan, tilt and zoom values (i.e. the field of view) of multiple cameras observing different tracked targets in order to maximise the likelihood of correct identification. The expected gain in information is influenced by the controllable field of view, and by the false positive and negative rates of the identification process, which are in turn a function of the gaze angle. We validate the approach using a combination of simulated situations and real tracking output to demonstrate superior performance over alternative approaches, notably using no gaze information, or using gaze inferred from direction of travel (i.e. assuming each person is always looking directly ahead).We also show results from a live implementation with a static camera and two pan-tilt-zoom devices, involving real-time tracking, processing and control.Eric Sommerlade, Ben Benfold and Ian Rei

    Understanding interactions and guiding visual surveillance by tracking attention

    No full text
    The central tenet of this paper is that by determining where people are looking, other tasks involved with understanding and interrogating a scene are simplified. To this end we describe a fully automatic method to determine a person’s attention based on real-time visual tracking of their head and a coarse classification of their head pose. We estimate the head pose, or coarse gaze, using randomised ferns with decision branches based on both histograms of gradient orientations and colour based features. We use the coarse gaze for three applications to demonstrate its value: (i) we show how by building static and temporally varying maps of areas where people look we are able to identify interesting regions; (ii) we show how by determining the gaze of people in the scene we can more effectively control a multi-camera surveillance system to acquire faces for identification; (iii) we show how by identifying where people are looking we can more effectively classify human interactions.Ian Reid, Ben Benfold, Alonso Patron, and Eric Sommerlad

    Cognitive visual tracking and camera control

    No full text
    Cognitive visual tracking is the process of observing and understanding the behavior of a moving person. This paper presents an efficient solution to extract, in real-time, high-level information from an observed scene, and generate the most appropriate commands for a set of pan-tilt-zoom (PTZ) cameras in a surveillance scenario. Such a high-level feedback control loop, which is the main novelty of our work, will serve to reduce uncertainties in the observed scene and to maximize the amount of information extracted from it. It is implemented with a distributed camera system using SQL tables as virtual communication channels, and Situation Graph Trees for knowledge representation, inference and high-level camera control. A set of experiments in a surveillance scenario show the effectiveness of our approach and its potential for real applications of cognitive vision. © 2011 Elsevier Inc. All rights reserved

    A distributed camera system for multi-resolution surveillance

    No full text
    We describe an architecture for a multi-camera, multi-resolution surveillance system. The aim is to support a set of distributed static and pan-tilt-zoom (PTZ) cameras and visual tracking algorithms, together with a central supervisor unit. Each camera (and possibly pan-tilt device) has a dedicated process and processor. Asynchronous interprocess communications and archiving of data are achieved in a simple and effective way via a central repository, implemented using an SQL database. Visual tracking data from static views are stored dynamically into tables in the database via client calls to the SQL server. A supervisor process running on the SQL server determines if active zoom cameras should be dispatched to observe a particular target, and this message is effected via writing demands into another database table. We show results from a real implementation of the system comprising one static camera overviewing the environment under consideration and a PTZ camera operating under closed-loop velocity control, which uses a fast and robust level-set-based region tracker. Experiments demonstrate the effectiveness of our approach and its feasibility to multi-camera systems for intelligent surveillance. © 2009 IEEE
    corecore