557 research outputs found
Robust pedestrian detection and tracking in crowded scenes
In this paper, a robust computer vision approach to detecting and tracking pedestrians in unconstrained crowded scenes is presented. Pedestrian detection is performed via a 3D clustering process within a region-growing framework. The clustering process avoids using hard thresholds by using bio-metrically inspired constraints and a number of plan view statistics. Pedestrian tracking is achieved by formulating the track matching process as a weighted bipartite graph and using a Weighted Maximum Cardinality Matching scheme. The approach is evaluated using both indoor and outdoor sequences, captured using a variety of different camera placements and orientations, that feature significant challenges in terms of the number of pedestrians present, their interactions and scene lighting conditions. The evaluation is performed against a manually generated groundtruth for all sequences. Results point to the extremely accurate performance of the proposed approach in all cases
Radar-based Feature Design and Multiclass Classification for Road User Recognition
The classification of individual traffic participants is a complex task,
especially for challenging scenarios with multiple road users or under bad
weather conditions. Radar sensors provide an - with respect to well established
camera systems - orthogonal way of measuring such scenes. In order to gain
accurate classification results, 50 different features are extracted from the
measurement data and tested on their performance. From these features a
suitable subset is chosen and passed to random forest and long short-term
memory (LSTM) classifiers to obtain class predictions for the radar input.
Moreover, it is shown why data imbalance is an inherent problem in automotive
radar classification when the dataset is not sufficiently large. To overcome
this issue, classifier binarization is used among other techniques in order to
better account for underrepresented classes. A new method to couple the
resulting probabilities is proposed and compared to others with great success.
Final results show substantial improvements when compared to ordinary
multiclass classificationComment: 8 pages, 6 figure
Recommended from our members
A study on detection of risk factors of a toddler’s fall injuries using visual dynamic motion cues
This thesis was submitted for the degree of Doctor of Philosophy and awarded by Brunel University.The research in this thesis is intended to aid caregivers’ supervision of toddlers to prevent accidental injuries, especially injuries due to falls in the home environment. There have been very few attempts to develop an automatic system to tackle young children’s accidents despite the fact that they are particularly vulnerable to home accidents and a caregiver cannot give continuous supervision. Vision-based analysis methods have been developed to recognise toddlers’ fall risk factors related to changes in their behaviour or environment. First of all, suggestions to prevent fall events of young children at home were collected from well-known organisations for child safety. A large number of fall records of toddlers who had sought treatment at a hospital were analysed to identify a toddler’s fall risk factors. The factors include clutter being a tripping or slipping hazard on the floor and a toddler moving around or climbing furniture or room structures.
The major technical problem in detecting the risk factors is to classify foreground objects into human and non-human, and novel approaches have been proposed for the classification. Unlike most existing studies, which focus on human appearance such as skin colour for human detection, the approaches addressed in this thesis use cues related to dynamic motions. The first cue is based on the fact that there is relative motion between human body parts while typical indoor clutter does not have such parts with diverse motions. In addition, other motion cues are employed to differentiate a human from a pet since a pet also moves its parts diversely. They are angle changes of ellipse fitted to each object and history of its actual heights to capture the various posture changes and different body size of pets. The methods work well as long as foreground regions are correctly segmented
Video Surveillance Analysis as a Context for Embedded Systems and Artificial Intelligence Education
Video surveillance analysis is an exciting, active research area and an important industry application. It is a multidisciplinary field that draws on signal processing, embedded systems, and artificial intelligence topics, and is well suited to motivate student engagement in all of these areas. This paper describes the benefits of the convergence of these topics, presents a versatile video surveillance analysis process that can be used as the basis for many investigations, and presents two template exercises in tracking detected targets and in evaluating runtime efficiency. The processing chain consists of detecting changes in a scene and locating and characterizing the resulting targets. The analysis is illustrated for targets in outdoor scenes using a variety of classification features. Also, sample code for processing is included
Video foreground extraction for mobile camera platforms
Foreground object detection is a fundamental task in computer vision with many applications in areas such as object tracking, event identification, and behavior analysis. Most conventional foreground object detection methods work only in a stable illumination environments using fixed cameras. In real-world applications, however, it is often the case that the algorithm needs to operate under the following challenging conditions: drastic lighting changes, object shape complexity, moving cameras, low frame capture rates, and low resolution images. This thesis presents four novel approaches for foreground object detection on real-world datasets using cameras deployed on moving vehicles.The first problem addresses passenger detection and tracking tasks for public transport buses investigating the problem of changing illumination conditions and low frame capture rates. Our approach integrates a stable SIFT (Scale Invariant Feature Transform) background seat modelling method with a human shape model into a weighted Bayesian framework to detect passengers. To deal with the problem of tracking multiple targets, we employ the Reversible Jump Monte Carlo Markov Chain tracking algorithm. Using the SVM classifier, the appearance transformation models capture changes in the appearance of the foreground objects across two consecutives frames under low frame rate conditions. In the second problem, we present a system for pedestrian detection involving scenes captured by a mobile bus surveillance system. It integrates scene localization, foreground-background separation, and pedestrian detection modules into a unified detection framework. The scene localization module performs a two stage clustering of the video data.In the first stage, SIFT Homography is applied to cluster frames in terms of their structural similarity, and the second stage further clusters these aligned frames according to consistency in illumination. This produces clusters of images that are differential in viewpoint and lighting. A kernel density estimation (KDE) technique for colour and gradient is then used to construct background models for each image cluster, which is further used to detect candidate foreground pixels. Finally, using a hierarchical template matching approach, pedestrians can be detected.In addition to the second problem, we present three direct pedestrian detection methods that extend the HOG (Histogram of Oriented Gradient) techniques (Dalal and Triggs, 2005) and provide a comparative evaluation of these approaches. The three approaches include: a) a new histogram feature, that is formed by the weighted sum of both the gradient magnitude and the filter responses from a set of elongated Gaussian filters (Leung and Malik, 2001) corresponding to the quantised orientation, which we refer to as the Histogram of Oriented Gradient Banks (HOGB) approach; b) the codebook based HOG feature with branch-and-bound (efficient subwindow search) algorithm (Lampert et al., 2008) and; c) the codebook based HOGB approach.In the third problem, a unified framework that combines 3D and 2D background modelling is proposed to detect scene changes using a camera mounted on a moving vehicle. The 3D scene is first reconstructed from a set of videos taken at different times. The 3D background modelling identifies inconsistent scene structures as foreground objects. For the 2D approach, foreground objects are detected using the spatio-temporal MRF algorithm. Finally, the 3D and 2D results are combined using morphological operations.The significance of these research is that it provides basic frameworks for automatic large-scale mobile surveillance applications and facilitates many higher-level applications such as object tracking and behaviour analysis
Object and feature based modelling of attention in meeting and surveillance videos
MPhilThe aim of the thesis is to create and validate models of visual attention. To
this extent, a novel unsupervised object detection and tracking framework has been
developed by the author. It is demonstrated on people, faces and moving objects
and the output is integrated in modelling of visual attention. The proposed approach
integrates several types of modules in initialisation, target estimation and validation.
Tracking is rst used to introduce high-level features, by extending a popular model
based on low-level features[1]. Two automatic models of visual attention are further
implemented. One based on winner take it all and inhibition of return as the mech-
anisms of selection on a saliency model with high- and low-level features combined.
Another which is based only on high-level object tracking results and statistic proper-
ties from the collected eye-traces, with the possibility of activating inhibition of return
as an additional mechanism. The parameters of the tracking framework thoroughly
investigated and its success demonstrated. Eye-tracking experiments show that high-
level features are much better at explaining the allocation of attention by the subjects
in the study. Low-level features alone do correlate signi cantly with real allocation
of attention. However, in fact it lowers the correlation score when combined with
high-level features in comparison to using high-level features alone. Further, ndings
in collected eye-traces are studied with qualitative method, mainly to discover direc-
tions in future research in the area. Similarities and dissimilarities between automatic
models of attention and collected eye-traces are discusse
Recommended from our members
Camera-based measurement of cyclist motion
Heavy goods vehicles are overrepresented in cyclist fatality statistics in the United Kingdom relative to their proportion of total traffic volume. In particular, the statistics highlight a problem for vehicles turning left across the path of a cyclist on their inside. In this article, we present a camera-based system to detect and track cyclists in the blind spot. The system uses boosted classifiers and geometric constraints to detect cyclist wheels, and Canny edge detection to locate the ground contact point. The locations of these points are mapped into physical coordinates using a calibration system based on the ground plane. A Kalman Filter is used to track and predict the future motion of the cyclist. Full-scale tests were conducted using a construction vehicle fitted with two cameras, and the results compared with measurements from an ultrasonic-sensor system. Errors were comparable to the ultrasonic system, with average error standard deviation of 4.3 cm when the cyclist was 1.5 m from the heavy goods vehicles, and 7.1 cm at a distance of 1 m. When results were compared to manually extracted cyclist position data, errors were less than 4 cm at separations of 1.5 and 1 m. Compared to the ultrasonic system, the camera system requires simple hardware and can easily differentiate cyclists from stationary or moving background objects such as parked cars or roadside furniture. However, the cameras suffer from reduced robustness and accuracy at close range and cannot operate in low-light conditions. C. Eddy was supported by the UK Engineering and Physical Sciences Research Council (EPSRC). C.C. de Saxe was supported by the Cambridge Commonwealth, European and International Trust, UK, and the Council for Scientific and Industrial Research (CSIR), South Africa
Pedestrian Dynamics: Modeling and Analyzing Cognitive Processes and Traffic Flows to Evaluate Facility Service Level
Walking is the oldest and foremost mode of transportation through history and the prevalence of walking has increased. Effective pedestrian model is crucial to evaluate pedestrian facility service level and to enhance pedestrian safety, performance, and satisfaction. The objectives of this study were to: (1) validate the efficacy of utilizing queueing network model, which predicts cognitive information processing time and task performance; (2) develop a generalized queueing network based cognitive information processing model that can be utilized and applied to construct pedestrian cognitive structure and estimate the reaction time with the first moment of service time distribution; (3) investigate pedestrian behavior through naturalistic and experimental observations to analyze the effects of environment settings and psychological factors in pedestrians; and (4) develop pedestrian level of service (LOS) metrics that are quick and practical to identify improvement points in pedestrian facility design. Two empirical and two analytical studies were conducted to address the research objectives. The first study investigated the efficacy of utilizing queueing network in modeling and predicting the cognitive information processing time. Motion capture system was utilized to collect detailed pedestrian movement. The predicted reaction time using queueing network was compared with the results from the empirical study to validate the performance of the model. No significant difference between model and empirical results was found with respect to mean reaction time. The second study endeavored to develop a generalized queueing network system so the task can be modeled with the approximated queueing network and its first moment of any service time distribution. There was no significant difference between empirical study results and the proposed model with respect to mean reaction time. Third study investigated methods to quantify pedestrian traffic behavior, and analyze physical and cognitive behavior from the real-world observation and field experiment. Footage from indoor and outdoor corridor was used to quantify pedestrian behavior. Effects of environmental setting and/or psychological factor on travel performance were tested. Finally, adhoc and tailor-made LOS metrics were presented for simple realistic service level assessments. The proposed methodologies were composed of space revision LOS, delay-based LOS, preferred walking speed-based LOS, and ‘blocking probability’
- …