11,861 research outputs found
A system for learning statistical motion patterns
Analysis of motion patterns is an effective approach for anomaly detection and behavior prediction. Current approaches for the analysis of motion patterns depend on known scenes, where objects move in predefined ways. It is highly desirable to automatically construct object motion patterns which reflect the knowledge of the scene. In this paper, we present a system for automatically learning motion patterns for anomaly detection and behavior prediction based on a proposed algorithm for robustly tracking multiple objects. In the tracking algorithm, foreground pixels are clustered using a fast accurate fuzzy k-means algorithm. Growing and prediction of the cluster centroids of foreground pixels ensure that each cluster centroid is associated with a moving object in the scene. In the algorithm for learning motion patterns, trajectories are clustered hierarchically using spatial and temporal information and then each motion pattern is represented with a chain of Gaussian distributions. Based on the learned statistical motion patterns, statistical methods are used to detect anomalies and predict behaviors. Our system is tested using image sequences acquired, respectively, from a crowded real traffic scene and a model traffic scene. Experimental results show the robustness of the tracking algorithm, the efficiency of the algorithm for learning motion patterns, and the encouraging performance of algorithms for anomaly detection and behavior prediction
A system for learning statistical motion patterns
Analysis of motion patterns is an effective approach for anomaly detection and behavior prediction. Current approaches for the analysis of motion patterns depend on known scenes, where objects move in predefined ways. It is highly desirable to automatically construct object motion patterns which reflect the knowledge of the scene. In this paper, we present a system for automatically learning motion patterns for anomaly detection and behavior prediction based on a proposed algorithm for robustly tracking multiple objects. In the tracking algorithm, foreground pixels are clustered using a fast accurate fuzzy k-means algorithm. Growing and prediction of the cluster centroids of foreground pixels ensure that each cluster centroid is associated with a moving object in the scene. In the algorithm for learning motion patterns, trajectories are clustered hierarchically using spatial and temporal information and then each motion pattern is represented with a chain of Gaussian distributions. Based on the learned statistical motion patterns, statistical methods are used to detect anomalies and predict behaviors. Our system is tested using image sequences acquired, respectively, from a crowded real traffic scene and a model traffic scene. Experimental results show the robustness of the tracking algorithm, the efficiency of the algorithm for learning motion patterns, and the encouraging performance of algorithms for anomaly detection and behavior prediction
Vision-based toddler tracking at home
This paper presents a vision-based toddler tracking system for detecting risk factors of a toddler's fall within the home environment. The risk factors have environmental and behavioral aspects and the research in this paper focuses on the behavioral aspects. Apart from common image processing tasks such as background subtraction, the vision-based toddler tracking involves human classification, acquisition of motion and position information, and handling of regional merges and splits. The human classification is based on dynamic motion vectors of the human body. The center of mass of each contour is detected and connected with the closest center of mass in the next frame to obtain position, speed, and directional information. This tracking system is further enhanced by dealing with regional merges and splits due to multiple object occlusions. In order to identify the merges and splits, two directional detections of closest region centers are conducted between every two successive frames. Merges and splits of a single object due to errors in the background subtraction are also handled. The tracking algorithms have been developed, implemented and tested
Understanding Traffic Density from Large-Scale Web Camera Data
Understanding traffic density from large-scale web camera (webcam) videos is
a challenging problem because such videos have low spatial and temporal
resolution, high occlusion and large perspective. To deeply understand traffic
density, we explore both deep learning based and optimization based methods. To
avoid individual vehicle detection and tracking, both methods map the image
into vehicle density map, one based on rank constrained regression and the
other one based on fully convolution networks (FCN). The regression based
method learns different weights for different blocks in the image to increase
freedom degrees of weights and embed perspective information. The FCN based
method jointly estimates vehicle density map and vehicle count with a residual
learning framework to perform end-to-end dense prediction, allowing arbitrary
image resolution, and adapting to different vehicle scales and perspectives. We
analyze and compare both methods, and get insights from optimization based
method to improve deep model. Since existing datasets do not cover all the
challenges in our work, we collected and labelled a large-scale traffic video
dataset, containing 60 million frames from 212 webcams. Both methods are
extensively evaluated and compared on different counting tasks and datasets.
FCN based method significantly reduces the mean absolute error from 10.99 to
5.31 on the public dataset TRANCOS compared with the state-of-the-art baseline.Comment: Accepted by CVPR 2017. Preprint version was uploaded on
http://welcome.isr.tecnico.ulisboa.pt/publications/understanding-traffic-density-from-large-scale-web-camera-data
Unsupervised Segmentation of Action Segments in Egocentric Videos using Gaze
Unsupervised segmentation of action segments in egocentric videos is a
desirable feature in tasks such as activity recognition and content-based video
retrieval. Reducing the search space into a finite set of action segments
facilitates a faster and less noisy matching. However, there exist a
substantial gap in machine understanding of natural temporal cuts during a
continuous human activity. This work reports on a novel gaze-based approach for
segmenting action segments in videos captured using an egocentric camera. Gaze
is used to locate the region-of-interest inside a frame. By tracking two simple
motion-based parameters inside successive regions-of-interest, we discover a
finite set of temporal cuts. We present several results using combinations (of
the two parameters) on a dataset, i.e., BRISGAZE-ACTIONS. The dataset contains
egocentric videos depicting several daily-living activities. The quality of the
temporal cuts is further improved by implementing two entropy measures.Comment: To appear in 2017 IEEE International Conference On Signal and Image
Processing Application
- …