1,890 research outputs found

    Visual Human Tracking and Group Activity Analysis: A Video Mining System for Retail Marketing

    Get PDF
    Thesis (PhD) - Indiana University, Computer Sciences, 2007In this thesis we present a system for automatic human tracking and activity recognition from video sequences. The problem of automated analysis of visual information in order to derive descriptors of high level human activities has intrigued computer vision community for decades and is considered to be largely unsolved. A part of this interest is derived from the vast range of applications in which such a solution may be useful. We attempt to find efficient formulations of these tasks as applied to the extracting customer behavior information in a retail marketing context. Based on these formulations, we present a system that visually tracks customers in a retail store and performs a number of activity analysis tasks based on the output from the tracker. In tracking we introduce new techniques for pedestrian detection, initialization of the body model and a formulation of the temporal tracking as a global trans-dimensional optimization problem. Initial human detection is addressed by a novel method for head detection, which incorporates the knowledge of the camera projection model.The initialization of the human body model is addressed by newly developed shape and appearance descriptors. Temporal tracking of customer trajectories is performed by employing a human body tracking system designed as a Bayesian jump-diffusion filter. This approach demonstrates the ability to overcome model dimensionality ambiguities as people are leaving and entering the scene. Following the tracking, we developed a two-stage group activity formulation based upon the ideas from swarming research. For modeling purposes, all moving actors in the scene are viewed here as simplistic agents in the swarm. This allows to effectively define a set of inter-agent interactions, which combine to derive a distance metric used in further swarm clustering. This way, in the first stage the shoppers that belong to the same group are identified by deterministically clustering bodies to detect short term events and in the second stage events are post-processed to form clusters of group activities with fuzzy memberships. Quantitative analysis of the tracking subsystem shows an improvement over the state of the art methods, if used under similar conditions. Finally, based on the output from the tracker, the activity recognition procedure achieves over 80% correct shopper group detection, as validated by the human generated ground truth results

    Human Activity Recognition System Based-on Sequential Logic Circuits and Statistical Models

    Get PDF
    this research proposed the human activityrecognition system that described complete flow of processes fromlowest process (dealing with images) to highest process (recognizehuman activity). We proposed human action recognition thatmanage image sequence then recognize human action with simplehuman model by model-based recognition technique. Theexperimental result shows good accuracy which up to 93%correctly recognized. We proposed the human activity processwith 3 methods that consecutive improved. All of those methodscan use the result of action recognition as inputs. First method isFSM recognizer. The human model in Finite State Machine (FSM)recognizer can be modeled by rational condition that make it easyto understand and consume low computation cost but it hard todefine complex activity condition so it is unsuitable method forcomplex activity. The second recognizer applied Hidden MarkovModel (HMM) for activity modeling. The HMM recognizer candealing with much more complex activity and give fair recognitionrate. However, HMM recognizer is not involve feature prioritythat should has effect to accuracy so we proposed the thirdrecognizer that used graph similarity measurement for activitymodeling and activity classification. The third one, GraphSimilarity Measurement (GSM) recognizer involved featurepriority for recognition method then show better result thanHMM in most measurement. GSM recognizer has ~84% accuracyin average. FSM recognizer is suitable for simple activity with lowcomputation cost while HMM is suitable for much more complexactivity and use single feature for recognition process. However,HMM method may not give best result for the activity that usemultiple features. GSM is also suitable for complex activity and,furthermore, give better result than HMM for the activity thattrained from multiple features

    A review on intelligent monitoring and activity interpretation

    Get PDF
    This survey paper provides a tour of the various monitoring and activity interpretation frameworks found in the literature. The needs of monitoring and interpretation systems are presented in relation to the area where they have been developed or applied. Their evolution is studied to better understand the characteristics of current systems. After this, the main features of monitoring and activity interpretation systems are defined.Este trabajo presenta una revisión de los marcos de trabajo para monitorización e interpretación de actividades presentes en la literatura. Dependiendo del área donde dichos marcos se han desarrollado o aplicado, se han identificado diferentes necesidades. Además, para comprender mejor las particularidades de los marcos de trabajo, esta revisión realiza un recorrido por su evolución histórica. Posteriormente, se definirían las principales características de los sistemas de monitorización e interpretación de actividades.This work was partially supported by Spanish Ministerio de Economía y Competitividad / FEDER under DPI2016-80894-R grant

    Deep Learning for Semantic Video Understanding

    Get PDF
    The field of computer vision has long strived to extract understanding from images and videos sequences. The recent flood of video data along with massive increments in computing power have provided the perfect environment to generate advanced research to extract intelligence from video data. Video data is ubiquitous, occurring in numerous everyday activities such as surveillance, traffic, movies, sports, etc. This massive amount of video needs to be analyzed and processed efficiently to extract semantic features towards video understanding. Such capabilities could benefit surveillance, video analytics and visually challenged people. While watching a long video, humans have the uncanny ability to bypass unnecessary information and concentrate on the important events. These key events can be used as a higher-level description or summary of a long video. Inspired by the human visual cortex, this research affords such abilities in computers using neural networks. Useful or interesting events are first extracted from a video and then deep learning methodologies are used to extract natural language summaries for each video sequence. Previous approaches of video description either have been domain specific or use a template based approach to fill detected objects such as verbs or actions to constitute a grammatically correct sentence. This work involves exploiting temporal contextual information for sentence generation while working on wide domain datasets. Current state-of- the-art video description methodologies are well suited for small video clips whereas this research can also be applied to long sequences of video. This work proposes methods to generate visual summaries of long videos, and in addition proposes techniques to annotate and generate textual summaries of the videos using recurrent networks. End to end video summarization immensely depends on abstractive summarization of video descriptions. State-of- the-art neural language & attention joint models have been used to generate textual summaries. Interesting segments of long video are extracted based on image quality as well as cinematographic and consumer preference. This novel approach will be a stepping stone for a variety of innovative applications such as video retrieval, automatic summarization for visually impaired persons, automatic movie review generation, video question and answering systems

    A Review on Intelligent Monitoring and Activity Interpretation

    Get PDF

    Challenges of Big Data Analysis

    Full text link
    Big Data bring new opportunities to modern society and challenges to data scientists. On one hand, Big Data hold great promises for discovering subtle population patterns and heterogeneities that are not possible with small-scale data. On the other hand, the massive sample size and high dimensionality of Big Data introduce unique computational and statistical challenges, including scalability and storage bottleneck, noise accumulation, spurious correlation, incidental endogeneity, and measurement errors. These challenges are distinguished and require new computational and statistical paradigm. This article give overviews on the salient features of Big Data and how these features impact on paradigm change on statistical and computational methods as well as computing architectures. We also provide various new perspectives on the Big Data analysis and computation. In particular, we emphasis on the viability of the sparsest solution in high-confidence set and point out that exogeneous assumptions in most statistical methods for Big Data can not be validated due to incidental endogeneity. They can lead to wrong statistical inferences and consequently wrong scientific conclusions

    Articulated human tracking and behavioural analysis in video sequences

    Get PDF
    Recently, there has been a dramatic growth of interest in the observation and tracking of human subjects through video sequences. Arguably, the principal impetus has come from the perceived demand for technological surveillance, however applications in entertainment, intelligent domiciles and medicine are also increasing. This thesis examines human articulated tracking and the classi cation of human movement, rst separately and then as a sequential process. First, this thesis considers the development and training of a 3D model of human body structure and dynamics. To process video sequences, an observation model is also designed with a multi-component likelihood based on edge, silhouette and colour. This is de ned on the articulated limbs, and visible from a single or multiple cameras, each of which may be calibrated from that sequence. Second, for behavioural analysis, we develop a methodology in which actions and activities are described by semantic labels generated from a Movement Cluster Model (MCM). Third, a Hierarchical Partitioned Particle Filter (HPPF) was developed for human tracking that allows multi-level parameter search consistent with the body structure. This tracker relies on the articulated motion prediction provided by the MCM at pose or limb level. Fourth, tracking and movement analysis are integrated to generate a probabilistic activity description with action labels. The implemented algorithms for tracking and behavioural analysis are tested extensively and independently against ground truth on human tracking and surveillance datasets. Dynamic models are shown to predict and generate synthetic motion, while MCM recovers both periodic and non-periodic activities, de ned either on the whole body or at the limb level. Tracking results are comparable with the state of the art, however the integrated behaviour analysis adds to the value of the approach.Overseas Research Students Awards Scheme (ORSAS
    corecore