1,288 research outputs found

    Activity recognition and abnormality detection with the switching hidden semi-Markov model

    Get PDF
    This paper addresses the problem of learning and recognizing human activities of daily living (ADL), which is an important research issue in building a pervasive and smart environment. In dealing with ADL, we argue that it is beneficial to exploit both the inherent hierarchical organization of the activities and their typical duration. To this end, we introduce the Switching Hidden Semi-Markov Model (S-HSMM), a two-layered extension of the hidden semi-Markov model (HSMM) for the modeling task. Activities are modeled in the S-HSMM in two ways: the bottom layer represents atomic activities and their duration using HSMMs; the top layer represents a sequence of high-level activities where each high-level activity is made of a sequence of atomic activities. We consider two methods for modeling duration: the classic explicit duration model using multinomial distribution, and the novel use of the discrete Coxian distribution. In addition, we propose an effective scheme to detect abnormality without the need for training on abnormal data. Experimental results show that the S-HSMM performs better than existing models including the flat HSMM and the hierarchical hidden Markov model in both classification and abnormality detection tasks, alleviating the need for presegmented training data. Furthermore, our discrete Coxian duration model yields better computation time and generalization error than the classic explicit duration model

    Efficient duration and hierarchical modeling for human activity recognition

    Get PDF
    A challenge in building pervasive and smart spaces is to learn and recognize human activities of daily living (ADLs). In this paper, we address this problem and argue that in dealing with ADLs, it is beneficial to exploit both their typical duration patterns and inherent hierarchical structures. We exploit efficient duration modeling using the novel Coxian distribution to form the Coxian hidden semi-Markov model (CxHSMM) and apply it to the problem of learning and recognizing ADLs with complex temporal dependencies.The Coxian duration model has several advantages over existing duration parameterization using multinomial or exponential family distributions, including its denseness in the space of non negative distributions, low number of parameters, computational efficiency and the existence of closed-form estimation solutions. Further we combine both hierarchical and duration extensions of the hidden Markov model (HMM) to form the novel switching hidden semi-Markov model (SHSMM), and empirically compare its performance with existing models. The model can learn what an occupant normally does during the day from unsegmented training data and then perform online activity classification, segmentation and abnormality detection. Experimental results show that Coxian modeling outperforms a range of baseline models for the task of activity segmentation. We also achieve arecognition accuracy competitive to the current state-of-the-art multinomial duration model, while gaining a significant reduction in computation. Furthermore, cross-validation model selection on the number of phases K in the Coxian indicates that only a small Kis required to achieve the optimal performance. Finally, our models are further tested in a more challenging setting in which the tracking is often lost and the activities considerably overlap. With a small amount of labels supplied during training in a partially supervised learning mode, our models are again able to deliver reliable performance, again with a small number of phases, making our proposed framework an attractive choice for activity modeling

    Efficient duration modelling in the hierarchical hidden semi-Markov models and their applications

    Get PDF
    Modeling patterns in temporal data has arisen as an important problem in engineering and science. This has led to the popularity of several dynamic models, in particular the renowned hidden Markov model (HMM) [Rabiner, 1989]. Despite its widespread success in many cases, the standard HMM often fails to model more complex data whose elements are correlated hierarchically or over a long period. Such problems are, however, frequently encountered in practice. Existing efforts to overcome this weakness often address either one of these two aspects separately, mainly due to computational intractability. Motivated by this modeling challenge in many real world problems, in particular, for video surveillance and segmentation, this thesis aims to develop tractable probabilistic models that can jointly model duration and hierarchical information in a unified framework. We believe that jointly exploiting statistical strength from both properties will lead to more accurate and robust models for the needed task. To tackle the modeling aspect, we base our work on an intersection between dynamic graphical models and statistics of lifetime modeling. Realizing that the key bottleneck found in the existing works lies in the choice of the distribution for a state, we have successfully integrated the discrete Coxian distribution [Cox, 1955], a special class of phase-type distributions, into the HMM to form a novel and powerful stochastic model termed as the Coxian Hidden Semi-Markov Model (CxHSMM). We show that this model can still be expressed as a dynamic Bayesian network, and inference and learning can be derived analytically.Most importantly, it has four superior features over existing semi-Markov modelling: the parameter space is compact, computation is fast (almost the same as the HMM), close-formed estimation can be derived, and the Coxian is flexible enough to approximate a large class of distributions. Next, we exploit hierarchical decomposition in the data by borrowing analogy from the hierarchical hidden Markov model in [Fine et al., 1998, Bui et al., 2004] and introduce a new type of shallow structured graphical model that combines both duration and hierarchical modelling into a unified framework, termed the Coxian Switching Hidden Semi-Markov Models (CxSHSMM). The top layer is a Markov sequence of switching variables, while the bottom layer is a sequence of concatenated CxHSMMs whose parameters are determined by the switching variable at the top. Again, we provide a thorough analysis along with inference and learning machinery. We also show that semi-Markov models with arbitrary depth structure can easily be developed. In all cases we further address two practical issues: missing observations to unstable tracking and the use of partially labelled data to improve training accuracy. Motivated by real-world problems, our application contribution is a framework to recognize complex activities of daily livings (ADLs) and detect anomalies to provide better intelligent caring services for the elderly.Coarser activities with self duration distributions are represented using the CxHSMM. Complex activities are made of a sequence of coarser activities and represented at the top level in the CxSHSMM. Intensive experiments are conducted to evaluate our solutions against existing methods. In many cases, the superiority of the joint modeling and the Coxian parameterization over traditional methods is confirmed. The robustness of our proposed models is further demonstrated in a series of more challenging experiments, in which the tracking is often lost and activities considerably overlap. Our final contribution is an application of the switching Coxian model to segment education-oriented videos into coherent topical units. Our results again demonstrate such segmentation processes can benefit greatly from the joint modeling of duration and hierarchy

    A system for learning statistical motion patterns

    Get PDF
    Analysis of motion patterns is an effective approach for anomaly detection and behavior prediction. Current approaches for the analysis of motion patterns depend on known scenes, where objects move in predefined ways. It is highly desirable to automatically construct object motion patterns which reflect the knowledge of the scene. In this paper, we present a system for automatically learning motion patterns for anomaly detection and behavior prediction based on a proposed algorithm for robustly tracking multiple objects. In the tracking algorithm, foreground pixels are clustered using a fast accurate fuzzy k-means algorithm. Growing and prediction of the cluster centroids of foreground pixels ensure that each cluster centroid is associated with a moving object in the scene. In the algorithm for learning motion patterns, trajectories are clustered hierarchically using spatial and temporal information and then each motion pattern is represented with a chain of Gaussian distributions. Based on the learned statistical motion patterns, statistical methods are used to detect anomalies and predict behaviors. Our system is tested using image sequences acquired, respectively, from a crowded real traffic scene and a model traffic scene. Experimental results show the robustness of the tracking algorithm, the efficiency of the algorithm for learning motion patterns, and the encouraging performance of algorithms for anomaly detection and behavior prediction

    A system for learning statistical motion patterns

    Get PDF
    Analysis of motion patterns is an effective approach for anomaly detection and behavior prediction. Current approaches for the analysis of motion patterns depend on known scenes, where objects move in predefined ways. It is highly desirable to automatically construct object motion patterns which reflect the knowledge of the scene. In this paper, we present a system for automatically learning motion patterns for anomaly detection and behavior prediction based on a proposed algorithm for robustly tracking multiple objects. In the tracking algorithm, foreground pixels are clustered using a fast accurate fuzzy k-means algorithm. Growing and prediction of the cluster centroids of foreground pixels ensure that each cluster centroid is associated with a moving object in the scene. In the algorithm for learning motion patterns, trajectories are clustered hierarchically using spatial and temporal information and then each motion pattern is represented with a chain of Gaussian distributions. Based on the learned statistical motion patterns, statistical methods are used to detect anomalies and predict behaviors. Our system is tested using image sequences acquired, respectively, from a crowded real traffic scene and a model traffic scene. Experimental results show the robustness of the tracking algorithm, the efficiency of the algorithm for learning motion patterns, and the encouraging performance of algorithms for anomaly detection and behavior prediction

    Human behavior recognition with generic exponential family duration modeling in the hidden semi-Markov model

    Full text link
    The ability to learn and recognize human activities of daily living (ADLs) is important in building pervasive and smart environments. In this paper, we tackle this problem using the hidden semi-Markov model. We discuss the state-of-the-art duration modeling choices and then address a large class of exponential family distributions to model state durations. Inference and learning are efficiently addressed by providing a graphical representation for the model in terms of a dynamic Bayesian network (DBN). We investigate both discrete and continuous distributions from the exponential family (Poisson and Inverse Gaussian respectively) for the problem of learning and recognizing ADLs. A full comparison between the exponential family duration models and other existing models including the traditional multinomial and the new Coxian are also presented. Our work thus completes a thorough investigation into the aspect of duration modeling and its application to human activities recognition in a real-world smart home surveillance scenario.<br /

    Automated camera ranking and selection using video content and scene context

    Get PDF
    PhDWhen observing a scene with multiple cameras, an important problem to solve is to automatically identify “what camera feed should be shown and when?” The answer to this question is of interest for a number of applications and scenarios ranging from sports to surveillance. In this thesis we present a framework for the ranking of each video frame and camera across time and the camera network, respectively. This ranking is then used for automated video production. In the first stage information from each camera view and from the objects in it is extracted and represented in a way that allows for object- and frame-ranking. First objects are detected and ranked within and across camera views. This ranking takes into account both visible and contextual information related to the object. Then content ranking is performed based on the objects in the view and camera-network level information. We propose two novel techniques for content ranking namely: Routing Based Ranking (RBR) and Multivariate Gaussian based Ranking (MVG). In RBR we use a rule based framework where weighted fusion of object and frame level information takes place while in MVG the rank is estimated as a multivariate Gaussian distribution. Through experimental and subjective validation we demonstrate that the proposed content ranking strategies allows the identification of the best-camera at each time. The second part of the thesis focuses on the automatic generation of N-to-1 videos based on the ranked content. We demonstrate that in such production settings it is undesirable to have frequent inter-camera switching. Thus motivating the need for a compromise, between selecting the best camera most of the time and minimising the frequent inter-camera switching, we demonstrate that state-of-the-art techniques for this task are inadequate and fail in dynamic scenes. We propose three novel methods for automated camera selection. The first method (¡go f ) performs a joint optimization of a cost function that depends on both the view quality and inter-camera switching so that a i Abstract ii pleasing best-view video sequence can be composed. The other two methods (¡dbn and ¡util) include the selection decision into the ranking-strategy. In ¡dbn we model the best-camera selection as a state sequence via Directed Acyclic Graphs (DAG) designed as a Dynamic Bayesian Network (DBN), which encodes the contextual knowledge about the camera network and employs the past information to minimize the inter camera switches. In comparison ¡util utilizes the past as well as the future information in a Partially Observable Markov Decision Process (POMDP) where the camera-selection at a certain time is influenced by the past information and its repercussions in the future. The performance of the proposed approach is demonstrated on multiple real and synthetic multi-camera setups. We compare the proposed architectures with various baseline methods with encouraging results. The performance of the proposed approaches is also validated through extensive subjective testing
    corecore