1,996 research outputs found

    A Deep-structured Conditional Random Field Model for Object Silhouette Tracking

    Full text link
    In this work, we introduce a deep-structured conditional random field (DS-CRF) model for the purpose of state-based object silhouette tracking. The proposed DS-CRF model consists of a series of state layers, where each state layer spatially characterizes the object silhouette at a particular point in time. The interactions between adjacent state layers are established by inter-layer connectivity dynamically determined based on inter-frame optical flow. By incorporate both spatial and temporal context in a dynamic fashion within such a deep-structured probabilistic graphical model, the proposed DS-CRF model allows us to develop a framework that can accurately and efficiently track object silhouettes that can change greatly over time, as well as under different situations such as occlusion and multiple targets within the scene. Experiment results using video surveillance datasets containing different scenarios such as occlusion and multiple targets showed that the proposed DS-CRF approach provides strong object silhouette tracking performance when compared to baseline methods such as mean-shift tracking, as well as state-of-the-art methods such as context tracking and boosted particle filtering.Comment: 17 page

    Complexity-Aware Assignment of Latent Values in Discriminative Models for Accurate Gesture Recognition

    Full text link
    Many of the state-of-the-art algorithms for gesture recognition are based on Conditional Random Fields (CRFs). Successful approaches, such as the Latent-Dynamic CRFs, extend the CRF by incorporating latent variables, whose values are mapped to the values of the labels. In this paper we propose a novel methodology to set the latent values according to the gesture complexity. We use an heuristic that iterates through the samples associated with each label value, stimating their complexity. We then use it to assign the latent values to the label values. We evaluate our method on the task of recognizing human gestures from video streams. The experiments were performed in binary datasets, generated by grouping different labels. Our results demonstrate that our approach outperforms the arbitrary one in many cases, increasing the accuracy by up to 10%.Comment: Conference paper published at 2016 29th SIBGRAPI, Conference on Graphics, Patterns and Images (SIBGRAPI). 8 pages, 7 figure

    A probabilistic model for assistive robotics devices to support activities of daily living

    Full text link
    University of Technology, Sydney. Faculty of Engineering and Information Technology.This thesis explores probabilistic techniques to model interactions between humans and robotic devices. The work is motivated by the rapid increase in the ageing population and the role that assistive robotic devices can play in maintaining independence and quality of life as assistants and/or companions for these communities. While there are substantial social and ethical implications in this pursuit, it is advocated that robotic systems are bound to acquire more sophisticated assistive capabilities if they are to operate in unstructured, dynamic, human-centred environments, responsive to the needs of their human operators. Such cognitive assistive systems postulate advances along the complete processing pipeline, from sensing, to anticipating user actions and environmental changes, and to delivering natural supportive actuation. Within the boundaries of the human-robot interaction context, it can be expected that acute awareness of human intentions plays a key role in delivering practical assistive actions. This work is thereby focused on the human behaviours likely to result from merging sensed human-robot interactions and the learning gained from past experiences, proposing a framework that facilitates the path towards integrating tightly knit human-robot interaction models. Human behaviour is complex in nature and interactions with the environment and other objects occur in different and unpredictable ways. Moreover, observed sensory data is often incomplete and noisy. Inferring human intention is thus a challenging problem. This work defends the thesis that in many real-world scenarios these complex behaviours can be naturally simplified by decomposing them into smaller activities, so that their temporal dependencies can be learned more efficiently with the aid of probabilistic hierarchical models. To that end, a strategy is devised in the first part of the thesis to efficiently represent human Activities of Daily Living, or ADLs, by decomposing them into a flexible semantic structure of “Action Primitives” (APs), atomic actions which are proven able to encapsulate complex activities when combined within a temporal probabilistic framework at multiple levels of abstraction. A Hierarchical Hidden Markov Model (HHMM) is proposed as a powerful tool capable of modelling and learning these complex and uncertain human behaviours using knowledge gained from past interactions. The ADLs performed by humans consist of a variety of complex locomotion-related tasks, as well as activities that involve grasping and manipulation of objects used in everyday life. Two widely used devices that provide assistance to users with mobility impairments while carrying out their ADLs, a power walker and a robotic wheelchair, are instrumented and used to model patterns of navigational activities (i.e. visiting location of interest), as well as some additional platform-specific support activities (e.g. standing up using the support of assistive walker). Human indications while performing these activities are captured using low-level sensing fitted on the mobility devices (e.g. strain gauges, laser range finders). Grasping and manipulations related ADLs are modelled using data captured from a stream of video images, where data comprises of hand-object interactions and their motion in 3D space. The inference accuracy of the proposed framework in predicting APs and recognising long term user intentions is compared with traditional discriminative models (sequential Support Vector Machines (SVM)), other generative models (layered Dynamic Bayesian Networks (DBN)), and combinations thereof, to provide a complete picture that highlights the benefits of the proposed approach. Results from real data collected from a set of trials conducted by actor users demonstrate that all techniques are able to predict APs with good accuracies, yet successful inference of long term tasks is substantially reduced in the case of the layered DBN and SVM models. These findings validate the thesis’ proposal that the combination of decomposing tasks at multiple levels and exploiting their inherent temporal nature plays a critical role in predicting complex interactive tasks
    • …
    corecore