13,976 research outputs found

    A semantic feature for human motion retrieval

    Get PDF
    With the explosive growth of motion capture data, it becomes very imperative in animation production to have an efficient search engine to retrieve motions from large motion repository. However, because of the high dimension of data space and complexity of matching methods, most of the existing approaches cannot return the result in real time. This paper proposes a high level semantic feature in a low dimensional space to represent the essential characteristic of different motion classes. On the basis of the statistic training of Gauss Mixture Model, this feature can effectively achieve motion matching on both global clip level and local frame level. Experiment results show that our approach can retrieve similar motions with rankings from large motion database in real-time and also can make motion annotation automatically on the fly. Copyright © 2013 John Wiley & Sons, Ltd

    Human motion retrieval based on freehand sketch

    Get PDF
    In this paper, we present an integrated framework of human motion retrieval based on freehand sketch. With some simple rules, the user can acquire a desired motion by sketching several key postures. To retrieve efficiently and accurately by sketch, the 3D postures are projected onto several 2D planes. The limb direction feature is proposed to represent the input sketch and the projected-postures. Furthermore, a novel index structure based on k-d tree is constructed to index the motions in the database, which speeds up the retrieval process. With our posture-by-posture retrieval algorithm, a continuous motion can be got directly or generated by using a pre-computed graph structure. What's more, our system provides an intuitive user interface. The experimental results demonstrate the effectiveness of our method. © 2014 John Wiley & Sons, Ltd

    A framework for improving the performance of verification algorithms with a low false positive rate requirement and limited training data

    Full text link
    In this paper we address the problem of matching patterns in the so-called verification setting in which a novel, query pattern is verified against a single training pattern: the decision sought is whether the two match (i.e. belong to the same class) or not. Unlike previous work which has universally focused on the development of more discriminative distance functions between patterns, here we consider the equally important and pervasive task of selecting a distance threshold which fits a particular operational requirement - specifically, the target false positive rate (FPR). First, we argue on theoretical grounds that a data-driven approach is inherently ill-conditioned when the desired FPR is low, because by the very nature of the challenge only a small portion of training data affects or is affected by the desired threshold. This leads us to propose a general, statistical model-based method instead. Our approach is based on the interpretation of an inter-pattern distance as implicitly defining a pattern embedding which approximately distributes patterns according to an isotropic multi-variate normal distribution in some space. This interpretation is then used to show that the distribution of training inter-pattern distances is the non-central chi2 distribution, differently parameterized for each class. Thus, to make the class-specific threshold choice we propose a novel analysis-by-synthesis iterative algorithm which estimates the three free parameters of the model (for each class) using task-specific constraints. The validity of the premises of our work and the effectiveness of the proposed method are demonstrated by applying the method to the task of set-based face verification on a large database of pseudo-random head motion videos.Comment: IEEE/IAPR International Joint Conference on Biometrics, 201

    Real-time motion data annotation via action string

    Get PDF
    Even though there is an explosive growth of motion capture data, there is still a lack of efficient and reliable methods to automatically annotate all the motions in a database. Moreover, because of the popularity of mocap devices in home entertainment systems, real-time human motion annotation or recognition becomes more and more imperative. This paper presents a new motion annotation method that achieves both the aforementioned two targets at the same time. It uses a probabilistic pose feature based on the Gaussian Mixture Model to represent each pose. After training a clustered pose feature model, a motion clip could be represented as an action string. Then, a dynamic programming-based string matching method is introduced to compare the differences between action strings. Finally, in order to achieve the real-time target, we construct a hierarchical action string structure to quickly label each given action string. The experimental results demonstrate the efficacy and efficiency of our method

    Review of Person Re-identification Techniques

    Full text link
    Person re-identification across different surveillance cameras with disjoint fields of view has become one of the most interesting and challenging subjects in the area of intelligent video surveillance. Although several methods have been developed and proposed, certain limitations and unresolved issues remain. In all of the existing re-identification approaches, feature vectors are extracted from segmented still images or video frames. Different similarity or dissimilarity measures have been applied to these vectors. Some methods have used simple constant metrics, whereas others have utilised models to obtain optimised metrics. Some have created models based on local colour or texture information, and others have built models based on the gait of people. In general, the main objective of all these approaches is to achieve a higher-accuracy rate and lowercomputational costs. This study summarises several developments in recent literature and discusses the various available methods used in person re-identification. Specifically, their advantages and disadvantages are mentioned and compared.Comment: Published 201

    Unsupervised Video Understanding by Reconciliation of Posture Similarities

    Full text link
    Understanding human activity and being able to explain it in detail surpasses mere action classification by far in both complexity and value. The challenge is thus to describe an activity on the basis of its most fundamental constituents, the individual postures and their distinctive transitions. Supervised learning of such a fine-grained representation based on elementary poses is very tedious and does not scale. Therefore, we propose a completely unsupervised deep learning procedure based solely on video sequences, which starts from scratch without requiring pre-trained networks, predefined body models, or keypoints. A combinatorial sequence matching algorithm proposes relations between frames from subsets of the training data, while a CNN is reconciling the transitivity conflicts of the different subsets to learn a single concerted pose embedding despite changes in appearance across sequences. Without any manual annotation, the model learns a structured representation of postures and their temporal development. The model not only enables retrieval of similar postures but also temporal super-resolution. Additionally, based on a recurrent formulation, next frames can be synthesized.Comment: Accepted by ICCV 201

    Increasing the Efficiency of 6-DoF Visual Localization Using Multi-Modal Sensory Data

    Full text link
    Localization is a key requirement for mobile robot autonomy and human-robot interaction. Vision-based localization is accurate and flexible, however, it incurs a high computational burden which limits its application on many resource-constrained platforms. In this paper, we address the problem of performing real-time localization in large-scale 3D point cloud maps of ever-growing size. While most systems using multi-modal information reduce localization time by employing side-channel information in a coarse manner (eg. WiFi for a rough prior position estimate), we propose to inter-weave the map with rich sensory data. This multi-modal approach achieves two key goals simultaneously. First, it enables us to harness additional sensory data to localise against a map covering a vast area in real-time; and secondly, it also allows us to roughly localise devices which are not equipped with a camera. The key to our approach is a localization policy based on a sequential Monte Carlo estimator. The localiser uses this policy to attempt point-matching only in nodes where it is likely to succeed, significantly increasing the efficiency of the localization process. The proposed multi-modal localization system is evaluated extensively in a large museum building. The results show that our multi-modal approach not only increases the localization accuracy but significantly reduces computational time.Comment: Presented at IEEE-RAS International Conference on Humanoid Robots (Humanoids) 201
    corecore