25 research outputs found

    Action Classification with Locality-constrained Linear Coding

    Full text link
    We propose an action classification algorithm which uses Locality-constrained Linear Coding (LLC) to capture discriminative information of human body variations in each spatiotemporal subsequence of a video sequence. Our proposed method divides the input video into equally spaced overlapping spatiotemporal subsequences, each of which is decomposed into blocks and then cells. We use the Histogram of Oriented Gradient (HOG3D) feature to encode the information in each cell. We justify the use of LLC for encoding the block descriptor by demonstrating its superiority over Sparse Coding (SC). Our sequence descriptor is obtained via a logistic regression classifier with L2 regularization. We evaluate and compare our algorithm with ten state-of-the-art algorithms on five benchmark datasets. Experimental results show that, on average, our algorithm gives better accuracy than these ten algorithms.Comment: ICPR 201

    Collaborative voting of 3D features for robust gesture estimation

    Get PDF
    © 2017 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes,creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.Human body analysis raises special interest because it enables a wide range of interactive applications. In this paper we present a gesture estimator that discriminates body poses in depth images. A novel collaborative method is proposed to learn 3D features of the human body and, later, to estimate specific gestures. The collaborative estimation framework is inspired by decision forests, where each selected point (anchor point) contributes to the estimation by casting votes. The main idea is to detect a body part by accumulating the inference of other trained body parts. The collaborative voting encodes the global context of human pose, while 3D features represent local appearance. Body parts contributing to the detection are interpreted as a voting process. Experimental results for different 3D features prove the validity of the proposed algorithm.Peer ReviewedPostprint (author's final draft

    3D facial geometric features for constrained local model

    Get PDF
    We propose a 3D Constrained Local Model framework for deformable face alignment in depth image. Our framework exploits the intrinsic 3D geometric information in depth data by utilizing robust histogram-based 3D geometric features that are based on normal vectors. In addition, we demonstrate the fusion of intensity data and 3D features that further improves the facial landmark localization accuracy. The experiments are conducted on publicly available FRGC database. The results show that our 3D features based CLM completely outperforms the raw depth features based CLM in term of fitting accuracy and robustness, and the fusion of intensity and 3D depth feature further improves the performance. Another benefit is that the proposed 3D features in our framework do not require any pre-processing procedure on the data

    Toward a New Approach in Fruit Recognition using Hybrid RGBD Features and Fruit Hierarchy Property

    Get PDF
    We present hierarchical multi-feature classification (HMC) system for multiclass fruit recognition problem. Our approach to HMC exploits the advantages of combining multimodal features  and  the  fruit  hierarchy  property.  In  the construction of hybrid features, we take the advantage of using color feature in the fruit recognition problem and combine it with 3D shape feature of depth channel of RGBD (Red, Green, Blue, Depth) images. Meanwhile, given a set of fruit species and variety, with a preexisting hierarchy among them, we consider the problem of assigning images to one of these fruit variety from the point of view of a hierarchy. We report on computational experiment using this approach. We show that the use of hierarchy structure along with hybrid RGBD features can improve the classification performance

    Articulated Clinician Detection Using 3D Pictorial Structures on RGB-D Data

    Full text link
    Reliable human pose estimation (HPE) is essential to many clinical applications, such as surgical workflow analysis, radiation safety monitoring and human-robot cooperation. Proposed methods for the operating room (OR) rely either on foreground estimation using a multi-camera system, which is a challenge in real ORs due to color similarities and frequent illumination changes, or on wearable sensors or markers, which are invasive and therefore difficult to introduce in the room. Instead, we propose a novel approach based on Pictorial Structures (PS) and on RGB-D data, which can be easily deployed in real ORs. We extend the PS framework in two ways. First, we build robust and discriminative part detectors using both color and depth images. We also present a novel descriptor for depth images, called histogram of depth differences (HDD). Second, we extend PS to 3D by proposing 3D pairwise constraints and a new method that makes exact inference tractable. Our approach is evaluated for pose estimation and clinician detection on a challenging RGB-D dataset recorded in a busy operating room during live surgeries. We conduct series of experiments to study the different part detectors in conjunction with the various 2D or 3D pairwise constraints. Our comparisons demonstrate that 3D PS with RGB-D part detectors significantly improves the results in a visually challenging operating environment.Comment: The supplementary video is available at https://youtu.be/iabbGSqRSg

    Depth sensor based object detection using surface curvature

    Get PDF
    An object detection system finds objects from an image or video sequence of the real world. The good performance of object detection has been largely driven by the development of well-established robust feature sets. By using conventional color images as input, researchers have achieved major success. Recent dramatic advances in depth imaging technology triggered significant attention to revisit object detection problems using depth images as input. Using depth information, we propose a feature, Histogram of Oriented Curvature (HOC), designed specifically to capture local surface shape for object detection with depth sensor. We form the HOC feature as a concatenation of the local histograms of Gaussian curvature and mean curvature. The linear Support Vector Machine (SVM) is employed for the object detection task in this work. We evaluate our proposed HOC feature on two widely used datasets and compare the results with other well-known object detection methods applied on both RGB images and depth images. Our experimental results show that the proposed HOC feature generally outperform the HOG and HOGD features in object detection task, and can achieve similar or higher results compared with the state-of-the-art depth descriptor HONV on some object categories

    Active nonrigid ICP algorithm

    No full text
    © 2015 IEEE.The problem of fitting a 3D facial model to a 3D mesh has received a lot of attention the past 15-20 years. The majority of the techniques fit a general model consisting of a simple parameterisable surface or a mean 3D facial shape. The drawback of this approach is that is rather difficult to describe the non-rigid aspect of the face using just a single facial model. One way to capture the 3D facial deformations is by means of a statistical 3D model of the face or its parts. This is particularly evident when we want to capture the deformations of the mouth region. Even though statistical models of face are generally applied for modelling facial intensity, there are few approaches that fit a statistical model of 3D faces. In this paper, in order to capture and describe the non-rigid nature of facial surfaces we build a part-based statistical model of the 3D facial surface and we combine it with non-rigid iterative closest point algorithms. We show that the proposed algorithm largely outperforms state-of-the-art algorithms for 3D face fitting and alignment especially when it comes to the description of the mouth region

    Review of Local Descriptor in RGB-D Object Recognition

    Get PDF
    The emergence of an RGB-D (Red-Green-Blue-Depth) sensor which is capable of providing depth and RGB images gives hope to the computer vision community. Moreover, the use of local features began to increase over the last few years and has shown impressive results, especially in the field of object recognition. This article attempts to provide a survey of the recent technical achievements in this area of research. We review the use of local descriptors as the feature representation which is extracted from RGB-D images, in instances and category-level object recognition. We also highlight the involvement of depth images and how they can be combined with RGB images in constructing a local descriptor. Three different approaches are used in involving depth images into compact feature representation, that is classical approach using distribution based, kernel-trick, and feature learning. In this article, we show that the involvement of depth data successfully improves the accuracy of object recognition
    corecore