839 research outputs found

    Three-Dimensional Integral Imaging for Gesture Recognition Under Occlusions

    Get PDF
    Over the last years, three-dimensional (3-D) imaging has been applied to human action and gesture recognition, usually in the form of depth maps from RGB-D sensors. An alternative which has not been explored is 3-D integral imaging, aside from a recent preliminary study which shows that it can be an effective sensory modality with some advantages over the conventional monocular imaging. Since integral imaging has also been shown to be a powerful tool in other visual tasks (e.g., object reconstruction and recognition) under challenging conditions (e.g., low illumination, occlusions), and its passive long-range operation brings benefits over active close-range devices, a natural question is whether these advantages also hold for gesture recognition. Furthermore, occlusions are present in many real-world scenarios in gesture recognition, but it is an elusive problem which has scarcely been addressed. As far as we know, this letter analyzes for the first time the potential of integral imaging for gesture recognition under occlusions, by comparing it to monocular imaging and to RGB-D sensory data. Empirical results corroborates the benefits of 3-D integral imaging for gesture recognition, mainly under occlusions

    Human gesture recognition under degraded environments using 3D-integral imaging and deep learning

    Get PDF
    In this paper, we propose a spatio-temporal human gesture recognition algorithm under degraded conditions using three-dimensional integral imaging and deep learning. The proposed algorithm leverages the advantages of integral imaging with deep learning to provide an efficient human gesture recognition system under degraded environments such as occlusion and low illumination conditions. The 3D data captured using integral imaging serves as the input to a convolutional neural network (CNN). The spatial features extracted by the convolutional and pooling layers of the neural network are fed into a bi-directional long short-term memory (BiLSTM) network. The BiLSTM network is designed to capture the temporal variation in the input data. We have compared the proposed approach with conventional 2D imaging and with the previously reported approaches using spatio-temporal interest points with support vector machines (STIP-SVMs) and distortion invariant non-linear correlation-based filters. Our experimental results suggest that the proposed approach is promising, especially in degraded environments. Using the proposed approach, we find a substantial improvement over previously published methods and find 3D integral imaging to provide superior performance over the conventional 2D imaging system. To the best of our knowledge, this is the first report that examines deep learning algorithms based on 3D integral imaging for human activity recognition in degraded environments

    Integral imaging techniques for flexible sensing through image-based reprojection

    Get PDF
    In this work, a 3D reconstruction approach for flexible sensing inspired by integral imaging techniques is proposed. This method allows the application of different integral imaging techniques, such as generating a depth map or the reconstruction of images on a certain 3D plane of the scene that were taken with a set of cameras located at unknown and arbitrary positions and orientations. By means of a photo-consistency measure proposed in this work, all-in-focus images can also be generated by projecting the points of the 3D plane into the sensor planes of the cameras and thereby capturing the associated RGB values. The proposed method obtains consistent results in real scenes with different surfaces of objects as well as changes in texture and lighting

    Roadmap on 3D integral imaging: Sensing, processing, and display

    Get PDF
    This Roadmap article on three-dimensional integral imaging provides an overview of some of the research activities in the field of integral imaging. The article discusses various aspects of the field including sensing of 3D scenes, processing of captured information, and 3D display and visualization of information. The paper consists of a series of 15 sections from the experts presenting various aspects of the field on sensing, processing, displays, augmented reality, microscopy, object recognition, and other applications. Each section represents the vision of its author to describe the progress, potential, vision, and challenging issues in this field

    Facial Point Detection using Boosted Regression and Graph Models

    Get PDF
    Finding fiducial facial points in any frame of a video showing rich naturalistic facial behaviour is an unsolved problem. Yet this is a crucial step for geometric-featurebased facial expression analysis, and methods that use appearance-based features extracted at fiducial facial point locations. In this paper we present a method based on a combination of Support Vector Regression and Markov Random Fields to drastically reduce the time needed to search for a point’s location and increase the accuracy and robustness of the algorithm. Using Markov Random Fields allows us to constrain the search space by exploiting the constellations that facial points can form. The regressors on the other hand learn a mapping between the appearance of the area surrounding a point and the positions of these points, which makes detection of the points very fast and can make the algorithm robust to variations of appearance due to facial expression and moderate changes in head pose. The proposed point detection algorithm was tested on 1855 images, the results of which showed we outperform current state of the art point detectors

    Gait recognition based on shape and motion analysis of silhouette contours

    Get PDF
    This paper presents a three-phase gait recognition method that analyses the spatio-temporal shape and dynamic motion (STS-DM) characteristics of a human subject’s silhouettes to identify the subject in the presence of most of the challenging factors that affect existing gait recognition systems. In phase 1, phase-weighted magnitude spectra of the Fourier descriptor of the silhouette contours at ten phases of a gait period are used to analyse the spatio-temporal changes of the subject’s shape. A component-based Fourier descriptor based on anatomical studies of human body is used to achieve robustness against shape variations caused by all common types of small carrying conditions with folded hands, at the subject’s back and in upright position. In phase 2, a full-body shape and motion analysis is performed by fitting ellipses to contour segments of ten phases of a gait period and using a histogram matching with Bhattacharyya distance of parameters of the ellipses as dissimilarity scores. In phase 3, dynamic time warping is used to analyse the angular rotation pattern of the subject’s leading knee with a consideration of arm-swing over a gait period to achieve identification that is invariant to walking speed, limited clothing variations, hair style changes and shadows under feet. The match scores generated in the three phases are fused using weight-based score-level fusion for robust identification in the presence of missing and distorted frames, and occlusion in the scene. Experimental analyses on various publicly available data sets show that STS-DM outperforms several state-of-the-art gait recognition methods

    Depth and All-in-Focus Image Estimation in Synthetic Aperture Integral Imaging Under Partial Occlusions

    Get PDF
    A common assumption in the integral imaging reconstruction is that a pixel will be photo-consistent if all viewpoints observed by the different cameras converge at a single point when focusing at the proper depth. However, the presence of occlusions between objects in the scene prevents this from being fulfilled. In this paper, a novel depth and all-in focus image estimation method is presented, based on a photo-consistency measure that uses the median criterion in relation to the elemental images. The interest of this approach is to find a solution to detect which camera correctly sees the partially occluded object at a certain depth and allows for a precise solution to the object depth. In addition, a robust solution is proposed to detect the boundary limits between partially occluded objects, which are subsequently used during the regularization depth estimation process. The experimental results show that the proposed method outperforms other state-of-the-art depth estimation methods in a synthetic aperture integral imaging framework

    Duodepth: Static Gesture Recognition Via Dual Depth Sensors

    Full text link
    Static gesture recognition is an effective non-verbal communication channel between a user and their devices; however many modern methods are sensitive to the relative pose of the user's hands with respect to the capture device, as parts of the gesture can become occluded. We present two methodologies for gesture recognition via synchronized recording from two depth cameras to alleviate this occlusion problem. One is a more classic approach using iterative closest point registration to accurately fuse point clouds and a single PointNet architecture for classification, and the other is a dual Point-Net architecture for classification without registration. On a manually collected data-set of 20,100 point clouds we show a 39.2% reduction in misclassification for the fused point cloud method, and 53.4% for the dual PointNet, when compared to a standard single camera pipeline.Comment: 26th International Conference on Image Processin
    corecore