39 research outputs found

    Real-Time 3D Image Visualization System for Digital Video on a Single Chip

    Get PDF
    Implementation of a real-time image visualization system on a reconfigurable chip (FPGA) is proposed. The system utilizes an innovative stereoscopic image capture, processing and visualization technique. Implementation is done as a two stage process. In the first stage, the stereo pair is captured using two image sensors. The captured images are then synchronized and sent to the second stage for fusion. A controller module is developed, designed, and placed on the FPGA for this purpose. The second stage is used for reconstruction and visualization of the 3D image. An innovative technique employing dual-processor architecture on the same single FPGA is developed for this purpose. The whole system is placed on a single PCB resulting in a fast processing time and the ability to view 3D video in real-time. The system is simulated, implemented, and tested on real images. Results show that this system is a low cost solution for efficient 3D video visualization using a single chip

    3D human action recognition in multiple view scenarios

    Get PDF
    This paper presents a novel view-independent approach to the recognition of human gestures of several people in low resolution sequences from multiple calibrated cameras. In contraposition with other multi-ocular gesture recognition systems based on generating a classification on a fusion of features coming from different views, our system performs a data fusion (3D representation of the scene) and then a feature extraction and classification. Motion descriptors introduced by Bobick et al. for 2D data are extended to 3D and a set of features based on 3D invariant statistical moments are computed. Finally, a Bayesian classifier is employed to perform recognition over a small set of actions. Results are provided showing the effectiveness of the proposed algorithm in a SmartRoom scenario.Peer ReviewedPostprint (published version

    Spatio-temporal alignment and hyperspherical radon transform for 3D gait recognition in multi-view environments

    Get PDF
    This paper presents a view-invariant approach to gait recognition in multi-camera scenarios exploiting a joint spatio-temporal data representation and analysis. First, multi-view information is employed to generate a 3D voxel reconstruction of the scene under study. The analyzed subject is tracked and its centroid and orientation allow recentering and aligning the volume associated to it, thus obtaining a representation invariant to translation, rotation and scaling. Temporal periodicity of the walking cycle is extracted to align the input data in the time domain. Finally, Hyperspherical Radon Transform is presented as an efficient tool to obtain features from spatio-temporal gait templates for classification purposes. Experimental results prove the validity and robustness of the proposed method for gait recognition tasks with several covariates.Postprint (published version

    On using gait to enhance frontal face extraction

    No full text
    Visual surveillance finds increasing deployment formonitoring urban environments. Operators need to be able to determine identity from surveillance images and often use face recognition for this purpose. In surveillance environments, it is necessary to handle pose variation of the human head, low frame rate, and low resolution input images. We describe the first use of gait to enable face acquisition and recognition, by analysis of 3-D head motion and gait trajectory, with super-resolution analysis. We use region- and distance-based refinement of head pose estimation. We develop a direct mapping to relate the 2-D image with a 3-D model. In gait trajectory analysis, we model the looming effect so as to obtain the correct face region. Based on head position and the gait trajectory, we can reconstruct high-quality frontal face images which are demonstrated to be suitable for face recognition. The contributions of this research include the construction of a 3-D model for pose estimation from planar imagery and the first use of gait information to enhance the face extraction process allowing for deployment in surveillance scenario

    3D model-based human motion capture

    Get PDF
    Master'sMASTER OF ENGINEERIN

    GPU-Based Optimization of a Free-Viewpoint Video System

    Get PDF
    We present a method for optimizing the reconstruction and rendering of 3D objects from multiple images by utilizing the latest features of consumer-level graphics hardware based on shader model 4.0. We accelerate visual hull reconstruction by rewriting a shape-from-silhouette algorithm to execute on the GPU's parallel architecture. Rendering a is optimized through the application of geometry shaders to generate billboarding microfacets textured with captured images. We also present a method for handling occlusion in the camera selection process that is optimized for execution on the GPU. Execution time is further improved by rendering intermediate results directly to texture to minimize the number of data transfers between graphics and main memory. We show our GPU based system to be significantly more efficient than a purely CPU-based approach, due to the parallel nature of the GPU, while maintaining graphical quality

    A Multi-camera Network System for Markerless 3D Human Body Voxel Reconstruction

    Full text link
    This paper presents a fully automated system for real-time 3D human visual hull reconstruction and skeleton vox-els extraction. The main contributions include: (1) A novel network based system is presented, which uses AXIS net-work cameras as video capture device, and performs a parallel processing among data capture, 3D voxel recon-struction and display. (2) A new human visual hull re-construction algorithm is given. This approach firstly seg-ments the foreground accurately by an efficient Gaussian Mixture Model (GMM) and a shadow model in HSV color space, then extends the standard Shape-From-Silhouette (SFS) algorithm with online Region-of-Interest (ROI) esti-mation and binary searching, and finally construct skele-ton probability visual hull with distance transform. Exper-iments with real video sequences show that the system can process eleven 640x480 video sequences at a frame rate of 15fps, and construct human body voxels reliably in complex scenarios with cast shadows, various body configurations and multiple persons. 1

    Regression-Based Human Motion Capture From Voxel Data

    Full text link
    corecore