1,346 research outputs found

    VNect: Real-time 3D Human Pose Estimation with a Single RGB Camera

    Full text link
    We present the first real-time method to capture the full global 3D skeletal pose of a human in a stable, temporally consistent manner using a single RGB camera. Our method combines a new convolutional neural network (CNN) based pose regressor with kinematic skeleton fitting. Our novel fully-convolutional pose formulation regresses 2D and 3D joint positions jointly in real time and does not require tightly cropped input frames. A real-time kinematic skeleton fitting method uses the CNN output to yield temporally stable 3D global pose reconstructions on the basis of a coherent kinematic skeleton. This makes our approach the first monocular RGB method usable in real-time applications such as 3D character control---thus far, the only monocular methods for such applications employed specialized RGB-D cameras. Our method's accuracy is quantitatively on par with the best offline 3D monocular RGB pose estimation methods. Our results are qualitatively comparable to, and sometimes better than, results from monocular RGB-D approaches, such as the Kinect. However, we show that our approach is more broadly applicable than RGB-D solutions, i.e. it works for outdoor scenes, community videos, and low quality commodity RGB cameras.Comment: Accepted to SIGGRAPH 201

    Interactive and Audience Adaptive Digital Signage Using Real-Time Computer Vision

    Get PDF
    In this paper we present the development of an interactive, content‐aware and cost‐effective digital signage system. Using a monocular camera installed within the frame of a digital signage display, we employ real‐time computer vision algorithms to extract temporal, spatial and demographic features of the observers, which are further used for observer‐specific broadcasting of digital signage content. The number of observers is obtained by the Viola and Jones face detection algorithm, whilst facial images are registered using multi‐view Active Appearance Models. The distance of the observers from the system is estimated from the interpupillary distance of registered faces. Demographic features, including gender and age group, are determined using SVM classifiers to achieve individual observer‐specific selection and adaption of the digital signage broadcasting content. The developed system was evaluated at the laboratory study level and in a field study performed for audience measurement research. Comparison of our monocular localization module with the Kinect stereo‐system reveals a comparable level of accuracy. The facial characterization module is evaluated on the FERET database with 95% accuracy for gender classification and 92% for age group. Finally, the field study demonstrates the applicability of the developed system in real‐life environments

    Multimodal human hand motion sensing and analysis - a review

    Get PDF

    Design and Development of a Twisted String Exoskeleton Robot for the Upper Limb

    Get PDF
    High-intensity and task-specific upper-limb treatment of active, highly repetitive movements are the effective approaches for patients with motor disorders. However, with the severe shortage of medical service in the United States and the fact that post-stroke survivors can continue to incur significant financial costs, patients often choose not to return to the hospital or clinic for complete recovery. Therefore, robot-assisted therapy can be considered as an alternative rehabilitation approach because the similar or better results as the patients who receive intensive conventional therapy offered by professional physicians.;The primary objective of this study was to design and fabricate an effective mobile assistive robotic system that can provide stroke patients shoulder and elbow assistance. To reduce the size of actuators and to minimize the weight that needs to be carried by users, two sets of dual twisted-string actuators, each with 7 strands (1 neutral and 6 effective) were used to extend/contract the adopted strings to drive the rotational movements of shoulder and elbow joints through a Bowden cable mechanism. Furthermore, movements of non-disabled people were captured as templates of training trajectories to provide effective rehabilitation.;The specific aims of this study included the development of a two-degree-of-freedom prototype for the elbow and shoulder joints, an adaptive robust control algorithm with cross-coupling dynamics that can compensate for both nonlinear factors of the system and asynchronization between individual actuators as well as an approach for extracting the reference trajectories for the assistive robotic from non-disabled people based on Microsoft Kinect sensor and Dynamic time warping algorithm. Finally, the data acquisition and control system of the robot was implemented by Intel Galileo and XILINX FPGA embedded system

    3D garment digitisation for virtual wardrobe using a commodity depth sensor

    Get PDF
    5-Aminovaleric acid (5AVA) is an important five-carbon platform chemical that can be used for the synthesis of polymers and other chemicals of industrial interest. Enzymatic conversion of L-lysine to 5AVA has been achieved by employing lysine 2-monooxygenase encoded by the davB gene and 5-aminovaleramidase encoded by the davA gene. Additionally, a recombinant Escherichia coli strain expressing the davB and davA genes has been developed for bioconversion of L-lysine to 5AVA. To use glucose and xylose derived from lignocellulosic biomass as substrates, rather than L-lysine as a substrate, we previously examined direct fermentative production of 5AVA from glucose by metabolically engineered E. coli strains. However, the yield and productivity of 5AVA achieved by recombinant E. coli strains remain very low. Thus, Corynebacterium glutamicum, a highly efficient L-lysine producing microorganism, should be useful in the development of direct fermentative production of 5AVA using L-lysine as a precursor for 5AVA. Here, we report the development of metabolically engineered C. glutamicum strains for enhanced fermentative production of 5AVA from glucose.Various expression vectors containing different promoters and origins of replication were examined for optimal expression of Pseudomonas putida davB and davA genes encoding lysine 2-monooxygenase and delta-aminovaleramidase, respectively. Among them, expression of the C. glutamicum codon-optimized davA gene fused with His-Tag at its N-Terminal and the davB gene as an operon under a strong synthetic H promoter (plasmid p36davAB3) in C. glutamicum enabled the most efficient production of 5AVA. Flask culture and fed-batch culture of this strain produced 6.9 and 19.7\ua0g/L (together with 11.9\ua0g/L glutaric acid as major byproduct) of 5AVA, respectively. Homology modeling suggested that endogenous gamma-aminobutyrate aminotransferase encoded by the gabT gene might be responsible for the conversion of 5AVA to glutaric acid in recombinant C. glutamicum. Fed-batch culture of a C. glutamicum gabT mutant-harboring p36davAB3 produced 33.1\ua0g/L 5AVA with much reduced (2.0\ua0g/L) production of glutaric acid.Corynebacterium glutamicum was successfully engineered to produce 5AVA from glucose by optimizing the expression of two key enzymes, lysine 2-monooxygenase and delta-aminovaleramidase. In addition, production of glutaric acid, a major byproduct, was significantly reduced by employing C. glutamicum gabT mutant as a host strain. The metabolically engineered C. glutamicum strains developed in this study should be useful for enhanced fermentative production of the novel C5 platform chemical 5AVA from renewable resources

    Computational Multimedia for Video Self Modeling

    Get PDF
    Video self modeling (VSM) is a behavioral intervention technique in which a learner models a target behavior by watching a video of oneself. This is the idea behind the psychological theory of self-efficacy - you can learn or model to perform certain tasks because you see yourself doing it, which provides the most ideal form of behavior modeling. The effectiveness of VSM has been demonstrated for many different types of disabilities and behavioral problems ranging from stuttering, inappropriate social behaviors, autism, selective mutism to sports training. However, there is an inherent difficulty associated with the production of VSM material. Prolonged and persistent video recording is required to capture the rare, if not existed at all, snippets that can be used to string together in forming novel video sequences of the target skill. To solve this problem, in this dissertation, we use computational multimedia techniques to facilitate the creation of synthetic visual content for self-modeling that can be used by a learner and his/her therapist with a minimum amount of training data. There are three major technical contributions in my research. First, I developed an Adaptive Video Re-sampling algorithm to synthesize realistic lip-synchronized video with minimal motion jitter. Second, to denoise and complete the depth map captured by structure-light sensing systems, I introduced a layer based probabilistic model to account for various types of uncertainties in the depth measurement. Third, I developed a simple and robust bundle-adjustment based framework for calibrating a network of multiple wide baseline RGB and depth cameras
    • 

    corecore