3,300 research outputs found

    Keep it SMPL: Automatic Estimation of 3D Human Pose and Shape from a Single Image

    Full text link
    We describe the first method to automatically estimate the 3D pose of the human body as well as its 3D shape from a single unconstrained image. We estimate a full 3D mesh and show that 2D joints alone carry a surprising amount of information about body shape. The problem is challenging because of the complexity of the human body, articulation, occlusion, clothing, lighting, and the inherent ambiguity in inferring 3D from 2D. To solve this, we first use a recently published CNN-based method, DeepCut, to predict (bottom-up) the 2D body joint locations. We then fit (top-down) a recently published statistical body shape model, called SMPL, to the 2D joints. We do so by minimizing an objective function that penalizes the error between the projected 3D model joints and detected 2D joints. Because SMPL captures correlations in human shape across the population, we are able to robustly fit it to very little data. We further leverage the 3D model to prevent solutions that cause interpenetration. We evaluate our method, SMPLify, on the Leeds Sports, HumanEva, and Human3.6M datasets, showing superior pose accuracy with respect to the state of the art.Comment: To appear in ECCV 201

    Hand-Shape Recognition Using the Distributions of Multi-Viewpoint Image Sets

    Get PDF
    This paper proposes a method for recognizing hand-shapes by using multi-viewpoint image sets. The recognition of a hand-shape is a difficult problem, as appearance of the hand changes largely depending on viewpoint, illumination conditions and individual characteristics. To overcome this problem, we apply the Kernel Orthogonal Mutual Subspace Method (KOMSM) to shift-invariance features obtained from multi-viewpoint images of a hand. When applying KOMSM to hand recognition with a lot of learning images from each class, it is necessary to consider how to run the KOMSM with heavy computational cost due to the kernel trick technique. We propose a new method that can drastically reduce the computational cost of KOMSM by adopting centroids and the number of images belonging to the centroids, which are obtained by using k-means clustering. The validity of the proposed method is demonstrated through evaluation experiments using multi-viewpoint image sets of 30 classes of hand-shapes

    Vision-Based 2D and 3D Human Activity Recognition

    Get PDF

    Towards gestural understanding for intelligent robots

    Get PDF
    Fritsch JN. Towards gestural understanding for intelligent robots. Bielefeld: Universität Bielefeld; 2012.A strong driving force of scientific progress in the technical sciences is the quest for systems that assist humans in their daily life and make their life easier and more enjoyable. Nowadays smartphones are probably the most typical instances of such systems. Another class of systems that is getting increasing attention are intelligent robots. Instead of offering a smartphone touch screen to select actions, these systems are intended to offer a more natural human-machine interface to their users. Out of the large range of actions performed by humans, gestures performed with the hands play a very important role especially when humans interact with their direct surrounding like, e.g., pointing to an object or manipulating it. Consequently, a robot has to understand such gestures to offer an intuitive interface. Gestural understanding is, therefore, a key capability on the way to intelligent robots. This book deals with vision-based approaches for gestural understanding. Over the past two decades, this has been an intensive field of research which has resulted in a variety of algorithms to analyze human hand motions. Following a categorization of different gesture types and a review of other sensing techniques, the design of vision systems that achieve hand gesture understanding for intelligent robots is analyzed. For each of the individual algorithmic steps – hand detection, hand tracking, and trajectory-based gesture recognition – a separate Chapter introduces common techniques and algorithms and provides example methods. The resulting recognition algorithms are considering gestures in isolation and are often not sufficient for interacting with a robot who can only understand such gestures when incorporating the context like, e.g., what object was pointed at or manipulated. Going beyond a purely trajectory-based gesture recognition by incorporating context is an important prerequisite to achieve gesture understanding and is addressed explicitly in a separate Chapter of this book. Two types of context, user-provided context and situational context, are reviewed and existing approaches to incorporate context for gestural understanding are reviewed. Example approaches for both context types provide a deeper algorithmic insight into this field of research. An overview of recent robots capable of gesture recognition and understanding summarizes the currently realized human-robot interaction quality. The approaches for gesture understanding covered in this book are manually designed while humans learn to recognize gestures automatically during growing up. Promising research targeted at analyzing developmental learning in children in order to mimic this capability in technical systems is highlighted in the last Chapter completing this book as this research direction may be highly influential for creating future gesture understanding systems

    Computational Intelligence Techniques in Visual Pattern Recognition

    Get PDF
    Ph.DDOCTOR OF PHILOSOPH

    Hand gesture recognition in uncontrolled environments

    Get PDF
    Human Computer Interaction has been relying on mechanical devices to feed information into computers with low efficiency for a long time. With the recent developments in image processing and machine learning methods, the computer vision community is ready to develop the next generation of Human Computer Interaction methods, including Hand Gesture Recognition methods. A comprehensive Hand Gesture Recognition based semantic level Human Computer Interaction framework for uncontrolled environments is proposed in this thesis. The framework contains novel methods for Hand Posture Recognition, Hand Gesture Recognition and Hand Gesture Spotting. The Hand Posture Recognition method in the proposed framework is capable of recognising predefined still hand postures from cluttered backgrounds. Texture features are used in conjunction with Adaptive Boosting to form a novel feature selection scheme, which can effectively detect and select discriminative texture features from the training samples of the posture classes. A novel Hand Tracking method called Adaptive SURF Tracking is proposed in this thesis. Texture key points are used to track multiple hand candidates in the scene. This tracking method matches texture key points of hand candidates within adjacent frames to calculate the movement directions of hand candidates. With the gesture trajectories provided by the Adaptive SURF Tracking method, a novel classi�er called Partition Matrix is introduced to perform gesture classification for uncontrolled environments with multiple hand candidates. The trajectories of all hand candidates extracted from the original video under different frame rates are used to analyse the movements of hand candidates. An alternative gesture classifier based on Convolutional Neural Network is also proposed. The input images of the Neural Network are approximate trajectory images reconstructed from the tracking results of the Adaptive SURF Tracking method. For Hand Gesture Spotting, a forward spotting scheme is introduced to detect the starting and ending points of the prede�ned gestures in the continuously signed gesture videos. A Non-Sign Model is also proposed to simulate meaningless hand movements between the meaningful gestures. The proposed framework can perform well with unconstrained scene settings, including frontal occlusions, background distractions and changing lighting conditions. Moreover, it is invariant to changing scales, speed and locations of the gesture trajectories

    Using Prospect Theory to Investigate Decision-Making Bias within an Information Security Context

    Get PDF
    Information security is an issue that has increased in importance over the past decade. In this time both practitioner and academic circles have researched and developed practices and process to more effectively handle information security. Even with growth in these areas there has been little research conducted into how decision makers actually behave. This is problematic because decision makers in the Department of Defense have been observed exhibiting risk seeking behavior when making information security decisions that seemingly violate accepted norms. There are presently no models in the literature that provide sufficient insight into this phenomenon. This study used Prospect Theory as a framework to develop a survey in an effort to obtain insight into how decision makers actually behave while making information security decisions. The survey was distributed to Majors in the Air Force who represented likely future information security decision makers. The results of the study were mixed, showing that prospect theory had only limited explanatory power in this context. The most significant finding showed that negatively connotated decision frames result in significantly more risk seeking behavior. These results provide insight into decision maker behavior and highlight the fact that there are biases in information security decision making

    Computational Modeling of Human Dorsal Pathway for Motion Processing

    Get PDF
    Reliable motion estimation in videos is of crucial importance for background iden- tification, object tracking, action recognition, event analysis, self-navigation, etc. Re- constructing the motion field in the 2D image plane is very challenging, due to variations in image quality, scene geometry, lighting condition, and most importantly, camera jit- tering. Traditional optical flow models assume consistent image brightness and smooth motion field, which are violated by unstable illumination and motion discontinuities that are common in real world videos. To recognize observer (or camera) motion robustly in complex, realistic scenarios, we propose a biologically-inspired motion estimation system to overcome issues posed by real world videos. The bottom-up model is inspired from the infrastructure as well as functionalities of human dorsal pathway, and the hierarchical processing stream can be divided into three stages: 1) spatio-temporal processing for local motion, 2) recogni- tion for global motion patterns (camera motion), and 3) preemptive estimation of object motion. To extract effective and meaningful motion features, we apply a series of steer- able, spatio-temporal filters to detect local motion at different speeds and directions, in a way that\u27s selective of motion velocity. The intermediate response maps are cal- ibrated and combined to estimate dense motion fields in local regions, and then, local motions along two orthogonal axes are aggregated for recognizing planar, radial and circular patterns of global motion. We evaluate the model with an extensive, realistic video database that collected by hand with a mobile device (iPad) and the video content varies in scene geometry, lighting condition, view perspective and depth. We achieved high quality result and demonstrated that this bottom-up model is capable of extracting high-level semantic knowledge regarding self motion in realistic scenes. Once the global motion is known, we segment objects from moving backgrounds by compensating for camera motion. For videos captured with non-stationary cam- eras, we consider global motion as a combination of camera motion (background) and object motion (foreground). To estimate foreground motion, we exploit corollary dis- charge mechanism of biological systems and estimate motion preemptively. Since back- ground motions for each pixel are collectively introduced by camera movements, we apply spatial-temporal averaging to estimate the background motion at pixel level, and the initial estimation of foreground motion is derived by comparing global motion and background motion at multiple spatial levels. The real frame signals are compared with those derived by forward predictions, refining estimations for object motion. This mo- tion detection system is applied to detect objects with cluttered, moving backgrounds and is proved to be efficient in locating independently moving, non-rigid regions. The core contribution of this thesis is the invention of a robust motion estimation system for complicated real world videos, with challenges by real sensor noise, complex natural scenes, variations in illumination and depth, and motion discontinuities. The overall system demonstrates biological plausibility and holds great potential for other applications, such as camera motion removal, heading estimation, obstacle avoidance, route planning, and vision-based navigational assistance, etc
    corecore