1,504 research outputs found

    Lifting GIS Maps into Strong Geometric Context for Scene Understanding

    Full text link
    Contextual information can have a substantial impact on the performance of visual tasks such as semantic segmentation, object detection, and geometric estimation. Data stored in Geographic Information Systems (GIS) offers a rich source of contextual information that has been largely untapped by computer vision. We propose to leverage such information for scene understanding by combining GIS resources with large sets of unorganized photographs using Structure from Motion (SfM) techniques. We present a pipeline to quickly generate strong 3D geometric priors from 2D GIS data using SfM models aligned with minimal user input. Given an image resectioned against this model, we generate robust predictions of depth, surface normals, and semantic labels. We show that the precision of the predicted geometry is substantially more accurate other single-image depth estimation methods. We then demonstrate the utility of these contextual constraints for re-scoring pedestrian detections, and use these GIS contextual features alongside object detection score maps to improve a CRF-based semantic segmentation framework, boosting accuracy over baseline models

    Extraction and selection of muscle based features for facial expression recognition

    Get PDF
    In this study we propose a new set of muscle activity based features for facial expression recognition. We extract muscular activities by observing the displacements of facial feature points in an expression video. The facial feature points are initialized on muscular regions of influence in the first frame of the video. These points are tracked through optical flow in sequential frames. Displacements of feature points on the image plane are used to estimate the 3D orientation of a head model and relative displacements of its vertices. We model the human skin as a linear system of equations. The estimated deformation of the wireframe model produces an over-determined system of equations that can be solved under the constraint of the facial anatomy to obtain muscle activation levels. We apply sequential forward feature selection to choose the most descriptive set of muscles for recognition of basic facial expressions.Publisher's VersionAuthor Post Prin

    Robust Hand Motion Capture and Physics-Based Control for Grasping in Real Time

    Get PDF
    Hand motion capture technologies are being explored due to high demands in the fields such as video game, virtual reality, sign language recognition, human-computer interaction, and robotics. However, existing systems suffer a few limitations, e.g. they are high-cost (expensive capture devices), intrusive (additional wear-on sensors or complex configurations), and restrictive (limited motion varieties and restricted capture space). This dissertation mainly focus on exploring algorithms and applications for the hand motion capture system that is low-cost, non-intrusive, low-restriction, high-accuracy, and robust. More specifically, we develop a realtime and fully-automatic hand tracking system using a low-cost depth camera. We first introduce an efficient shape-indexed cascaded pose regressor that directly estimates 3D hand poses from depth images. A unique property of our hand pose regressor is to utilize a low-dimensional parametric hand geometric model to learn 3D shape-indexed features robust to variations in hand shapes, viewpoints and hand poses. We further introduce a hybrid tracking scheme that effectively complements our hand pose regressor with model-based hand tracking. In addition, we develop a rapid 3D hand shape modeling method that uses a small number of depth images to accurately construct a subject-specific skinned mesh model for hand tracking. This step not only automates the whole tracking system but also improves the robustness and accuracy of model-based tracking and hand pose regression. Additionally, we also propose a physically realistic human grasping synthesis method that is capable to grasp a wide variety of objects. Given an object to be grasped, our method is capable to compute required controls (e.g. forces and torques) that advance the simulation to achieve realistic grasping. Our method combines the power of data-driven synthesis and physics-based grasping control. We first introduce a data-driven method to synthesize a realistic grasping motion from large sets of prerecorded grasping motion data. And then we transform the synthesized kinematic motion to a physically realistic one by utilizing our online physics-based motion control method. In addition, we also provide a performance interface which allows the user to act out before a depth camera to control a virtual object

    Robust Hand Motion Capture and Physics-Based Control for Grasping in Real Time

    Get PDF
    Hand motion capture technologies are being explored due to high demands in the fields such as video game, virtual reality, sign language recognition, human-computer interaction, and robotics. However, existing systems suffer a few limitations, e.g. they are high-cost (expensive capture devices), intrusive (additional wear-on sensors or complex configurations), and restrictive (limited motion varieties and restricted capture space). This dissertation mainly focus on exploring algorithms and applications for the hand motion capture system that is low-cost, non-intrusive, low-restriction, high-accuracy, and robust. More specifically, we develop a realtime and fully-automatic hand tracking system using a low-cost depth camera. We first introduce an efficient shape-indexed cascaded pose regressor that directly estimates 3D hand poses from depth images. A unique property of our hand pose regressor is to utilize a low-dimensional parametric hand geometric model to learn 3D shape-indexed features robust to variations in hand shapes, viewpoints and hand poses. We further introduce a hybrid tracking scheme that effectively complements our hand pose regressor with model-based hand tracking. In addition, we develop a rapid 3D hand shape modeling method that uses a small number of depth images to accurately construct a subject-specific skinned mesh model for hand tracking. This step not only automates the whole tracking system but also improves the robustness and accuracy of model-based tracking and hand pose regression. Additionally, we also propose a physically realistic human grasping synthesis method that is capable to grasp a wide variety of objects. Given an object to be grasped, our method is capable to compute required controls (e.g. forces and torques) that advance the simulation to achieve realistic grasping. Our method combines the power of data-driven synthesis and physics-based grasping control. We first introduce a data-driven method to synthesize a realistic grasping motion from large sets of prerecorded grasping motion data. And then we transform the synthesized kinematic motion to a physically realistic one by utilizing our online physics-based motion control method. In addition, we also provide a performance interface which allows the user to act out before a depth camera to control a virtual object

    Analyzing Structured Scenarios by Tracking People and Their Limbs

    Get PDF
    The analysis of human activities is a fundamental problem in computer vision. Though complex, interactions between people and their environment often exhibit a spatio-temporal structure that can be exploited during analysis. This structure can be leveraged to mitigate the effects of missing or noisy visual observations caused, for example, by sensor noise, inaccurate models, or occlusion. Trajectories of people and their hands and feet, often sufficient for recognition of human activities, lead to a natural qualitative spatio-temporal description of these interactions. This work introduces the following contributions to the task of human activity understanding: 1) a framework that efficiently detects and tracks multiple interacting people and their limbs, 2) an event recognition approach that integrates both logical and probabilistic reasoning in analyzing the spatio-temporal structure of multi-agent scenarios, and 3) an effective computational model of the visibility constraints imposed on humans as they navigate through their environment. The tracking framework mixes probabilistic models with deterministic constraints and uses AND/OR search and lazy evaluation to efficiently obtain the globally optimal solution in each frame. Our high-level reasoning framework efficiently and robustly interprets noisy visual observations to deduce the events comprising structured scenarios. This is accomplished by combining First-Order Logic, Allen's Interval Logic, and Markov Logic Networks with an event hypothesis generation process that reduces the size of the ground Markov network. When applied to outdoor one-on-one basketball videos, our framework tracks the players and, guided by the game rules, analyzes their interactions with each other and the ball, annotating the videos with the relevant basketball events that occurred. Finally, motivated by studies of spatial behavior, we use a set of features from visibility analysis to represent spatial context in the interpretation of human spatial activities. We demonstrate the effectiveness of our representation on trajectories generated by humans in a virtual environment

    Real Time Animation of Virtual Humans: A Trade-off Between Naturalness and Control

    Get PDF
    Virtual humans are employed in many interactive applications using 3D virtual environments, including (serious) games. The motion of such virtual humans should look realistic (or ‘natural’) and allow interaction with the surroundings and other (virtual) humans. Current animation techniques differ in the trade-off they offer between motion naturalness and the control that can be exerted over the motion. We show mechanisms to parametrize, combine (on different body parts) and concatenate motions generated by different animation techniques. We discuss several aspects of motion naturalness and show how it can be evaluated. We conclude by showing the promise of combinations of different animation paradigms to enhance both naturalness and control

    Industrial Heritage Education and User Tracking in Virtual Reality

    Get PDF
    Industrial heritage provides one of the most important records of social and technological progress and has international potential for education and development. This chapter presents the potential to use the virtual reality devices for informal education in technical and natural sciences. The hypothetical virtual appearance of an industrial power plant from the nineteenth century in Slovak city of PieĆĄĆ„any was intricately reconstructed by a combination of identified conserved valuable parts of the building and preserved original equipment and archival plans. This practical result—interactive virtual tool—educates about the lost heritage by allowing viewers to look closer and experience the former atmosphere of industrial work. During the virtual visits, users are motion tracked and invited to take photographs to mark the most interesting motives. Gathered data from this users’ observation were analyzed to find behavioral patterns and to give feedback information about the exhibition’s attractivity, used in further presentations

    Subsurface robotic exploration for geomorphology, astrobiology and mining during MINAR6 campaign, Boulby Mine, UK: : part II (Results and Discussion)

    Get PDF
    Acknowledgement. The authors of this paper would like to thank Kempe Foundation for its generous funding support to develop KORE, the workshop at the Teknikens Hus, LuleĂ„, for their invaluable and unconditional support in helping with the fabrication of the KORE components and the organizers of the MINAR campaign comprising the UK Centre of Astrobiology, ICL Boulby Mine and STFC Boulby Underground Laboratory, UK. MPZ has been partially funded by the Spanish State Research Agency (AEI) Project No. MDM-2017-0737 Unidad de Excelencia ‘MarĂ­a de Maeztu’- Centro de AstrobiologĂ­a (INTA-CSIC)Peer reviewedPostprin

    Analysis domain model for shared virtual environments

    Get PDF
    The field of shared virtual environments, which also encompasses online games and social 3D environments, has a system landscape consisting of multiple solutions that share great functional overlap. However, there is little system interoperability between the different solutions. A shared virtual environment has an associated problem domain that is highly complex raising difficult challenges to the development process, starting with the architectural design of the underlying system. This paper has two main contributions. The first contribution is a broad domain analysis of shared virtual environments, which enables developers to have a better understanding of the whole rather than the part(s). The second contribution is a reference domain model for discussing and describing solutions - the Analysis Domain Model
    • 

    corecore