Search CORE

23,321 research outputs found

MetaSpace II: Object and full-body tracking for interaction and navigation in social VR

Author: Schmandt Chris
Sra Misha
Publication venue
Publication date: 09/12/2015
Field of study

MetaSpace II (MS2) is a social Virtual Reality (VR) system where multiple users can not only see and hear but also interact with each other, grasp and manipulate objects, walk around in space, and get tactile feedback. MS2 allows walking in physical space by tracking each user's skeleton in real-time and allows users to feel by employing passive haptics i.e., when users touch or manipulate an object in the virtual world, they simultaneously also touch or manipulate a corresponding object in the physical world. To enable these elements in VR, MS2 creates a correspondence in spatial layout and object placement by building the virtual world on top of a 3D scan of the real world. Through the association between the real and virtual world, users are able to walk freely while wearing a head-mounted device, avoid obstacles like walls and furniture, and interact with people and objects. Most current virtual reality (VR) environments are designed for a single user experience where interactions with virtual objects are mediated by hand-held input devices or hand gestures. Additionally, users are only shown a representation of their hands in VR floating in front of the camera as seen from a first person perspective. We believe, representing each user as a full-body avatar that is controlled by natural movements of the person in the real world (see Figure 1d), can greatly enhance believability and a user's sense immersion in VR.Comment: 10 pages, 9 figures. Video: http://living.media.mit.edu/projects/metaspace-ii

arXiv.org e-Print Archive

eScholarship - University of California

Detecting and tracking multiple interacting objects without class-specific models

Author: Bose Biswajit
Grimson Eric
Wang Xiaogang
Publication venue
Publication date: 25/04/2006
Field of study

We propose a framework for detecting and tracking multiple interacting objects from a single, static, uncalibrated camera. The number of objects is variable and unknown, and object-class-specific models are not available. We use background subtraction results as measurements for object detection and tracking. Given these constraints, the main challenge is to associate pixel measurements with (possibly interacting) object targets. We first track clusters of pixels, and note when they merge or split. We then build an inference graph, representing relations between the tracked clusters. Using this graph and a generic object model based on spatial connectedness and coherent motion, we label the tracked clusters as whole objects, fragments of objects or groups of interacting objects. The outputs of our algorithm are entire tracks of objects, which may include corresponding tracks from groups of objects during interactions. Experimental results on multiple video sequences are shown

DSpace@MIT

SAVASA project @ TRECVID 2012: interactive surveillance event detection

Author: Clawson Kathy
Direkoglu Cem
Gimenez Roberto
Jargalsaikhan Iveel
Li Hao
Little Suzanne
Martinez Llorens Ana
Mereu Anna
Nieto Marcos
O'Connor Noel E.
Rodriguez Aitor
Sanchez Pedro
Santos de la Camara Raul
Smeaton Alan F.
Villarroel Peniza Karina
Publication venue
Publication date: 26/11/2012
Field of study

In this paper we describe our participation in the interactive surveillance event detection task at TRECVid 2012. The system we developed was comprised of individual classifiers brought together behind a simple video search interface that enabled users to select relevant segments based on down~sampled animated gifs. Two types of user -- `experts' and `end users' -- performed the evaluations. Due to time constraints we focussed on three events -- ObjectPut, PersonRuns and Pointing -- and two of the five available cameras (1 and 3). Results from the interactive runs as well as discussion of the performance of the underlying retrospective classifiers are presented

DCU Online Research Access Service

Tracking of Individuals in Very Long Video Sequences

Author: Corlin Rasmus
Fihl Preben
Moeslund Thomas B.
Park Sangho
Trivedi Mohan M.
Publication venue: 'Springer Fachmedien Wiesbaden GmbH'
Publication date: 01/01/2006
Field of study

VBN

3D Object Reconstruction from Hand-Object Interactions

Author: Gall Juergen
Tzionas Dimitrios
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/12/2015
Field of study

Recent advances have enabled 3d object reconstruction approaches using a single off-the-shelf RGB-D camera. Although these approaches are successful for a wide range of object classes, they rely on stable and distinctive geometric or texture features. Many objects like mechanical parts, toys, household or decorative articles, however, are textureless and characterized by minimalistic shapes that are simple and symmetric. Existing in-hand scanning systems and 3d reconstruction techniques fail for such symmetric objects in the absence of highly distinctive features. In this work, we show that extracting 3d hand motion for in-hand scanning effectively facilitates the reconstruction of even featureless and highly symmetric objects and we present an approach that fuses the rich additional information of hands into a 3d reconstruction pipeline, significantly contributing to the state-of-the-art of in-hand scanning.Comment: International Conference on Computer Vision (ICCV) 2015, http://files.is.tue.mpg.de/dtzionas/In-Hand-Scannin

arXiv.org e-Print Archive

CiteSeerX

MPG.PuRe

Learning to Refine Human Pose Estimation

Author: Fieraru Mihai
Khoreva Anna
Pishchulin Leonid
Schiele Bernt
Publication venue
Publication date: 01/01/2018
Field of study

Multi-person pose estimation in images and videos is an important yet challenging task with many applications. Despite the large improvements in human pose estimation enabled by the development of convolutional neural networks, there still exist a lot of difficult cases where even the state-of-the-art models fail to correctly localize all body joints. This motivates the need for an additional refinement step that addresses these challenging cases and can be easily applied on top of any existing method. In this work, we introduce a pose refinement network (PoseRefiner) which takes as input both the image and a given pose estimate and learns to directly predict a refined pose by jointly reasoning about the input-output space. In order for the network to learn to refine incorrect body joint predictions, we employ a novel data augmentation scheme for training, where we model "hard" human pose cases. We evaluate our approach on four popular large-scale pose estimation benchmarks such as MPII Single- and Multi-Person Pose Estimation, PoseTrack Pose Estimation, and PoseTrack Pose Tracking, and report systematic improvement over the state of the art.Comment: To appear in CVPRW (2018). Workshop: Visual Understanding of Humans in Crowd Scene and the 2nd Look Into Person Challenge (VUHCS-LIP

arXiv.org e-Print Archive

Crossref

MPG.PuRe

Automatic Video-based Analysis of Human Motion

Author: Fihl Preben
Publication venue: Aalborg Universitet
Publication date: 15/10/2011
Field of study

VBN