86,844 research outputs found

    The Importance of Hand Motions for Communication and Interaction in Virtual Reality

    Get PDF
    Virtual reality (VR) is a growing method of communication and play. Recent advances have enabled hand-tracking technologies for consumer VR headsets, allowing virtual hands to mimic a user\u27s real hand movements in real-time. A growing number of users now utilize hand-tracking when using VR to manipulate objects or to create gestures when interacting with others. As VR grows as a tool and communication platform, it is important to understand how the rising prevalence of hand-tracking technology might affect users\u27 experiences. The goal of this dissertation is to investigate, through a series of experiments, how using hand motions in VR influences our experience when we communicate with others or interact with the environment. In our daily lives hand motions play a major role in interpersonal communication. Our hands can help emphasize or clarify our speech, or even supplement words entirely. When interacting with the world, hands are our primary tool for manipulating objects and performing dexterous tasks. Bringing these capabilities into VR, a space that has so far been lacking in such detailed expression and interaction, may have unexpected effects. Overall, we show that using hand-tracking and hand motions in VR is beneficial to many metrics that are used to measure the quality of experiences in virtual environments. When using accurate hand motions, people feel more comfortable and embodied within their virtual avatars, or they feel more socially present. We recommend tracking and displaying hand motions in virtual environments if embodiment or communication are the most important criteria

    Overview of Bayesian sequential Monte Carlo methods for group and extended object tracking

    Get PDF
    This work presents the current state-of-the-art in techniques for tracking a number of objects moving in a coordinated and interacting fashion. Groups are structured objects characterized with particular motion patterns. The group can be comprised of a small number of interacting objects (e.g. pedestrians, sport players, convoy of cars) or of hundreds or thousands of components such as crowds of people. The group object tracking is closely linked with extended object tracking but at the same time has particular features which differentiate it from extended objects. Extended objects, such as in maritime surveillance, are characterized by their kinematic states and their size or volume. Both group and extended objects give rise to a varying number of measurements and require trajectory maintenance. An emphasis is given here to sequential Monte Carlo (SMC) methods and their variants. Methods for small groups and for large groups are presented, including Markov Chain Monte Carlo (MCMC) methods, the random matrices approach and Random Finite Set Statistics methods. Efficient real-time implementations are discussed which are able to deal with the high dimensionality and provide high accuracy. Future trends and avenues are traced. © 2013 Elsevier Inc. All rights reserved

    A Causal And-Or Graph Model for Visibility Fluent Reasoning in Tracking Interacting Objects

    Full text link
    Tracking humans that are interacting with the other subjects or environment remains unsolved in visual tracking, because the visibility of the human of interests in videos is unknown and might vary over time. In particular, it is still difficult for state-of-the-art human trackers to recover complete human trajectories in crowded scenes with frequent human interactions. In this work, we consider the visibility status of a subject as a fluent variable, whose change is mostly attributed to the subject's interaction with the surrounding, e.g., crossing behind another object, entering a building, or getting into a vehicle, etc. We introduce a Causal And-Or Graph (C-AOG) to represent the causal-effect relations between an object's visibility fluent and its activities, and develop a probabilistic graph model to jointly reason the visibility fluent change (e.g., from visible to invisible) and track humans in videos. We formulate this joint task as an iterative search of a feasible causal graph structure that enables fast search algorithm, e.g., dynamic programming method. We apply the proposed method on challenging video sequences to evaluate its capabilities of estimating visibility fluent changes of subjects and tracking subjects of interests over time. Results with comparisons demonstrate that our method outperforms the alternative trackers and can recover complete trajectories of humans in complicated scenarios with frequent human interactions.Comment: accepted by CVPR 201

    Multi-Object Tracking with Interacting Vehicles and Road Map Information

    Full text link
    In many applications, tracking of multiple objects is crucial for a perception of the current environment. Most of the present multi-object tracking algorithms assume that objects move independently regarding other dynamic objects as well as the static environment. Since in many traffic situations objects interact with each other and in addition there are restrictions due to drivable areas, the assumption of an independent object motion is not fulfilled. This paper proposes an approach adapting a multi-object tracking system to model interaction between vehicles, and the current road geometry. Therefore, the prediction step of a Labeled Multi-Bernoulli filter is extended to facilitate modeling interaction between objects using the Intelligent Driver Model. Furthermore, to consider road map information, an approximation of a highly precise road map is used. The results show that in scenarios where the assumption of a standard motion model is violated, the tracking system adapted with the proposed method achieves higher accuracy and robustness in its track estimations

    MetaSpace II: Object and full-body tracking for interaction and navigation in social VR

    Full text link
    MetaSpace II (MS2) is a social Virtual Reality (VR) system where multiple users can not only see and hear but also interact with each other, grasp and manipulate objects, walk around in space, and get tactile feedback. MS2 allows walking in physical space by tracking each user's skeleton in real-time and allows users to feel by employing passive haptics i.e., when users touch or manipulate an object in the virtual world, they simultaneously also touch or manipulate a corresponding object in the physical world. To enable these elements in VR, MS2 creates a correspondence in spatial layout and object placement by building the virtual world on top of a 3D scan of the real world. Through the association between the real and virtual world, users are able to walk freely while wearing a head-mounted device, avoid obstacles like walls and furniture, and interact with people and objects. Most current virtual reality (VR) environments are designed for a single user experience where interactions with virtual objects are mediated by hand-held input devices or hand gestures. Additionally, users are only shown a representation of their hands in VR floating in front of the camera as seen from a first person perspective. We believe, representing each user as a full-body avatar that is controlled by natural movements of the person in the real world (see Figure 1d), can greatly enhance believability and a user's sense immersion in VR.Comment: 10 pages, 9 figures. Video: http://living.media.mit.edu/projects/metaspace-ii

    3D Tracking Using Multi-view Based Particle Filters

    Get PDF
    Visual surveillance and monitoring of indoor environments using multiple cameras has become a field of great activity in computer vision. Usual 3D tracking and positioning systems rely on several independent 2D tracking modules applied over individual camera streams, fused using geometrical relationships across cameras. As 2D tracking systems suffer inherent difficulties due to point of view limitations (perceptually similar foreground and background regions causing fragmentation of moving objects, occlusions), 3D tracking based on partially erroneous 2D tracks are likely to fail when handling multiple-people interaction. To overcome this problem, this paper proposes a Bayesian framework for combining 2D low-level cues from multiple cameras directly into the 3D world through 3D Particle Filters. This method allows to estimate the probability of a certain volume being occupied by a moving object, and thus to segment and track multiple people across the monitored area. The proposed method is developed on the basis of simple, binary 2D moving region segmentation on each camera, considered as different state observations. In addition, the method is proved well suited for integrating additional 2D low-level cues to increase system robustness to occlusions: in this line, a naĂŻve color-based (HSI) appearance model has been integrated, resulting in clear performance improvements when dealing with complex scenarios

    RGBD Datasets: Past, Present and Future

    Full text link
    Since the launch of the Microsoft Kinect, scores of RGBD datasets have been released. These have propelled advances in areas from reconstruction to gesture recognition. In this paper we explore the field, reviewing datasets across eight categories: semantics, object pose estimation, camera tracking, scene reconstruction, object tracking, human actions, faces and identification. By extracting relevant information in each category we help researchers to find appropriate data for their needs, and we consider which datasets have succeeded in driving computer vision forward and why. Finally, we examine the future of RGBD datasets. We identify key areas which are currently underexplored, and suggest that future directions may include synthetic data and dense reconstructions of static and dynamic scenes.Comment: 8 pages excluding references (CVPR style

    Saying What You're Looking For: Linguistics Meets Video Search

    Full text link
    We present an approach to searching large video corpora for video clips which depict a natural-language query in the form of a sentence. This approach uses compositional semantics to encode subtle meaning that is lost in other systems, such as the difference between two sentences which have identical words but entirely different meaning: "The person rode the horse} vs. \emph{The horse rode the person". Given a video-sentence pair and a natural-language parser, along with a grammar that describes the space of sentential queries, we produce a score which indicates how well the video depicts the sentence. We produce such a score for each video clip in a corpus and return a ranked list of clips. Furthermore, this approach addresses two fundamental problems simultaneously: detecting and tracking objects, and recognizing whether those tracks depict the query. Because both tracking and object detection are unreliable, this uses knowledge about the intended sentential query to focus the tracker on the relevant participants and ensures that the resulting tracks are described by the sentential query. While earlier work was limited to single-word queries which correspond to either verbs or nouns, we show how one can search for complex queries which contain multiple phrases, such as prepositional phrases, and modifiers, such as adverbs. We demonstrate this approach by searching for 141 queries involving people and horses interacting with each other in 10 full-length Hollywood movies.Comment: 13 pages, 8 figure

    A Dose of Reality: Overcoming Usability Challenges in VR Head-Mounted Displays

    Get PDF
    We identify usability challenges facing consumers adopting Virtual Reality (VR) head-mounted displays (HMDs) in a survey of 108 VR HMD users. Users reported significant issues in interacting with, and being aware of their real-world context when using a HMD. Building upon existing work on blending real and virtual environments, we performed three design studies to address these usability concerns. In a typing study, we show that augmenting VR with a view of reality significantly corrected the performance impairment of typing in VR. We then investigated how much reality should be incorporated and when, so as to preserve users’ sense of presence in VR. For interaction with objects and peripherals, we found that selectively presenting reality as users engaged with it was optimal in terms of performance and users’ sense of presence. Finally, we investigated how this selective, engagement-dependent approach could be applied in social environments, to support the user’s awareness of the proximity and presence of others
    • 

    corecore