16,542 research outputs found
A fast and robust hand-driven 3D mouse
The development of new interaction paradigms requires a natural interaction. This means that people should be able to interact with technology with the same models used to interact with everyday real life, that is through gestures, expressions, voice. Following this idea, in this paper we propose a non intrusive vision based tracking system able to capture hand motion and simple hand gestures. The proposed device allows to use the hand as a "natural" 3D mouse, where the forefinger tip or the palm centre are used to identify a 3D marker and the hand gesture can be used to simulate the mouse buttons. The approach is based on a monoscopic tracking algorithm which is computationally fast and robust against noise and cluttered backgrounds. Two image streams are processed in parallel exploiting multi-core architectures, and their results are combined to obtain a constrained stereoscopic problem. The system has been implemented and thoroughly tested in an experimental environment where the 3D hand mouse has been used to interact with objects in a virtual reality application. We also provide results about the performances of the tracker, which demonstrate precision and robustness of the proposed syste
An Immersive Telepresence System using RGB-D Sensors and Head Mounted Display
We present a tele-immersive system that enables people to interact with each
other in a virtual world using body gestures in addition to verbal
communication. Beyond the obvious applications, including general online
conversations and gaming, we hypothesize that our proposed system would be
particularly beneficial to education by offering rich visual contents and
interactivity. One distinct feature is the integration of egocentric pose
recognition that allows participants to use their gestures to demonstrate and
manipulate virtual objects simultaneously. This functionality enables the
instructor to ef- fectively and efficiently explain and illustrate complex
concepts or sophisticated problems in an intuitive manner. The highly
interactive and flexible environment can capture and sustain more student
attention than the traditional classroom setting and, thus, delivers a
compelling experience to the students. Our main focus here is to investigate
possible solutions for the system design and implementation and devise
strategies for fast, efficient computation suitable for visual data processing
and network transmission. We describe the technique and experiments in details
and provide quantitative performance results, demonstrating our system can be
run comfortably and reliably for different application scenarios. Our
preliminary results are promising and demonstrate the potential for more
compelling directions in cyberlearning.Comment: IEEE International Symposium on Multimedia 201
Analysis domain model for shared virtual environments
The field of shared virtual environments, which also
encompasses online games and social 3D environments, has a
system landscape consisting of multiple solutions that share great functional overlap. However, there is little system interoperability between the different solutions. A shared virtual environment has an associated problem domain that is highly complex raising difficult challenges to the development process, starting with the architectural design of the underlying system. This paper has two main contributions. The first contribution is a broad domain analysis of shared virtual environments, which enables developers to have a better understanding of the whole rather than the part(s). The second contribution is a reference domain model for discussing and describing solutions - the Analysis Domain Model
Robust multi-clue face tracking system
In this paper we present a multi-clue face tracking system, based on the combination of a face detector and two independent trackers. The detector, a variant of the Viola-Jones algorithm, is set to generate very low false positive error rate. It initiates the tracking system and updates its state. The trackers, based on 3DRS and optical flow respectively, have been chosen to complement each other in different conditions. The main focus of this work is the integration of the two trackers and the design of a closed loop detector-tracker system, aiming at achieving superior robustness at real-time operation on a PC platform. Tests were carried out to assess the actual performance of the system. With an average of about 95% correct face location rate and no significant false positives, the proposed approach appears to be particularly robust to complex backgrounds, ambient light variation, face orientation and scale changes, partial occlusions, different\ud
facial expressions and presence of other unwanted faces
Assessment of Audio Interfaces for use in Smartphone Based Spatial Learning Systems for the Blind
Recent advancements in the field of indoor positioning and mobile computing promise development of smart phone based indoor navigation systems. Currently, the preliminary implementations of such systems only use visual interfacesâmeaning that they are inaccessible to blind and low vision users. According to the World Health Organization, about 39 million people in the world are blind. This necessitates the need for development and evaluation of non-visual interfaces for indoor navigation systems that support safe and efficient spatial learning and navigation behavior. This thesis research has empirically evaluated several different approaches through which spatial information about the environment can be conveyed through audio. In the first experiment, blindfolded participants standing at an origin in a lab learned the distance and azimuth of target objects that were specified by four audio modes. The first three modes were perceptual interfaces and did not require cognitive mediation on the part of the user. The fourth mode was a non-perceptual mode where object descriptions were given via spatial language using clockface angles. After learning the targets through the four modes, the participants spatially updated the position of the targets and localized them by walking to each of them from two indirect waypoints. The results also indicate hand motion triggered mode to be better than the head motion triggered mode and comparable to auditory snapshot. In the second experiment, blindfolded participants learned target object arrays with two spatial audio modes and a visual mode. In the first mode, head tracking was enabled, whereas in the second mode hand tracking was enabled. In the third mode, serving as a control, the participants were allowed to learn the targets visually. We again compared spatial updating performance with these modes and found no significant performance differences between modes. These results indicate that we can develop 3D audio interfaces on sensor rich off the shelf smartphone devices, without the need of expensive head tracking hardware. Finally, a third study, evaluated room layout learning performance by blindfolded participants with an android smartphone. Three perceptual and one non-perceptual mode were tested for cognitive map development. As expected the perceptual interfaces performed significantly better than the non-perceptual language based mode in an allocentric pointing judgment and in overall subjective rating. In sum, the perceptual interfaces led to better spatial learning performance and higher user ratings. Also there is no significant difference in a cognitive map developed through spatial audio based on tracking userâs head or hand. These results have important implications as they support development of accessible perceptually driven interfaces for smartphones
Integration and coordination in a cognitive vision system
In this paper, we present a case study that exemplifies
general ideas of system integration and coordination.
The application field of assistant technology provides an
ideal test bed for complex computer vision systems including
real-time components, human-computer interaction, dynamic
3-d environments, and information retrieval aspects.
In our scenario the user is wearing an augmented reality device
that supports her/him in everyday tasks by presenting
information that is triggered by perceptual and contextual
cues. The system integrates a wide variety of visual functions
like localization, object tracking and recognition, action
recognition, interactive object learning, etc. We show
how different kinds of system behavior are realized using
the Active Memory Infrastructure that provides the technical
basis for distributed computation and a data- and eventdriven
integration approach
Using film cutting in interface design
It has been suggested that computer interfaces could be made more usable if their designers utilized cinematography techniques, which have evolved to guide
the viewer through a narrative despite frequent discontinuities in the presented scene (i.e., cuts between shots). Because of differences between the domains of
film and interface design, it is not straightforward to understand how such techniques can be transferred. May and Barnard (1995) argued that a psychological
model of watching film could support such a transference. This article presents an extended account of this model, which allows identification of the practice of collocation
of objects of interest in the same screen position before and after a cut. To verify that filmmakers do, in fact, use such techniques successfully, eye movements
were measured while participants watched the entirety of a commerciall
- âŠ