240 research outputs found
Understanding egocentric human actions with temporal decision forests
Understanding human actions is a fundamental task in computer vision with a wide range of applications including pervasive health-care, robotics and game control. This thesis focuses on the problem of egocentric action recognition from RGB-D data, wherein the world is viewed through the eyes of the actor whose hands describe the actions.
The main contributions of this work are its findings regarding egocentric actions as described by hands in two application scenarios and a proposal of a new technique that is based on temporal decision forests. The thesis first introduces a novel framework to recognise fingertip writing in mid-air in the context of human-computer interaction. This framework detects whether the user is writing and tracks the fingertip over time to generate spatio-temporal trajectories that are recognised by using a Hough forest variant that encourages temporal consistency in prediction. A problem with using such forest approach for action recognition is that the learning of temporal dynamics is limited to hand-crafted temporal features and temporal regression, which may break the temporal continuity and lead to inconsistent predictions. To overcome this limitation, the thesis proposes transition forests. Besides any temporal information that is encoded in the feature space, the forest automatically learns the temporal dynamics during training, and it is exploited in inference in an online and efficient manner achieving state-of-the-art results. The last contribution of this thesis is its introduction of the first RGB-D benchmark to allow for the study of egocentric hand-object actions with both hand and object pose annotations. This study conducts an extensive evaluation of different baselines, state-of-the art approaches and temporal decision forest models using colour, depth and hand pose features. Furthermore, it extends the transition forest model to incorporate data from different modalities and demonstrates the benefit of using hand pose features to recognise egocentric human actions. The thesis concludes by discussing and analysing the contributions and proposing a few ideas for future work.Open Acces
Finger-stylus for non touch-enable systems
Since computer was invented, people are using many devices to interact with computer. Initially there were keyboard, mouse etc. but with advancement of technology, new ways are being discovered that are quite common and natural to the humans like stylus for touch-enabled systems. In the current age of technology, the user is expected to touch the machine interface to give input. Hand gesture is used in such a way to interact with machines where natural bare hand is used to communicate without touching machine interface. It gives a feeling to the user that he is interacting in a natural way with some human, not with traditional machines. This paper presents a technique where the user need not touch the machine interface to draw on the screen. Here hand finger draws shapes on monitor like stylus, without touching the monitor. This method can be used in many applications including games. The finger is used as an input device that acts like a paint-brush or finger-stylus and is used to make shapes in front of the camera. Fingertip extraction and motion tracking were done in Matlab with real time constraints. This work is an early attempt to replace stylus with the natural finger without touching the screen
Freeform 3D interactions in everyday environments
PhD ThesisPersonal computing is continuously moving away from traditional input using
mouse and keyboard, as new input technologies emerge. Recently, natural user interfaces
(NUI) have led to interactive systems that are inspired by our physical interactions
in the real-world, and focus on enabling dexterous freehand input in 2D or 3D. Another
recent trend is Augmented Reality (AR), which follows a similar goal to further reduce
the gap between the real and the virtual, but predominately focuses on output, by overlaying
virtual information onto a tracked real-world 3D scene.
Whilst AR and NUI technologies have been developed for both immersive 3D output as
well as seamless 3D input, these have mostly been looked at separately. NUI focuses on
sensing the user and enabling new forms of input; AR traditionally focuses on capturing
the environment around us and enabling new forms of output that are registered to the
real world. The output of NUI systems is mainly presented on a 2D display, while
the input technologies for AR experiences, such as data gloves and body-worn motion
trackers are often uncomfortable and restricting when interacting in the real world.
NUI and AR can be seen as very complimentary, and bringing these two fields together
can lead to new user experiences that radically change the way we interact with
our everyday environments. The aim of this thesis is to enable real-time, low latency,
dexterous input and immersive output without heavily instrumenting the user. The
main challenge is to retain and to meaningfully combine the positive qualities that are
attributed to both NUI and AR systems.
I review work in the intersecting research fields of AR and NUI, and explore freehand
3D interactions with varying degrees of expressiveness, directness and mobility
in various physical settings. There a number of technical challenges that arise when
designing a mixed NUI/AR system, which I will address is this work: What can we capture,
and how? How do we represent the real in the virtual? And how do we physically
couple input and output? This is achieved by designing new systems, algorithms, and
user experiences that explore the combination of AR and NUI
Review of three-dimensional human-computer interaction with focus on the leap motion controller
Modern hardware and software development has led to an evolution of user interfaces from command-line to natural user interfaces for virtual immersive environments. Gestures imitating real-world interaction tasks increasingly replace classical two-dimensional interfaces based on Windows/Icons/Menus/Pointers (WIMP) or touch metaphors. Thus, the purpose of this paper is to survey the state-of-the-art Human-Computer Interaction (HCI) techniques with a focus on the special field of three-dimensional interaction. This includes an overview of currently available interaction devices, their applications of usage and underlying methods for gesture design and recognition. Focus is on interfaces based on the Leap Motion Controller (LMC) and corresponding methods of gesture design and recognition. Further, a review of evaluation methods for the proposed natural user interfaces is given
- …