272 research outputs found
Context-aware gestural interaction in the smart environments of the ubiquitous computing era
A thesis submitted to the University of Bedfordshire in partial fulfilment of the requirements for the degree of Doctor of PhilosophyTechnology is becoming pervasive and the current interfaces are not adequate for the interaction with the smart environments of the ubiquitous computing era. Recently, researchers have started to address this issue introducing the concept of natural user interface, which is mainly based on gestural interactions. Many issues are still open in this emerging domain and, in particular, there is a lack of common guidelines for coherent implementation of gestural interfaces.
This research investigates gestural interactions between humans and smart environments. It proposes a novel framework for the high-level organization of the context information. The framework is conceived to provide the support for a novel approach using functional gestures to reduce the gesture ambiguity and the number of gestures in taxonomies and improve the usability.
In order to validate this framework, a proof-of-concept has been developed. A prototype has been developed by implementing a novel method for the view-invariant recognition of deictic and dynamic gestures. Tests have been conducted to assess the gesture recognition accuracy and the usability of the interfaces developed following the proposed framework. The results show that the method provides optimal gesture recognition from very different view-points whilst the usability tests have yielded high scores.
Further investigation on the context information has been performed tackling the problem of user status. It is intended as human activity and a technique based on an innovative application of electromyography is proposed. The tests show that the proposed technique has achieved good activity recognition accuracy.
The context is treated also as system status. In ubiquitous computing, the system can adopt different paradigms: wearable, environmental and pervasive. A novel paradigm, called synergistic paradigm, is presented combining the advantages of the wearable and environmental paradigms. Moreover, it augments the interaction possibilities of the user and ensures better gesture recognition accuracy than with the other paradigms
The Evolution of First Person Vision Methods: A Survey
The emergence of new wearable technologies such as action cameras and
smart-glasses has increased the interest of computer vision scientists in the
First Person perspective. Nowadays, this field is attracting attention and
investments of companies aiming to develop commercial devices with First Person
Vision recording capabilities. Due to this interest, an increasing demand of
methods to process these videos, possibly in real-time, is expected. Current
approaches present a particular combinations of different image features and
quantitative methods to accomplish specific objectives like object detection,
activity recognition, user machine interaction and so on. This paper summarizes
the evolution of the state of the art in First Person Vision video analysis
between 1997 and 2014, highlighting, among others, most commonly used features,
methods, challenges and opportunities within the field.Comment: First Person Vision, Egocentric Vision, Wearable Devices, Smart
Glasses, Computer Vision, Video Analytics, Human-machine Interactio
Robust Signal Processing Techniques for Wearable Inertial Measurement Unit (IMU) Sensors
Activity and gesture recognition using wearable motion sensors, also known as inertial measurement units (IMUs), provides important context for many ubiquitous sensing applications including healthcare monitoring, human computer interface and context-aware smart homes and offices. Such systems are gaining popularity due to their minimal cost and ability to provide sensing functionality at any time and place. However, several factors can affect the system performance such as sensor location and orientation displacement, activity and gesture inconsistency, movement speed variation and lack of tiny motion information.
This research is focused on developing signal processing solutions to ensure the system robustness with respect to these factors. Firstly, for existing systems which have already been designed to work with certain sensor orientation/location, this research proposes opportunistic calibration algorithms leveraging camera information from the environment to ensure the system performs correctly despite location or orientation displacement of the sensors. The calibration algorithms do not require extra effort from the users and the calibration is done seamlessly when the users present in front of an environmental camera and perform arbitrary movements. Secondly, an orientation independent and speed independent approach is proposed and studied by exploring a novel orientation independent feature set and by intelligently selecting only the relevant and consistent portions of various activities and gestures. Thirdly, in order to address the challenge that the IMU is not able capture tiny motion which is important to some applications, a sensor fusion framework is proposed to fuse the complementary sensor modality in order to enhance the system performance and robustness. For example, American Sign Language has a large vocabulary of signs and a recognition system solely based on IMU sensors would not perform very well. In order to demonstrate the feasibility of sensor fusion techniques, a robust real-time American Sign Language recognition approach is developed using wrist worn IMU and surface electromyography (EMG) sensors
Robust Signal Processing Techniques for Wearable Inertial Measurement Unit (IMU) Sensors
Activity and gesture recognition using wearable motion sensors, also known as inertial measurement units (IMUs), provides important context for many ubiquitous sensing applications including healthcare monitoring, human computer interface and context-aware smart homes and offices. Such systems are gaining popularity due to their minimal cost and ability to provide sensing functionality at any time and place. However, several factors can affect the system performance such as sensor location and orientation displacement, activity and gesture inconsistency, movement speed variation and lack of tiny motion information.
This research is focused on developing signal processing solutions to ensure the system robustness with respect to these factors. Firstly, for existing systems which have already been designed to work with certain sensor orientation/location, this research proposes opportunistic calibration algorithms leveraging camera information from the environment to ensure the system performs correctly despite location or orientation displacement of the sensors. The calibration algorithms do not require extra effort from the users and the calibration is done seamlessly when the users present in front of an environmental camera and perform arbitrary movements. Secondly, an orientation independent and speed independent approach is proposed and studied by exploring a novel orientation independent feature set and by intelligently selecting only the relevant and consistent portions of various activities and gestures. Thirdly, in order to address the challenge that the IMU is not able capture tiny motion which is important to some applications, a sensor fusion framework is proposed to fuse the complementary sensor modality in order to enhance the system performance and robustness. For example, American Sign Language has a large vocabulary of signs and a recognition system solely based on IMU sensors would not perform very well. In order to demonstrate the feasibility of sensor fusion techniques, a robust real-time American Sign Language recognition approach is developed using wrist worn IMU and surface electromyography (EMG) sensors
Visual on-line learning in distributed camera networks
Automatic detection of persons is an important application in visual surveillance. In general, state-of-the-art systems have two main disadvantages: First, usually a general detector has to be learned that is applicable to a wide range of scenes. Thus, the training is time-consuming and requires a huge amount of labeled data. Second, the data is usually processed centralized, which leads to a huge network traffic. Thus, the goal of this paper is to overcome these problems, which is realized by a person detection system, that is based on distributed smart cameras (DSCs). Assuming that we have a large number of cameras with partly overlapping views, the main idea is to reduce the model complexity of the detector by training a specific detector for each camera. These detectors are initialized by a pre-trained classifier, that is then adapted for a specific camera by co-training. In particular, for co-training we apply an on-line learning method (i.e., boosting for feature selection), where the information exchange is realized via mapping the overlapping views onto each other by using a homography. Thus, we have a compact scenedependent representation, which allows to train and to evaluate the classifiers on an embedded device. Moreover, since the information transfer is reduced to exchanging positions the required network-traffic is minimal. The power of the approach is demonstrated in various experiments on different publicly available data sets. In fact, we show that on-line learning and applying DSCs can benefit from each other. Index Terms — visual on-line learning, object detection, multi-camera networks 1
Sensor fusion in smart camera networks for ambient intelligence
This short report introduces the topics of PhD research that was conducted on 2008-2013 and was defended on July 2013. The PhD thesis covers sensor fusion theory, gathers it into a framework with design rules for fusion-friendly design of vision networks, and elaborates on the rules through fusion experiments performed with four distinct applications of Ambient Intelligence
Practical and Rich User Digitization
A long-standing vision in computer science has been to evolve computing
devices into proactive assistants that enhance our productivity, health and
wellness, and many other facets of our lives. User digitization is crucial in
achieving this vision as it allows computers to intimately understand their
users, capturing activity, pose, routine, and behavior. Today's consumer
devices - like smartphones and smartwatches provide a glimpse of this
potential, offering coarse digital representations of users with metrics such
as step count, heart rate, and a handful of human activities like running and
biking. Even these very low-dimensional representations are already bringing
value to millions of people's lives, but there is significant potential for
improvement. On the other end, professional, high-fidelity comprehensive user
digitization systems exist. For example, motion capture suits and multi-camera
rigs that digitize our full body and appearance, and scanning machines such as
MRI capture our detailed anatomy. However, these carry significant user
practicality burdens, such as financial, privacy, ergonomic, aesthetic, and
instrumentation considerations, that preclude consumer use. In general, the
higher the fidelity of capture, the lower the user's practicality. Most
conventional approaches strike a balance between user practicality and
digitization fidelity.
My research aims to break this trend, developing sensing systems that
increase user digitization fidelity to create new and powerful computing
experiences while retaining or even improving user practicality and
accessibility, allowing such technologies to have a societal impact. Armed with
such knowledge, our future devices could offer longitudinal health tracking,
more productive work environments, full body avatars in extended reality, and
embodied telepresence experiences, to name just a few domains.Comment: PhD thesi
- …