654 research outputs found

    Spatially Aware Computing for Natural Interaction

    Get PDF
    Spatial information refers to the location of an object in a physical or digital world. Besides, it also includes the relative position of an object related to other objects around it. In this dissertation, three systems are designed and developed. All of them apply spatial information in different fields. The ultimate goal is to increase the user friendliness and efficiency in those applications by utilizing spatial information. The first system is a novel Web page data extraction application, which takes advantage of 2D spatial information to discover structured records from a Web page. The extracted information is useful to re-organize the layout of a Web page to fit mobile browsing. The second application utilizes the 3D spatial information of a mobile device within a large paper-based workspace to implement interactive paper that combines the merits of paper documents and mobile devices. This application can overlay digital information on top of a paper document based on the location of a mobile device within a workspace. The third application further integrates 3D space information with sound detection to realize an automatic camera management system. This application automatically controls multiple cameras in a conference room, and creates an engaging video by intelligently switching camera shots among meeting participants based on their activities. Evaluations have been made on all three applications, and the results are promising. In summary, this dissertation comprehensively explores the usage of spatial information in various applications to improve the usability

    Automatic visual detection of human behavior: a review from 2000 to 2014

    Get PDF
    Due to advances in information technology (e.g., digital video cameras, ubiquitous sensors), the automatic detection of human behaviors from video is a very recent research topic. In this paper, we perform a systematic and recent literature review on this topic, from 2000 to 2014, covering a selection of 193 papers that were searched from six major scientific publishers. The selected papers were classified into three main subjects: detection techniques, datasets and applications. The detection techniques were divided into four categories (initialization, tracking, pose estimation and recognition). The list of datasets includes eight examples (e.g., Hollywood action). Finally, several application areas were identified, including human detection, abnormal activity detection, action recognition, player modeling and pedestrian detection. Our analysis provides a road map to guide future research for designing automatic visual human behavior detection systems.This work is funded by the Portuguese Foundation for Science and Technology (FCT - Fundacao para a Ciencia e a Tecnologia) under research Grant SFRH/BD/84939/2012

    Virtual Reality Applications and Development

    Get PDF
    Virtual Reality (VR) has existed for many years; however, it has only recently gained wide spread popularity and commercial use. This change comes from the innovations in head mounted displays (HMDs) and from the work of many software engineers making quality user experiences (UX). In this thesis, four areas are explored inside of VR. One area of research is within the use of VR for virtual environments and fire simulations. The second area of research is within the use of VR for eye tracking and medical simulations. The third area of research is within multiplayer development for more immersive collaborative simulations. Finally, the fourth area of research is within the development of typing in 3D for virtual reality. Extending from this final area of research, this thesis details an application that details more practical and granular details about developing for VR and using the real-time development platform, Unity

    Low Cost Open Source Modal Virtual Environment Interfaces Using Full Body Motion Tracking and Hand Gesture Recognition

    Get PDF
    Virtual environments provide insightful and meaningful ways to explore data sets through immersive experiences. One of the ways immersion is achieved is through natural interaction methods instead of only a keyboard and mouse. Intuitive tracking systems for natural interfaces suitable for such environments are often expensive. Recently however, devices such as gesture tracking gloves and skeletal tracking systems have emerged in the consumer market. This project integrates gestural interfaces into an open source virtual reality toolkit using consumer grade input devices and generates a set of tools to enable multimodal gestural interface creation. The AnthroTronix AcceleGlove is used to augment body tracking data from a Microsoft Kinect with fine grained hand gesture data. The tools are found to be useful as a sample gestural interface is implemented using them. The project concludes by suggesting studies targeting gestural interfaces using such devices as well as other areas for further research

    Low Cost Open Source Modal Virtual Environment Interfaces Using Full Body Motion Tracking and Hand Gesture Recognition

    Get PDF
    Virtual environments provide insightful and meaningful ways to explore data sets through immersive experiences. One of the ways immersion is achieved is through natural interaction methods instead of only a keyboard and mouse. Intuitive tracking systems for natural interfaces suitable for such environments are often expensive. Recently however, devices such as gesture tracking gloves and skeletal tracking systems have emerged in the consumer market. This project integrates gestural interfaces into an open source virtual reality toolkit using consumer grade input devices and generates a set of tools to enable multimodal gestural interface creation. The AnthroTronix AcceleGlove is used to augment body tracking data from a Microsoft Kinect with fine grained hand gesture data. The tools are found to be useful as a sample gestural interface is implemented using them. The project concludes by suggesting studies targeting gestural interfaces using such devices as well as other areas for further research

    Group Action Recognition Using Space-Time Interest Points

    Full text link
    Abstract. Group action recognition is a challenging task in computer vision due to the large complexity induced by multiple motion patterns. This paper aims at analyzing group actions in video clips containing sev-eral activities. We combine the probability summation framework with the space-time (ST) interest points for this task. First, ST interest points are extracted from video clips to form the feature space. Then we use k-means for feature clustering and build a compact representation, which is then used for group action classification. The proposed approach has been applied to classification tasks including four classes: badminton, tennis, basketball, and soccer videos. The experimental results demon-strate the advantages of the proposed approach.

    An Overview of Self-Adaptive Technologies Within Virtual Reality Training

    Get PDF
    This overview presents the current state-of-the-art of self-adaptive technologies within virtual reality (VR) training. Virtual reality training and assessment is increasingly used for five key areas: medical, industrial & commercial training, serious games, rehabilitation and remote training such as Massive Open Online Courses (MOOCs). Adaptation can be applied to five core technologies of VR including haptic devices, stereo graphics, adaptive content, assessment and autonomous agents. Automation of VR training can contribute to automation of actual procedures including remote and robotic assisted surgery which reduces injury and improves accuracy of the procedure. Automated haptic interaction can enable tele-presence and virtual artefact tactile interaction from either remote or simulated environments. Automation, machine learning and data driven features play an important role in providing trainee-specific individual adaptive training content. Data from trainee assessment can form an input to autonomous systems for customised training and automated difficulty levels to match individual requirements. Self-adaptive technology has been developed previously within individual technologies of VR training. One of the conclusions of this research is that while it does not exist, an enhanced portable framework is needed and it would be beneficial to combine automation of core technologies, producing a reusable automation framework for VR training

    A new pose-based representation for recognizing actions from multiple cameras

    Get PDF
    Cataloged from PDF version of article.We address the problem of recognizing actions from arbitrary views for a multi-camera system. We argue that poses are important for understanding human actions and the strength of the pose representation affects the overall performance of the action recognition system. Based on this idea, we present a new view-independent representation for human poses. Assuming that the data is initially provided in the form of volumetric data, the volume of the human body is first divided into a sequence of horizontal layers, and then the intersections of the body segments with each layer are coded with enclosing circles. The circular features in all layers (i) the number of circles, (ii) the area of the outer circle, and (iii) the area of the inner circle are then used to generate a pose descriptor. The pose descriptors of all frames in an action sequence are further combined to generate corresponding motion descriptors. Action recognition is then performed with a simple nearest neighbor classifier. Experiments performed on the benchmark IXMAS multi-view dataset demonstrate that the performance of our method is comparable to the other methods in the literature. (C) 2010 Elsevier Inc. All rights reserved

    Multi-sensor fusion for human-robot interaction in crowded environments

    Get PDF
    For challenges associated with the ageing population, robot assistants are becoming a promising solution. Human-Robot Interaction (HRI) allows a robot to understand the intention of humans in an environment and react accordingly. This thesis proposes HRI techniques to facilitate the transition of robots from lab-based research to real-world environments. The HRI aspects addressed in this thesis are illustrated in the following scenario: an elderly person, engaged in conversation with friends, wishes to attract a robot's attention. This composite task consists of many problems. The robot must detect and track the subject in a crowded environment. To engage with the user, it must track their hand movement. Knowledge of the subject's gaze would ensure that the robot doesn't react to the wrong person. Understanding the subject's group participation would enable the robot to respect existing human-human interaction. Many existing solutions to these problems are too constrained for natural HRI in crowded environments. Some require initial calibration or static backgrounds. Others deal poorly with occlusions, illumination changes, or real-time operation requirements. This work proposes algorithms that fuse multiple sensors to remove these restrictions and increase the accuracy over the state-of-the-art. The main contributions of this thesis are: A hand and body detection method, with a probabilistic algorithm for their real-time association when multiple users and hands are detected in crowded environments; An RGB-D sensor-fusion hand tracker, which increases position and velocity accuracy by combining a depth-image based hand detector with Monte-Carlo updates using colour images; A sensor-fusion gaze estimation system, combining IR and depth cameras on a mobile robot to give better accuracy than traditional visual methods, without the constraints of traditional IR techniques; A group detection method, based on sociological concepts of static and dynamic interactions, which incorporates real-time gaze estimates to enhance detection accuracy.Open Acces

    Recognition and Understanding of Meetings Overview of the European AMI and AMIDA Projects

    Get PDF
    The AMI and AMIDA projects are concerned with the recognition and interpretation of multiparty (face-to-face and remote) meetings. Within these projects we have developed the following: (1) an infrastructure for recording meetings using multiple microphones and cameras; (2) a one hundred hour, manually annotated meeting corpus; (3) a number of techniques for indexing, and summarizing of meeting videos using automatic speech recognition and computer vision, and (4) a extensible framework for browsing, and searching of meeting videos. We give an overview of the various techniques developed in AMI (mainly involving face-to-face meetings), their integration into our meeting browser framework, and future plans for AMIDA (Augmented Multiparty Interaction with Distant Access), the follow-up project to AMI. Technical and business information related to these two projects can be found at www.amiproject.org, respectively on the Scientific and Business portals
    corecore