5,924 research outputs found

    Exploring Vision-Based Interfaces: How to Use Your Head in Dual Pointing Tasks

    Get PDF
    The utility of vision-based face tracking for dual pointing tasks is evaluated. We first describe a 3-D face tracking technique based on real-time parametric motion-stereo, which is non-invasive, robust, and self-initialized. The tracker provides a real-time estimate of a ?frontal face ray? whose intersection with the display surface plane is used as a second stream of input for scrolling or pointing, in paral-lel with hand input. We evaluated the performance of com-bined head/hand input on a box selection and coloring task: users selected boxes with one pointer and colors with a second pointer, or performed both tasks with a single pointer. We found that performance with head and one hand was intermediate between single hand performance and dual hand performance. Our results are consistent with previously reported dual hand conflict in symmetric pointing tasks, and suggest that a head-based input stream should be used for asymmetric control

    Sensory processing and world modeling for an active ranging device

    Get PDF
    In this project, we studied world modeling and sensory processing for laser range data. World Model data representation and operation were defined. Sensory processing algorithms for point processing and linear feature detection were designed and implemented. The interface between world modeling and sensory processing in the Servo and Primitive levels was investigated and implemented. In the primitive level, linear features detectors for edges were also implemented, analyzed and compared. The existing world model representations is surveyed. Also presented is the design and implementation of the Y-frame model, a hierarchical world model. The interfaces between the world model module and the sensory processing module are discussed as well as the linear feature detectors that were designed and implemented

    A Customizable Camera-based Human Computer Interaction System Allowing People With Disabilities Autonomous Hands Free Navigation of Multiple Computing Task

    Full text link
    Many people suffer from conditions that lead to deterioration of motor control and makes access to the computer using traditional input devices difficult. In particular, they may loose control of hand movement to the extent that the standard mouse cannot be used as a pointing device. Most current alternatives use markers or specialized hardware to track and translate a user's movement to pointer movement. These approaches may be perceived as intrusive, for example, wearable devices. Camera-based assistive systems that use visual tracking of features on the user's body often require cumbersome manual adjustment. This paper introduces an enhanced computer vision based strategy where features, for example on a user's face, viewed through an inexpensive USB camera, are tracked and translated to pointer movement. The main contributions of this paper are (1) enhancing a video based interface with a mechanism for mapping feature movement to pointer movement, which allows users to navigate to all areas of the screen even with very limited physical movement, and (2) providing a customizable, hierarchical navigation framework for human computer interaction (HCI). This framework provides effective use of the vision-based interface system for accessing multiple applications in an autonomous setting. Experiments with several users show the effectiveness of the mapping strategy and its usage within the application framework as a practical tool for desktop users with disabilities.National Science Foundation (IIS-0093367, IIS-0329009, 0202067

    Architecture and applications of the FingerMouse: a smart stereo camera for wearable computing HCI

    Get PDF
    In this paper we present a visual input HCI system for wearable computers, the FingerMouse. It is a fully integrated stereo camera and vision processing system, with a specifically designed ASIC performing stereo block matching at 5Mpixel/s (e.g. QVGA 320×240at 30fps) and a disparity range of 47, consuming 187mW (78mW in the ASIC). It is button-sized (43mm×18mm) and can be worn on the body, capturing the user's hand and processing in real-time its coordinates as well as a 1-bit image of the hand segmented from the background. Alternatively, the system serves as a smart depth camera, delivering foreground segmentation and tracking, depth maps and standard images, with a processing latency smaller than 1ms. This paper describes the FingerMouse functionality and its applications, and how the specific architecture outperforms other systems in size, latency and power consumptio

    Simultaneous localization and map-building using active vision

    No full text
    An active approach to sensing can provide the focused measurement capability over a wide field of view which allows correctly formulated Simultaneous Localization and Map-Building (SLAM) to be implemented with vision, permitting repeatable long-term localization using only naturally occurring, automatically-detected features. In this paper, we present the first example of a general system for autonomous localization using active vision, enabled here by a high-performance stereo head, addressing such issues as uncertainty-based measurement selection, automatic map-maintenance, and goal-directed steering. We present varied real-time experiments in a complex environment.Published versio

    Kinect Range Sensing: Structured-Light versus Time-of-Flight Kinect

    Full text link
    Recently, the new Kinect One has been issued by Microsoft, providing the next generation of real-time range sensing devices based on the Time-of-Flight (ToF) principle. As the first Kinect version was using a structured light approach, one would expect various differences in the characteristics of the range data delivered by both devices. This paper presents a detailed and in-depth comparison between both devices. In order to conduct the comparison, we propose a framework of seven different experimental setups, which is a generic basis for evaluating range cameras such as Kinect. The experiments have been designed with the goal to capture individual effects of the Kinect devices as isolatedly as possible and in a way, that they can also be adopted, in order to apply them to any other range sensing device. The overall goal of this paper is to provide a solid insight into the pros and cons of either device. Thus, scientists that are interested in using Kinect range sensing cameras in their specific application scenario can directly assess the expected, specific benefits and potential problem of either device.Comment: 58 pages, 23 figures. Accepted for publication in Computer Vision and Image Understanding (CVIU

    Intelligent Agent Architectures: Reactive Planning Testbed

    Get PDF
    An Integrated Agent Architecture (IAA) is a framework or paradigm for constructing intelligent agents. Intelligent agents are collections of sensors, computers, and effectors that interact with their environments in real time in goal-directed ways. Because of the complexity involved in designing intelligent agents, it has been found useful to approach the construction of agents with some organizing principle, theory, or paradigm that gives shape to the agent's components and structures their relationships. Given the wide variety of approaches being taken in the field, the question naturally arises: Is there a way to compare and evaluate these approaches? The purpose of the present work is to develop common benchmark tasks and evaluation metrics to which intelligent agents, including complex robotic agents, constructed using various architectural approaches can be subjected

    Direct interaction with large displays through monocular computer vision

    Get PDF
    Large displays are everywhere, and have been shown to provide higher productivity gain and user satisfaction compared to traditional desktop monitors. The computer mouse remains the most common input tool for users to interact with these larger displays. Much effort has been made on making this interaction more natural and more intuitive for the user. The use of computer vision for this purpose has been well researched as it provides freedom and mobility to the user and allows them to interact at a distance. Interaction that relies on monocular computer vision, however, has not been well researched, particularly when used for depth information recovery. This thesis aims to investigate the feasibility of using monocular computer vision to allow bare-hand interaction with large display systems from a distance. By taking into account the location of the user and the interaction area available, a dynamic virtual touchscreen can be estimated between the display and the user. In the process, theories and techniques that make interaction with computer display as easy as pointing to real world objects is explored. Studies were conducted to investigate the way human point at objects naturally with their hand and to examine the inadequacy in existing pointing systems. Models that underpin the pointing strategy used in many of the previous interactive systems were formalized. A proof-of-concept prototype is built and evaluated from various user studies. Results from this thesis suggested that it is possible to allow natural user interaction with large displays using low-cost monocular computer vision. Furthermore, models developed and lessons learnt in this research can assist designers to develop more accurate and natural interactive systems that make use of human’s natural pointing behaviours

    Intraoperative Endoscopic Augmented Reality in Third Ventriculostomy

    Get PDF
    In neurosurgery, as a result of the brain-shift, the preoperative patient models used as a intraoperative reference change. A meaningful use of the preoperative virtual models during the operation requires for a model update. The NEAR project, Neuroendoscopy towards Augmented Reality, describes a new camera calibration model for high distorted lenses and introduces the concept of active endoscopes endowed with with navigation, camera calibration, augmented reality and triangulation modules
    corecore