32 research outputs found

    Evaluation of Background Subtraction Algorithms with Post-processing

    Get PDF
    Processing a video stream to segment foreground objects from the background is a critical first step in many computer vision applications. Background subtraction (BGS) is a commonly used technique for achieving this segmentation. The popularity of BGS largely comes from its computational efficiency, which allows applications such as humancomputer interaction, video surveillance, and traffic monitoring to meet their real-time goals. Numerous BGS algorithms and a number of postprocessing techniques that aim to improve the results of these algorithms have been proposed. In this paper, we evaluate several popular, state-of-the-art BGS algorithms and examine how post-processing techniques affect their performance. Our experimental results demonstrate that post-processing techniques can significantly improve the foreground segmentation masks produced by a BGS algorithm. We provide recommendations for achieving robust foreground segmentation based on the lessons learned performing this comparative study. 1

    Neural Networks for Computer-Human Interfaces: Glove-TalkII

    No full text
    Glove-TalkII is system which has an adaptive interface built with neural networks. Glove-TalkII maps hand gestures continuously to 10 control parameters of a parallel formant speech synthesizer. The mapping allows the hand to act as an artificial vocal tract that produces speech in real time giving an unlimited vocabulary and unlimited control of fundamental frequency and volume. The best version of Glove-TalkII uses several input devices (including a CyberGlove, a ContactGlove, a 3-space tracker, and a foot-pedal), a parallel formant speech synthesizer and 3 neural networks. The gesture-to-speech task is divided into vowel and consonant production by using a gating network to weight the outputs of a vowel and a consonant neural network. The gating network and the consonant network are trained with examples from the user. The vowel network implements a fixed, user-defined relationship between hand-position and vowel sound and does not require any training examples from the user. Volume, fundamental frequency and stop consonants are produced with a fixed mapping from the input devices. One subject has trained to speak intelligibly with Glove-TalkII. He speaks slowly with speech quality similar to a text-tospeech synthesizer but with far more natural-sounding pitch variations. Characteristics of the neural networks both enhance and detract from control intimacy

    Glove-TalkII: Mapping Hand Gestures to Speech Using Neural Networks - An Approach to Building Adaptive Interfaces

    No full text
    Glove-TalkII is a system which translates hand gestures to speech through an adaptive interface. Hand gestures are mapped continuously to 10 control parameters of a parallel formant speech synthesizer. The mapping allows the hand to act as an artificial vocal tract that produces speech in real time. This gives an unlimited vocabulary in addition to direct control of fundamental frequency and volume. Currently, the best version of Glove-TalkII uses several input devices (including a Cyberglove, a 3-space tracker, a keyboard and a foot-pedal), a parallel formant speech synthesizer and 3 neural networks. The gesture-to-speech task is divided into vowel and consonant production by using a gating network to weight the outputs of a vowel and a consonant neural network. The gating network and the consonant network are trained with examples from the user. The vowel network implements a fixed, user-defined relationship between hand-position and vowel sound and does not require any training exam..

    OpenVL: Towards A Novel Software Architecture for Computer Vision

    No full text
    This paper presents our progress on OpenVL- a novel software architecture to address efficiency through facilitating hardware acceleration, reusability and scalability for computer vision. A logical image understanding pipeline is introduced to allow parallel processing. As well, we discuss our middleware- VLUT that enables applications to operate transparently over a heterogeneous collection of hardware implementations. OpenVL works as a state machine, with an event-driven mechanism to provide users with application-level interaction. Various explicit or implicit synchronization and communication methods are supported among distributed processes in the logical pipelines. The intent of OpenVL is to allow users to quickly and easily recover useful information from multiple scenes across various software environments and hardware platforms. We implement two different human tracking systems to validate the critical underlying concepts of OpenVL. 1

    Glove-Talk: A neural network interface between a data-glove and a speech synthesizer

    No full text
    To illustrate the potential of multilayer neural networks for adaptive interfaces, we used a VPL DataGlove connected to a DECtalk speech synthesizer via five neural networks to implement a hand-gesture to speech system. Using minor variations of the standard back-propagation learning procedure, the complex mapping of hand movements to speech is learned using data obtained from a single "speaker" in a simple training phase. With a 203 gesture-to-word vocabulary, the wrong word is produced less than 1% of the time, and no word is produced about 5% of the time. Adaptivecontrol of the speaking rate and word stress is also available. The training times and final performance speed are improved by using small, separate networks for each naturally defined subtask. The system demonstrates that neural networks can be used to develop the complex mappings required in a high bandwidth interface that adapts to the individual user

    Design of Virtual 3D Instruments for Musical Interaction

    No full text
    An environment for designing virtual instruments with 3D geometry has been prototyped and applied to realtime sound control and design. It was implemented by extending a realtime, visual programming language called Max/FTS, running on an SGI Onyx, with software objects to interface CyberGloves and Polhemus sensors and to compute human movement and virtual object features. Virtual input devices with behaviours of a rubber balloon and sheet were designed for the control of sound spatialization and timbre parameters. Informal evaluation showed that a sonification inspired by the physical world appears natural and effective. More research is required for a natural sonification of virtual input device features such as shape, taking into account possible co-articulation of these features. While both hands can be used for manipulation, left-hand-only interaction with a virtual instrument may be a useful replacement for and extension of the standard music synthesizer keyboard modulation wheel...
    corecore