4,922 research outputs found

    Recognition of Gestural Object Reference with Auditory Feedback

    Get PDF
    Bax I, Bekel H, Heidemann G. Recognition of Gestural Object Reference with Auditory Feedback. In: Kaynak O, ed. Artificial neural networks and neural information processing. Proceedings. Lecture notes in computer science. Vol 2712. Berlin: Springer; 2003: 425-432.We present a cognitively motivated vision architecture for the evaluation of pointing gestures. The system views a scene of several structured objects and a pointing human hand. A neural classifier gives an estimation of the pointing direction, then the object correspondence is established using a sub-symbolic representation of both the scene and the pointing direction. The system achieves high robustness because the result (the indicated location) does not primarily depend on the accuracy of the pointing direction classification. Instead, the scene is analysed for low level saliency features to restrict the set of all possible pointing locations to a subset of highly likely locations. This transformation of the "continuous" to a "discrete" pointing problem simultaneously facilitates an auditory feedback whenever the object reference changes, which leads to a significantly improved human-machine interaction

    miMic: The microphone as a pencil

    Get PDF
    miMic, a sonic analogue of paper and pencil is proposed: An augmented microphone for vocal and gestural sonic sketching. Vocalizations are classified and interpreted as instances of sound models, which the user can play with by vocal and gestural control. The physical device is based on a modified microphone, with embedded inertial sensors and buttons. Sound models can be selected by vocal imitations that are automatically classified, and each model is mapped to vocal and gestural features for real-time control. With miMic, the sound designer can explore a vast sonic space and quickly produce expressive sonic sketches, which may be turned into sound prototypes by further adjustment of model parameters

    Granular synthesis for display of time-varying probability densities

    Get PDF
    We present a method for displaying time-varying probabilistic information to users using an asynchronous granular synthesis technique. We extend the basic synthesis technique to include distribution over waveform source, spatial position, pitch and time inside waveforms. To enhance the synthesis in interactive contexts, we "quicken" the display by integrating predictions of user behaviour into the sonification. This includes summing the derivatives of the distribution during exploration of static densities, and using Monte-Carlo sampling to predict future user states in nonlinear dynamic systems. These techniques can be used to improve user performance in continuous control systems and in the interactive exploration of high dimensional spaces. This technique provides feedback from users potential goals, and their progress toward achieving them; modulating the feedback with quickening can help shape the users actions toward achieving these goals. We have applied these techniques to a simple nonlinear control problem as well as to the sonification of on-line probabilistic gesture recognition. We are applying these displays to mobile, gestural interfaces, where visual display is often impractical. The granular synthesis approach is theoretically elegant and easily applied in contexts where dynamic probabilistic displays are required

    Recognizing Speech in a Novel Accent: The Motor Theory of Speech Perception Reframed

    Get PDF
    The motor theory of speech perception holds that we perceive the speech of another in terms of a motor representation of that speech. However, when we have learned to recognize a foreign accent, it seems plausible that recognition of a word rarely involves reconstruction of the speech gestures of the speaker rather than the listener. To better assess the motor theory and this observation, we proceed in three stages. Part 1 places the motor theory of speech perception in a larger framework based on our earlier models of the adaptive formation of mirror neurons for grasping, and for viewing extensions of that mirror system as part of a larger system for neuro-linguistic processing, augmented by the present consideration of recognizing speech in a novel accent. Part 2 then offers a novel computational model of how a listener comes to understand the speech of someone speaking the listener's native language with a foreign accent. The core tenet of the model is that the listener uses hypotheses about the word the speaker is currently uttering to update probabilities linking the sound produced by the speaker to phonemes in the native language repertoire of the listener. This, on average, improves the recognition of later words. This model is neutral regarding the nature of the representations it uses (motor vs. auditory). It serve as a reference point for the discussion in Part 3, which proposes a dual-stream neuro-linguistic architecture to revisits claims for and against the motor theory of speech perception and the relevance of mirror neurons, and extracts some implications for the reframing of the motor theory

    An empirical study of embodied music listening, and its applications in mediation technology

    Get PDF

    Agency in mid-air interfaces

    Get PDF
    Touchless interfaces allow users to view, control and manipulate digital content without physically touching an interface. They are being explored in a wide range of application scenarios from medical surgery to car dashboard controllers. One aspect of touchless interaction that has not been explored to date is the Sense of Agency (SoA). The SoA refers to the subjective experience of voluntary control over actions in the external world. In this paper, we investigated the SoA in touchless systems using the intentional binding paradigm. We first compare touchless systems with physical interactions and then augmented different types of haptic feedback to explore how different outcome modalities influence users’ SoA. From our experiments, we demonstrated that an intentional binding effect is observed in both physical and touchless interactions with no statistical difference. Additionally, we found that haptic and auditory feedback help to increase SoA compared with visual feedback in touchless interfaces. We discuss these findings and identify design opportunities that take agency into consideration

    An Introduction to 3D User Interface Design

    Get PDF
    3D user interface design is a critical component of any virtual environment (VE) application. In this paper, we present a broad overview of three-dimensional (3D) interaction and user interfaces. We discuss the effect of common VE hardware devices on user interaction, as well as interaction techniques for generic 3D tasks and the use of traditional two-dimensional interaction styles in 3D environments. We divide most user interaction tasks into three categories: navigation, selection/manipulation, and system control. Throughout the paper, our focus is on presenting not only the available techniques, but also practical guidelines for 3D interaction design and widely held myths. Finally, we briefly discuss two approaches to 3D interaction design, and some example applications with complex 3D interaction requirements. We also present an annotated online bibliography as a reference companion to this article

    Ambient hues and audible cues: An approach to automotive user interface design using multi-modal feedback

    Get PDF
    The use of touchscreen interfaces for in-vehicle information, entertainment, and for the control of comfort settings is proliferating. Moreover, using these interfaces requires the same visual and manual resources needed for safe driving. Guided by much of the prevalent research in the areas of the human visual system, attention, and multimodal redundancy the Hues and Cues design paradigm was developed to make touchscreen automotive user interfaces more suitable to use while driving. This paradigm was applied to a prototype of an automotive user interface and evaluated with respects to driver performance using the dual-task, Lane Change Test (LCT). Each level of the design paradigm was evaluated in light of possible gender differences. The results of the repeated measures experiment suggests that when compared to interfaces without both the Hues and the Cues paradigm applied, the Hues and Cues interface requires less mental effort to operate, is more usable, and is more preferred. However, the results differ in the degradation in driver performance with interfaces that only have visual feedback resulting in better task times and significant gender differences in the driving task with interfaces that only have auditory feedback. Overall, the results reported show that the presentation of multimodal feedback can be useful in design automotive interfaces, but must be flexible enough to account for individual differences

    Tele-media-art: web-based inclusive teaching of body expression

    Get PDF
    ConferĂȘncia Internacional, realizada em OlhĂŁo, Algarve, de 26-28 de abril de 2018.The Tele-Media-Art project aims to promote the improvement of the online distance learning and artistic teaching process applied in the teaching of two test scenarios, doctorate in digital art-media and the lifelong learning course ”the experience of diversity” by exploiting multimodal telepresence facilities encompassing the diversified visual, auditory and sensory channels, as well as rich forms of gestural / body interaction. To this end, a telepresence system was developed to be installed at PalĂĄcio Ceia, in Lisbon, Portugal, headquarters of the Portuguese Open University, from which methodologies of artistic teaching in mixed regime - face-to-face and online distance - that are inclusive to blind and partially sighted students. This system has already been tested against a group of subjects, including blind people. Although positive results were achieved, more development and further tests will be carried in the futureThis project was financed by Calouste Gulbenkian Foundation under Grant number 142793.info:eu-repo/semantics/publishedVersio

    Modeling the development of pronunciation in infant speech acquisition.

    Get PDF
    Pronunciation is an important part of speech acquisition, but little attention has been given to the mechanism or mechanisms by which it develops. Speech sound qualities, for example, have just been assumed to develop by simple imitation. In most accounts this is then assumed to be by acoustic matching, with the infant comparing his output to that of his caregiver. There are theoretical and empirical problems with both of these assumptions, and we present a computational model- Elija-that does not learn to pronounce speech sounds this way. Elija starts by exploring the sound making capabilities of his vocal apparatus. Then he uses the natural responses he gets from a caregiver to learn equivalence relations between his vocal actions and his caregiver's speech. We show that Elija progresses from a babbling stage to learning the names of objects. This demonstrates the viability of a non-imitative mechanism in learning to pronounce
    • 

    corecore