1,730 research outputs found

    Combining Residual Networks with LSTMs for Lipreading

    Full text link
    We propose an end-to-end deep learning architecture for word-level visual speech recognition. The system is a combination of spatiotemporal convolutional, residual and bidirectional Long Short-Term Memory networks. We train and evaluate it on the Lipreading In-The-Wild benchmark, a challenging database of 500-size target-words consisting of 1.28sec video excerpts from BBC TV broadcasts. The proposed network attains word accuracy equal to 83.0, yielding 6.8 absolute improvement over the current state-of-the-art, without using information about word boundaries during training or testing.Comment: Submitted to Interspeech 201

    earGram Actors: an interactive audiovisual system based on social behavior

    Get PDF
    In multi-agent systems, local interactions among system components following relatively simple rules often result in complex overall systemic behavior. Complex behavioral and morphological patterns have been used to generate and organize audiovisual systems with artistic purposes. In this work, we propose to use the Actor model of social interactions to drive a concatenative synthesis engine called earGram in real time. The Actor model was originally developed to explore the emergence of complex visual patterns. On the other hand, earGram was originally developed to facilitate the creative exploration of concatenative sound synthesis. The integrated audiovisual system allows a human performer to interact with the system dynamics while receiving visual and auditory feedback. The interaction happens indirectly by disturbing the rules governing the social relationships amongst the actors, which results in a wide range of dynamic spatiotemporal patterns. A performer thus improvises within the behavioural scope of the system while evaluating the apparent connections between parameter values and actual complexity of the system output

    Biometric liveness checking using multimodal fuzzy fusion

    Get PDF
    corecore