1,730 research outputs found
Combining Residual Networks with LSTMs for Lipreading
We propose an end-to-end deep learning architecture for word-level visual
speech recognition. The system is a combination of spatiotemporal
convolutional, residual and bidirectional Long Short-Term Memory networks. We
train and evaluate it on the Lipreading In-The-Wild benchmark, a challenging
database of 500-size target-words consisting of 1.28sec video excerpts from BBC
TV broadcasts. The proposed network attains word accuracy equal to 83.0,
yielding 6.8 absolute improvement over the current state-of-the-art, without
using information about word boundaries during training or testing.Comment: Submitted to Interspeech 201
earGram Actors: an interactive audiovisual system based on social behavior
In multi-agent systems, local interactions among system components following relatively simple rules often result in complex overall systemic behavior. Complex behavioral and morphological patterns have been used to generate and organize audiovisual systems with artistic purposes. In this work, we propose to use the Actor model of social interactions to drive a concatenative synthesis engine called earGram in real time. The Actor model was originally developed to explore the emergence of complex visual patterns. On the other hand, earGram was originally developed to facilitate the creative exploration of concatenative sound synthesis. The integrated audiovisual system allows a human performer to interact with the system dynamics while receiving visual and auditory feedback. The interaction happens indirectly by disturbing the rules governing the social relationships amongst the actors, which results in a wide range of dynamic spatiotemporal patterns. A performer thus improvises within the behavioural scope of the system while evaluating the apparent connections between parameter values and actual complexity of the system output
Recommended from our members
The role of HG in the analysis of temporal iteration and interaural correlation
- …