14,240 research outputs found

    Zero-shot keyword spotting for visual speech recognition in-the-wild

    Full text link
    Visual keyword spotting (KWS) is the problem of estimating whether a text query occurs in a given recording using only video information. This paper focuses on visual KWS for words unseen during training, a real-world, practical setting which so far has received no attention by the community. To this end, we devise an end-to-end architecture comprising (a) a state-of-the-art visual feature extractor based on spatiotemporal Residual Networks, (b) a grapheme-to-phoneme model based on sequence-to-sequence neural networks, and (c) a stack of recurrent neural networks which learn how to correlate visual features with the keyword representation. Different to prior works on KWS, which try to learn word representations merely from sequences of graphemes (i.e. letters), we propose the use of a grapheme-to-phoneme encoder-decoder model which learns how to map words to their pronunciation. We demonstrate that our system obtains very promising visual-only KWS results on the challenging LRS2 database, for keywords unseen during training. We also show that our system outperforms a baseline which addresses KWS via automatic speech recognition (ASR), while it drastically improves over other recently proposed ASR-free KWS methods.Comment: Accepted at ECCV-201

    Character Recognition

    Get PDF
    Character recognition is one of the pattern recognition technologies that are most widely used in practical applications. This book presents recent advances that are relevant to character recognition, from technical topics such as image processing, feature extraction or classification, to new applications including human-computer interfaces. The goal of this book is to provide a reference source for academic research and for professionals working in the character recognition field

    Dyslex_Re : The Real-Time Assistance for Dyslexic People

    Get PDF
    DYSLEX_RE is a real-time reading assistant app for dyslexic people. Dyslexia, also known as reading disorder and it is characterized by trouble with reading ability. Different people are affected to varying degrees. Problems may include difficulties in spelling words, reading at high speed, writing some words, sounding out words in the head, pronouncing words when reading aloud and understanding what one reads. Some cases run in families. OpenDyslexic is a free typeface/font designed to avoid some of the common reading errors caused by dyslexia. The font that includes regular, bold, italic, bold-italic, and monospaced font styles. This application is developed in English language using multisensory approach and it is an appropriate and suitable learning ecosystem for dyslexic children. Previous studies shows that many application that are developed in Malay and Spanish language. And this applications that only recognize some of the alphabetic. But in our application we work with all the alphabetic using OCR. The main objective of the proposed system that uses Google2019;s mobile vision AP
    corecore