51 research outputs found

    Learning Multi-Boosted HMMs for Lip-Password Based Speaker Verification

    Full text link

    Visual speech encoding based on facial landmark registration

    Get PDF
    Visual Speech Recognition (VSR) related studies largely ignore the use of state of the art approaches in facial landmark localization, and are also deficit of robust visual features and its temporal encoding. In this work, we propose a visual speech temporal encoding by integrating state of the art fast and accurate facial landmark detection based on ensemble of regression trees learned using gradient boosting. The main contribution of this work is in proposing a fast and simple encoding of visual speech features derived from vertically symmetric point pairs (VeSPP) of facial landmarks corresponding to lip regions, and demonstrating their usefulness in temporal sequence comparisons using Dynamic Time Warping. VSR can be either speaker dependent (SD) or speaker independent (SI), and each of them poses different kind of challenges. In this work, we consider the SD scenario, and obtain 82.65% recognition accuracy on OuluVS database. Unlike recent research in VSR which makes use of auxiliary information such as audio, depth and color channels, our approach does not impose such constraints

    Visual Passwords Using Automatic Lip Reading

    Get PDF
    This paper presents a visual passwords system to increase security. The system depends mainly on recognizing the speaker using the visual speech signal alone. The proposed scheme works in two stages: setting the visual password stage and the verification stage. At the setting stage the visual passwords system request the user to utter a selected password, a video recording of the user face is captured, and processed by a special words-based VSR system which extracts a sequence of feature vectors. In the verification stage, the same procedure is executed, the features will be sent to be compared with the stored visual password. The proposed scheme has been evaluated using a video database of 20 different speakers (10 females and 10 males), and 15 more males in another video database with different experiment sets. The evaluation has proved the system feasibility, with average error rate in the range of 7.63% to 20.51% at the worst tested scenario, and therefore, has potential to be a practical approach with the support of other conventional authentication methods such as the use of usernames and passwords

    Audio-Visual Biometrics and Forgery

    Get PDF

    Activity Report 2002

    Get PDF

    Speech Recognition

    Get PDF
    Chapters in the first part of the book cover all the essential speech processing techniques for building robust, automatic speech recognition systems: the representation for speech signals and the methods for speech-features extraction, acoustic and language modeling, efficient algorithms for searching the hypothesis space, and multimodal approaches to speech recognition. The last part of the book is devoted to other speech processing applications that can use the information from automatic speech recognition for speaker identification and tracking, for prosody modeling in emotion-detection systems and in other speech processing applications that are able to operate in real-world environments, like mobile communication services and smart homes

    Biometric Liveness Detection Using Gaze Information

    Get PDF
    This thesis is concerned with liveness detection for biometric systems and in particular for face recognition systems. Biometric systems are well studied and have the potential to provide satisfactory solutions for a variety of applications. However, presentation attacks (spoofng), where an attempt is made at subverting them system by making a deliberate presentation at the sensor is a serious challenge to their use in unattended applications. Liveness detection techniques can help with protecting biometric systems from attacks made through the presentation of artefacts and recordings at the sensor. In this work novel techniques for liveness detection are presented using gaze information. The notion of natural gaze stability is introduced and used to develop a number of novel features that rely on directing the gaze of the user and establishing its behaviour. These features are then used to develop systems for detecting spoofng attempts. The attack scenarios considered in this work include the use of hand held photos and photo masks as well as video reply to subvert the system. The proposed features and systems based on them were evaluated extensively using data captured from genuine and fake attempts. The results of the evaluations indicate that gaze-based features can be used to discriminate between genuine and imposter. Combining features through feature selection and score fusion substantially improved the performance of the proposed features
    corecore