1,036 research outputs found

    Alternative visual units for an optimized phoneme-based lipreading system

    Get PDF
    Lipreading is understanding speech from observed lip movements. An observed series of lip motions is an ordered sequence of visual lip gestures. These gestures are commonly known, but as yet are not formally defined, as `visemes’. In this article, we describe a structured approach which allows us to create speaker-dependent visemes with a fixed number of visemes within each set. We create sets of visemes for sizes two to 45. Each set of visemes is based upon clustering phonemes, thus each set has a unique phoneme-to-viseme mapping. We first present an experiment using these maps and the Resource Management Audio-Visual (RMAV) dataset which shows the effect of changing the viseme map size in speaker-dependent machine lipreading and demonstrate that word recognition with phoneme classifiers is possible. Furthermore, we show that there are intermediate units between visemes and phonemes which are better still. Second, we present a novel two-pass training scheme for phoneme classifiers. This approach uses our new intermediary visual units from our first experiment in the first pass as classifiers; before using the phoneme-to-viseme maps, we retrain these into phoneme classifiers. This method significantly improves on previous lipreading results with RMAV speakers

    Expressing Robot Personality through Talking Body Language

    Get PDF
    Social robots must master the nuances of human communication as a mean to convey an effective message and generate trust. It is well-known that non-verbal cues are very important in human interactions, and therefore a social robot should produce a body language coherent with its discourse. In this work, we report on a system that endows a humanoid robot with the ability to adapt its body language according to the sentiment of its speech. A combination of talking beat gestures with emotional cues such as eye lightings, body posture of voice intonation and volume permits a rich variety of behaviors. The developed approach is not purely reactive, and it easily allows to assign a kind of personality to the robot. We present several videos with the robot in two different scenarios, and showing discrete and histrionic personalities.This work has been partially supported by the Basque Government (IT900-16 and Elkartek 2018/00114), the Spanish Ministry of Economy and Competitiveness (RTI 2018-093337-B-100, MINECO/FEDER, EU)

    Survey of Finite Element Method-Based Real-Time Simulations

    Get PDF
    The finite element method (FEM) has deservedly gained the reputation of the most powerful, highly efficient, and versatile numerical method in the field of structural analysis. Though typical application of FE programs implies the so-called “off-line” computations, the rapid pace of hardware development over the past couple of decades was the major impetus for numerous researchers to consider the possibility of real-time simulation based on FE models. Limitations of available hardware components in various phases of developments demanded remarkable innovativeness in the quest for suitable solutions to the challenge. Different approaches have been proposed depending on the demands of the specific field of application. Though it is still a relatively young field of work in global terms, an immense amount of work has already been done calling for a representative survey. This paper aims to provide such a survey, which of course cannot be exhaustive

    Performance analysis of unimodal and multimodal models in valence-based empathy recognition

    Get PDF
    The human ability to empathise is a core aspect of successful interpersonal relationships. In this regard, human-robot interaction can be improved through the automatic perception of empathy, among other human attributes, allowing robots to affectively adapt their actions to interactants' feelings in any given situation. This paper presents our contribution to the generalised track of the One-Minute Gradual (OMG) Empathy Prediction Challenge by describing our approach to predict a listener's valence during semi-scripted actor-listener interactions. We extract visual and acoustic features from the interactions and feed them into a bidirectional long short-term memory network to capture the time-dependencies of the valence-based empathy during the interactions. Generalised and personalised unimodal and multimodal valence-based empathy models are then trained to assess the impact of each modality on the system performance. Furthermore, we analyse if intra-subject dependencies on empathy perception affect the system performance. We assess the models by computing the concordance correlation coefficient (CCC) between the predicted and self-annotated valence scores. The results support the suitability of employing multimodal data to recognise participants' valence-based empathy during the interactions, and highlight the subject-dependency of empathy. In particular, we obtained our best result with a personalised multimodal model, which achieved a CCC of 0.11 on the test set.Funding : Bavarian State Ministry of Education, Science and the Arts Electronic ISBN:978-1-7281-0089-0 IEEE restrictions of institutional members and purchase

    Form, Function and Etiquette – Potential Users’ Perspectives on Social Domestic Robots

    Get PDF
    Social Domestic Robots (SDRs) will soon be launched en masse among commercial markets. Previously, social robots only inhabited scientific labs; now there is an opportunity to conduct experiments to investigate human-robot relationships (including user expectations of social interaction) within more naturalistic, domestic spaces, as well as to test models of technology acceptance. To this end we exposed 20 participants to advertisements prepared by three robotics companies, explaining and “pitching” their SDRs’ functionality (namely, Pepper by SoftBank; Jibo by Jibo, Inc.; and Buddy by Blue Frog Robotics). Participants were interviewed and the data was thematically analyzed to critically examine their initial reactions, concerns and impressions of the three SDRs. Using this approach, we aim to complement existing survey results pertaining to SDRs, and to try to understand the reasoning people use when evaluating SDRs based on what is publicly available to them, namely, advertising. Herein, we unpack issues raised concerning form/function, security/privacy, and the perceived emotional impact of owning an SDR. We discuss implications for the adequate design of socially engaged robotics for domestic applications, and provide four practical steps that could improve the relationships between people and SDRs. An additional contribution is made by expanding existing models of technology acceptance in domestic settings with a new factor of privacy

    CGAMES'2009

    Get PDF

    Multi-Level Liveness Verification for Face-Voice Biometric Authentication

    Get PDF
    In this paper we present the details of the multilevel liveness verification (MLLV) framework proposed for realizing a secure face-voice biometric authentication system that can thwart different types of audio and video replay attacks. The proposed MLLV framework based on novel feature extraction and multimodal fusion approaches, uncovers the static and dynamic relationship between voice and face information from speaking faces, and allows multiple levels of security. Experiments with three different speaking corpora VidTIMIT, UCBN and AVOZES shows a significant improvement in system performance in terms of DET curves and equal error rates(EER) for different types of replay and synthesis attacks
    • 

    corecore