91 research outputs found

    VocaLiST: An Audio-Visual Synchronisation Model for Lips and Voices

    Full text link
    In this paper, we address the problem of lip-voice synchronisation in videos containing human face and voice. Our approach is based on determining if the lips motion and the voice in a video are synchronised or not, depending on their audio-visual correspondence score. We propose an audio-visual cross-modal transformer-based model that outperforms several baseline models in the audio-visual synchronisation task on the standard lip-reading speech benchmark dataset LRS2. While the existing methods focus mainly on the lip synchronisation in speech videos, we also consider the special case of singing voice. Singing voice is a more challenging use case for synchronisation due to sustained vowel sounds. We also investigate the relevance of lip synchronisation models trained on speech datasets in the context of singing voice. Finally, we use the frozen visual features learned by our lip synchronisation model in the singing voice separation task to outperform a baseline audio-visual model which was trained end-to-end. The demos, source code, and the pre-trained model will be made available on https://ipcv.github.io/VocaLiST/Comment: Submitted to Interspeech 2022; Project Page: https://ipcv.github.io/VocaLiST

    Speech inpainting: Context-based speech synthesis guided by video

    Full text link
    Audio and visual modalities are inherently connected in speech signals: lip movements and facial expressions are correlated with speech sounds. This motivates studies that incorporate the visual modality to enhance an acoustic speech signal or even restore missing audio information. Specifically, this paper focuses on the problem of audio-visual speech inpainting, which is the task of synthesizing the speech in a corrupted audio segment in a way that it is consistent with the corresponding visual content and the uncorrupted audio context. We present an audio-visual transformer-based deep learning model that leverages visual cues that provide information about the content of the corrupted audio. It outperforms the previous state-of-the-art audio-visual model and audio-only baselines. We also show how visual features extracted with AV-HuBERT, a large audio-visual transformer for speech recognition, are suitable for synthesizing speech.Comment: Accepted in Interspeech2

    Implementación en un SIG del sistema de cuentas ambientales y económicas del agua- GuaSEEAW

    Get PDF
    El Sistema de Cuentas Ambientales y Económicas del Agua proporciona el marco conceptual para la organización coherente y consistente de la información hídrica y económica. GuaSEEAW (System of Economic and Environmental Accounts for Water in Guadiana River Basin) es un proyecto financiado por la DG de Medio Ambiente de la Comisión Europea, con objeto de analizar la posibilidad de su implementación en el ámbito de una cuenca hidrográfica mediante el uso intensivo de los Sistemas de Información Geográfica. Las cuentas del agua facilitan a los gestores una nueva perspectiva al contrastar los datos hidrológicos que hasta ahora vienen manejando, junto con la información económica. Desde el sistema de cuentas ambientales y económicas del agua se pueden obtener indicadores para la mejora del conocimiento y gestión de la cuenca.Este trabajo se ha realizado gracias al proyecto GUASEEAW y GUASEEAW+, financiados por la Dirección General de Medio Ambiente de la Comisión Europea

    Self-assembled trityl radical capsules implications for dynamic nuclear polarization

    Get PDF
    A new class of guest-induced, bi-radical self-assembled organic capsules is reported. They are formed by the inclusion of a tetramethylammonium (TMA) cation between two monomers of the stable trityl radical OX63. OX63 is extensively used in dissolution dynamic nuclear polarization (DNP) where it leads to NMR sensitivity enhancements of several orders of magnitude. The supramolecular properties of OX63 have a strong impact on its DNP properties. An especially relevant case is the polarization of choline-containing metabolites, where complex formation between choline and OX63 results in faster relaxation

    Models and Observations of Sunspot Penumbrae

    Get PDF
    The mysteries of sunspot penumbrae have been under an intense scrutiny for the past 10 years. During this time, some models have been proposed and refuted, while the surviving ones had to be modified, adapted and evolved to explain the ever-increasing array of observational constraints. In this contribution I will review two of the present models, emphasizing their contributions to this field, but also pinpointing some of their inadequacies to explain a number of recent observations at very high spatial resolution. To help explaining these new observations I propose some modifications to each of them. These modifications bring those two seemingly opposite models closer together into a general picture that agrees well with recent 3D magneto-hydrodynamic simulations.Comment: 9 pages, 1 color figure. Review talk to appear in the proceedings of the International Workshop of 2008 Solar Total Eclipse: Solar Magnetism, Corona and Space Weather--Chinese Space Solar Telescope Scienc

    DAS-28-based EULAR response and HAQ improvement in rheumatoid arthritis patients switching between TNF antagonists

    Get PDF
    <p>Abstract</p> <p>Introduction</p> <p>No definitive data are available regarding the value of switching to an alternative TNF antagonist in rheumatoid arthritis patients who fail to respond to the first one. The aim of this study was to evaluate treatment response in a clinical setting based on HAQ improvement and EULAR response criteria in RA patients who were switched to a second or a third TNF antagonist due to failure with the first one.</p> <p>Methods</p> <p>This was an observational, prospective study of a cohort of 417 RA patients treated with TNF antagonists in three university hospitals in Spain between January 1999 and December 2005. A database was created at the participating centres, with well-defined operational instructions. The main outcome variables were analyzed using parametric or non-parametric tests depending on the level of measurement and distribution of each variable.</p> <p>Results</p> <p>Mean (± SD) DAS-28 on starting the first, second and third TNF antagonist was 5.9 (± 2.0), 5.1 (± 1.5) and 6.1 (± 1.1). At the end of follow-up, it decreased to 3.3 (± 1.6; Δ = -2.6; p > 0.0001), 4.2 (± 1.5; Δ = -1.1; p = 0.0001) and 5.4 (± 1.7; Δ = -0.7; p = 0.06). For the first TNF antagonist, DAS-28-based EULAR response level was good in 42% and moderate in 33% of patients. The second TNF antagonist yielded a good response in 20% and no response in 53% of patients, while the third one yielded a good response in 28% and no response in 72%. Mean baseline HAQ on starting the first, second and third TNF antagonist was 1.61, 1.52 and 1.87, respectively. At the end of follow-up, it decreased to 1.12 (Δ = -0.49; p < 0.0001), 1.31 (Δ = -0.21, p = 0.004) and 1.75 (Δ = -0.12; p = 0.1), respectively. Sixty four percent of patients had a clinically important improvement in HAQ (defined as ≥ -0.22) with the first TNF antagonist and 46% with the second.</p> <p>Conclusion</p> <p>A clinically significant effect size was seen in less than half of RA patients cycling to a second TNF antagonist.</p
    corecore