232 research outputs found

    Energy-based temporal neural networks for imputing missing values

    Get PDF
    Imputing missing values in high dimensional time series is a difficult problem. There have been some approaches to the problem [11,8] where neural architectures were trained as probabilistic models of the data. However, we argue that this approach is not optimal. We propose to view temporal neural networks with latent variables as energy-based models and train them for missing value recovery directly. In this paper we introduce two energy-based models. The first model is based on a one dimensional convolution and the second model utilizes a recurrent neural network. We demonstrate how ideas from the energy-based learning framework can be used to train these models to recover missing values. The models are evaluated on a motion capture dataset

    Multimodal Representation Learning for Human Robot Interaction

    Get PDF

    Attention-based Audio-Visual Fusion for Robust Automatic Speech Recognition

    Full text link
    Automatic speech recognition can potentially benefit from the lip motion patterns, complementing acoustic speech to improve the overall recognition performance, particularly in noise. In this paper we propose an audio-visual fusion strategy that goes beyond simple feature concatenation and learns to automatically align the two modalities, leading to enhanced representations which increase the recognition accuracy in both clean and noisy conditions. We test our strategy on the TCD-TIMIT and LRS2 datasets, designed for large vocabulary continuous speech recognition, applying three types of noise at different power ratios. We also exploit state of the art Sequence-to-Sequence architectures, showing that our method can be easily integrated. Results show relative improvements from 7% up to 30% on TCD-TIMIT over the acoustic modality alone, depending on the acoustic noise level. We anticipate that the fusion strategy can easily generalise to many other multimodal tasks which involve correlated modalities. Code available online on GitHub: https://github.com/georgesterpu/Sigmedia-AVSRComment: In ICMI'18, October 16-20, 2018, Boulder, CO, USA. Equation (2) corrected on this versio

    The relationship between mandibular advancement, tongue movement, and treatment outcome in obstructive sleep apnea

    Full text link
    Study Objectives: To characterize how mandibular advancement enlarges the upper airway via posterior tongue advancement in people with obstructive sleep apnea (OSA) and whether this is associated with mandibular advancement splint (MAS) treatment outcome. Methods: One-hundred and one untreated people with OSA underwent a 3T magnetic resonance (MRI) scan. Dynamic mid-sagittal posterior tongue and mandible movements during passive jaw advancement were measured with tagged MRI. Upper airway cross-sectional areas were measured with the mandible in a neutral position and advanced to 70% of maximum advancement. Treatment outcome was determined after a minimum of 9 weeks of therapy. Results: Seventy-one participants completed the study: 33 were responders (AHI50% AHI reduction), 11 were partial responders (>50% AHI reduction but AHI>10 events/hr), and 27 nonresponders (AHI reduction 4 mm). In comparison, a model using only baseline AHI correctly classified 50.0% of patients (5-fold cross-validated 52.5%, n = 40). Conclusions: Tongue advancement and upper airway enlargement with mandibular advancement in conjunction with baseline AHI improve treatment response categorization to a satisfactory level (69.2%, 5-fold cross-validated 62.5%)

    Enhancing the operational stability of unencapsulated perovskite solar cells through Cu-Ag bilayer electrode incorporation

    Get PDF
    We identify a facile strategy that significantly reduces electrode corrosion and device degradation in unencapsulated perovskite solar cells (PSCs) operating in ambient air. By employing Cu-Ag bilayer top electrodes PSCs, we show enhanced operational lifetime compared with devices prepared from single metal (Al, Ag and Cu) analogues. Time-of-flight secondary ion mass spectrometry depth profiles indicate that the insertion of the thin layer of Cu (10nm) below the Ag (100nm) electrode significantly reduces diffusion of species originating in the perovskite active layer into the electron transport layer and electrode. X-ray diffraction (XRD) analysis reveals the mutually beneficial relationship between the bilayer metals, whereby the thermally evaporated Ag inhibits Cu oxidation and the Cu prevents interfacial reactions between the perovskite and Ag. The results here not only demonstrate a simple approach to prevent the electrode and device degradation that enhance lifetime and stability but also give an insight into ageing related ion migration and structural reorganisation

    EEG ERP preregistration template

    Get PDF
    This preregistration template guides researchers who wish to preregister their EEG projects, more specifically studies investigating event-related potentials (ERPs) in the sensor space

    A Domain Adaptation Approach to Improve Speaker Turn Embedding Using Face Representation

    Get PDF
    This paper proposes a novel approach to improve speaker modeling using knowledge transferred from face representation. In particular, we are interested in learning a discriminative metric which allows speaker turns to be compared directly, which is beneficial for tasks such as diarization and dialogue analysis. Our method improves the embedding space of speaker turns by applying maximum mean discrepancy loss to minimize the disparity between the distributions of facial and acoustic embedded features. This approach aims to discover the shared underlying structure of the two embedded spaces, thus enabling the transfer of knowledge from the richer face representation to the counterpart in speech. Experiments are conducted on broadcast TV news datasets, REPERE and ETAPE, to demonstrate the validity of our method. Quantitative results in verification and clustering tasks show promising improvement, especially in cases where speaker turns are short or the training data size is limited

    Learning Multimodal Temporal Representation for Dubbing Detection in Broadcast Media

    Get PDF
    Person discovery in the absence of prior identity knowledge requires accurate association of visual and auditory cues. In broadcast data, multimodal analysis faces additional challenges due to narrated voices over muted scenes or dubbing in different languages. To address these challenges, we define and analyze the problem of dubbing detection in broadcast data, which has not been explored before. We propose a method to represent the temporal relationship between the auditory and visual streams. This method consists of canonical correlation analysis to learn a joint multimodal space, and long short term memory (LSTM) networks to model cross-modality temporal dependencies. Our contributions also include the introduction of a newly acquired dataset of face-speech segments from TV data, which we have made publicly available. The proposed method achieves promising performance on this real world dataset as compared to several baselines

    Multinational characterization of neurological phenotypes in patients hospitalized with COVID-19

    Get PDF
    Neurological complications worsen outcomes in COVID-19. To define the prevalence of neurological conditions among hospitalized patients with a positive SARS-CoV-2 reverse transcription polymerase chain reaction test in geographically diverse multinational populations during early pandemic, we used electronic health records (EHR) from 338 participating hospitals across 6 countries and 3 continents (January–September 2020) for a cross-sectional analysis. We assessed the frequency of International Classification of Disease code of neurological conditions by countries, healthcare systems, time before and after admission for COVID-19 and COVID-19 severity. Among 35,177 hospitalized patients with SARS-CoV-2 infection, there was an increase in the proportion with disorders of consciousness (5.8%, 95% confidence interval [CI] 3.7–7.8%, pFDR < 0.001) and unspecified disorders of the brain (8.1%, 5.7–10.5%, pFDR < 0.001) when compared to the pre-admission proportion. During hospitalization, the relative risk of disorders of consciousness (22%, 19–25%), cerebrovascular diseases (24%, 13–35%), nontraumatic intracranial hemorrhage (34%, 20–50%), encephalitis and/or myelitis (37%, 17–60%) and myopathy (72%, 67–77%) were higher for patients with severe COVID-19 when compared to those who never experienced severe COVID-19. Leveraging a multinational network to capture standardized EHR data, we highlighted the increased prevalence of central and peripheral neurological phenotypes in patients hospitalized with COVID-19, particularly among those with severe disease

    Difference of clinical features in childhood Mycoplasma pneumoniae pneumonia

    Get PDF
    <p>Abstract</p> <p>Background</p> <p><it>M. pneumoniae </it>pneumonia (MP) has been reported in 10-40% of community-acquired pneumonia cases. We aimed to evaluate the difference of clinical features in children with MP, according to their age and chest radiographic patterns.</p> <p>Methods</p> <p>The diagnosis of MP was made by examinations at both admission and discharge and by two serologic tests: the indirect microparticle agglutinin assay (≥1:40) and the cold agglutinins titer (≥1:32). A total of 191 children with MP were grouped by age: ≤2 years of age (29 patients), 3-5 years of age (81 patients), and ≥6 years of age (81 patients). They were also grouped by pneumonia pattern: bronchopneumonia group (96 patients) and segmental/lobar pneumonia group (95 patients).</p> <p>Results</p> <p>Eighty-six patients (45%) were seroconverters, and the others showed increased antibody titers during hospitalization. Among the three age groups, the oldest children showed the longest duration of fever, highest C-reactive protein (CRP) values, and the most severe pneumonia pattern. The patients with segmental/lobar pneumonia were older and had longer fever duration and lower white blood cell (WBC) and lymphocyte counts, compared with those with bronchopneumonia. The patient group with the most severe pulmonary lesions had the most prolonged fever, highest CRP, highest rate of seroconverters, and lowest lymphocyte counts. Thrombocytosis was observed in 8% of patients at admission, but in 33% of patients at discharge.</p> <p>Conclusions</p> <p>In MP, older children had more prolonged fever and more severe pulmonary lesions. The severity of pulmonary lesions was associated with the absence of diagnostic IgM antibodies at presentation and lymphocyte count. Short-term paired IgM serologic test may be mandatory for early and definitive diagnosis of MP.</p
    • …
    corecore