129 research outputs found

    End-to-End Open Vocabulary Keyword Search With Multilingual Neural Representations

    Full text link
    Conventional keyword search systems operate on automatic speech recognition (ASR) outputs, which causes them to have a complex indexing and search pipeline. This has led to interest in ASR-free approaches to simplify the search procedure. We recently proposed a neural ASR-free keyword search model which achieves competitive performance while maintaining an efficient and simplified pipeline, where queries and documents are encoded with a pair of recurrent neural network encoders and the encodings are combined with a dot-product. In this article, we extend this work with multilingual pretraining and detailed analysis of the model. Our experiments show that the proposed multilingual training significantly improves the model performance and that despite not matching a strong ASR-based conventional keyword search system for short queries and queries comprising in-vocabulary words, the proposed model outperforms the ASR-based system for long queries and queries that do not appear in the training data.Comment: Accepted by IEEE/ACM Transactions on Audio, Speech and Language Processing (TASLP), 202

    Spoken term detection ALBAYZIN 2014 evaluation: overview, systems, results, and discussion

    Get PDF
    The electronic version of this article is the complete one and can be found online at: http://dx.doi.org/10.1186/s13636-015-0063-8Spoken term detection (STD) aims at retrieving data from a speech repository given a textual representation of the search term. Nowadays, it is receiving much interest due to the large volume of multimedia information. STD differs from automatic speech recognition (ASR) in that ASR is interested in all the terms/words that appear in the speech data, whereas STD focuses on a selected list of search terms that must be detected within the speech data. This paper presents the systems submitted to the STD ALBAYZIN 2014 evaluation, held as a part of the ALBAYZIN 2014 evaluation campaign within the context of the IberSPEECH 2014 conference. This is the first STD evaluation that deals with Spanish language. The evaluation consists of retrieving the speech files that contain the search terms, indicating their start and end times within the appropriate speech file, along with a score value that reflects the confidence given to the detection of the search term. The evaluation is conducted on a Spanish spontaneous speech database, which comprises a set of talks from workshops and amounts to about 7 h of speech. We present the database, the evaluation metrics, the systems submitted to the evaluation, the results, and a detailed discussion. Four different research groups took part in the evaluation. Evaluation results show reasonable performance for moderate out-of-vocabulary term rate. This paper compares the systems submitted to the evaluation and makes a deep analysis based on some search term properties (term length, in-vocabulary/out-of-vocabulary terms, single-word/multi-word terms, and in-language/foreign terms).This work has been partly supported by project CMC-V2 (TEC2012-37585-C02-01) from the Spanish Ministry of Economy and Competitiveness. This research was also funded by the European Regional Development Fund, the Galician Regional Government (GRC2014/024, “Consolidation of Research Units: AtlantTIC Project” CN2012/160)

    Can environment or allergy explain international variation in prevalence of wheeze in childhood?

    Get PDF
    Asthma prevalence in children varies substantially around the world, but the contribution of known risk factors to this international variation is uncertain. The International Study of Asthma and Allergies in Childhood (ISAAC) Phase Two studied 8–12 year old children in 30 centres worldwide with parent-completed symptom and risk factor questionnaires and aeroallergen skin prick testing. We used multilevel logistic regression modelling to investigate the effect of adjustment for individual and ecological risk factors on the between-centre variation in prevalence of recent wheeze. Adjustment for single individual-level risk factors changed the centre-level variation from a reduction of up to 8.4% (and 8.5% for atopy) to an increase of up to 6.8%. Modelling the 11 most influential environmental factors among all children simultaneously, the centre-level variation changed little overall (2.4% increase). Modelling only factors that decreased the variance, the 6 most influential factors (synthetic and feather quilt, mother’s smoking, heating stoves, dampness and foam pillows) in combination resulted in a 21% reduction in variance. Ecological (centre-level) risk factors generally explained higher proportions of the variation than did individual risk factors. Single environmental factors and aeroallergen sensitisation measured at the individual (child) level did not explain much of the between-centre variation in wheeze prevalence

    The pitfall of the echocardiography in congenital heart disease

    No full text
    • 

    corecore