125 research outputs found

    Statistical language modeling based on variable-length sequences

    Get PDF
    Abstract In natural language and especially in spontaneous speech, people often group words in order to constitute phrases which become usual expressions. This is due to phonological (to make the pronunciation easier), or to semantic reasons (to remember more easily a phrase by assigning a meaning to a block of words). Classical language models do not adequately take into account such phrases. A better approach consists in modeling some word sequences as if they were individual dictionary elements. Sequences are considered as additional entries of the vocabulary, on which language models are computed. In this paper, we present a method for automatically retrieving the most relevant phrases from a corpus of written sentences. The originality of our approach resides in the fact that the extracted phrases are obtained from a linguistically tagged corpus. Therefore, the obtained phrases are linguistically viable. To measure the contribution of classes in retrieving phrases, we have implemented the same algorithm without using classes. The class-based method outperformed by 11% the other method. Our approach uses information theoretic criteria which insure a high statistical consistency and make the decision of selecting a potential sequence optimal in accordance with the language perplexity. We propose several variants of language model with and without word sequences. Among them, we present a model in which the trigger pairs are linguistically more significant. We show that the use of sequences decrease the word error rate and improve the normalized perplexity. For instance, the best sequence model improves the perplexity by 16%, and the the accuracy of our dictation system (MAUD) by approximately 14%. Experiments, in terms of perplexity and recognition rate, have been carried out on a vocabulary of 20,000 words extracted from a corpus of 43 million words made up of two years of the French newspaper Le Monde. The acoustic model (HMM) is trained with the Bref80 corpus. Ó 2002 Published b

    Angiodysplasies des maxillaires de l’enfant

    Get PDF
    Les angiodysplasies des maxillaires sont des affections rares et graves. Leur mauvais pronostic est surtout lié aux complications hémorragiques redoutables, dues le plus souvent à des extractions dentaires abusives sans examens radiologiques préalables. L’angio-IRM a permis d’une manière non invasive une meilleure connaissance de ces lésions en étudiant leur angio-architecture et leur hémodynamique. Jadis, le seul traitement efficace consistait en une exérèse chirurgicale mutilante entraînant des complications esthétiques et fonctionnelles sévères avec une morbidité importante. Actuellement, la radiologie interventionnelle a révolutionné la prise en charge grâce à l’artériographie hypersélective et àla découverte de nouveaux matériaux d’obturation plus faciles à manier. L’embolisation hypersélective est devenue l’arme de première intention en raison de son caractère conservateur et de son faible taux de  complications. Le but de ce travail est de rappeler les différents aspects étiopathogéniques, cliniques et para cliniques des angiodysplasieset de définir l’attitude thérapeutique.Mots clefs : angiodysplasie, embolisation, chirurgie

    High frequency magnetic oscillations of the organic metal θ\theta-(ET)4_4ZnBr4_4(C6_6H4_4Cl2_2) in pulsed magnetic field of up to 81 T

    Full text link
    De Haas-van Alphen oscillations of the organic metal θ\theta-(ET)4_4ZnBr4_4(C6_6H4_4Cl2_2) are studied in pulsed magnetic fields up to 81 T. The long decay time of the pulse allows determining reliable field-dependent amplitudes of Fourier components with frequencies up to several kiloteslas. The Fourier spectrum is in agreement with the model of a linear chain of coupled orbits. In this model, all the observed frequencies are linear combinations of the frequency linked to the basic orbit α\alpha and to the magnetic-breakdown orbit β\beta.Comment: 6 pages, 4 figure

    Deep Sequential Models for Task Satisfaction Prediction

    Get PDF
    Detecting and understanding implicit signals of user satisfaction are essential for experimentation aimed at predicting searcher satisfaction. As retrieval systems have advanced, search tasks have steadily emerged as accurate units not only to capture searcher's goals but also in understanding how well a system is able to help the user achieve that goal. However, a major portion of existing work on modeling searcher satisfaction has focused on query level satisfaction. The few existing approaches for task satisfaction prediction have narrowly focused on simple tasks aimed at solving atomic information needs. In this work we go beyond such atomic tasks and consider the problem of predicting user's satisfaction when engaged in complex search tasks composed of many different queries and subtasks. We begin by considering holistic view of user interactions with the search engine result page (SERP) and extract detailed interaction sequences of their activity. We then look at query level abstraction and propose a novel deep sequential architecture which leverages the extracted interaction sequences to predict query level satisfaction. Further, we enrich this model with auxiliary features which have been traditionally used for satisfaction prediction and propose a unified multi-view model which combines the benefit of user interaction sequences with auxiliary features. Finally, we go beyond query level abstraction and consider query sequences issued by the user in order to complete a complex task, to make task level satisfaction predictions. We propose a number of functional composition techniques which take into account query level satisfaction estimates along with the query sequence to predict task level satisfaction. Through rigorous experiments, we demonstrate that the proposed deep sequential models significantly outperform established baselines at both query and task satisfaction prediction. Our findings have implications on metric development for gauging user satisfaction and on designing systems which help users accomplish complex search tasks

    Biosensor immunoassay for traces of hazelnut protein in olive oil

    Get PDF
    The fraudulent addition of hazelnut oil to more expensive olive oil not only causes economical loss but may also result in problems for allergic individuals as they may inadvertently be exposed to potentially allergenic hazelnut proteins. To improve consumer safety, a rapid and sensitive direct biosensor immunoassay, based on a highly specific monoclonal antibody, was developed to detect the presence of hazelnut proteins in olive oils. The sample preparation was easy (extraction with buffer); the assay time was fast (4.5 min only) and the limit of detection was low (0.08 μg/g of hazelnut proteins in olive oil). Recoveries obtained with an olive oil mixed with different amounts of a hazelnut protein containing hazelnut oil varied between 93% and 109%

    Abstracts from the 3rd International Genomic Medicine Conference (3rd IGMC 2015)

    Get PDF

    A COMPARATIVE STUDY BETWEEN POLYCLASS AND MULTICLASS LANGUAGE MODELS

    Get PDF
    International audienceIn this work, we introduce the concept of Multiclass for language modeling and we compare it to the Polyclass model. The originality of the Multiclass is its capability to parse a string of classes/tags into variable length independent sequences. A few experimental tests were carried out on a class corpus extracted from the French « Le Monde » word corpus labeled automatically. This corpus contains a set of 43 million of words. In our experiments, Multiclass outperform first-order Polyclass but are slightly outperformed by second-order Polyclass
    corecore