19 research outputs found

    PROSPECTS FOR A MERCURY ION FREQUENCY STANDARD

    No full text
    A mercury ion frequency standard has been built and operated at Orsay. It is briefly described and the results obtained for its stability are given. Measurement of various physical parameters and calculation of their effect on the short and medium term stability allow to discuss possible improvements. Systematic effects are studied and a method of measurement is given. It is shown that good stability and accuracy can be achieved

    Highlighting latent structure in documents

    No full text
    Extensible Markup Language (XML) is playing an increasingly important role in the exchange of a wide variety of data on the Web and elsewhere. It is a simple, very flexible text format, used to annotate data by means of markup. XML documents can be checked for syntactic well-formedness and semantic coherence through DTD and schema validation which makes their processing easier. In particular, data with nested structure can be easily represented with embedded tags. This structured representation should be used in information retrieval models which take structure into account. As such, it is meta-data and therefore a contribution to the Semantic Web. However, nowadays, there exists huge quantities of raw texts and the issue is how to find an easy way to provide these texts with sensible XML structure. Here we present an automatic method to extract tree structure from raw texts. This work has been supported by the Paris XI University (BQR2002 project, Paris-XI University)

    A New Method Based on Context for Combining Statistical Language Models

    No full text
    Colloque avec actes et comité de lecture. internationale.International audienceIn this paper we propose a new method to extract from a corpus the histories for which a given language model is better than another one. The decision is based on a measure stemmed from perplexity. This measure allows, for a given history, to compare two language models, and then to choose the best one for this history. Using this principle, and with a 20K vocabulary words, we combined two language models: a bigram and a distant bigram. The contribution of a distant bigram is significant and outperforms a bigram model by 7.5%. Moreover, the performance in Shannon game are improved. We show through this article that we proposed a cheaper framework in comparison to the maximum entropy principle, for combining language models. In addition, the selected histories for which a model is better than another one, have been collected and studied. Almost, all of them are beginnings of very frequently used French phrases. Finally, by using this principle, we achieve a better trigram model in terms of parameters and perplexity. This model is a combination of a bigram and a trigram based on a selected history

    A first evaluation campaign for language models

    Get PDF
    International audienceThis article describes a comparative evaluation campaign for language models which has been set up by AUPELF-UREF 1 , an agency in charge of the promotion of the French language. Three laboratories have participated to the first part of this campaign. The language models have been compared with an original scheme derived from the Shannon game. The results of this evaluation as well as the description of the method and the evaluated language models are presented

    PEAS, the first instantiation of a comparative framework for evaluating parsers of French

    No full text
    This paper presents PEAS, the first comparative evaluation framework for parsers of French whose annotation formalism allows the annotation of both constituents and functional relations. A test corpus containing an assortment of different text types has been built and part of it has been manually annotated. Precision/Recall and crossing brackets metrics will be adapted to our formalism and applied to the parses produced by one parser from academia and another one from industry in order to validate the framework.

    Introduction of Rules into a Stochastic Approach for Language Modelling

    No full text
    corecore