51 research outputs found

    FreDist: Automatic construction of distributional thesauri for French

    Get PDF
    International audienceIn this article we present FreDist, a freely available software package for the automatic construction of distributional thesauri from text corpora, as well as an evaluation of various distributional similarity metrics for French. Following from the work of Lin (1998) and Curran (2004), we use a large corpus of journalistic text and implement different choices for the type of lexical context relation, the weight function, and the measure function needed to build a distributional thesaurus. Using the EuroWordNet and \wolf wordnet resources for French as gold-standard references for our evaluation, we obtain the novel result that combining bigram and syntactic dependency context relations results in higher quality distributional thesauri. In addition, we hope that our software package and a joint release of our best thesauri for French will be useful to the NLP community

    Generation, transport and focusing of high-brightness heavy ion beams

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Physics, 2006.Includes bibliographical references (p. 195-201).The Neutralized Transport Experiment (NTX) has been built at the Heavy Ion Fusion Virtual National Laboratory. NTX is the first successful integrated beam system experiment that explores various physical phenomena, and determines the final spot size of a high intensity ion beam on a scaled version of a Heavy Ion Fusion driver. The final spot size is determined by the conditions of the beam produced in the injector, the beam dynamics in the focusing lattice, and the plasma neutralization dynamics in the final transport. A high brightness ion source using an aperturing technique delivers 25 mA of single charged potassium ion beam at 300 keV and a normalized edge emittance of 0.05 r-mm-mr. The ion beam is injected into a large bore magnetic quadrupole lattice, which produces a 20 mm radius beam converging at 20 mr. The converging ion beam is further injected into a plasma neutralization drift section where it is compressed ballistically down to a 1 mm spot size.(cont.) NTX provides the first experimental proof of plasma neutralized ballistic transport of a space-charge dominated ion beam, the information about higher order aberration effects on the spot size, the validation of numerical tools based on excellent agreement between measurements and numerical simulations over a broad parameter regime, and the development of new diagnostics to study the ion beam dynamics. The theoretical and experimental results are presented on the beam dynamics in the ion diode, downstream quadrupole lattice, and final neutralized transport.by Enrique Henestroza.Ph.D

    FreDist: Automatic construction of distributional thesauri for French

    Get PDF
    International audienceIn this article we present FreDist, a freely available software package for the automatic construction of distributional thesauri from text corpora, as well as an evaluation of various distributional similarity metrics for French. Following from the work of Lin (1998) and Curran (2004), we use a large corpus of journalistic text and implement different choices for the type of lexical context relation, the weight function, and the measure function needed to build a distributional thesaurus. Using the EuroWordNet and \wolf wordnet resources for French as gold-standard references for our evaluation, we obtain the novel result that combining bigram and syntactic dependency context relations results in higher quality distributional thesauri. In addition, we hope that our software package and a joint release of our best thesauri for French will be useful to the NLP community

    Sensitivity Analysis of the DARHT-II 2.5MV/2kA Diode

    No full text

    Analyse syntaxique probabiliste en deĢpendances : approches efficaces aĢ€ large contexte avec ressources lexicales distributionnelles

    Get PDF
    This thesis explores ways to improve the accuracy and coverage of efficient statistical dependency parsing. We employ transition-based parsing with models learned using Support Vector Machines (Cortes and Vapnik, 1995), and our experiments are carried out on French. Transition-based parsing is very fast due to the computational efficiency of its underlying algorithms, which are based on a local optimization of attachment decisions. Our first research thread is thus to increase the syntactic context used. From the arc-eager transition system (Nivre, 2008) we propose a variant that simultaneously considers multiple candidate governors for right-directed attachments. We also test parse correction, inspired by Hall and NovaĢk (2005), which revises each attachment in a parse by considering multiple alternative governors in the local syntactic neighborhood. We find that multiple-candidate approaches slightly improve parsing accuracy overall as well as for prepositional phrase attachment and coordination, two linguistic phenomena that exhibit high syntactic ambiguity. Our second research thread explores semi-supervised approaches for improving parsing accuracy and coverage. We test self-training within the journalistic domain as well as for adaptation to the medical domain, using a two-stage parsing approach based on that of McClosky et al. (2006). We then turn to lexical modeling over a large corpus: we model generalized lexical classes to reduce data sparseness, and prepositional phrase attachment preference to improve disambiguation. We find that semi-supervised approaches can sometimes improve parsing accuracy and coverage, without increasing time complexity.Cette theĢ€se preĢsente des meĢthodes pour ameĢliorer l'analyse syntaxique probabiliste en deĢpendances. Nous employons l'analyse aĢ€ base de transitions avec une modeĢlisation effectueĢe par des machines aĢ€ vecteurs supports (Cortes and Vapnik, 1995), et nos expeĢriences sont reĢaliseĢes sur le francĢ§ais. L'analyse a base de transitions est rapide, de par la faible complexiteĢ des algorithmes sous-jacents, eux meĢ‚mes fondeĢs sur une optimisation locale des deĢcisions d'attachement. Ainsi notre premier fil directeur est d'eĢlargir le contexte syntaxique utiliseĢ. Partant du systeĢ€me de transitions arc-eager (Nivre, 2008), nous proposons une variante qui consideĢ€re simultaneĢment plusieurs gouverneurs candidats pour les attachements aĢ€ droite. Nous testons aussi la correction des analyses, inspireĢe par Hall and NovaĢk (2005), qui reĢvise chaque attachement en choisissant parmi plusieurs gouverneurs alternatifs dans le voisinage syntaxique. Nos approches ameĢliorent leĢgeĢ€rement la preĢcision globale ainsi que celles de l'attachement des groupes preĢpositionnels et de la coordination. Notre deuxieĢ€me fil explore des approches semi-superviseĢes. Nous testons l'auto-entrainement avec un analyseur en deux eĢtapes, baseĢ sur McClosky et al. (2006), pour le domaine journalistique ainsi que pour l'adaptation au domaine meĢdical. Nous passons ensuite aĢ€ la modeĢlisation lexicale aĢ€ base de corpus, avec des classes lexicales geĢneĢraliseĢes pour reĢduire la dispersion des donneĢes, et des preĢfeĢrences lexicales de l'attachement des groupes preĢpositionnels pour aider aĢ€ la deĢsambiguiĢˆsation. Nos approches ameĢliorent, dans certains cas, la preĢcision et la couverture de l'analyseur, sans augmenter sa complexiteĢ theĢorique
    • ā€¦
    corecore