26,601 research outputs found

    MORSE: Semantic-ally Drive-n MORpheme SEgment-er

    Full text link
    We present in this paper a novel framework for morpheme segmentation which uses the morpho-syntactic regularities preserved by word representations, in addition to orthographic features, to segment words into morphemes. This framework is the first to consider vocabulary-wide syntactico-semantic information for this task. We also analyze the deficiencies of available benchmarking datasets and introduce our own dataset that was created on the basis of compositionality. We validate our algorithm across datasets and present state-of-the-art results

    Statistical parsing of morphologically rich languages (SPMRL): what, how and whither

    Get PDF
    The term Morphologically Rich Languages (MRLs) refers to languages in which significant information concerning syntactic units and relations is expressed at word-level. There is ample evidence that the application of readily available statistical parsing models to such languages is susceptible to serious performance degradation. The first workshop on statistical parsing of MRLs hosts a variety of contributions which show that despite language-specific idiosyncrasies, the problems associated with parsing MRLs cut across languages and parsing frameworks. In this paper we review the current state-of-affairs with respect to parsing MRLs and point out central challenges. We synthesize the contributions of researchers working on parsing Arabic, Basque, French, German, Hebrew, Hindi and Korean to point out shared solutions across languages. The overarching analysis suggests itself as a source of directions for future investigations

    Multiscale Astronomical Image Processing Based on Nonlinear Partial Differential Equations

    Get PDF
    Astronomical applications of recent advances in the field of nonastronomical image processing are presented. These innovative methods, applied to multiscale astronomical images, increase signal-to-noise ratio, do not smear point sources or extended diffuse structures, and are thus a highly useful preliminary step for detection of different features including point sources, smoothing of clumpy data, and removal of contaminants from background maps. We show how the new methods, combined with other algorithms of image processing, unveil fine diffuse structures while at the same time enhance detection of localized objects, thus facilitating interactive morphology studies and paving the way for the automated recognition and classification of different features. We have also developed a new application framework for astronomical image processing that implements some recent advances made in computer vision and modern image processing, along with original algorithms based on nonlinear partial differential equations. The framework enables the user to easily set up and customize an image-processing pipeline interactively; it has various common and new visualization features and provides access to many astronomy data archives. Altogether, the results presented here demonstrate the first implementation of a novel synergistic approach based on integration of image processing, image visualization, and image quality assessment

    Paradigm Completion for Derivational Morphology

    Full text link
    The generation of complex derived word forms has been an overlooked problem in NLP; we fill this gap by applying neural sequence-to-sequence models to the task. We overview the theoretical motivation for a paradigmatic treatment of derivational morphology, and introduce the task of derivational paradigm completion as a parallel to inflectional paradigm completion. State-of-the-art neural models, adapted from the inflection task, are able to learn a range of derivation patterns, and outperform a non-neural baseline by 16.4%. However, due to semantic, historical, and lexical considerations involved in derivational morphology, future work will be needed to achieve performance parity with inflection-generating systems.Comment: EMNLP 201

    A Survey of Word Reordering in Statistical Machine Translation: Computational Models and Language Phenomena

    Get PDF
    Word reordering is one of the most difficult aspects of statistical machine translation (SMT), and an important factor of its quality and efficiency. Despite the vast amount of research published to date, the interest of the community in this problem has not decreased, and no single method appears to be strongly dominant across language pairs. Instead, the choice of the optimal approach for a new translation task still seems to be mostly driven by empirical trials. To orientate the reader in this vast and complex research area, we present a comprehensive survey of word reordering viewed as a statistical modeling challenge and as a natural language phenomenon. The survey describes in detail how word reordering is modeled within different string-based and tree-based SMT frameworks and as a stand-alone task, including systematic overviews of the literature in advanced reordering modeling. We then question why some approaches are more successful than others in different language pairs. We argue that, besides measuring the amount of reordering, it is important to understand which kinds of reordering occur in a given language pair. To this end, we conduct a qualitative analysis of word reordering phenomena in a diverse sample of language pairs, based on a large collection of linguistic knowledge. Empirical results in the SMT literature are shown to support the hypothesis that a few linguistic facts can be very useful to anticipate the reordering characteristics of a language pair and to select the SMT framework that best suits them.Comment: 44 pages, to appear in Computational Linguistic

    Removal of Confined Ionic Liquid from a Metal Organic Framework by Extraction with Molecular Solvents

    Get PDF
    This work was supported in part by NSF Grant No. CHE-1223988 and by EPSRC Grant No. EP/K00090X/1.Peer reviewedPostprin
    corecore