26,601 research outputs found
Recommended from our members
Multiple alignments of inflectional paradigms
Most models of inflectional morphology rely at their core on the identification of recurrent and diverging material across inflected forms. Across theoretical frameworks, this can be expressed in terms of morpheme segmentation, rules, processes, patterns or analogies.
Finding these recurrences in large structured lexicons is an important step in empirical computational morphology, where analyses are induced bottom-up from inflected forms. This can be done by aligning all the forms in each paradigm, a task of Multiple Sequence Alignments which is well known in other fields such as evolutionary biology and historical linguistics.
In this paper, we present the specific problems which arise when aligning inflected forms, provide a simple alignment format, define evaluation measures and compare two implemented methods on 13 inflectional lexicons. Our intent is to provide the conditions for the inter-operability of future systems, and for incremental improvements in this fundamental step for quantitative morphology
MORSE: Semantic-ally Drive-n MORpheme SEgment-er
We present in this paper a novel framework for morpheme segmentation which
uses the morpho-syntactic regularities preserved by word representations, in
addition to orthographic features, to segment words into morphemes. This
framework is the first to consider vocabulary-wide syntactico-semantic
information for this task. We also analyze the deficiencies of available
benchmarking datasets and introduce our own dataset that was created on the
basis of compositionality. We validate our algorithm across datasets and
present state-of-the-art results
Statistical parsing of morphologically rich languages (SPMRL): what, how and whither
The term Morphologically Rich Languages (MRLs) refers to languages in which significant information concerning syntactic units and relations is expressed at word-level. There is ample evidence that the application of readily available statistical parsing models to such languages is susceptible to serious performance degradation. The first workshop on statistical parsing of MRLs hosts a variety of contributions which show that despite language-specific idiosyncrasies, the problems associated with parsing MRLs cut across languages and parsing frameworks. In this paper we review the current state-of-affairs with respect to parsing MRLs and point out central challenges. We synthesize the contributions of researchers working on parsing Arabic, Basque, French, German, Hebrew, Hindi and Korean to point out shared solutions across languages. The overarching analysis suggests itself as a source of directions for future investigations
Multiscale Astronomical Image Processing Based on Nonlinear Partial Differential Equations
Astronomical applications of recent advances in the field of nonastronomical image processing are presented. These innovative methods, applied to multiscale astronomical images, increase signal-to-noise ratio, do not smear point sources or extended diffuse structures, and are thus a highly useful preliminary step for detection of different features including point sources, smoothing of clumpy data, and removal of contaminants from background maps. We show how the new methods, combined with other algorithms of image processing, unveil fine diffuse structures while at the same time enhance detection of localized objects, thus facilitating interactive morphology studies and paving the way for the automated recognition and classification of different features. We have also developed a new application framework for astronomical image processing that implements some recent advances made in computer vision and modern image processing, along with original algorithms based on nonlinear partial differential equations. The framework enables the user to easily set up and customize an image-processing pipeline interactively; it has various common and new visualization features and provides access to many astronomy data archives. Altogether, the results presented here demonstrate the first implementation of a novel synergistic approach based on integration of image processing, image visualization, and image quality assessment
Paradigm Completion for Derivational Morphology
The generation of complex derived word forms has been an overlooked problem
in NLP; we fill this gap by applying neural sequence-to-sequence models to the
task. We overview the theoretical motivation for a paradigmatic treatment of
derivational morphology, and introduce the task of derivational paradigm
completion as a parallel to inflectional paradigm completion. State-of-the-art
neural models, adapted from the inflection task, are able to learn a range of
derivation patterns, and outperform a non-neural baseline by 16.4%. However,
due to semantic, historical, and lexical considerations involved in
derivational morphology, future work will be needed to achieve performance
parity with inflection-generating systems.Comment: EMNLP 201
A Survey of Word Reordering in Statistical Machine Translation: Computational Models and Language Phenomena
Word reordering is one of the most difficult aspects of statistical machine
translation (SMT), and an important factor of its quality and efficiency.
Despite the vast amount of research published to date, the interest of the
community in this problem has not decreased, and no single method appears to be
strongly dominant across language pairs. Instead, the choice of the optimal
approach for a new translation task still seems to be mostly driven by
empirical trials. To orientate the reader in this vast and complex research
area, we present a comprehensive survey of word reordering viewed as a
statistical modeling challenge and as a natural language phenomenon. The survey
describes in detail how word reordering is modeled within different
string-based and tree-based SMT frameworks and as a stand-alone task, including
systematic overviews of the literature in advanced reordering modeling. We then
question why some approaches are more successful than others in different
language pairs. We argue that, besides measuring the amount of reordering, it
is important to understand which kinds of reordering occur in a given language
pair. To this end, we conduct a qualitative analysis of word reordering
phenomena in a diverse sample of language pairs, based on a large collection of
linguistic knowledge. Empirical results in the SMT literature are shown to
support the hypothesis that a few linguistic facts can be very useful to
anticipate the reordering characteristics of a language pair and to select the
SMT framework that best suits them.Comment: 44 pages, to appear in Computational Linguistic
Removal of Confined Ionic Liquid from a Metal Organic Framework by Extraction with Molecular Solvents
This work was supported in part by NSF Grant No. CHE-1223988 and by EPSRC Grant No. EP/K00090X/1.Peer reviewedPostprin
- …