154,877 research outputs found

    Token and Type Constraints for Cross-Lingual Part-of-Speech Tagging

    Get PDF
    We consider the construction of part-of-speech taggers for resource-poor languages. Recently, manually constructed tag dictionaries from Wiktionary and dictionaries projected via bitext have been used as type constraints to overcome the scarcity of annotated data in this setting. In this paper, we show that additional token constraints can be projected from a resource-rich source language to a resource-poor target language via word-aligned bitext. We present several models to this end; in particular a partially observed conditional random ïŹeld model, where coupled token and type constraints provide a partial signal for training. Averaged across eight previously studied Indo-European languages, our model achieves a 25% relative error reduction over the prior state of the art. We further present successful results on seven additional languages from different families, empirically demonstrating the applicability of coupled token and type constraints across a diverse set of languages

    Using Multi-Sense Vector Embeddings for Reverse Dictionaries

    Get PDF
    Popular word embedding methods such as word2vec and GloVe assign a single vector representation to each word, even if a word has multiple distinct meanings. Multi-sense embeddings instead provide different vectors for each sense of a word. However, they typically cannot serve as a drop-in replacement for conventional single-sense embeddings, because the correct sense vector needs to be selected for each word. In this work, we study the effect of multi-sense embeddings on the task of reverse dictionaries. We propose a technique to easily integrate them into an existing neural network architecture using an attention mechanism. Our experiments demonstrate that large improvements can be obtained when employing multi-sense embeddings both in the input sequence as well as for the target representation. An analysis of the sense distributions and of the learned attention is provided as well

    Definitions in ontologies

    Get PDF
    Definitions vary according to context of use and target audience. They must be made relevant for each context to fulfill their cognitive and linguistic goals. This involves adapting their logical structure, type of content, and form to each context of use. We examine from these perspectives the case of definitions in ontologies

    Towards OWL-based Knowledge Representation in Petrology

    Full text link
    This paper presents our work on development of OWL-driven systems for formal representation and reasoning about terminological knowledge and facts in petrology. The long-term aim of our project is to provide solid foundations for a large-scale integration of various kinds of knowledge, including basic terms, rock classification algorithms, findings and reports. We describe three steps we have taken towards that goal here. First, we develop a semi-automated procedure for transforming a database of igneous rock samples to texts in a controlled natural language (CNL), and then a collection of OWL ontologies. Second, we create an OWL ontology of important petrology terms currently described in natural language thesauri. We describe a prototype of a tool for collecting definitions from domain experts. Third, we present an approach to formalization of current industrial standards for classification of rock samples, which requires linear equations in OWL 2. In conclusion, we discuss a range of opportunities arising from the use of semantic technologies in petrology and outline the future work in this area.Comment: 10 pages. The paper has been accepted by OWLED2011 as a long presentatio

    Sense Tagging: Semantic Tagging with a Lexicon

    Full text link
    Sense tagging, the automatic assignment of the appropriate sense from some lexicon to each of the words in a text, is a specialised instance of the general problem of semantic tagging by category or type. We discuss which recent word sense disambiguation algorithms are appropriate for sense tagging. It is our belief that sense tagging can be carried out effectively by combining several simple, independent, methods and we include the design of such a tagger. A prototype of this system has been implemented, correctly tagging 86% of polysemous word tokens in a small test set, providing evidence that our hypothesis is correct.Comment: 6 pages, uses aclap LaTeX style file. Also in Proceedings of the SIGLEX Workshop "Tagging Text with Lexical Semantics

    An analysis of The Oxford Guide to practical lexicography (Atkins and Rundell 2008)

    Get PDF
    Since at least a decade ago, the lexicographic community at large has been demanding that a modern textbook be designed - one that Would place corpora in the centre of the lexicographic enterprise. Written by two of the most respected practising lexicographers, this book has finally arrived, and delivers on very many levels. This review article presents a critical analysis of its features
    • 

    corecore