32,473 research outputs found

    A Pattern Matching method for finding Noun and Proper Noun Translations from Noisy Parallel Corpora

    Full text link
    We present a pattern matching method for compiling a bilingual lexicon of nouns and proper nouns from unaligned, noisy parallel texts of Asian/Indo-European language pairs. Tagging information of one language is used. Word frequency and position information for high and low frequency words are represented in two different vector forms for pattern matching. New anchor point finding and noise elimination techniques are introduced. We obtained a 73.1\% precision. We also show how the results can be used in the compilation of domain-specific noun phrases.Comment: 8 pages, uuencoded compressed postscript file. To appear in the Proceedings of the 33rd AC

    Onset-to-onset probability and gradient acceptability in Korean

    Get PDF

    GO-WORDS: An Entropic Approach to Semantic Decomposition of Gene Ontology Terms

    Get PDF
    The Gene Ontology (GO) has a large and growing number of terms that constitute its vocabulary. An entropy-based approach is presented to automate the characterization of the compositional semantics of GO terms. The motivation is to extend the machine-readability of GO and to offer insights for the continued maintenance and growth of GO. A proto-type implementation illustrates the benefits of the approach

    A Modular and Flexible Architecture for an Integrated Corpus Query System

    Full text link
    The paper describes the architecture of an integrated and extensible corpus query system developed at the University of Stuttgart and gives examples of some of the modules realized within this architecture. The modules form the core of a corpus workbench. Within the proposed architecture, information required for the evaluation of queries may be derived from different knowledge sources (the corpus text, databases, on-line thesauri) and by different means: either through direct lookup in a database or by calling external tools which may infer the necessary information at the time of query evaluation. The information available and the method of information access can be stated declaratively and individually for each corpus, leading to a flexible, extensible and modular corpus workbench.Comment: 10 pages, uuencoded gzip'ped PostScript; presented at COMPLEX'9

    Optimality theory

    Get PDF
    corecore