Search CORE

16,646 research outputs found

MBT: A Memory-Based Part of Speech Tagger-Generator

Author: Berck Peter
Daelemans Walter
Gillis Steven
Zavrel Jakub
Publication venue
Publication date: 01/01/1996
Field of study

We introduce a memory-based approach to part of speech tagging. Memory-based learning is a form of supervised learning based on similarity-based reasoning. The part of speech tag of a word in a particular context is extrapolated from the most similar cases held in memory. Supervised learning approaches are useful when a tagged corpus is available as an example of the desired output of the tagger. Based on such a corpus, the tagger-generator automatically builds a tagger which is able to tag new text the same way, diminishing development time for the construction of a tagger considerably. Memory-based tagging shares this advantage with other statistical or machine learning approaches. Additional advantages specific to a memory-based approach include (i) the relatively small tagged corpus size sufficient for training, (ii) incremental learning, (iii) explanation capabilities, (iv) flexible integration of information in case representations, (v) its non-parametric nature, (vi) reasonably good results on unknown words without morphological analysis, and (vii) fast learning and tagging. In this paper we show that a large-scale application of the memory-based approach is feasible: we obtain a tagging accuracy that is on a par with that of known statistical approaches, and with attractive space and time complexity properties when using {\em IGTree}, a tree-based formalism for indexing and searching huge case bases.} The use of IGTree has as additional advantage that optimal context size for disambiguation is dynamically computed.Comment: 14 pages, 2 Postscript figure

arXiv.org e-Print Archive

CiteSeerX

Institutional Repository Universiteit Antwerpen

Tilburg University Repository

Semantic Ambiguity and Perceived Ambiguity

Author: Poesio Massimo
Publication venue
Publication date: 01/01/1994
Field of study

I explore some of the issues that arise when trying to establish a connection between the underspecification hypothesis pursued in the NLP literature and work on ambiguity in semantics and in the psychological literature. A theory of underspecification is developed `from the first principles', i.e., starting from a definition of what it means for a sentence to be semantically ambiguous and from what we know about the way humans deal with ambiguity. An underspecified language is specified as the translation language of a grammar covering sentences that display three classes of semantic ambiguity: lexical ambiguity, scopal ambiguity, and referential ambiguity. The expressions of this language denote sets of senses. A formalization of defeasible reasoning with underspecified representations is presented, based on Default Logic. Some issues to be confronted by such a formalization are discussed.Comment: Latex, 47 pages. Uses tree-dvips.sty, lingmacros.sty, fullname.st

arXiv.org e-Print Archive

CiteSeerX

Preferential Multi-Context Systems

Author: Mu Kedian
Wang Kewen
Wen Lian
Publication venue
Publication date: 25/04/2015
Field of study

Multi-context systems (MCS) presented by Brewka and Eiter can be considered as a promising way to interlink decentralized and heterogeneous knowledge contexts. In this paper, we propose preferential multi-context systems (PMCS), which provide a framework for incorporating a total preorder relation over contexts in a multi-context system. In a given PMCS, its contexts are divided into several parts according to the total preorder relation over them, moreover, only information flows from a context to ones of the same part or less preferred parts are allowed to occur. As such, the first

l

preferred parts of an PMCS always fully capture the information exchange between contexts of these parts, and then compose another meaningful PMCS, termed the

l

-section of that PMCS. We generalize the equilibrium semantics for an MCS to the (maximal)

l_{\leq}

-equilibrium which represents belief states at least acceptable for the

l

-section of an PMCS. We also investigate inconsistency analysis in PMCS and related computational complexity issues

arXiv.org e-Print Archive

Principle Based Semantics for HPSG

Author: Frank A.
Reyle U.
Publication venue
Publication date: 01/01/1994
Field of study

The paper presents a constraint based semantic formalism for HPSG. The advantages of the formlism are shown with respect to a grammar for a fragment of German that deals with (i) quantifier scope ambiguities triggered by scrambling and/or movement and (ii) ambiguities that arise from the collective/distributive distinction of plural NPs. The syntax-semantics interface directly implements syntactic conditions on quantifier scoping and distributivity. The construction of semantic representations is guided by general principles governing the interaction between syntax and semantics. Each of these principles acts as a constraint to narrow down the set of possible interpretations of a sentence. Meanings of ambiguous sentences are represented by single partial representations (so-called U(nderspecified) D(iscourse) R(epresentation) S(tructure)s) to which further constraints can be added monotonically to gain more information about the content of a sentence. There is no need to build up a large number of alternative representations of the sentence which are then filtered by subsequent discourse and world knowledge. The advantage of UDRSs is not only that they allow for monotonic incremental interpretation but also that they are equipped with truth conditions and a proof theory that allows for inferences to be drawn directly on structures where quantifier scope is not resolved

arXiv.org e-Print Archive

CiteSeerX