Search CORE

89,542 research outputs found

Unsupervised Named-Entity Recognition: Generating Gazetteers and Resolving Ambiguity

Author: Matwin Stan
Nadeau David
Turney Peter D.
Publication venue
Publication date: 01/01/2006
Field of study

In this paper, we propose a named-entity recognition (NER) system that addresses two major limitations frequently discussed in the field. First, the system requires no human intervention such as manually labeling training data or creating gazetteers. Second, the system can handle more than the three classical named-entity types (person, location, and organization). We describe the system’s architecture and compare its performance with a supervised system. We experimentally evaluate the system on a standard corpus, with the three classical named-entity types, and also on a new corpus, with a new named-entity type (car brands)

CiteSeerX

NRC Publications Archive

CogPrints Cognitive Sciences Eprint Archive

Parsing Argumentation Structures in Persuasive Essays

Author: Gurevych Iryna
Stab Christian
Publication venue
Publication date: 22/07/2016
Field of study

In this article, we present a novel approach for parsing argumentation structures. We identify argument components using sequence labeling at the token level and apply a new joint model for detecting argumentation structures. The proposed model globally optimizes argument component types and argumentative relations using integer linear programming. We show that our model considerably improves the performance of base classifiers and significantly outperforms challenging heuristic baselines. Moreover, we introduce a novel corpus of persuasive essays annotated with argumentation structures. We show that our annotation scheme and annotation guidelines successfully guide human annotators to substantial agreement. This corpus and the annotation guidelines are freely available for ensuring reproducibility and to encourage future research in computational argumentation.Comment: Under review in Computational Linguistics. First submission: 26 October 2015. Revised submission: 15 July 201

arXiv.org e-Print Archive

TUbiblio

Directory of Open Access Journals

TUdatalib Repository (TU Darmstadt)

Combination Strategies for Semantic Role Labeling

Author: Carreras X.
Comas P. R.
Marquez L.
Surdeanu M.
Publication venue: 'AI Access Foundation'
Publication date: 04/10/2011
Field of study

This paper introduces and analyzes a battery of inference models for the problem of semantic role labeling: one based on constraint satisfaction, and several strategies that model the inference as a meta-learning problem using discriminative classifiers. These classifiers are developed with a rich set of novel features that encode proposition and sentence-level information. To our knowledge, this is the first work that: (a) performs a thorough analysis of learning-based inference models for semantic role labeling, and (b) compares several inference strategies in this context. We evaluate the proposed inference strategies in the framework of the CoNLL-2005 shared task using only automatically-generated syntactic information. The extensive experimental evaluation and analysis indicates that all the proposed inference strategies are successful -they all outperform the current best results reported in the CoNLL-2005 evaluation exercise- but each of the proposed approaches has its advantages and disadvantages. Several important traits of a state-of-the-art SRL combination strategy emerge from this analysis: (i) individual models should be combined at the granularity of candidate arguments rather than at the granularity of complete solutions; (ii) the best combination strategy uses an inference model based in learning; and (iii) the learning-based inference benefits from max-margin classifiers and global feedback

arXiv.org e-Print Archive

Crossref

Adaptive text mining: Inferring structure from sequences

Author: Witten Ian H.
Publication venue: 'Elsevier BV'
Publication date: 01/01/2004
Field of study

Text mining is about inferring structure from sequences representing natural language text, and may be defined as the process of analyzing text to extract information that is useful for particular purposes. Although hand-crafted heuristics are a common practical approach for extracting information from text, a general, and generalizable, approach requires adaptive techniques. This paper studies the way in which the adaptive techniques used in text compression can be applied to text mining. It develops several examples: extraction of hierarchical phrase structures from text, identification of keyphrases in documents, locating proper names and quantities of interest in a piece of text, text categorization, word segmentation, acronym extraction, and structure recognition. We conclude that compression forms a sound unifying principle that allows many text mining problems to be tacked adaptively

Research Commons@Waikato

SUBSTANTIVWÖRTER IN GERMAN

Author: Triono Sulis
Publication venue
Publication date: 18/11/2014
Field of study

This paper aims to describe Substantivwörter ’noun’ in German. Substantivwörter can be: (1) Artikelwort ’article’, (2) Adjetiv vor sich ’adjective modifying a noun’, (3) ein weiteres Substantiv (als Attribut im Genetiv oder Präpositionalkasus) ’nominal functioning as an attribute in a genetive’, and (4) substantivische Pronomina ’substantive pronoun’. There are six types of Substantivische Pronomina, i.e.: Personalpronomen ’personal pronoun’ such as Ich, du, er; Interrogativpronomen ’interrogative pronoun’ such as wer, was, and welche; Demonstrativpronomen ’demonstrative pronoun’ (dieser, jener, and ein solcher); Indefinitpronomen ’indefinite pronoun’ (einige, mache, and allen); Possesivpronomen ’possessive pronoun’ (wessen and wem); and Relativpronomen ’relative pronoun’ (ein Bild replaced by es or das). Substantivwörter function to express (1) Gattungsnamen and (2) Eigenamen. Gattungsnamen or Appelativa are used to name fruits and jobs. Gattungsnamen function to name concrete objects such as Gold ’gold, Schnee ’snow’ and to describe one’s characters or personality such as Härte ’hard’ and Klugheit ’smart’. Besides, Gattungsnamen are used to express kinship relationships such as Onkel ’uncle’ and Grossvater ’grandfather’; and (2) Eigenamen ’proper nouns’ function to express Personennamen or to express one’s name comprising Vorname ’first name’ and Familienname ’family name’ such as Helmut Kohl. Eigenamen are used to name a place such as Gothes Haus, a mountain such as der Alpen, and a country such as Deutschland. Eigenamen are also used for Produktnamen to name a building such as Humboldt Universität, a book such as Gothes Faust, a painting such as Monalisa, a ship such as Titanic, and a song title such as Mother John Lennon

Diponegoro University Institutional Repository

Access to recorded interviews: A research agenda

Author: Heeren W.F.L.
Jong F.M.G. de
Oard D.W.
Ordelman R.J.F.
Publication venue: ACM
Publication date: 01/01/2008
Field of study

Recorded interviews form a rich basis for scholarly inquiry. Examples include oral histories, community memory projects, and interviews conducted for broadcast media. Emerging technologies offer the potential to radically transform the way in which recorded interviews are made accessible, but this vision will demand substantial investments from a broad range of research communities. This article reviews the present state of practice for making recorded interviews available and the state-of-the-art for key component technologies. A large number of important research issues are identified, and from that set of issues, a coherent research agenda is proposed

University of Twente Research Information

Recommended from our members

Recall of random and distorted positions: Implications for the theory of expertise.

Author: Gobet F
Simon H A
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/1996
Field of study

This paper explores the question, important to the theory of expert performance, of the nature and number of chunks that chess experts hold in memory. It examines how memory contents determine players' abilities to reconstruct (a) positions from games, (b) positions distorted in various ways and (c) and random positions. Comparison of a computer simulation with a human experiment supports the usual estimate that chess Masters store some 50,000 chunks in memory. The observed impairment of recall when positions are modified by mirror image reflection, implies that each chunk represents a specific pattern of pieces in a specific location. A good account of the results of the experiments is given by the template theory proposed by Gobet and Simon (in press) as an extension of Chase and Simon's (1973a) initial chunking proposal, and in agreement with other recent proposals for modification of the chunking theory (Richman, Staszewski & Simon, 1995) as applied to various recall tasks

Brunel University Research Archive