89,542 research outputs found
Unsupervised Named-Entity Recognition: Generating Gazetteers and Resolving Ambiguity
In this paper, we propose a named-entity recognition (NER) system that addresses two major limitations frequently discussed in the field. First, the system requires no human intervention such as manually labeling training data or creating gazetteers. Second, the system can handle more than the three classical named-entity types (person, location, and organization). We describe the system’s architecture and compare its performance with a supervised system. We experimentally evaluate the system on a standard corpus, with the three classical named-entity types, and also on a new corpus, with a new named-entity type (car brands)
Parsing Argumentation Structures in Persuasive Essays
In this article, we present a novel approach for parsing argumentation
structures. We identify argument components using sequence labeling at the
token level and apply a new joint model for detecting argumentation structures.
The proposed model globally optimizes argument component types and
argumentative relations using integer linear programming. We show that our
model considerably improves the performance of base classifiers and
significantly outperforms challenging heuristic baselines. Moreover, we
introduce a novel corpus of persuasive essays annotated with argumentation
structures. We show that our annotation scheme and annotation guidelines
successfully guide human annotators to substantial agreement. This corpus and
the annotation guidelines are freely available for ensuring reproducibility and
to encourage future research in computational argumentation.Comment: Under review in Computational Linguistics. First submission: 26
October 2015. Revised submission: 15 July 201
Combination Strategies for Semantic Role Labeling
This paper introduces and analyzes a battery of inference models for the
problem of semantic role labeling: one based on constraint satisfaction, and
several strategies that model the inference as a meta-learning problem using
discriminative classifiers. These classifiers are developed with a rich set of
novel features that encode proposition and sentence-level information. To our
knowledge, this is the first work that: (a) performs a thorough analysis of
learning-based inference models for semantic role labeling, and (b) compares
several inference strategies in this context. We evaluate the proposed
inference strategies in the framework of the CoNLL-2005 shared task using only
automatically-generated syntactic information. The extensive experimental
evaluation and analysis indicates that all the proposed inference strategies
are successful -they all outperform the current best results reported in the
CoNLL-2005 evaluation exercise- but each of the proposed approaches has its
advantages and disadvantages. Several important traits of a state-of-the-art
SRL combination strategy emerge from this analysis: (i) individual models
should be combined at the granularity of candidate arguments rather than at the
granularity of complete solutions; (ii) the best combination strategy uses an
inference model based in learning; and (iii) the learning-based inference
benefits from max-margin classifiers and global feedback
Adaptive text mining: Inferring structure from sequences
Text mining is about inferring structure from sequences representing natural language text, and may be defined as the process of analyzing text to extract information that is useful for particular purposes. Although hand-crafted heuristics are a common practical approach for extracting information from text, a general, and generalizable, approach requires adaptive techniques. This paper studies the way in which the adaptive techniques used in text compression can be applied to text mining. It develops several examples: extraction of hierarchical phrase structures from text, identification of keyphrases in documents, locating proper names and quantities of interest in a piece of text, text categorization, word segmentation, acronym extraction, and structure recognition. We conclude that compression forms a sound unifying principle that allows many text mining problems to be tacked adaptively
SUBSTANTIVWÖRTER IN GERMAN
This paper aims to describe Substantivwörter ’noun’ in German. Substantivwörter can
be: (1) Artikelwort ’article’, (2) Adjetiv vor sich ’adjective modifying a noun’, (3) ein
weiteres Substantiv (als Attribut im Genetiv oder Präpositionalkasus) ’nominal
functioning as an attribute in a genetive’, and (4) substantivische Pronomina
’substantive pronoun’. There are six types of Substantivische Pronomina, i.e.:
Personalpronomen ’personal pronoun’ such as Ich, du, er; Interrogativpronomen
’interrogative pronoun’ such as wer, was, and welche; Demonstrativpronomen
’demonstrative pronoun’ (dieser, jener, and ein solcher); Indefinitpronomen ’indefinite
pronoun’ (einige, mache, and allen); Possesivpronomen ’possessive pronoun’ (wessen
and wem); and Relativpronomen ’relative pronoun’ (ein Bild replaced by es or das).
Substantivwörter function to express (1) Gattungsnamen and (2) Eigenamen.
Gattungsnamen or Appelativa are used to name fruits and jobs. Gattungsnamen
function to name concrete objects such as Gold ’gold, Schnee ’snow’ and to describe
one’s characters or personality such as Härte ’hard’ and Klugheit ’smart’. Besides,
Gattungsnamen are used to express kinship relationships such as Onkel ’uncle’ and
Grossvater ’grandfather’; and (2) Eigenamen ’proper nouns’ function to express
Personennamen or to express one’s name comprising Vorname ’first name’ and
Familienname ’family name’ such as Helmut Kohl. Eigenamen are used to name a
place such as Gothes Haus, a mountain such as der Alpen, and a country such as
Deutschland. Eigenamen are also used for Produktnamen to name a building such as
Humboldt Universität, a book such as Gothes Faust, a painting such as Monalisa, a ship
such as Titanic, and a song title such as Mother John Lennon
Access to recorded interviews: A research agenda
Recorded interviews form a rich basis for scholarly inquiry. Examples include oral histories, community memory projects, and interviews conducted for broadcast media. Emerging technologies offer the potential to radically transform the way in which recorded interviews are made accessible, but this vision will demand substantial investments from a broad range of research communities. This article reviews the present state of practice for making recorded interviews available and the state-of-the-art for key component technologies. A large number of important research issues are identified, and from that set of issues, a coherent research agenda is proposed
Recommended from our members
Recall of random and distorted positions: Implications for the theory of expertise.
This paper explores the question, important to the theory of expert performance, of the nature and number of chunks that chess experts hold in memory. It examines how memory contents determine players' abilities to reconstruct (a) positions from games, (b) positions distorted in various ways and (c) and random positions. Comparison of a computer simulation with a human experiment supports the usual estimate that chess Masters store some 50,000 chunks in memory. The observed impairment of recall when positions are modified by mirror image reflection, implies that each chunk represents a specific pattern of pieces in a specific location. A good account of the results of the experiments is given by the template theory proposed by Gobet and Simon (in press) as an extension of Chase and Simon's (1973a) initial chunking proposal, and in agreement with other recent proposals for modification of the chunking theory (Richman, Staszewski & Simon, 1995) as applied to various recall tasks
- …