89,542 research outputs found

    Unsupervised Named-Entity Recognition: Generating Gazetteers and Resolving Ambiguity

    Get PDF
    In this paper, we propose a named-entity recognition (NER) system that addresses two major limitations frequently discussed in the field. First, the system requires no human intervention such as manually labeling training data or creating gazetteers. Second, the system can handle more than the three classical named-entity types (person, location, and organization). We describe the system’s architecture and compare its performance with a supervised system. We experimentally evaluate the system on a standard corpus, with the three classical named-entity types, and also on a new corpus, with a new named-entity type (car brands)

    Parsing Argumentation Structures in Persuasive Essays

    Full text link
    In this article, we present a novel approach for parsing argumentation structures. We identify argument components using sequence labeling at the token level and apply a new joint model for detecting argumentation structures. The proposed model globally optimizes argument component types and argumentative relations using integer linear programming. We show that our model considerably improves the performance of base classifiers and significantly outperforms challenging heuristic baselines. Moreover, we introduce a novel corpus of persuasive essays annotated with argumentation structures. We show that our annotation scheme and annotation guidelines successfully guide human annotators to substantial agreement. This corpus and the annotation guidelines are freely available for ensuring reproducibility and to encourage future research in computational argumentation.Comment: Under review in Computational Linguistics. First submission: 26 October 2015. Revised submission: 15 July 201

    Combination Strategies for Semantic Role Labeling

    Full text link
    This paper introduces and analyzes a battery of inference models for the problem of semantic role labeling: one based on constraint satisfaction, and several strategies that model the inference as a meta-learning problem using discriminative classifiers. These classifiers are developed with a rich set of novel features that encode proposition and sentence-level information. To our knowledge, this is the first work that: (a) performs a thorough analysis of learning-based inference models for semantic role labeling, and (b) compares several inference strategies in this context. We evaluate the proposed inference strategies in the framework of the CoNLL-2005 shared task using only automatically-generated syntactic information. The extensive experimental evaluation and analysis indicates that all the proposed inference strategies are successful -they all outperform the current best results reported in the CoNLL-2005 evaluation exercise- but each of the proposed approaches has its advantages and disadvantages. Several important traits of a state-of-the-art SRL combination strategy emerge from this analysis: (i) individual models should be combined at the granularity of candidate arguments rather than at the granularity of complete solutions; (ii) the best combination strategy uses an inference model based in learning; and (iii) the learning-based inference benefits from max-margin classifiers and global feedback

    Adaptive text mining: Inferring structure from sequences

    Get PDF
    Text mining is about inferring structure from sequences representing natural language text, and may be defined as the process of analyzing text to extract information that is useful for particular purposes. Although hand-crafted heuristics are a common practical approach for extracting information from text, a general, and generalizable, approach requires adaptive techniques. This paper studies the way in which the adaptive techniques used in text compression can be applied to text mining. It develops several examples: extraction of hierarchical phrase structures from text, identification of keyphrases in documents, locating proper names and quantities of interest in a piece of text, text categorization, word segmentation, acronym extraction, and structure recognition. We conclude that compression forms a sound unifying principle that allows many text mining problems to be tacked adaptively

    SUBSTANTIVWÖRTER IN GERMAN

    Get PDF
    This paper aims to describe Substantivwörter ’noun’ in German. Substantivwörter can be: (1) Artikelwort ’article’, (2) Adjetiv vor sich ’adjective modifying a noun’, (3) ein weiteres Substantiv (als Attribut im Genetiv oder Präpositionalkasus) ’nominal functioning as an attribute in a genetive’, and (4) substantivische Pronomina ’substantive pronoun’. There are six types of Substantivische Pronomina, i.e.: Personalpronomen ’personal pronoun’ such as Ich, du, er; Interrogativpronomen ’interrogative pronoun’ such as wer, was, and welche; Demonstrativpronomen ’demonstrative pronoun’ (dieser, jener, and ein solcher); Indefinitpronomen ’indefinite pronoun’ (einige, mache, and allen); Possesivpronomen ’possessive pronoun’ (wessen and wem); and Relativpronomen ’relative pronoun’ (ein Bild replaced by es or das). Substantivwörter function to express (1) Gattungsnamen and (2) Eigenamen. Gattungsnamen or Appelativa are used to name fruits and jobs. Gattungsnamen function to name concrete objects such as Gold ’gold, Schnee ’snow’ and to describe one’s characters or personality such as Härte ’hard’ and Klugheit ’smart’. Besides, Gattungsnamen are used to express kinship relationships such as Onkel ’uncle’ and Grossvater ’grandfather’; and (2) Eigenamen ’proper nouns’ function to express Personennamen or to express one’s name comprising Vorname ’first name’ and Familienname ’family name’ such as Helmut Kohl. Eigenamen are used to name a place such as Gothes Haus, a mountain such as der Alpen, and a country such as Deutschland. Eigenamen are also used for Produktnamen to name a building such as Humboldt Universität, a book such as Gothes Faust, a painting such as Monalisa, a ship such as Titanic, and a song title such as Mother John Lennon

    Access to recorded interviews: A research agenda

    Get PDF
    Recorded interviews form a rich basis for scholarly inquiry. Examples include oral histories, community memory projects, and interviews conducted for broadcast media. Emerging technologies offer the potential to radically transform the way in which recorded interviews are made accessible, but this vision will demand substantial investments from a broad range of research communities. This article reviews the present state of practice for making recorded interviews available and the state-of-the-art for key component technologies. A large number of important research issues are identified, and from that set of issues, a coherent research agenda is proposed
    corecore