Search CORE

5,531 research outputs found

Developing a corpus of strategic conversation in The Settlers of Catan

Author: Afantenos S.
Asher N.
Benamara F.
Cadilhac A.
Degremont C.
Denis P.
Guhe M.
Keizer S.
Lascarides A.
Lemon O.
Muller P.
Paul S.
Rieser V.
Vieu L.
Publication venue
Publication date: 01/01/2012
Field of study

International audienceWe describe a dialogue model and an implemented annotation scheme for a pilot corpus of annotated online chats concerning bargaining negotiations in the game The Settlers of Catan. We will use this model and data to analyze how conversations proceed in the absence of strong forms of cooperativity, where agents have diverging motives. Here we concentrate on the description of our annotation scheme for negotiation dialogues, illustrated with our pilot data, and some perspectives for future research on the issue

HAL - Lille 3

Scientific Publications of the University of Toulouse II Le Mirail

INRIA a CCSD electronic archive server

Edinburgh Research Explorer

Cross-lingual RST Discourse Parsing

Author: Braud Chloé
Coavoux Maximin
Søgaard Anders
Publication venue
Publication date: 01/01/2017
Field of study

Discourse parsing is an integral part of understanding information flow and argumentative structure in documents. Most previous research has focused on inducing and evaluating models from the English RST Discourse Treebank. However, discourse treebanks for other languages exist, including Spanish, German, Basque, Dutch and Brazilian Portuguese. The treebanks share the same underlying linguistic theory, but differ slightly in the way documents are annotated. In this paper, we present (a) a new discourse parser which is simpler, yet competitive (significantly better on 2/3 metrics) to state of the art for English, (b) a harmonization of discourse treebanks across languages, enabling us to present (c) what to the best of our knowledge are the first experiments on cross-lingual discourse parsing.Comment: To be published in EACL 2017, 13 page

arXiv.org e-Print Archive

Crossref

Copenhagen University Research Information System

ANNIS: a linguistic database for exploring information structure

Author: Dipper Stefanie
Götze Michael
Stede Manfred
Wegst Tillmann
Publication venue
Publication date: 01/01/2004
Field of study

In this paper, we discuss the design and implementation of our first version of the database "ANNIS" (ANNotation of Information Structure). For research based on empirical data, ANNIS provides a uniform environment for storing this data together with its linguistic annotations. A central database promotes standardized annotation, which facilitates interpretation and comparison of the data. ANNIS is used through a standard web browser and offers tier-based visualization of data and annotations, as well as search facilities that allow for cross-level and cross-sentential queries. The paper motivates the design of the system, characterizes its user interface, and provides an initial technical evaluation of ANNIS with respect to data size and query processing

CiteSeerX

Hochschulschriftenserver - Universität Frankfurt am Main

Do peers see more in a paper than its authors?

Author: Divoli Anna
Hearst Marti
Nakov Preslav
Publication venue: eScholarship, University of California
Publication date: 01/01/2012
Field of study

Recent years have shown a gradual shift in the content of biomedical publications that is freely accessible, from titles and abstracts to full text. This has enabled new forms of automatic text analysis and has given rise to some interesting questions: How informative is the abstract compared to the full-text? What important information in the full-text is not present in the abstract? What should a good summary contain that is not already in the abstract? Do authors and peers see an article differently? We answer these questions by comparing the information content of the abstract to that in citances-sentences containing citations to that article. We contrast the important points of an article as judged by its authors versus as seen by peers. Focusing on the area of molecular interactions, we perform manual and automatic analysis, and we find that the set of all citances to a target article not only covers most information (entities, functions, experimental methods, and other biological concepts) found in its abstract, but also contains 20% more concepts. We further present a detailed summary of the differences across information types, and we examine the effects other citations and time have on the content of citances

Directory of Open Access Journals

PubMed Central

eScholarship - University of California

Sentiment and behaviour annotation in a corpus of dialogue summaries

Author: Alvares Alexandre Rossi
Carvalho Ariadne Maria Brito Rizzoni
Piwek Paul
Roman Norton Trevisan
Publication venue
Publication date: 01/01/2015
Field of study

This paper proposes a scheme for sentiment annotation. We show how the task can be made tractable by focusing on one of the many aspects of sentiment: sentiment as it is recorded in behaviour reports of people and their interactions. Together with a number of measures for supporting the reliable application of the scheme, this allows us to obtain sufficient to good agreement scores (in terms of Krippendorf's alpha) on three key dimensions: polarity, evaluated party and type of clause. Evaluation of the scheme is carried out through the annotation of an existing corpus of dialogue summaries (in English and Portuguese) by nine annotators. Our contribution to the field is twofold: (i) a reliable multi-dimensional annotation scheme for sentiment in behaviour reports; and (ii) an annotated corpus that was used for testing the reliability of the scheme and which is made available to the research community

ZENODO

Open Research Online (The Open University)

NEUROSURGERY ENTHUSIASTIC WOMEN SOCIETY

ARPHA OAI-PMH Endpoint

ARPHA Preprints

Harnessing Rhetorical Figures for Argument Mining:A Pilot Study in Relating Figures of Speech to Argument Structure

Author: Chesñevar
Chien
Gladkova
Grasso
Harris
Liu
Pallotta
Pang
Peldszus
Reed
Steen
Webber
Publication venue: 'IOS Press'
Publication date: 01/01/2017
Field of study

Crossref

University of Dundee Online Publications

Recognizing cited facts and principles in legal judgements

Author: Shulayeva Olga
Siddharthan Advaith
Wyner Adam
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2017
Field of study

In common law jurisdictions, legal professionals cite facts and legal principles from precedent cases to support their arguments before the court for their intended outcome in a current case. This practice stems from the doctrine of stare decisis, where cases that have similar facts should receive similar decisions with respect to the principles. It is essential for legal professionals to identify such facts and principles in precedent cases, though this is a highly time intensive task. In this paper, we present studies that demonstrate that human annotators can achieve reasonable agreement on which sentences in legal judgements contain cited facts and principles (respectively, κ=0.65 and κ=0.95 for inter- and intra-annotator agreement). We further demonstrate that it is feasible to automatically annotate sentences containing such legal facts and principles in a supervised machine learning framework based on linguistic features, reporting per category precision and recall figures of between 0.79 and 0.89 for classifying sentences in legal judgements as cited facts, principles or neither using a Bayesian classifier, with an overall κ of 0.72 with the human-annotated gold standard

Aberdeen University Research

Crossref

Springer - Publisher Connector

Open Research Online (The Open University)

Cronfa at Swansea University

Reliability measurement without limits

Author: Carletta J.
Reidsma D.
Publication venue: MIT Press
Publication date: 01/01/2008
Field of study

In computational linguistics, a reliability measurement of 0.8 on some statistic such as

\kappa

is widely thought to guarantee that hand-coded data is fit for purpose, with lower values suspect. We demonstrate that the main use of such data, machine learning, can tolerate data with a low reliability as long as any disagreement among human coders looks like random noise. When it does not, however, data can have a reliability of more than 0.8 and still be unsuitable for use: the disagreement may indicate erroneous patterns that machine-learning can learn, and evaluation against test data that contain these same erroneous patterns may lead us to draw wrong conclusions about our machine-learning algorithms. Furthermore, lower reliability values still held as acceptable by many researchers, between 0.67 and 0.8, may even yield inflated performance figures in some circumstances. Although this is a common sense result, it has implications for how we work that are likely to reach beyond the machine-learning applications we discuss. At the very least, computational linguists should look for any patterns in the disagreement among coders and assess what impact they will have

University of Twente Research Information

Parsing Argumentation Structures in Persuasive Essays

Author: Gurevych Iryna
Stab Christian
Publication venue
Publication date: 22/07/2016
Field of study

In this article, we present a novel approach for parsing argumentation structures. We identify argument components using sequence labeling at the token level and apply a new joint model for detecting argumentation structures. The proposed model globally optimizes argument component types and argumentative relations using integer linear programming. We show that our model considerably improves the performance of base classifiers and significantly outperforms challenging heuristic baselines. Moreover, we introduce a novel corpus of persuasive essays annotated with argumentation structures. We show that our annotation scheme and annotation guidelines successfully guide human annotators to substantial agreement. This corpus and the annotation guidelines are freely available for ensuring reproducibility and to encourage future research in computational argumentation.Comment: Under review in Computational Linguistics. First submission: 26 October 2015. Revised submission: 15 July 201

arXiv.org e-Print Archive

TUbiblio

Directory of Open Access Journals

TUdatalib Repository (TU Darmstadt)

Что и как спрашивают в социальных вопросно-ответных сервисах по-русски?

Author: Braslavski P.
Mukhin M.
Браславский П. И.
Мухин М. Ю.
Publication venue: Издательство РГГУ
Publication date: 01/01/2012
Field of study

In our study we surveyed different approaches to the study of questions in traditional linguistics, question answering (QA), and, recently, in community question answering (CQA). We adapted a functional-semantic classification scheme for CQA data and manually labeled 2,000 questions in Russian originating from [email protected] CQA service. About half of them are purely conversational and do not aim at obtaining actual information. In the subset of meaningful questions the major classes are requests for recommendations, or how-questions, and fact-seeking questions. The data demonstrate a variety of interrogative sentences as well as a host of formally non-interrogative expressions with the meaning of questions and requests. The observations can be of interest both for linguistics and for practical applications

Institutional repository of Ural Federal University named after the first President of Russia B.N.Yeltsin