88,274 research outputs found
Anaphor Assisted Document-Level Relation Extraction
Document-level relation extraction (DocRE) involves identifying relations
between entities distributed in multiple sentences within a document. Existing
methods focus on building a heterogeneous document graph to model the internal
structure of an entity and the external interaction between entities. However,
there are two drawbacks in existing methods. On one hand, anaphor plays an
important role in reasoning to identify relations between entities but is
ignored by these methods. On the other hand, these methods achieve
cross-sentence entity interactions implicitly by utilizing a document or
sentences as intermediate nodes. Such an approach has difficulties in learning
fine-grained interactions between entities across different sentences,
resulting in sub-optimal performance. To address these issues, we propose an
Anaphor-Assisted (AA) framework for DocRE tasks. Experimental results on the
widely-used datasets demonstrate that our model achieves a new state-of-the-art
performance.Comment: Accepted to EMNLP 2023 Main Conferenc
Identifying Relationships Among Sentences in Court Case Transcripts Using Discourse Relations
Case Law has a significant impact on the proceedings of legal cases.
Therefore, the information that can be obtained from previous court cases is
valuable to lawyers and other legal officials when performing their duties.
This paper describes a methodology of applying discourse relations between
sentences when processing text documents related to the legal domain. In this
study, we developed a mechanism to classify the relationships that can be
observed among sentences in transcripts of United States court cases. First, we
defined relationship types that can be observed between sentences in court case
transcripts. Then we classified pairs of sentences according to the
relationship type by combining a machine learning model and a rule-based
approach. The results obtained through our system were evaluated using human
judges. To the best of our knowledge, this is the first study where discourse
relationships between sentences have been used to determine relationships among
sentences in legal court case transcripts.Comment: Conference: 2018 International Conference on Advances in ICT for
Emerging Regions (ICTer
Argumentation Mining in User-Generated Web Discourse
The goal of argumentation mining, an evolving research field in computational
linguistics, is to design methods capable of analyzing people's argumentation.
In this article, we go beyond the state of the art in several ways. (i) We deal
with actual Web data and take up the challenges given by the variety of
registers, multiple domains, and unrestricted noisy user-generated Web
discourse. (ii) We bridge the gap between normative argumentation theories and
argumentation phenomena encountered in actual data by adapting an argumentation
model tested in an extensive annotation study. (iii) We create a new gold
standard corpus (90k tokens in 340 documents) and experiment with several
machine learning methods to identify argument components. We offer the data,
source codes, and annotation guidelines to the community under free licenses.
Our findings show that argumentation mining in user-generated Web discourse is
a feasible but challenging task.Comment: Cite as: Habernal, I. & Gurevych, I. (2017). Argumentation Mining in
User-Generated Web Discourse. Computational Linguistics 43(1), pp. 125-17
Identifying high-impact sub-structures for convolution kernels in document-level sentiment classification
Convolution kernels support the modeling of complex syntactic information in machine-learning tasks. However, such models are highly sensitive to the type and size of syntactic structure used. It is therefore an important challenge to automatically identify high impact sub-structures relevant to a given task. In this paper we present a systematic study investigating (combinations of) sequence and convolution kernels using different types of substructures in document-level sentiment classification. We show that minimal sub-structures extracted from constituency and dependency trees guided by a polarity lexicon show 1.45 point absolute improvement in accuracy over a bag-of-words classifier on a widely used sentiment corpus
Parsing Argumentation Structures in Persuasive Essays
In this article, we present a novel approach for parsing argumentation
structures. We identify argument components using sequence labeling at the
token level and apply a new joint model for detecting argumentation structures.
The proposed model globally optimizes argument component types and
argumentative relations using integer linear programming. We show that our
model considerably improves the performance of base classifiers and
significantly outperforms challenging heuristic baselines. Moreover, we
introduce a novel corpus of persuasive essays annotated with argumentation
structures. We show that our annotation scheme and annotation guidelines
successfully guide human annotators to substantial agreement. This corpus and
the annotation guidelines are freely available for ensuring reproducibility and
to encourage future research in computational argumentation.Comment: Under review in Computational Linguistics. First submission: 26
October 2015. Revised submission: 15 July 201
Web based knowledge extraction and consolidation for automatic ontology instantiation
The Web is probably the largest and richest information repository available today. Search engines are the common access routes to this valuable source. However, the role of these search engines is often limited to the retrieval of lists of potentially relevant documents. The burden of analysing the returned documents and identifying the knowledge of interest is therefore left to the user. The Artequakt system aims to deploy natural language tools to automatically ex-tract and consolidate knowledge from web documents and instantiate a given ontology, which dictates the type and form of knowledge to extract. Artequakt focuses on the domain of artists, and uses the harvested knowledge to gen-erate tailored biographies. This paper describes the latest developments of the system and discusses the problem of knowledge consolidation
- ā¦