55,537 research outputs found
Qualitative System Identification from Imperfect Data
Experience in the physical sciences suggests that the only realistic means of
understanding complex systems is through the use of mathematical models.
Typically, this has come to mean the identification of quantitative models
expressed as differential equations. Quantitative modelling works best when the
structure of the model (i.e., the form of the equations) is known; and the
primary concern is one of estimating the values of the parameters in the model.
For complex biological systems, the model-structure is rarely known and the
modeler has to deal with both model-identification and parameter-estimation. In
this paper we are concerned with providing automated assistance to the first of
these problems. Specifically, we examine the identification by machine of the
structural relationships between experimentally observed variables. These
relationship will be expressed in the form of qualitative abstractions of a
quantitative model. Such qualitative models may not only provide clues to the
precise quantitative model, but also assist in understanding the essence of
that model. Our position in this paper is that background knowledge
incorporating system modelling principles can be used to constrain effectively
the set of good qualitative models. Utilising the model-identification
framework provided by Inductive Logic Programming (ILP) we present empirical
support for this position using a series of increasingly complex artificial
datasets. The results are obtained with qualitative and quantitative data
subject to varying amounts of noise and different degrees of sparsity. The
results also point to the presence of a set of qualitative states, which we
term kernel subsets, that may be necessary for a qualitative model-learner to
learn correct models. We demonstrate scalability of the method to biological
system modelling by identification of the glycolysis metabolic pathway from
data
Assessing the contribution of shallow and deep knowledge sources for word sense disambiguation
Corpus-based techniques have proved to be very beneficial in the development of efficient and accurate approaches to word sense disambiguation (WSD) despite the fact that they generally represent relatively shallow knowledge. It has always been thought, however, that WSD could also benefit from deeper knowledge sources. We describe a novel approach to WSD using inductive logic programming to learn theories from first-order logic representations that allows corpus-based evidence to be combined with any kind of background knowledge. This approach has been shown to be effective over several disambiguation tasks using a combination of deep and shallow knowledge sources. Is it important to understand the contribution of the various knowledge sources used in such a system. This paper investigates the contribution of nine knowledge sources to the performance of the disambiguation models produced for the SemEval-2007 English lexical sample task. The outcome of this analysis will assist future work on WSD in concentrating on the most useful knowledge sources
Argumentation Mining in User-Generated Web Discourse
The goal of argumentation mining, an evolving research field in computational
linguistics, is to design methods capable of analyzing people's argumentation.
In this article, we go beyond the state of the art in several ways. (i) We deal
with actual Web data and take up the challenges given by the variety of
registers, multiple domains, and unrestricted noisy user-generated Web
discourse. (ii) We bridge the gap between normative argumentation theories and
argumentation phenomena encountered in actual data by adapting an argumentation
model tested in an extensive annotation study. (iii) We create a new gold
standard corpus (90k tokens in 340 documents) and experiment with several
machine learning methods to identify argument components. We offer the data,
source codes, and annotation guidelines to the community under free licenses.
Our findings show that argumentation mining in user-generated Web discourse is
a feasible but challenging task.Comment: Cite as: Habernal, I. & Gurevych, I. (2017). Argumentation Mining in
User-Generated Web Discourse. Computational Linguistics 43(1), pp. 125-17
Parsing Argumentation Structures in Persuasive Essays
In this article, we present a novel approach for parsing argumentation
structures. We identify argument components using sequence labeling at the
token level and apply a new joint model for detecting argumentation structures.
The proposed model globally optimizes argument component types and
argumentative relations using integer linear programming. We show that our
model considerably improves the performance of base classifiers and
significantly outperforms challenging heuristic baselines. Moreover, we
introduce a novel corpus of persuasive essays annotated with argumentation
structures. We show that our annotation scheme and annotation guidelines
successfully guide human annotators to substantial agreement. This corpus and
the annotation guidelines are freely available for ensuring reproducibility and
to encourage future research in computational argumentation.Comment: Under review in Computational Linguistics. First submission: 26
October 2015. Revised submission: 15 July 201
Recommended from our members
Improving music genre classification using automatically induced harmony rules
We present a new genre classification framework using both low-level signal-based features and high-level harmony features. A state-of-the-art statistical genre classifier based on timbral features is extended using a first-order random forest containing for each genre rules derived from harmony or chord sequences. This random forest has been automatically induced, using the first-order logic induction algorithm TILDE, from a dataset, in which for each chord the degree and chord category are identified, and covering classical, jazz and pop genre classes. The audio descriptor-based genre classifier contains 206 features, covering spectral, temporal, energy, and pitch characteristics of the audio signal. The fusion of the harmony-based classifier with the extracted feature vectors is tested on three-genre subsets of the GTZAN and ISMIR04 datasets, which contain 300 and 448 recordings, respectively. Machine learning classifiers were tested using 5 Ă— 5-fold cross-validation and feature selection. Results indicate that the proposed harmony-based rules combined with the timbral descriptor-based genre classification system lead to improved genre classification rates
Recommended from our members
Improving music genre classification using automatically induced harmony rules
We present a new genre classification framework using both low-level signal-based features and high-level harmony features. A state-of-the-art statistical genre classifier based on timbral features is extended using a first-order random forest containing for each genre rules derived from harmony or chord sequences. This random forest has been automatically induced, using the first-order logic induction algorithm TILDE, from a dataset, in which for each chord the degree and chord category are identified, and covering classical, jazz and pop genre classes. The audio descriptor-based genre classifier contains 206 features, covering spectral, temporal, energy, and pitch characteristics of the audio signal. The fusion of the harmony-based classifier with the extracted feature vectors is tested on three-genre subsets of the GTZAN and ISMIR04 datasets, which contain 300 and 448 recordings, respectively. Machine learning classifiers were tested using 5 Ă— 5-fold cross-validation and feature selection. Results indicate that the proposed harmony-based rules combined with the timbral descriptor-based genre classification system lead to improved genre classification rates
The Argument Reasoning Comprehension Task: Identification and Reconstruction of Implicit Warrants
Reasoning is a crucial part of natural language argumentation. To comprehend
an argument, one must analyze its warrant, which explains why its claim follows
from its premises. As arguments are highly contextualized, warrants are usually
presupposed and left implicit. Thus, the comprehension does not only require
language understanding and logic skills, but also depends on common sense. In
this paper we develop a methodology for reconstructing warrants systematically.
We operationalize it in a scalable crowdsourcing process, resulting in a freely
licensed dataset with warrants for 2k authentic arguments from news comments.
On this basis, we present a new challenging task, the argument reasoning
comprehension task. Given an argument with a claim and a premise, the goal is
to choose the correct implicit warrant from two options. Both warrants are
plausible and lexically close, but lead to contradicting claims. A solution to
this task will define a substantial step towards automatic warrant
reconstruction. However, experiments with several neural attention and language
models reveal that current approaches do not suffice.Comment: Accepted as NAACL 2018 Long Paper; see details on the front pag
- …