53,132 research outputs found
Sequence Mining and Pattern Analysis in Drilling Reports with Deep Natural Language Processing
Drilling activities in the oil and gas industry have been reported over
decades for thousands of wells on a daily basis, yet the analysis of this text
at large-scale for information retrieval, sequence mining, and pattern analysis
is very challenging. Drilling reports contain interpretations written by
drillers from noting measurements in downhole sensors and surface equipment,
and can be used for operation optimization and accident mitigation. In this
initial work, a methodology is proposed for automatic classification of
sentences written in drilling reports into three relevant labels (EVENT,
SYMPTOM and ACTION) for hundreds of wells in an actual field. Some of the main
challenges in the text corpus were overcome, which include the high frequency
of technical symbols, mistyping/abbreviation of technical terms, and the
presence of incomplete sentences in the drilling reports. We obtain
state-of-the-art classification accuracy within this technical language and
illustrate advanced queries enabled by the tool.Comment: 7 pages, 14 figures, technical repor
Rude waiter but mouthwatering pastries! An exploratory study into Dutch aspect-based sentiment analysis
The fine-grained task of automatically detecting all sentiment expressions within a given document and the aspects to which they refer is known as aspect-based sentiment analysis. In this paper we present the first full aspect-based sentiment analysis pipeline for Dutch
and apply it to customer reviews. To this purpose, we collected reviews from two different domains, i.e. restaurant and smartphone reviews. Both corpora have been manually annotated using newly developed guidelines that comply to standard practices in the field. For our experimental pipeline we perceive aspect-based sentiment analysis as a task consisting of three main subtasks which have to be tackled incrementally: aspect term extraction, aspect category classification and polarity classification. First experiments on our Dutch restaurant corpus reveal that this is indeed a feasible approach that yields promising results
Parsing Argumentation Structures in Persuasive Essays
In this article, we present a novel approach for parsing argumentation
structures. We identify argument components using sequence labeling at the
token level and apply a new joint model for detecting argumentation structures.
The proposed model globally optimizes argument component types and
argumentative relations using integer linear programming. We show that our
model considerably improves the performance of base classifiers and
significantly outperforms challenging heuristic baselines. Moreover, we
introduce a novel corpus of persuasive essays annotated with argumentation
structures. We show that our annotation scheme and annotation guidelines
successfully guide human annotators to substantial agreement. This corpus and
the annotation guidelines are freely available for ensuring reproducibility and
to encourage future research in computational argumentation.Comment: Under review in Computational Linguistics. First submission: 26
October 2015. Revised submission: 15 July 201
- …