14,464 research outputs found
Cue Phrase Classification Using Machine Learning
Cue phrases may be used in a discourse sense to explicitly signal discourse
structure, but also in a sentential sense to convey semantic rather than
structural information. Correctly classifying cue phrases as discourse or
sentential is critical in natural language processing systems that exploit
discourse structure, e.g., for performing tasks such as anaphora resolution and
plan recognition. This paper explores the use of machine learning for
classifying cue phrases as discourse or sentential. Two machine learning
programs (Cgrendel and C4.5) are used to induce classification models from sets
of pre-classified cue phrases and their features in text and speech. Machine
learning is shown to be an effective technique for not only automating the
generation of classification models, but also for improving upon previous
results. When compared to manually derived classification models already in the
literature, the learned models often perform with higher accuracy and contain
new linguistic insights into the data. In addition, the ability to
automatically construct classification models makes it easier to comparatively
analyze the utility of alternative feature representations of the data.
Finally, the ease of retraining makes the learning approach more scalable and
flexible than manual methods.Comment: 42 pages, uses jair.sty, theapa.bst, theapa.st
An Empirical Analysis of the Role of Amplifiers, Downtoners, and Negations in Emotion Classification in Microblogs
The effect of amplifiers, downtoners, and negations has been studied in
general and particularly in the context of sentiment analysis. However, there
is only limited work which aims at transferring the results and methods to
discrete classes of emotions, e. g., joy, anger, fear, sadness, surprise, and
disgust. For instance, it is not straight-forward to interpret which emotion
the phrase "not happy" expresses. With this paper, we aim at obtaining a better
understanding of such modifiers in the context of emotion-bearing words and
their impact on document-level emotion classification, namely, microposts on
Twitter. We select an appropriate scope detection method for modifiers of
emotion words, incorporate it in a document-level emotion classification model
as additional bag of words and show that this approach improves the performance
of emotion classification. In addition, we build a term weighting approach
based on the different modifiers into a lexical model for the analysis of the
semantics of modifiers and their impact on emotion meaning. We show that
amplifiers separate emotions expressed with an emotion- bearing word more
clearly from other secondary connotations. Downtoners have the opposite effect.
In addition, we discuss the meaning of negations of emotion-bearing words. For
instance we show empirically that "not happy" is closer to sadness than to
anger and that fear-expressing words in the scope of downtoners often express
surprise.Comment: Accepted for publication at The 5th IEEE International Conference on
Data Science and Advanced Analytics (DSAA), https://dsaa2018.isi.it
A Machine Learning Approach to the Classification of Dialogue Utterances
The purpose of this paper is to present a method for automatic classification
of dialogue utterances and the results of applying that method to a corpus.
Superficial features of a set of training utterances (which we will call cues)
are taken as the basis for finding relevant utterance classes and for
extracting rules for assigning these classes to new utterances. Each cue is
assumed to partially contribute to the communicative function of an utterance.
Instead of relying on subjective judgments for the tasks of finding classes and
rules, we opt for using machine learning techniques to guarantee objectivity.Comment: 12 pages, using nemlap.sty, harvard.sty and agsm.bst, to appear in
Proceedings of NeMLaP-2, Bilkent University, Ankara, Turke
Uncertainty Detection as Approximate Max-Margin Sequence Labelling
This paper reports experiments for the CoNLL 2010 shared task on learning to detect hedges and their scope in natural language text. We have addressed the experimental tasks as supervised linear maximum margin prediction problems. For sentence level hedge detection in the biological domain we use an L1-regularised binary support vector machine, while for sentence level weasel detection in the Wikipedia domain, we use an L2-regularised approach. We model the in-sentence uncertainty cue and scope detection task as an L2-regularised approximate maximum margin sequence labelling problem, using the BIO-encoding. In addition to surface level features, we use a variety of linguistic features based on a functional dependency analysis. A greedy forward selection strategy is used in exploring the large set of potential features.
Our official results for Task 1 for the biological domain are 85.2 F1-score, for the Wikipedia set 55.4 F1-score. For Task 2, our official results are 2.1 for the entire task with a score of 62.5 for cue detection. After resolving errors and final bugs, our final results are for Task 1, biological: 86.0, Wikipedia: 58.2; Task 2, scopes: 39.6 and cues: 78.5
Exploring Different Dimensions of Attention for Uncertainty Detection
Neural networks with attention have proven effective for many natural
language processing tasks. In this paper, we develop attention mechanisms for
uncertainty detection. In particular, we generalize standardly used attention
mechanisms by introducing external attention and sequence-preserving attention.
These novel architectures differ from standard approaches in that they use
external resources to compute attention weights and preserve sequence
information. We compare them to other configurations along different dimensions
of attention. Our novel architectures set the new state of the art on a
Wikipedia benchmark dataset and perform similar to the state-of-the-art model
on a biomedical benchmark which uses a large set of linguistic features.Comment: accepted at EACL 201
- …