16,738 research outputs found
TwiSE at SemEval-2016 Task 4: Twitter Sentiment Classification
This paper describes the participation of the team "TwiSE" in the SemEval
2016 challenge. Specifically, we participated in Task 4, namely "Sentiment
Analysis in Twitter" for which we implemented sentiment classification systems
for subtasks A, B, C and D. Our approach consists of two steps. In the first
step, we generate and validate diverse feature sets for twitter sentiment
evaluation, inspired by the work of participants of previous editions of such
challenges. In the second step, we focus on the optimization of the evaluation
measures of the different subtasks. To this end, we examine different learning
strategies by validating them on the data provided by the task organisers. For
our final submissions we used an ensemble learning approach (stacked
generalization) for Subtask A and single linear models for the rest of the
subtasks. In the official leaderboard we were ranked 9/35, 8/19, 1/11 and 2/14
for subtasks A, B, C and D respectively.\footnote{We make the code available
for research purposes at
\url{https://github.com/balikasg/SemEval2016-Twitter\_Sentiment\_Evaluation}.
User Intent Prediction in Information-seeking Conversations
Conversational assistants are being progressively adopted by the general
population. However, they are not capable of handling complicated
information-seeking tasks that involve multiple turns of information exchange.
Due to the limited communication bandwidth in conversational search, it is
important for conversational assistants to accurately detect and predict user
intent in information-seeking conversations. In this paper, we investigate two
aspects of user intent prediction in an information-seeking setting. First, we
extract features based on the content, structural, and sentiment
characteristics of a given utterance, and use classic machine learning methods
to perform user intent prediction. We then conduct an in-depth feature
importance analysis to identify key features in this prediction task. We find
that structural features contribute most to the prediction performance. Given
this finding, we construct neural classifiers to incorporate context
information and achieve better performance without feature engineering. Our
findings can provide insights into the important factors and effective methods
of user intent prediction in information-seeking conversations.Comment: Accepted to CHIIR 201
Effective Use of Word Order for Text Categorization with Convolutional Neural Networks
Convolutional neural network (CNN) is a neural network that can make use of
the internal structure of data such as the 2D structure of image data. This
paper studies CNN on text categorization to exploit the 1D structure (namely,
word order) of text data for accurate prediction. Instead of using
low-dimensional word vectors as input as is often done, we directly apply CNN
to high-dimensional text data, which leads to directly learning embedding of
small text regions for use in classification. In addition to a straightforward
adaptation of CNN from image to text, a simple but new variation which employs
bag-of-word conversion in the convolution layer is proposed. An extension to
combine multiple convolution layers is also explored for higher accuracy. The
experiments demonstrate the effectiveness of our approach in comparison with
state-of-the-art methods
Automatically extracting polarity-bearing topics for cross-domain sentiment classification
Joint sentiment-topic (JST) model was previously proposed to detect sentiment and topic simultaneously from text. The only supervision required by JST model learning is domain-independent polarity word priors. In this paper, we modify the JST model by incorporating word polarity priors through modifying the topic-word Dirichlet priors. We study the polarity-bearing topics extracted by JST and show that by augmenting the original feature space with polarity-bearing topics, the in-domain supervised classifiers learned from augmented feature representation achieve the state-of-the-art performance of 95% on the movie review data and an average of 90% on the multi-domain sentiment dataset. Furthermore, using feature augmentation and selection according to the information gain criteria for cross-domain sentiment classification, our proposed approach performs either better or comparably compared to previous approaches. Nevertheless, our approach is much simpler and does not require difficult parameter tuning
A case study of predicting banking customers behaviour by using data mining
Data Mining (DM) is a technique that examines information stored in large database or data warehouse and find the patterns or trends in the data that are not yet known or suspected. DM techniques have been applied to a variety of different domains including Customer Relationship Management CRM). In this research, a new Customer Knowledge Management (CKM) framework based on data mining is proposed. The proposed data mining framework in this study manages relationships between banking organizations and their customers. Two typical data mining techniques - Neural Network and Association Rules - are applied to predict the behavior of customers and to increase the decision-making processes for recalling valued customers in banking industries. The experiments on the real world dataset are conducted and the different metrics are used to evaluate the performances of the two data mining models. The results indicate that the Neural Network model achieves better accuracy but takes longer time to train the model
- …