7,687 research outputs found
A Convolutional Neural Network for Modelling Sentences
The ability to accurately represent sentences is central to language
understanding. We describe a convolutional architecture dubbed the Dynamic
Convolutional Neural Network (DCNN) that we adopt for the semantic modelling of
sentences. The network uses Dynamic k-Max Pooling, a global pooling operation
over linear sequences. The network handles input sentences of varying length
and induces a feature graph over the sentence that is capable of explicitly
capturing short and long-range relations. The network does not rely on a parse
tree and is easily applicable to any language. We test the DCNN in four
experiments: small scale binary and multi-class sentiment prediction, six-way
question classification and Twitter sentiment prediction by distant
supervision. The network achieves excellent performance in the first three
tasks and a greater than 25% error reduction in the last task with respect to
the strongest baseline
Organized Behavior Classification of Tweet Sets using Supervised Learning Methods
During the 2016 US elections Twitter experienced unprecedented levels of
propaganda and fake news through the collaboration of bots and hired persons,
the ramifications of which are still being debated. This work proposes an
approach to identify the presence of organized behavior in tweets. The Random
Forest, Support Vector Machine, and Logistic Regression algorithms are each
used to train a model with a data set of 850 records consisting of 299 features
extracted from tweets gathered during the 2016 US presidential election. The
features represent user and temporal synchronization characteristics to capture
coordinated behavior. These models are trained to classify tweet sets among the
categories: organic vs organized, political vs non-political, and pro-Trump vs
pro-Hillary vs neither. The random forest algorithm performs better with
greater than 95% average accuracy and f-measure scores for each category. The
most valuable features for classification are identified as user based
features, with media use and marking tweets as favorite to be the most
dominant.Comment: 51 pages, 5 figure
- …