171 research outputs found
TwiSE at SemEval-2016 Task 4: Twitter Sentiment Classification
This paper describes the participation of the team "TwiSE" in the SemEval
2016 challenge. Specifically, we participated in Task 4, namely "Sentiment
Analysis in Twitter" for which we implemented sentiment classification systems
for subtasks A, B, C and D. Our approach consists of two steps. In the first
step, we generate and validate diverse feature sets for twitter sentiment
evaluation, inspired by the work of participants of previous editions of such
challenges. In the second step, we focus on the optimization of the evaluation
measures of the different subtasks. To this end, we examine different learning
strategies by validating them on the data provided by the task organisers. For
our final submissions we used an ensemble learning approach (stacked
generalization) for Subtask A and single linear models for the rest of the
subtasks. In the official leaderboard we were ranked 9/35, 8/19, 1/11 and 2/14
for subtasks A, B, C and D respectively.\footnote{We make the code available
for research purposes at
\url{https://github.com/balikasg/SemEval2016-Twitter\_Sentiment\_Evaluation}.
Apprentissage Machine: de la théorie à la pratique
International audienceCet ouvrage présente les fondements scientifiques de la théorie de l'apprentissage supervisé, les algorithmes les plus répandus développés suivant ce domaine ainsi que les deux cadres de l'apprentissage semi-supervisé et de l'ordonnancement, à un niveau accessible aux étudiants de master et aux élèves ingénieurs. Nous avons eu ici le souci de fournir un exposé cohérent reliant la théorie aux algorithmes développés dans cette sphère. Mais cette étude ne se limite pas à présenter ces fondements, vous trouverez ainsi quelques programmes des algorithmes classiques proposés dans ce manuscrit, écrits en langage C (langage à la fois simple et populaire), et à destination des lecteurs qui cherchent à connaître le fonctionnement de ces modèles désignés parfois comme des boîtes noires
On a Topic Model for Sentences
Probabilistic topic models are generative models that describe the content of
documents by discovering the latent topics underlying them. However, the
structure of the textual input, and for instance the grouping of words in
coherent text spans such as sentences, contains much information which is
generally lost with these models. In this paper, we propose sentenceLDA, an
extension of LDA whose goal is to overcome this limitation by incorporating the
structure of the text in the generative and inference processes. We illustrate
the advantages of sentenceLDA by comparing it with LDA using both intrinsic
(perplexity) and extrinsic (text classification) evaluation tasks on different
text collections
Multitask Learning for Fine-Grained Twitter Sentiment Analysis
Traditional sentiment analysis approaches tackle problems like ternary
(3-category) and fine-grained (5-category) classification by learning the tasks
separately. We argue that such classification tasks are correlated and we
propose a multitask approach based on a recurrent neural network that benefits
by jointly learning them. Our study demonstrates the potential of multitask
models on this type of problems and improves the state-of-the-art results in
the fine-grained sentiment classification problem.Comment: International ACM SIGIR Conference on Research and Development in
Information Retrieval 201
- …