171 research outputs found

    TwiSE at SemEval-2016 Task 4: Twitter Sentiment Classification

    Full text link
    This paper describes the participation of the team "TwiSE" in the SemEval 2016 challenge. Specifically, we participated in Task 4, namely "Sentiment Analysis in Twitter" for which we implemented sentiment classification systems for subtasks A, B, C and D. Our approach consists of two steps. In the first step, we generate and validate diverse feature sets for twitter sentiment evaluation, inspired by the work of participants of previous editions of such challenges. In the second step, we focus on the optimization of the evaluation measures of the different subtasks. To this end, we examine different learning strategies by validating them on the data provided by the task organisers. For our final submissions we used an ensemble learning approach (stacked generalization) for Subtask A and single linear models for the rest of the subtasks. In the official leaderboard we were ranked 9/35, 8/19, 1/11 and 2/14 for subtasks A, B, C and D respectively.\footnote{We make the code available for research purposes at \url{https://github.com/balikasg/SemEval2016-Twitter\_Sentiment\_Evaluation}.

    Apprentissage Machine: de la théorie à la pratique

    No full text
    International audienceCet ouvrage présente les fondements scientifiques de la théorie de l'apprentissage supervisé, les algorithmes les plus répandus développés suivant ce domaine ainsi que les deux cadres de l'apprentissage semi-supervisé et de l'ordonnancement, à un niveau accessible aux étudiants de master et aux élèves ingénieurs. Nous avons eu ici le souci de fournir un exposé cohérent reliant la théorie aux algorithmes développés dans cette sphère. Mais cette étude ne se limite pas à présenter ces fondements, vous trouverez ainsi quelques programmes des algorithmes classiques proposés dans ce manuscrit, écrits en langage C (langage à la fois simple et populaire), et à destination des lecteurs qui cherchent à connaître le fonctionnement de ces modèles désignés parfois comme des boîtes noires

    On a Topic Model for Sentences

    Full text link
    Probabilistic topic models are generative models that describe the content of documents by discovering the latent topics underlying them. However, the structure of the textual input, and for instance the grouping of words in coherent text spans such as sentences, contains much information which is generally lost with these models. In this paper, we propose sentenceLDA, an extension of LDA whose goal is to overcome this limitation by incorporating the structure of the text in the generative and inference processes. We illustrate the advantages of sentenceLDA by comparing it with LDA using both intrinsic (perplexity) and extrinsic (text classification) evaluation tasks on different text collections

    Multitask Learning for Fine-Grained Twitter Sentiment Analysis

    Get PDF
    Traditional sentiment analysis approaches tackle problems like ternary (3-category) and fine-grained (5-category) classification by learning the tasks separately. We argue that such classification tasks are correlated and we propose a multitask approach based on a recurrent neural network that benefits by jointly learning them. Our study demonstrates the potential of multitask models on this type of problems and improves the state-of-the-art results in the fine-grained sentiment classification problem.Comment: International ACM SIGIR Conference on Research and Development in Information Retrieval 201
    • …
    corecore