264 research outputs found

    Detection of stance and sentiment modifiers in political blogs

    Get PDF
    The automatic detection of seven types of modifiers was studied: Certainty, Uncertainty, Hypotheticality, Prediction, Recommendation, Concession/Contrast and Source. A classifier aimed at detecting local cue words that signal the categories was the most successful method for five of the categories. For Prediction and Hypotheticality, however, better results were obtained with a classifier trained on tokens and bigrams present in the entire sentence. Unsupervised cluster features were shown useful for the categories Source and Uncertainty, when a subset of the training data available was used. However, when all of the 2,095 sentences that had been actively selected and manually annotated were used as training data, the cluster features had a very limited effect. Some of the classification errors made by the models would be possible to avoid by extending the training data set, while other features and feature representations, as well as the incorporation of pragmatic knowledge, would be required for other error types

    How would Stance Detection Techniques Evolve after the Launch of ChatGPT?

    Full text link
    Stance detection refers to the task of extracting the standpoint (Favor, Against or Neither) towards a target in given texts. Such research gains increasing attention with the proliferation of social media contents. The conventional framework of handling stance detection is converting it into text classification tasks. Deep learning models have already replaced rule-based models and traditional machine learning models in solving such problems. Current deep neural networks are facing two main challenges which are insufficient labeled data and information in social media posts and the unexplainable nature of deep learning models. A new pre-trained language model chatGPT was launched on Nov 30, 2022. For the stance detection tasks, our experiments show that ChatGPT can achieve SOTA or similar performance for commonly used datasets including SemEval-2016 and P-Stance. At the same time, ChatGPT can provide explanation for its own prediction, which is beyond the capability of any existing model. The explanations for the cases it cannot provide classification results are especially useful. ChatGPT has the potential to be the best AI model for stance detection tasks in NLP, or at least change the research paradigm of this field. ChatGPT also opens up the possibility of building explanatory AI for stance detection

    A survey on sentiment analysis in Urdu: A resource-poor language

    Get PDF
    © 2020 Background/introduction: The dawn of the internet opened the doors to the easy and widespread sharing of information on subject matters such as products, services, events and political opinions. While the volume of studies conducted on sentiment analysis is rapidly expanding, these studies mostly address English language concerns. The primary goal of this study is to present state-of-art survey for identifying the progress and shortcomings saddling Urdu sentiment analysis and propose rectifications. Methods: We described the advancements made thus far in this area by categorising the studies along three dimensions, namely: text pre-processing lexical resources and sentiment classification. These pre-processing operations include word segmentation, text cleaning, spell checking and part-of-speech tagging. An evaluation of sophisticated lexical resources including corpuses and lexicons was carried out, and investigations were conducted on sentiment analysis constructs such as opinion words, modifiers, negations. Results and conclusions: Performance is reported for each of the reviewed study. Based on experimental results and proposals forwarded through this paper provides the groundwork for further studies on Urdu sentiment analysis

    Sentiment Analysis: An Overview from Linguistics

    Get PDF
    Sentiment analysis is a growing field at the intersection of linguistics and computer science, which attempts to automatically determine the sentiment, or positive/negative opinion, contained in text. Sentiment can be characterized as positive or negative evaluation expressed through language. Common applications of sentiment analysis include the automatic determination of whether a review posted online (of a movie, a book, or a consumer product) is positive or negative towards the item being reviewed. Sentiment analysis is now a common tool in the repertoire of social media analysis carried out by companies, marketers and political analysts. Research on sentiment analysis extracts information from positive and negative words in text, from the context of those words, and the linguistic structure of the text. This brief survey examines in particular the contributions that linguistic knowledge can make to the problem of automatically determining sentiment

    Understanding misinformation on Twitter in the context of controversial issues

    Get PDF
    Social media is slowly supplementing, or even replacing, traditional media outlets such as television, newspapers, and radio. However, social media presents some drawbacks when it comes to circulating information. These drawbacks include spreading false information, rumors, and fake news. At least three main factors create these drawbacks: The filter bubble effect, misinformation, and information overload. These factors make gathering accurate and credible information online very challenging, which in turn may affect public trust in online information. These issues are even more challenging when the issue under discussion is a controversial topic. In this thesis, four main controversial topics are studied, each of which comes from a different domain. This variation of domains can give a broad view of how misinformation is manifested in social media, and how it is manifested differently in different domains. This thesis aims to understand misinformation in the context of controversial issue discussions. This can be done through understanding how misinformation is manifested in social media as well as by understanding people’s opinions towards these controversial issues. In this thesis, three different aspects of a tweet are studied. These aspects are 1) the user sharing the information, 2) the information source shared, and 3) whether specific linguistic cues can help in assessing the credibility of information on social media. Finally, the web application tool TweetChecker is used to allow online users to have a more in-depth understanding of the discussions about five different controversial health issues. The results and recommendations of this study can be used to build solutions for the problem of trustworthiness of user-generated content on different social media platforms, especially for controversial issues

    Improving multilingual sentiment analysis using linguistic knowledge

    Get PDF
    The need for the automatic analysis of opinions in written texts, which has been growing in recent years in several domains, has made Sentiment Analysis a very popular field (Liu 2012). In this area, systems have been traditionally classifying sentences as positive or negative only in accordance to the sentiment that words most frequently assume (e.g. “angry” negative, “beautiful” positive). Such strategies present two main limitations: 1. Multiple opinions often appear in the same sentence, with each expressing an opposing sentiment on different subjects (e.g. a positive opinion is expressed on the plot of a film, but a negative one on the actors' performance). 2. The most frequent sentiment, collected in sentiment dictionaries, does not take into account the fact that context often alters the orientation. Sentiment dictionaries have also been demonstrated to have small coverage (Di Bari, Sharoff et al. 2013, Di Bari 2015). As a consequence, I propose an automatic system based on deep linguistic knowledge given in particular by dependency parsing relations (Nivre 2005) and by attributes taken from the Appraisal framework (Martin and White 2005), a theory concerned with the language of evaluation, attitude and emotion within Systemic Functional Linguistics (Halliday 1978). As a basis for the creation of the automatic system, I tailored an annotation scheme called SentiML inspired by previous works (Whitelaw, Garg et al. 2005, Bloom, Garg et al. 2007, Bloom and Argamon 2009) and carried out the annotation task in three languages (English, Italian and Russian) by using MAE (Stubbs 2011). The resulting corpora consist of around 500 sentences and 9000 tokens for each language. The corpora contain both original texts and translations of different types: news, political speeches and TED talks (Cettolo, Girardi et al. 2012). The foundation of SentiML lies in the fact that an opinion can be captured in a pair consisting of usually two words with different functions: a target as the expression the sentiment refers to, and a modifier as the expression conveying the sentiment. The pair consisting of the target and the modifier altogether is called appraisal group. Along with these main categories, the annotation includes their attributes, among which the most important are the appraisal type according to the Appraisal framework (‘affect’, ‘appreciation’, ‘judgement’) and the orientation (‘positive’ or ‘negative’, both out-of-context and contextual). A detailed manual analysis of the translation strategies (Baker 2002) and the appraisal types across the corpora, supported by insights from Corpus Linguistics has been carried out. The most interesting expressions found during such analysis have been automatically analysed afterwards with the aim of having a further evaluation of the system. Nonetheless, the main evaluation consists of a comparison with a rule-based system that makes use of already existing tools such as the part-of-speech (POS) tagger and the sentiment dictionary. The main objective of this work is to demonstrate that the Appraisal framework and Sentiment analysis can successfully support each other. The additional consideration that this has been done not only for English, but in parallel for Italian and Russian (and as one of the first applications of the Appraisal Framework in these languages) and for different text types, makes the research unique. Moreover, because the methodology used to compare a variety of linguistic features (morphological, grammatical, lexical, syntactical) at work in sentiment analysis has been applied to three languages belonging to different families (Germanic, Romance and Slavonic), it is expected to be generalizable to other languages. As far as the practical applications are concerned, the automatic system could be used in any field in which written opinions need to be analysed. In the meanwhile, the new individual resources such as the annotated corpora and the Maltparser models for Italian and Russian have been made publicly available
    corecore