3,269 research outputs found

    Viewpoint Discovery and Understanding in Social Networks

    Full text link
    The Web has evolved to a dominant platform where everyone has the opportunity to express their opinions, to interact with other users, and to debate on emerging events happening around the world. On the one hand, this has enabled the presence of different viewpoints and opinions about a - usually controversial - topic (like Brexit), but at the same time, it has led to phenomena like media bias, echo chambers and filter bubbles, where users are exposed to only one point of view on the same topic. Therefore, there is the need for methods that are able to detect and explain the different viewpoints. In this paper, we propose a graph partitioning method that exploits social interactions to enable the discovery of different communities (representing different viewpoints) discussing about a controversial topic in a social network like Twitter. To explain the discovered viewpoints, we describe a method, called Iterative Rank Difference (IRD), which allows detecting descriptive terms that characterize the different viewpoints as well as understanding how a specific term is related to a viewpoint (by detecting other related descriptive terms). The results of an experimental evaluation showed that our approach outperforms state-of-the-art methods on viewpoint discovery, while a qualitative analysis of the proposed IRD method on three different controversial topics showed that IRD provides comprehensive and deep representations of the different viewpoints

    Applications of opinion mining to data journalism

    Get PDF
    Dissertação de mest., Processamento de Linguagem Natural e Indústrias da Língua, Faculdade de Ciências Humanas e Sociais, Univ. do Algarve, 2013Nowadays social media play a central role in every day life. A huge volume of user-generated data spins around online social networks, such as Twitter, having an extraordinary impact on the media industry and on the users’ everyday life. More and more users and people use social networks from their computers and smartphones to share their emotions and opinions about the facts happening in the world. Natural language processing and, in particular, sentiment analysis are key technologies to make sense out of the data about news that circulates in the online social networks. The application of opinion mining to news-oriented user-generated contents, such as news-linking tweets, can provide novel views on the news audience behaviour and help to interpret the evolution of sentiments. Applying this capability in the social news-sphere permits (i) to measure the impact of news onto readers and (ii) to gather elements that contain stories. From a broad perspective, the main aim of this research is to face this challenge, that is, to explore how opinion mining (or sentiment analysis) can be adopted into the field of digital media and data-driven journalism

    Text mining for central banks: handbook

    Get PDF

    An Improved Machine Learning Approach to Analyze the Sentiment of the Movie Reviews Using IMDB dataset

    Get PDF
    Sentiment analysis is a sub-domain of opinion mining where the analysis is focused on the extraction of emotions and opinions of the people towards a particular topic from a structured, semi-structured or unstructured textual data. In this paper, we try to focus our task of sentiment analysis on IMDB movie review database. . In this work the novel approach is improved Naïve Bayes algorithm that is done with the help of Tf-IDF (Term Frequency-Inverse Document Frequency). The comparison is done on different sizes dataset and the comparison is done on the basis of parameters like mean square error, accuracy, precision, recall and F1 score and our work has shown better accuracy than other classification algorithm Keywords: Review, Sentiment Analysis, Modern Information Retrieval, Opinion Mining, Classifier.

    Statistical models for the analysis of short user-generated documents: author identification for conversational documents

    Get PDF
    In recent years short user-generated documents have been gaining popularity on the Internet and attention in the research communities. This kind of documents are generated by users of the various online services: platforms for instant messaging communication, for real-time status posting, for discussing and for writing reviews. Each of these services allows users to generate written texts with particular properties and which might require specific algorithms for being analysed. In this dissertation we are presenting our work which aims at analysing this kind of documents. We conducted qualitative and quantitative studies to identify the properties that might allow for characterising them. We compared the properties of these documents with the properties of standard documents employed in the literature, such as newspaper articles, and defined a set of characteristics that are distinctive of the documents generated online. We also observed two classes within the online user-generated documents: the conversational documents and those involving group discussions. We later focused on the class of conversational documents, that are short and spontaneous. We created a novel collection of real conversational documents retrieved online (e.g. Internet Relay Chat) and distributed it as part of an international competition (PAN @ CLEF'12). The competition was about author characterisation, which is one of the possible studies of authorship attribution documented in the literature. Another field of study is authorship identification, that became our main topic of research. We approached the authorship identification problem in its closed-class variant. For each problem we employed documents from the collection we released and from a collection of Twitter messages, as representative of conversational or short user-generated documents. We proved the unsuitability of standard authorship identification techniques for conversational documents and proposed novel methods capable of reaching better accuracy rates. As opposed to standard methods that worked well only for few authors, the proposed technique allowed for reaching significant results even for hundreds of users
    corecore