3,107 research outputs found
Detecting and Monitoring Hate Speech in Twitter
Social Media are sensors in the real world that can be used to measure the pulse of societies.
However, the massive and unfiltered feed of messages posted in social media is a phenomenon that
nowadays raises social alarms, especially when these messages contain hate speech targeted to a
specific individual or group. In this context, governments and non-governmental organizations
(NGOs) are concerned about the possible negative impact that these messages can have on individuals
or on the society. In this paper, we present HaterNet, an intelligent system currently being used by
the Spanish National Office Against Hate Crimes of the Spanish State Secretariat for Security that
identifies and monitors the evolution of hate speech in Twitter. The contributions of this research
are many-fold: (1) It introduces the first intelligent system that monitors and visualizes, using social
network analysis techniques, hate speech in Social Media. (2) It introduces a novel public dataset on
hate speech in Spanish consisting of 6000 expert-labeled tweets. (3) It compares several classification
approaches based on different document representation strategies and text classification models. (4)
The best approach consists of a combination of a LTSM+MLP neural network that takes as input the
tweet’s word, emoji, and expression tokens’ embeddings enriched by the tf-idf, and obtains an area
under the curve (AUC) of 0.828 on our dataset, outperforming previous methods presented in the
literatureThe work by Quijano-Sanchez was supported by the Spanish Ministry of Science and Innovation
grant FJCI-2016-28855. The research of Liberatore was supported by the Government of Spain, grant MTM2015-65803-R, and by the European Union’s Horizon 2020 Research and Innovation Programme, under the Marie Sklodowska-Curie grant agreement No. 691161 (GEOSAFE). All the financial support is gratefully acknowledge
Recommended from our members
REVIEW CLASSIFICATION USING NATURAL LANGUAGE PROCESSING AND DEEP LEARNING
Sentiment Analysis is an ongoing research in the field of Natural Language Processing (NLP). In this project, I will evaluate my testing against an Amazon Reviews Dataset, which contains more than 100 thousand reviews from customers. This project classifies the reviews using three methods – using a sentiment score by comparing the words of the reviews based on every positive and negative word that appears in the text with the Opinion Lexicon dataset, by considering the text’s variating sentiment polarity scores with a Python library called TextBlob, and with the help of neural network training. I have created a neural network model that learns from the review stars and then compare the neural network’s performance against both the Opinion Lexicon and TextBlob’s classification methods. We see that the accuracy of the Opinion Lexicon classification method is 64.38% while the accuracy with TextBlob’s classification method is 65.71% and the neural network model achieves an accuracy of 96.46%. The model would help brands for future reviews left by customers by classifying them as positive, negative, or neutral
A Study on Youth´s Political Satisfaction: The Case of Portugal
Dissertation presented as the partial requirement for obtaining a Master's degree in Information Management, specialization in Marketing IntelligencePolitical brands have taken their campaigns into social media, in an effort to keep up with the times and connect with younger users. Obama and Trump are examples of successful cases where their online presence translated into the ballot boxes. The present study aims to address the youth engagement and satisfaction with Portuguese political brands on Twitter and their likeness to vote for candidates and parties considering the sentiment results of the last 3 election periods – presidential, local and legislative. The conducted sentiment analysis demonstrated that the various types of users allocate a negative sentiment for most political brands. No positive sentiment was identified within the 29 analysed actors and parties. Most political brands with a neutral sentiment were part of the left-wing, in which youth showed more voting intentions. Liberal Initiative (IL) presented itself as an exception from the right-wing with the same results. Furthermore, it was also conceived that it is not possible to predict election results through Twitter, but this social medium gives a close idea of the users’ collective mind
Preferential Politics
Preferential voting is a unique system of voting that, while enjoying popularity abroad, has yet to make a significant impact on American political culture. However, within that past few years, preferential voting has been adopted by a number of cities across the country and the state of Maine. This dissertation examines the growing role of preferential voting in the United States, the impact of preferential voting on the electoral process, and the public’s perception of preferential voting. This project uses survey data and data collected through Twitter to demonstrate that preferential voting is generally popular with the electorate and reduces campaign negativity, but it can confuse certain voters. Ultimately, this project demonstrates that preferential voting has the potential to address many of the complaints directed towards plurality voting.
Advisor: Kevin Smit
SKEWER: Sentiment Knowledge Extraction With Entity Recognition
The California state legislature introduces approximately 5,000 new bills each legislative session. While the legislative hearings are recorded on video, the recordings are not easily accessible to the public. The lack of official transcripts or summaries also increases the effort required to gain meaningful insight from those recordings. Therefore, the news media and the general population are largely oblivious to what transpires during legislative sessions.
Digital Democracy, a project started by the Cal Poly Institute for Advanced Technology and Public Policy, is an online platform created to bring transparency to the California legislature. It features a searchable database of state legislative committee hearings, with each hearing accompanied by a transcript that was generated by an internal transcription tool.
This thesis presents SKEWER, a pipeline for building a spoken-word knowledge graph from those transcripts. SKEWER utilizes a number of natural language processing tools to extract named entities, phrases, and sentiments from the transcript texts and aggregates the results of those tools into a graph database. The resulting graph can be queried to discover knowledge regarding the positions of legislators, lobbyists, and the general public towards specific bills or topics, and how those positions are expressed in committee hearings. Several case studies are presented to illustrate the new knowledge that can be acquired from the knowledge graph
- …