894 research outputs found

    Enhancing Twitter Data Analysis with Simple Semantic Filtering: Example in Tracking Influenza-Like Illnesses

    Full text link
    Systems that exploit publicly available user generated content such as Twitter messages have been successful in tracking seasonal influenza. We developed a novel filtering method for Influenza-Like-Illnesses (ILI)-related messages using 587 million messages from Twitter micro-blogs. We first filtered messages based on syndrome keywords from the BioCaster Ontology, an extant knowledge model of laymen's terms. We then filtered the messages according to semantic features such as negation, hashtags, emoticons, humor and geography. The data covered 36 weeks for the US 2009 influenza season from 30th August 2009 to 8th May 2010. Results showed that our system achieved the highest Pearson correlation coefficient of 98.46% (p-value<2.2e-16), an improvement of 3.98% over the previous state-of-the-art method. The results indicate that simple NLP-based enhancements to existing approaches to mine Twitter data can increase the value of this inexpensive resource.Comment: 10 pages, 5 figures, IEEE HISB 2012 conference, Sept 27-28, 2012, La Jolla, California, U

    Detecting and Monitoring Hate Speech in Twitter

    Get PDF
    Social Media are sensors in the real world that can be used to measure the pulse of societies. However, the massive and unfiltered feed of messages posted in social media is a phenomenon that nowadays raises social alarms, especially when these messages contain hate speech targeted to a specific individual or group. In this context, governments and non-governmental organizations (NGOs) are concerned about the possible negative impact that these messages can have on individuals or on the society. In this paper, we present HaterNet, an intelligent system currently being used by the Spanish National Office Against Hate Crimes of the Spanish State Secretariat for Security that identifies and monitors the evolution of hate speech in Twitter. The contributions of this research are many-fold: (1) It introduces the first intelligent system that monitors and visualizes, using social network analysis techniques, hate speech in Social Media. (2) It introduces a novel public dataset on hate speech in Spanish consisting of 6000 expert-labeled tweets. (3) It compares several classification approaches based on different document representation strategies and text classification models. (4) The best approach consists of a combination of a LTSM+MLP neural network that takes as input the tweet’s word, emoji, and expression tokens’ embeddings enriched by the tf-idf, and obtains an area under the curve (AUC) of 0.828 on our dataset, outperforming previous methods presented in the literatureThe work by Quijano-Sanchez was supported by the Spanish Ministry of Science and Innovation grant FJCI-2016-28855. The research of Liberatore was supported by the Government of Spain, grant MTM2015-65803-R, and by the European Union’s Horizon 2020 Research and Innovation Programme, under the Marie Sklodowska-Curie grant agreement No. 691161 (GEOSAFE). All the financial support is gratefully acknowledge

    AI approaches to understand human deceptions, perceptions, and perspectives in social media

    Get PDF
    Social media platforms have created virtual space for sharing user generated information, connecting, and interacting among users. However, there are research and societal challenges: 1) The users are generating and sharing the disinformation 2) It is difficult to understand citizens\u27 perceptions or opinions expressed on wide variety of topics; and 3) There are overloaded information and echo chamber problems without overall understanding of the different perspectives taken by different people or groups. This dissertation addresses these three research challenges with advanced AI and Machine Learning approaches. To address the fake news, as deceptions on the facts, this dissertation presents Machine Learning approaches for fake news detection models, and a hybrid method for topic identification, whether they are fake or real. To understand the user\u27s perceptions or attitude toward some topics, this study analyzes the sentiments expressed in social media text. The sentiment analysis of posts can be used as an indicator to measure how topics are perceived by the users and how their perceptions as a whole can affect decision makers in government and industry, especially during the COVID-19 pandemic. It is difficult to measure the public perception of government policies issued during the pandemic. The citizen responses to the government policies are diverse, ranging from security or goodwill to confusion, fear, or anger. This dissertation provides a near real-time approach to track and monitor public reactions toward government policies by continuously collecting and analyzing Twitter posts about the COVID-19 pandemic. To address the social media\u27s overwhelming number of posts, content echo-chamber, and information isolation issue, this dissertation provides a multiple view-based summarization framework where the same contents can be summarized according to different perspectives. This framework includes components of choosing the perspectives, and advanced text summarization approaches. The proposed approaches in this dissertation are demonstrated with a prototype system to continuously collect Twitter data about COVID-19 government health policies and provide analysis of citizen concerns toward the policies, and the data is analyzed for fake news detection and for generating multiple-view summaries

    Social World Sensing via Social Image Analysis from Social Media

    Get PDF
    Social imagery, the visuals shared by users via various platforms and applications, may be analyzed to elicit something of massmind (and individual) thinking. This work involves the exploration of seven topics from various subject areas (global public health, environmentalism, human rights, political expression, and human predation) through social imagery and data from social media. The coding techniques involve manual coding, the integration of multiple social data streams, computational text analysis, data visualizations, and other combinations of approaches.https://newprairiepress.org/ebooks/1037/thumbnail.jp

    Efficient Text Classification with Linear Regression Using a Combination of Predictors for Flu Outbreak Detection

    Get PDF
    Early prediction of disease outbreaks and seasonal epidemics such as Influenza may reduce their impact on daily lives. Today, the web can be used for surveillance of diseases.Search engines and Social Networking Sites can be used to track trends of different diseases more quickly than government agencies such as Center of Disease Control and Prevention(CDC). Today, Social Networking Sites (SNS) are widely used by diverse demographic populations. Thus, SNS data can be used effectively to track disease outbreaks and provide necessary warnings. Although the generated data of microblogging sites is valuable for real time analysis and outbreak predictions, the volume is huge. Therefore, one of the main challenges in analyzing this huge volume of data is to find the best approach for accurate analysis in an efficient time. Regardless of the analysis time, many studies show only the accuracy of applying different machine learning approaches. Current SNS-based flu detection and prediction frameworks apply conventional machine learning approaches that require lengthy training and testing, which is not the optimal solution for new outbreaks with new signs and symptoms. The aim of this study is to propose an efficient and accurate framework that uses SNS data to track disease outbreaks and provide early warnings, even for newest outbreaks accurately. The presented framework of outbreak prediction consists of three main modules: text classification, mapping, and linear regression for weekly flu rate predictions. The text classification module utilizes the features of sentiment analysis and predefined keyword occurrences. Various classifiers, including FastText and six conventional machine learning algorithms, are evaluated to identify the most efficient and accurate one for the proposed framework. The text classifiers have been trained and tested using a pre-labeled dataset of flu-related and unrelated Twitter postings. The selected text classifier is then used to classify over 8,400,000 tweet documents. The flu-related documents are then mapped ona weekly basis using a mapping module. Lastly, the mapped results are passed together with historical Center for Disease Control and Prevention (CDC) data to a linear regression module for weekly flu rate predictions. The evaluation of flu tweet classification shows that FastText together with the extracted features, has achieved accurate results with anF-measure value of 89.9% in addition to its efficiency. Therefore, FastText has been chosen to be the classification module to work together with the other modules in the proposed framework, including the linear regression module, for flu trend predictions. The prediction results are compared with the available recent data from CDC as the ground truth and show a strong correlation of 96.2%

    A Sentiment and Content Analysis of Twitter Content Regarding the use of Antibiotics in Livestock

    Get PDF
    On January 1, 2017, the final rule of the Veterinary Feed Directive (VFD) was put into place requiring antibiotics approved for both humans and animals to be discontinued for growth promotion. This change was brought on by the role growth promoters in livestock production play in the development of antibiotic resistance. Antibiotic resistance increases the costs associated with human health care by increasing the length of stays in the hospital and requiring more intensive medical care for patients. The purpose of this study was to explore sentiment and characteristics of social media content and the characteristics of the key influencers whose opinions had the greatest amount of reach on social media in regard to antibiotic use in livestock and antibiotic resistance. Nuvi, a social media monitoring program, provided sentiment for each tweet and coded 64.8% of the content (n = 129) as negative compared to 38.2% (n = 76) humans coded as negative. The contrast between human coders and Nuvi indicates there could be discrepancies between how Nuvi codes content and the way a human might interpret the content. No key influencer discussed antibiotic use in livestock positively. Findings suggest agricultural communicators should not rely completely on the output from sentiment analysis programs to evaluate how the public discusses issues related to agriculture, particularly controversial issues. Further, agricultural communications practitioners should prioritize monitoring the content shared by key influencers in an effort to better understand the content being shared by the most influential users. Recommendations for future research are provided

    Signal Fusion and Semantic Similarity Evaluation for Social Media Based Adverse Drug Event Detection

    Get PDF
    Recent advancements in pharmacovigilance tasks have shown the usage of social media as a resource to obtain real-time signals for drug surveillance. Researchers demonstrated a good potential for the detection of Adverse Drug Events (ADEs) using social media much earlier than the traditional reporting systems maintained by official regulatory authorities like the United States Food and Drug Administration (FDA). Existing automated drug surveillance systems have used various types of social media channels and search query logs for monitoring ADE signals.;In this thesis, we address two key performance issues related to automated drug surveillance systems. The first is to improve the ADE signal detection by analyzing signals from multiple social media channels, and the second is usage of semantic similarity to evaluate ADE narratives detected by drug surveillance systems. Most current approaches for detecting ADEs from social media rely on a single channel: forums or microblogs or query logs. In this study we propose a new methodology to fuse signals from different social media channels. We use graphical causal models to discover potentially hidden connections between data channels, and then use such associations to generate signals for ADEs. Further, prior work have not emphasized much on the language of healthcare consumers, which is often casual and informal in expressing health issues on social media. There is a high potential to miss the semantic similarity between ADE terms extracted from social media and terms from formal official narratives when the two sets of terms do not share exact text. Thus, we exhibit the usage of semantic similarity to enhance accuracy of detected ADEs, and evaluated similarity measurement algorithms developed over biomedical vocabularies in ADE surveillance domain. We experimented on a dataset of drugs which had FDA black box warnings with a retrospective analysis spanning years 2008 to 2015. The results show a better detection rate and an improved performance in terms of precision, recall and timeliness using our proposed methods

    Using sentiment analysis to evaluate the impact of the COVID-19 outbreak on Italy’s country reputation and stock market performance

    Get PDF
    During the recent Coronavirus disease 2019 (COVID-19) outbreak, the microblogging service Twitter has been widely used to share opinions and reactions to events. Italy was one of the frst European countries to be severely afected by the outbreak and to establish lockdown and stay-at-home orders, potentially leading to country reputation damage. We resort to sentiment analysis to investigate changes in opinions about Italy reported on Twitter before and after the COVID-19 outbreak. Using diferent lexicons-based methods, we fnd a breakpoint corresponding to the date of the frst established case of COVID-19 in Italy that causes a relevant change in sentiment scores used as a proxy of the country’s reputation. Next, we demonstrate that sentiment scores about Italy are associated with the values of the FTSE-MIB index, the Italian Stock Exchange main index, as they serve as early detection signals of changes in the values of FTSE-MIB. Lastly, we evaluate whether diferent machine learning classifers were able to determine the polarity of tweets posted before and after the outbreak with a diferent level of accuracy

    An Information Diffusion-Based Recommendation Framework for Micro-Blogging

    Get PDF
    Micro-blogging is increasingly evolving from a daily chatting tool into a critical platform for individuals and organizations to seek and share real-time news updates during emergencies. However, seeking and extracting useful information from micro-blogging sites poses significant challenges due to the volume of the traffic and the presence of a large body of irrelevant personal messages and spam. In this paper, we propose a novel recommendation framework to overcome this problem. By analyzing information diffusion patterns among a large set of micro-blogs that play the role of emergency news providers, our approach selects a small subset as recommended emergency news feeds for regular users. We evaluate our diffusion-based recommendation framework on Twitter during the early outbreak of H1N1 Flu. The evaluation results show that our method results in more balanced and comprehensive recommendations compared to benchmark approaches

    Decision Making in Emergency Management: The Role of Social Media

    Get PDF
    Researchers and practitioners alike recognise the importance of emergency management (EM) in limiting the adverse impacts of crisis events, as well as the promise of social media to support these efforts. Decision making, which is crucial to ensure the effective management of immediate, emerging, and sustained crises, is one facet of EM potentially affected by social media. While much research has investigated social media in a crisis context more generally, little is known thus far about what it means for EM decision making. In this paper, we investigate the current knowledge base of this phenomenon and infer from it factors that are crucial for its understanding. To this end, we propose an analytical framework of EM decision making based on previous work on complex problem solving and social media networks. We then systematically review and rethink existing research from a decision-centred point of view to identify and synthesise key findings that are relevant to the role of social media in the EM decision-making process. Finally, we outline the research gaps that need to be closed to arrive at a more comprehensive understanding of social media for EM decision support and to begin moving towards theoretically grounded explanations of the phenomenon
    • …
    corecore