10 research outputs found

    Are black friday deals worth it? Mining twitter users' sentiment and behavior response

    Get PDF
    The Black Friday event has become a global opportunity for marketing and companies’ strategies aimed at increasing sales. The present study aims to understand consumer behavior through the analysis of user-generated content (UGC) on social media with respect to the Black Friday 2018 offers published by the 23 largest technology companies in Spain. To this end, we analyzed Twitter-based UGC about companies’ offers using a three-step data text mining process. First, a Latent Dirichlet Allocation Model (LDA) was used to divide the sample into topics related to Black Friday. In the next step, sentiment analysis (SA) using Python was carried out to determine the feelings towards the identified topics and offers published by the companies on Twitter. Thirdly and finally, a data-text mining process called textual analysis (TA) was performed to identify insights that could help companies to improve their promotion and marketing strategies as well as to better understand the customer behavior on social media. The results show that consumers had positive perceptions of such topics as exclusive promotions (EP) and smartphones (SM); by contrast, topics such as fraud (FA), insults and noise (IN), and customer support (CS) were negatively perceived by customers. Based on these results, we offer guidelines to practitioners to improve their social media communication. Our results also have theoretical implications that can promote further research in this area

    Mathematical Modeling of Public Opinion using Traditional and Social Media

    Get PDF
    With the growth of the internet, data from text sources has become increasingly available to researchers in the form of online newspapers, journals, and blogs. This data presents a unique opportunity to analyze human opinions and behaviors without soliciting the public explicitly. In this research, I utilize newspaper articles and the social media service Twitter to infer self-reported public opinions and awareness of climate change. Climate change is one of the most important and heavily debated issues of our time, and analyzing large-scale text surrounding this issue reveals insights surrounding self-reported public opinion. First, I inquire about public discourse on both climate change and energy system vulnerability following two large hurricanes. I apply topic modeling techniques to a corpus of articles about each hurricane in order to determine how these topics were reported on in the post event news media. Next, I perform sentiment analysis on a large collection of data from Twitter using a previously developed tool called the hedonometer . I use this sentiment scoring technique to investigate how the Twitter community reports feeling about climate change. Finally, I generalize the sentiment analysis technique to many other topics of global importance, and compare to more traditional public opinion polling methods. I determine that since traditional public opinion polls have limited reach and high associated costs, text data from Twitter may be the future of public opinion polling

    Seasonality Pattern of Suicides in the US – a Comparative Analysis of a Twitter Based Bad-mood Index and Committed Suicides

    Get PDF
    Is it possible that a general negative social climate exists, which, on the one hand, manifests itself in the number of suicides committed, and on the other hand, appears in the content of tweets posted on social media? In this paper, we are attempting to identify the seasonal and weekly patterns of suicide-related tweets and compare them with similar data that shows the ratios of committed suicides in the US. For this work we used the data stream freely provided by the online social networking site Twitter to collect geo-located tweets in the US. In order to calculate the negative mood Twitter–index three terms (words) were used related to suicide (suicide) and bad mood (depression, depressed). The raw daily occurrences of these three words were summarised and then divided by the number of all geo-located daily tweets in the US collected in the framework of the project. The weekly fluctuation of the temporal distribution of suicides and the ratio of bad mood messages on Twitter fit together well. Tweets show a deterioration of mood on Sundays more intensely; this tendency is more moderate in suicide data. Monthly data, however, are much more challenging to interpret, since the fluctuations show a completely opposite tendency than we expected. We present two possible explanation for this unexpected result

    Cues disseminated by professional associations that represent 5 health care professions across 5 nations : lexical analysis of tweets

    Get PDF
    Background: Collaboration across health care professions is critical in efficiently and effectively managing complex and chronic health conditions, yet interprofessional care does not happen automatically. Professional associations have a key role in setting a profession’s agenda, maintaining professional identity, and establishing priorities. The associations’ external communication is commonly undertaken through social media platforms, such as Twitter. Despite the valuable insights potentially available into professional associations through such communication, to date, their messaging has not been examined. Objective: This study aimed to identify the cues disseminated by professional associations that represent 5 health care professions spanning 5 nations. Methods: Using a back-iterative application programming interface methodology, public tweets were sourced from professional associations that represent 5 health care professions that have key roles in community-based health care: general practice, nursing, pharmacy, physiotherapy, and social work. Furthermore, the professional associations spanned Australia, Canada, New Zealand, the United Kingdom, and the United States. A lexical analysis was conducted of the tweets using Leximancer (Leximancer Pty Ltd) to clarify relationships within the discourse. Results: After completing a lexical analysis of 50,638 tweets, 7 key findings were identified. First, the discourse was largely devoid of references to interprofessional care. Second, there was no explicit discourse pertaining to physiotherapists. Third, although all the professions represented in this study support patients, discourse pertaining to general practitioners was most likely to be connected with that pertaining to patients. Fourth, tweets pertaining to pharmacists were most likely to be connected with discourse pertaining to latest and research. Fifth, tweets about social workers were unlikely to be connected with discourse pertaining to health or care. Sixth, notwithstanding a few exceptions, the findings across the different nations were generally similar, suggesting their generality. Seventh and last, tweets pertaining to physiotherapists were most likely to refer to discourse pertaining to profession. Conclusions: The findings indicate that health care professional associations do not use Twitter to disseminate cues that reinforce the importance of interprofessional care. Instead, they largely use this platform to emphasize what they individually deem to be important and advance the interests of their respective professions. Therefore, there is considerable opportunity for professional associations to assert how the profession they represent complements other health care professions and how the professionals they represent can enact interprofessional care for the benefit of patients and carers

    Predicting the Outcomes of Important Events based on Social Media and Social Network Analysis

    Get PDF
    Twitter is a famous social network website that lets users post their opinions about current affairs, share their social events, and interact with others. It has now become one of the largest sources of news, with over 200 million active users monthly. It is possible to predict the outcomes of events based on social networks using machine learning and big data analytics. Massive data available from social networks can be utilized to improve prediction efficacy and accuracy. It is a challenging problem to achieve high accuracy in predicting the outcomes of political events using Twitter data. The focus of this thesis is to investigate novel approaches to predicting the outcomes of political events from social media and social networks. The first proposed method is to predict election results based on Twitter data analysis. The method extracts and analyses sentimental information from microblogs to predict the popularity of candidates. Experimental results have shown its advantages over the existing method for predicting outcomes of politic events. The second proposed method is to predict election results based on Twitter data analysis that analyses sentimental information using term weighting and selection to predict the popularity of candidates. Scaling factors are used for different types of terms, which help to select informative terms more effectively and achieve better prediction results than the previous method. The third method proposed in this thesis represents the social network by using network connectivity constructed based on retweet data and social media contents as well, leading to a new approach to predicting the outcome of political events. Two approaches, whole-network and sub-network, have been developed and compared. Experimental results show that the sub-network approach, which constructs sub-networks based on different topics, outperformed the whole-network approach

    Modern Survey Estimation with Social Media and Auxiliary Data

    Full text link
    Traditional survey methods have been successful for nearly a century, but recently response rates have been declining and costs have been increasing, making the future of survey science uncertain. At the same time, new media sources are generating new forms of data, population data is increasingly readily available, and sophisticated machine learning algorithms are being created. This dissertation uses modern data sources and tools to improve survey estimates and advance the field of survey science. We begin by exploring the challenges of using data from new media, demonstrating how relationships between social media data and survey responses can appear deceptively strong. We examine a previously observed relationship between sentiment of ``jobs" tweets and consumer confidence, performing a sensitivity analysis on how sentiment of tweets is calculated and sorting ``jobs" tweets into categories based on their content, concluding that the original observed relationship was merely a chance occurrence. Next we track the relationship between sentiment of ``Trump" tweets and presidential approval. We develop a framework to interpret the strength of this observed relationship by implementing placebo analyses, in which we perform the same analysis but with tweets assumed to be unrelated to presidential approval, concluding that our observed relationship is not strong. Failing to find a meaningful signal, we next propose following a set of users over time. For a set of politically active users, we are able to find evidence of a political signal in terms of frequency and sentiment of their tweets around the 2016 presidential election. In a given corpus of tweets, there are likely to be several topics present, which has the potential to introduce bias when using the corpus to track survey responses. To help discover and sort tweets into these topics, we create a clustering-based topic modeling algorithm. Using the entire corpus, we create distances between words based on how often they appear together in the same tweet, create distances between tweets based on the distance between words in the tweets, and perform clustering on the resulting distances. We show that this method is effective using a validation set of tweets and apply it to the corpus of tweets from politically active users and ``jobs" tweets. Finally, we use population auxiliary data and machine learning algorithms to improve survey estimates. We develop an imputation-based estimation method that produces an unbiased estimate of the mean response of a finite population from a simple random sample when population auxiliary data are available. Our method allows for any prediction function or machine learning algorithm to be used to predict the response for out-of-sample observations, and is therefore able to accommodate a high dimensional setting and all covariate types. Exact unbiasedness is guaranteed by estimating the bias of the prediction function using subsamples of the original simple random sample. Importantly, the unbiasedness property does not depend on the accuracy of the imputation method. We apply this estimation method to simulated data, college tuition data, and the American Community Survey.PHDStatisticsUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/163193/1/fergr_1.pd

    Gauge against the machine: improving representations within sociotechnical instruments to enrich context and identify biases

    Get PDF
    The proliferation of digital data across all areas of society has transformed our ability to hypothesize, study, and understand social systems.From this richness of data we have seen the development of innovative instruments to study---and make decisions with---the digital artifacts of the modern day. These developments build on advancements in computation, connectivity, analytical methodologies, and sociological theories. The sociotechnical instruments we have developed have been revolutionary to how we understand society and how we conduct business, but with these broad leaps comes ample room (and need) for more nuanced advancements. As with the development of any field, as the digital humanities evolve there is opportunity for targeted progress and the need for more tectonic shifts in practices. Iterative improvements include building more full-featured instruments that include a broader set of variables when analyzing and presenting results. More profound topics such as fairness, accountability, transparency, and ethics need increased attention as well---especially to create equitable, pro-social tools. Both in academia and in industry, there is room to improve how we curate, study, and operationalize data sets and the AI pipelines that sit atop them. Here we use natural language processing, machine learning, tools from data ethics, and other methods to explore how we can contextualize results and improve representations within instruments used to understand sociotechnical systems. In the first study we examine the dynamics of responses to posts by US presidents on Twitter. These results offer a piece of culturally significant data in themselves---the ratio of response types is an unofficial measurement on the platform. Moreover, the results improve our understanding of the temporal dynamics that lead to the final counts that users may ultimately see. Deeply analyzing response activity dynamics provides insights on how the public responds to posts, the tenacity of supporters, and abnormalities that may be indicative of inauthentic behavior. The second study examines the interaction between gender biases in health records and language models and how to mitigate these biases. We present specific language that is more commonly associated with female and male patients. We go on to demonstrate how the deliberate augmentation of text can minimize the gender signal present in data while retaining performance on medically relevant tasks. We conclude by showing how much of this bias is domain specific, and the non-trivial interaction with general-purpose language models. Our final study investigates gender bias in resume text and relates this bias to the gender wage-gap. We show that language differences within occupations are associated with the gender pay gap. Our results highlight the value of utilizing high dimensional representations of individuals and the potential for previously undocumented biases to influence hiring pipelines

    INNODOCT/17. International conference on innovation,documentation and education

    Full text link
    INNODOCT/17 que tiene como objetivo proporcionar un foro para académicos y profesionales donde compartir sus investigaciones, discutir ideas, proyectos actuales, resultados y retos La conferencia tiene como objetivo proporcionar un foro para académicos y profesionales que permita compartir sus investigaciones, discutir ideas, proyectos actuales, resultados y retos relacionados con las Nuevas Tecnologías de Información y Comunicación, innovaciones y metodologías aplicadas a la Educación y la Investigación, en áreas como Ciencias, Ingenierías, Ciencias Sociales, Economía, Gestión, Marketing, y también Turismo y HosteleríaGarrigós Simón, FJ.; Estelles Miguel, S.; Lengua Lengua, I.; Onofre Montesa, J.; Dema Pérez, CM.; Oltra Gutiérrez, JV.; Yeamduan Narangajavana... (2018). INNODOCT/17. International conference on innovation,documentation and education. Editorial Universitat Politècnica de València. http://hdl.handle.net/10251/107064EDITORIA
    corecore