15,850 research outputs found
Reading the Source Code of Social Ties
Though online social network research has exploded during the past years, not
much thought has been given to the exploration of the nature of social links.
Online interactions have been interpreted as indicative of one social process
or another (e.g., status exchange or trust), often with little systematic
justification regarding the relation between observed data and theoretical
concept. Our research aims to breach this gap in computational social science
by proposing an unsupervised, parameter-free method to discover, with high
accuracy, the fundamental domains of interaction occurring in social networks.
By applying this method on two online datasets different by scope and type of
interaction (aNobii and Flickr) we observe the spontaneous emergence of three
domains of interaction representing the exchange of status, knowledge and
social support. By finding significant relations between the domains of
interaction and classic social network analysis issues (e.g., tie strength,
dyadic interaction over time) we show how the network of interactions induced
by the extracted domains can be used as a starting point for more nuanced
analysis of online social data that may one day incorporate the normative
grammar of social interaction. Our methods finds applications in online social
media services ranging from recommendation to visual link summarization.Comment: 10 pages, 8 figures, Proceedings of the 2014 ACM conference on Web
(WebSci'14
Multiresolution Recurrent Neural Networks: An Application to Dialogue Response Generation
We introduce the multiresolution recurrent neural network, which extends the
sequence-to-sequence framework to model natural language generation as two
parallel discrete stochastic processes: a sequence of high-level coarse tokens,
and a sequence of natural language tokens. There are many ways to estimate or
learn the high-level coarse tokens, but we argue that a simple extraction
procedure is sufficient to capture a wealth of high-level discourse semantics.
Such procedure allows training the multiresolution recurrent neural network by
maximizing the exact joint log-likelihood over both sequences. In contrast to
the standard log- likelihood objective w.r.t. natural language tokens (word
perplexity), optimizing the joint log-likelihood biases the model towards
modeling high-level abstractions. We apply the proposed model to the task of
dialogue response generation in two challenging domains: the Ubuntu technical
support domain, and Twitter conversations. On Ubuntu, the model outperforms
competing approaches by a substantial margin, achieving state-of-the-art
results according to both automatic evaluation metrics and a human evaluation
study. On Twitter, the model appears to generate more relevant and on-topic
responses according to automatic evaluation metrics. Finally, our experiments
demonstrate that the proposed model is more adept at overcoming the sparsity of
natural language and is better able to capture long-term structure.Comment: 21 pages, 2 figures, 10 table
Finding Truth in Cause-Related Advertising: A Lexical Analysis of Brands’ Health, Environment, and Social Justice Communications on Twitter
Consumers increasingly desire to make purchasing decisions based on factors such as health, the environment, and social justice. In response, there has been a commensurate rise in cause-related marketing to appeal to socially-conscious consumers. However, a lack of regulation and standardization makes it difficult for consumers to assess marketing claims; this is further complicated by social media, which firms use to cultivate a personality for their brand through frequent conversational messages. Yet, little empirical research has been done to explore the relationship between cause-related marketing messages on social media and the true cause alignment of brands. In this paper, we explore this by pairing the marketing messages from the Twitter accounts of over 1,000 brands with third-party ratings of each brand with respect to health, the environment, and social justice. Specifically, we perform text regression to predict each brand’s true rating in each dimension based on the lexical content of its tweets, and find significant held-out correlation on each task, suggesting that a brand’s alignment with a social cause can be somewhat reliably signaled through its Twitter communications — though the signal is weak in many cases. To aid in the identification of brands that engage in misleading cause-related communication as well as terms that more likely indicate insincerity, we propose a procedure to rank both brands and terms by their volume of “conflicting” communications (i.e., “greenwashing”). We further explore how cause-related terms are used differently by brands that are strong vs. weak in actual alignment with the cause. The results provide insight into current practices in causerelated marketing in social media, and provide a framework for identifying and monitoring misleading communications. Together, they can be used to promote transparency in causerelated marketing in social media, better enabling brands to communicate authentic valuesbased policy decisions, and consumers to make socially responsible purchase decisions
Enhancing Twitter Data Analysis with Simple Semantic Filtering: Example in Tracking Influenza-Like Illnesses
Systems that exploit publicly available user generated content such as
Twitter messages have been successful in tracking seasonal influenza. We
developed a novel filtering method for Influenza-Like-Illnesses (ILI)-related
messages using 587 million messages from Twitter micro-blogs. We first filtered
messages based on syndrome keywords from the BioCaster Ontology, an extant
knowledge model of laymen's terms. We then filtered the messages according to
semantic features such as negation, hashtags, emoticons, humor and geography.
The data covered 36 weeks for the US 2009 influenza season from 30th August
2009 to 8th May 2010. Results showed that our system achieved the highest
Pearson correlation coefficient of 98.46% (p-value<2.2e-16), an improvement of
3.98% over the previous state-of-the-art method. The results indicate that
simple NLP-based enhancements to existing approaches to mine Twitter data can
increase the value of this inexpensive resource.Comment: 10 pages, 5 figures, IEEE HISB 2012 conference, Sept 27-28, 2012, La
Jolla, California, U
Big Brother is Listening to You: Digital Eavesdropping in the Advertising Industry
In the Digital Age, information is more accessible than ever. Unfortunately, that accessibility has come at the expense of privacy. Now, more and more personal information is in the hands of corporations and governments, for uses not known to the average consumer. Although these entities have long been able to keep tabs on individuals, with the advent of virtual assistants and “always-listening” technologies, the ease by which a third party may extract information from a consumer has only increased. The stark reality is that lawmakers have left the American public behind. While other countries have enacted consumer privacy protections, the United States has no satisfactory legal framework in place to curb data collection by greedy businesses or to regulate how those companies may use and protect consumer data. This Article contemplates one use of that data: digital advertising. Inspired by stories of suspiciously well-targeted advertisements appearing on social media websites, this Article additionally questions whether companies have been honest about their collection of audio data. To address the potential harms consumers may suffer as a result of this deficient privacy protection, this Article proposes a framework wherein companies must acquire users\u27 consent and the government must ensure that businesses do not use consumer information for harmful purposes
- …