294 research outputs found

    Slightly generalized Generalized Contagion: Unifying simple models of biological and social spreading

    Full text link
    We motivate and explore the basic features of generalized contagion, a model mechanism that unifies fundamental models of biological and social contagion. Generalized contagion builds on the elementary observation that spreading and contagion of all kinds involve some form of system memory. We discuss the three main classes of systems that generalized contagion affords, resembling: simple biological contagion; critical mass contagion of social phenomena; and an intermediate, and explosive, vanishing critical mass contagion. We also present a simple explanation of the global spreading condition in the context of a small seed of infected individuals.Comment: 8 pages, 5 figures; chapter to appear in "Spreading Dynamics in Social Systems"; Eds. Sune Lehmann and Yong-Yeol Ahn, Springer Natur

    Positivity of the English language

    Get PDF
    Over the last million years, human language has emerged and evolved as a fundamental instrument of social communication and semiotic representation. People use language in part to convey emotional information, leading to the central and contingent questions: (1) What is the emotional spectrum of natural language? and (2) Are natural languages neutrally, positively, or negatively biased? Here, we report that the human-perceived positivity of over 10,000 of the most frequently used English words exhibits a clear positive bias. More deeply, we characterize and quantify distributions of word positivity for four large and distinct corpora, demonstrating that their form is broadly invariant with respect to frequency of word use.Comment: Manuscript: 9 pages, 3 tables, 5 figures; Supplementary Information: 12 pages, 3 tables, 8 figure

    Positive words carry less information than negative words

    Get PDF
    We show that the frequency of word use is not only determined by the word length \cite{Zipf1935} and the average information content \cite{Piantadosi2011}, but also by its emotional content. We have analyzed three established lexica of affective word usage in English, German, and Spanish, to verify that these lexica have a neutral, unbiased, emotional content. Taking into account the frequency of word usage, we find that words with a positive emotional content are more frequently used. This lends support to Pollyanna hypothesis \cite{Boucher1969} that there should be a positive bias in human expression. We also find that negative words contain more information than positive words, as the informativeness of a word increases uniformly with its valence decrease. Our findings support earlier conjectures about (i) the relation between word frequency and information content, and (ii) the impact of positive emotions on communication and social links.Comment: 16 pages, 3 figures, 3 table

    Stable Word-Clouds for Visualising Text-Changes Over Time

    Get PDF
    Word-clouds are a useful tool for providing overviews over texts, visualising relevant words. Multiple word-clouds can also be used to visualise changes over time in a text. This requires that the words in the individual word-clouds have stable positions, as otherwise it is very difficult so see what changed between two consecutive word-clouds. Existing approaches have used coordinated positioning algorithms, which do not allow for their use in an online, dynamic context. In this paper we present a fast word-cloud algorithm that uses word orthogonality to determine which words can share the same space in the word-clouds combined with a simple, but fast spiral-based layout algorithm. The evaluation shows that the algorithm achieves its goal of creating series of word-clouds fast enough to enable use in an online, dynamic context

    SentiBench - a benchmark comparison of state-of-the-practice sentiment analysis methods

    Get PDF
    In the last few years thousands of scientific papers have investigated sentiment analysis, several startups that measure opinions on real data have emerged and a number of innovative products related to this theme have been developed. There are multiple methods for measuring sentiments, including lexical-based and supervised machine learning methods. Despite the vast interest on the theme and wide popularity of some methods, it is unclear which one is better for identifying the polarity (i.e., positive or negative) of a message. Accordingly, there is a strong need to conduct a thorough apple-to-apple comparison of sentiment analysis methods, \textit{as they are used in practice}, across multiple datasets originated from different data sources. Such a comparison is key for understanding the potential limitations, advantages, and disadvantages of popular methods. This article aims at filling this gap by presenting a benchmark comparison of twenty-four popular sentiment analysis methods (which we call the state-of-the-practice methods). Our evaluation is based on a benchmark of eighteen labeled datasets, covering messages posted on social networks, movie and product reviews, as well as opinions and comments in news articles. Our results highlight the extent to which the prediction performance of these methods varies considerably across datasets. Aiming at boosting the development of this research area, we open the methods' codes and datasets used in this article, deploying them in a benchmark system, which provides an open API for accessing and comparing sentence-level sentiment analysis methods

    Emotional Sentence Annotation Helps Predict Fiction Genre

    Get PDF
    Fiction, a prime form of entertainment, has evolved into multiple genres which one can broadly attribute to different forms of stories. In this paper, we examine the hypothesis that works of fiction can be characterised by the emotions they portray. To investigate this hypothesis, we use the work of fictions in the Project Gutenberg and we attribute basic emotional content to each individual sentence using Ekman’s model. A time-smoothed version of the emotional content for each basic emotion is used to train extremely randomized trees. We show through 10-fold Cross-Validation that the emotional content of each work of fiction can help identify each genre with significantly higher probability than random. We also show that the most important differentiator between genre novels is fear

    Collective emotions online and their influence on community life

    Get PDF
    E-communities, social groups interacting online, have recently become an object of interdisciplinary research. As with face-to-face meetings, Internet exchanges may not only include factual information but also emotional information - how participants feel about the subject discussed or other group members. Emotions are known to be important in affecting interaction partners in offline communication in many ways. Could emotions in Internet exchanges affect others and systematically influence quantitative and qualitative aspects of the trajectory of e-communities? The development of automatic sentiment analysis has made large scale emotion detection and analysis possible using text messages collected from the web. It is not clear if emotions in e-communities primarily derive from individual group members' personalities or if they result from intra-group interactions, and whether they influence group activities. We show the collective character of affective phenomena on a large scale as observed in 4 million posts downloaded from Blogs, Digg and BBC forums. To test whether the emotions of a community member may influence the emotions of others, posts were grouped into clusters of messages with similar emotional valences. The frequency of long clusters was much higher than it would be if emotions occurred at random. Distributions for cluster lengths can be explained by preferential processes because conditional probabilities for consecutive messages grow as a power law with cluster length. For BBC forum threads, average discussion lengths were higher for larger values of absolute average emotional valence in the first ten comments and the average amount of emotion in messages fell during discussions. Our results prove that collective emotional states can be created and modulated via Internet communication and that emotional expressiveness is the fuel that sustains some e-communities.Comment: 23 pages including Supporting Information, accepted to PLoS ON

    Pulsatile blood flow, shear force, energy dissipation and Murray's Law

    Get PDF
    BACKGROUND: Murray's Law states that, when a parent blood vessel branches into daughter vessels, the cube of the radius of the parent vessel is equal to the sum of the cubes of the radii of daughter blood vessels. Murray derived this law by defining a cost function that is the sum of the energy cost of the blood in a vessel and the energy cost of pumping blood through the vessel. The cost is minimized when vessel radii are consistent with Murray's Law. This law has also been derived from the hypothesis that the shear force of moving blood on the inner walls of vessels is constant throughout the vascular system. However, this derivation, like Murray's earlier derivation, is based on the assumption of constant blood flow. METHODS: To determine the implications of the constant shear force hypothesis and to extend Murray's energy cost minimization to the pulsatile arterial system, a model of pulsatile flow in an elastic tube is analyzed. A new and exact solution for flow velocity, blood flow rate and shear force is derived. RESULTS: For medium and small arteries with pulsatile flow, Murray's energy minimization leads to Murray's Law. Furthermore, the hypothesis that the maximum shear force during the cycle of pulsatile flow is constant throughout the arterial system implies that Murray's Law is approximately true. The approximation is good for all but the largest vessels (aorta and its major branches) of the arterial system. CONCLUSION: A cellular mechanism that senses shear force at the inner wall of a blood vessel and triggers remodeling that increases the circumference of the wall when a shear force threshold is exceeded would result in the observed scaling of vessel radii described by Murray's Law

    Recruiting Injection Drug Users: A Three-Site Comparison of Results and Experiences with Respondent-Driven and Targeted Sampling Procedures

    Get PDF
    Several recent studies have utilized respondent-driven sampling (RDS) methods to survey hidden populations such as commercial sex-workers, men who have sex with men (MSM) and injection drug users (IDU). Few studies, however, have provided a direct comparison between RDS and other more traditional sampling methods such as venue-based, targeted or time/space sampling. The current study sampled injection drug users in three U.S. cities using RDS and targeted sampling (TS) methods and compared their effectiveness in terms of recruitment efficiency, logistics, and sample demographics. Both methods performed satisfactorily. The targeted method required more staff time per-recruited respondent and had a lower proportion of screened respondents who were eligible than RDS, while RDS respondents were offered higher incentives for participation