268 research outputs found

    Firsthand Opiates Abuse on Social Media: Monitoring Geospatial Patterns of Interest Through a Digital Cohort

    Get PDF
    In the last decade drug overdose deaths reached staggering proportions in the US. Besides the raw yearly deaths count that is worrisome per se, an alarming picture comes from the steep acceleration of such rate that increased by 21% from 2015 to 2016. While traditional public health surveillance suffers from its own biases and limitations, digital epidemiology offers a new lens to extract signals from Web and Social Media that might be complementary to official statistics. In this paper we present a computational approach to identify a digital cohort that might provide an updated and complementary view on the opioid crisis. We introduce an information retrieval algorithm suitable to identify relevant subspaces of discussion on social media, for mining data from users showing explicit interest in discussions about opioid consumption in Reddit. Moreover, despite the pseudonymous nature of the user base, almost 1.5 million users were geolocated at the US state level, resembling the census population distribution with a good agreement. A measure of prevalence of interest in opiate consumption has been estimated at the state level, producing a novel indicator with information that is not entirely encoded in the standard surveillance. Finally, we further provide a domain specific vocabulary containing informal lexicon and street nomenclature extracted by user-generated content that can be used by researchers and practitioners to implement novel digital public health surveillance methodologies for supporting policy makers in fighting the opioid epidemic.Comment: Proceedings of the 2019 World Wide Web Conference (WWW '19

    Detecting the community structure and activity patterns of temporal networks: a non-negative tensor factorization approach

    Full text link
    The increasing availability of temporal network data is calling for more research on extracting and characterizing mesoscopic structures in temporal networks and on relating such structure to specific functions or properties of the system. An outstanding challenge is the extension of the results achieved for static networks to time-varying networks, where the topological structure of the system and the temporal activity patterns of its components are intertwined. Here we investigate the use of a latent factor decomposition technique, non-negative tensor factorization, to extract the community-activity structure of temporal networks. The method is intrinsically temporal and allows to simultaneously identify communities and to track their activity over time. We represent the time-varying adjacency matrix of a temporal network as a three-way tensor and approximate this tensor as a sum of terms that can be interpreted as communities of nodes with an associated activity time series. We summarize known computational techniques for tensor decomposition and discuss some quality metrics that can be used to tune the complexity of the factorized representation. We subsequently apply tensor factorization to a temporal network for which a ground truth is available for both the community structure and the temporal activity patterns. The data we use describe the social interactions of students in a school, the associations between students and school classes, and the spatio-temporal trajectories of students over time. We show that non-negative tensor factorization is capable of recovering the class structure with high accuracy. In particular, the extracted tensor components can be validated either as known school classes, or in terms of correlated activity patterns, i.e., of spatial and temporal coincidences that are determined by the known school activity schedule

    Activity clocks: spreading dynamics on temporal networks of human contact

    Get PDF
    Dynamical processes on time-varying complex networks are key to understanding and modeling a broad variety of processes in socio-technical systems. Here we focus on empirical temporal networks of human proximity and we aim at understanding the factors that, in simulation, shape the arrival time distribution of simple spreading processes. Abandoning the notion of wall-clock time in favour of node-specific clocks based on activity exposes robust statistical patterns in the arrival times across different social contexts. Using randomization strategies and generative models constrained by data, we show that these patterns can be understood in terms of heterogeneous inter-event time distributions coupled with heterogeneous numbers of events per edge. We also show, both empirically and by using a synthetic dataset, that significant deviations from the above behavior can be caused by the presence of edge classes with strong activity correlations

    Predicting human mobility through the assimilation of social media traces into mobility models

    Get PDF
    Predicting human mobility flows at different spatial scales is challenged by the heterogeneity of individual trajectories and the multi-scale nature of transportation networks. As vast amounts of digital traces of human behaviour become available, an opportunity arises to improve mobility models by integrating into them proxy data on mobility collected by a variety of digital platforms and location-aware services. Here we propose a hybrid model of human mobility that integrates a large-scale publicly available dataset from a popular photo-sharing system with the classical gravity model, under a stacked regression procedure. We validate the performance and generalizability of our approach using two ground-truth datasets on air travel and daily commuting in the United States: using two different cross-validation schemes we show that the hybrid model affords enhanced mobility prediction at both spatial scales.Comment: 17 pages, 10 figure

    Time-varying graph representation learning via higher-order skip-gram with negative sampling

    Get PDF
    Representation learning models for graphs are a successful family of techniques that project nodes into feature spaces that can be exploited by other machine learning algorithms. Since many real-world networks are inherently dynamic, with interactions among nodes changing over time, these techniques can be defined both for static and for time-varying graphs. Here, we show how the skip-gram embedding approach can be generalized to perform implicit tensor factorization on different tensor representations of time-varying graphs. We show that higher-order skip-gram with negative sampling (HOSGNS) is able to disentangle the role of nodes and time, with a small fraction of the number of parameters needed by other approaches. We empirically evaluate our approach using time-resolved face-to-face proximity data, showing that the learned representations outperform state-of-the-art methods when used to solve downstream tasks such as network reconstruction. Good performance on predicting the outcome of dynamical processes such as disease spreading shows the potential of this method to estimate contagion risk, providing early risk awareness based on contact tracing data. Supplementary information: The online version contains supplementary material available at 10.1140/epjds/s13688-022-00344-8

    Data on face-to-face contacts in an office building suggests a low-cost vaccination strategy based on community linkers

    Full text link
    Empirical data on contacts between individuals in social contexts play an important role in providing information for models describing human behavior and how epidemics spread in populations. Here, we analyze data on face-to-face contacts collected in an office building. The statistical properties of contacts are similar to other social situations, but important differences are observed in the contact network structure. In particular, the contact network is strongly shaped by the organization of the offices in departments, which has consequences in the design of accurate agent-based models of epidemic spread. We consider the contact network as a potential substrate for infectious disease spread and show that its sparsity tends to prevent outbreaks of rapidly spreading epidemics. Moreover, we define three typical behaviors according to the fraction ff of links each individual shares outside its own department: residents, wanderers and linkers. Linkers (f∼50%f\sim 50\%) act as bridges in the network and have large betweenness centralities. Thus, a vaccination strategy targeting linkers efficiently prevents large outbreaks. As such a behavior may be spotted a priori in the offices' organization or from surveys, without the full knowledge of the time-resolved contact network, this result may help the design of efficient, low-cost vaccination or social-distancing strategies

    On the Dynamics of Human Proximity for Data Diffusion in Ad-Hoc Networks

    Full text link
    We report on a data-driven investigation aimed at understanding the dynamics of message spreading in a real-world dynamical network of human proximity. We use data collected by means of a proximity-sensing network of wearable sensors that we deployed at three different social gatherings, simultaneously involving several hundred individuals. We simulate a message spreading process over the recorded proximity network, focusing on both the topological and the temporal properties. We show that by using an appropriate technique to deal with the temporal heterogeneity of proximity events, a universal statistical pattern emerges for the delivery times of messages, robust across all the data sets. Our results are useful to set constraints for generic processes of data dissemination, as well as to validate established models of human mobility and proximity that are frequently used to simulate realistic behaviors.Comment: A. Panisson et al., On the dynamics of human proximity for data diffusion in ad-hoc networks, Ad Hoc Netw. (2011

    Explainability Methods for Natural Language Processing: Applications to Sentiment Analysis

    Get PDF
    Sentiment analysis is the process of classifying natural lan-guage sentences as expressing positive or negative sentiments, and it is a crucial task where the explanation of a prediction might arguably be as necessary as the prediction itself. We analysed di fierent explanation techniques, and we applied them to the classification task of Sentiment Analysis. We explored how attention-based techniques can be exploited to extract meaningful sentiment scores with a lower computational cost than existing XAI methods
    • …
    corecore