268 research outputs found
Firsthand Opiates Abuse on Social Media: Monitoring Geospatial Patterns of Interest Through a Digital Cohort
In the last decade drug overdose deaths reached staggering proportions in the
US. Besides the raw yearly deaths count that is worrisome per se, an alarming
picture comes from the steep acceleration of such rate that increased by 21%
from 2015 to 2016. While traditional public health surveillance suffers from
its own biases and limitations, digital epidemiology offers a new lens to
extract signals from Web and Social Media that might be complementary to
official statistics. In this paper we present a computational approach to
identify a digital cohort that might provide an updated and complementary view
on the opioid crisis. We introduce an information retrieval algorithm suitable
to identify relevant subspaces of discussion on social media, for mining data
from users showing explicit interest in discussions about opioid consumption in
Reddit. Moreover, despite the pseudonymous nature of the user base, almost 1.5
million users were geolocated at the US state level, resembling the census
population distribution with a good agreement. A measure of prevalence of
interest in opiate consumption has been estimated at the state level, producing
a novel indicator with information that is not entirely encoded in the standard
surveillance. Finally, we further provide a domain specific vocabulary
containing informal lexicon and street nomenclature extracted by user-generated
content that can be used by researchers and practitioners to implement novel
digital public health surveillance methodologies for supporting policy makers
in fighting the opioid epidemic.Comment: Proceedings of the 2019 World Wide Web Conference (WWW '19
Detecting the community structure and activity patterns of temporal networks: a non-negative tensor factorization approach
The increasing availability of temporal network data is calling for more
research on extracting and characterizing mesoscopic structures in temporal
networks and on relating such structure to specific functions or properties of
the system. An outstanding challenge is the extension of the results achieved
for static networks to time-varying networks, where the topological structure
of the system and the temporal activity patterns of its components are
intertwined. Here we investigate the use of a latent factor decomposition
technique, non-negative tensor factorization, to extract the community-activity
structure of temporal networks. The method is intrinsically temporal and allows
to simultaneously identify communities and to track their activity over time.
We represent the time-varying adjacency matrix of a temporal network as a
three-way tensor and approximate this tensor as a sum of terms that can be
interpreted as communities of nodes with an associated activity time series. We
summarize known computational techniques for tensor decomposition and discuss
some quality metrics that can be used to tune the complexity of the factorized
representation. We subsequently apply tensor factorization to a temporal
network for which a ground truth is available for both the community structure
and the temporal activity patterns. The data we use describe the social
interactions of students in a school, the associations between students and
school classes, and the spatio-temporal trajectories of students over time. We
show that non-negative tensor factorization is capable of recovering the class
structure with high accuracy. In particular, the extracted tensor components
can be validated either as known school classes, or in terms of correlated
activity patterns, i.e., of spatial and temporal coincidences that are
determined by the known school activity schedule
Activity clocks: spreading dynamics on temporal networks of human contact
Dynamical processes on time-varying complex networks are key to understanding
and modeling a broad variety of processes in socio-technical systems. Here we
focus on empirical temporal networks of human proximity and we aim at
understanding the factors that, in simulation, shape the arrival time
distribution of simple spreading processes. Abandoning the notion of wall-clock
time in favour of node-specific clocks based on activity exposes robust
statistical patterns in the arrival times across different social contexts.
Using randomization strategies and generative models constrained by data, we
show that these patterns can be understood in terms of heterogeneous
inter-event time distributions coupled with heterogeneous numbers of events per
edge. We also show, both empirically and by using a synthetic dataset, that
significant deviations from the above behavior can be caused by the presence of
edge classes with strong activity correlations
Predicting human mobility through the assimilation of social media traces into mobility models
Predicting human mobility flows at different spatial scales is challenged by
the heterogeneity of individual trajectories and the multi-scale nature of
transportation networks. As vast amounts of digital traces of human behaviour
become available, an opportunity arises to improve mobility models by
integrating into them proxy data on mobility collected by a variety of digital
platforms and location-aware services. Here we propose a hybrid model of human
mobility that integrates a large-scale publicly available dataset from a
popular photo-sharing system with the classical gravity model, under a stacked
regression procedure. We validate the performance and generalizability of our
approach using two ground-truth datasets on air travel and daily commuting in
the United States: using two different cross-validation schemes we show that
the hybrid model affords enhanced mobility prediction at both spatial scales.Comment: 17 pages, 10 figure
Time-varying graph representation learning via higher-order skip-gram with negative sampling
Representation learning models for graphs are a successful family of techniques that project nodes into feature spaces that can be exploited by other machine learning algorithms. Since many real-world networks are inherently dynamic, with interactions among nodes changing over time, these techniques can be defined both for static and for time-varying graphs. Here, we show how the skip-gram embedding approach can be generalized to perform implicit tensor factorization on different tensor representations of time-varying graphs. We show that higher-order skip-gram with negative sampling (HOSGNS) is able to disentangle the role of nodes and time, with a small fraction of the number of parameters needed by other approaches. We empirically evaluate our approach using time-resolved face-to-face proximity data, showing that the learned representations outperform state-of-the-art methods when used to solve downstream tasks such as network reconstruction. Good performance on predicting the outcome of dynamical processes such as disease spreading shows the potential of this method to estimate contagion risk, providing early risk awareness based on contact tracing data. Supplementary information: The online version contains supplementary material available at 10.1140/epjds/s13688-022-00344-8
Data on face-to-face contacts in an office building suggests a low-cost vaccination strategy based on community linkers
Empirical data on contacts between individuals in social contexts play an
important role in providing information for models describing human behavior
and how epidemics spread in populations. Here, we analyze data on face-to-face
contacts collected in an office building. The statistical properties of
contacts are similar to other social situations, but important differences are
observed in the contact network structure. In particular, the contact network
is strongly shaped by the organization of the offices in departments, which has
consequences in the design of accurate agent-based models of epidemic spread.
We consider the contact network as a potential substrate for infectious disease
spread and show that its sparsity tends to prevent outbreaks of rapidly
spreading epidemics. Moreover, we define three typical behaviors according to
the fraction of links each individual shares outside its own department:
residents, wanderers and linkers. Linkers () act as bridges in the
network and have large betweenness centralities. Thus, a vaccination strategy
targeting linkers efficiently prevents large outbreaks. As such a behavior may
be spotted a priori in the offices' organization or from surveys, without the
full knowledge of the time-resolved contact network, this result may help the
design of efficient, low-cost vaccination or social-distancing strategies
On the Dynamics of Human Proximity for Data Diffusion in Ad-Hoc Networks
We report on a data-driven investigation aimed at understanding the dynamics
of message spreading in a real-world dynamical network of human proximity. We
use data collected by means of a proximity-sensing network of wearable sensors
that we deployed at three different social gatherings, simultaneously involving
several hundred individuals. We simulate a message spreading process over the
recorded proximity network, focusing on both the topological and the temporal
properties. We show that by using an appropriate technique to deal with the
temporal heterogeneity of proximity events, a universal statistical pattern
emerges for the delivery times of messages, robust across all the data sets.
Our results are useful to set constraints for generic processes of data
dissemination, as well as to validate established models of human mobility and
proximity that are frequently used to simulate realistic behaviors.Comment: A. Panisson et al., On the dynamics of human proximity for data
diffusion in ad-hoc networks, Ad Hoc Netw. (2011
Explainability Methods for Natural Language Processing: Applications to Sentiment Analysis
Sentiment analysis is the process of classifying natural lan-guage sentences as expressing positive or negative sentiments, and it is a crucial task where the explanation of a prediction might arguably be as necessary as the prediction itself. We analysed di fierent explanation techniques, and we applied them to the classification task of Sentiment Analysis. We explored how attention-based techniques can be exploited to extract meaningful sentiment scores with a lower computational cost than existing XAI methods
- …