417 research outputs found
Improving Individual Predictions using Social Networks Assortativity
Social networks are known to be assortative with
respect to many attributes, such as age, weight, wealth, level
of education, ethnicity and gender. This can be explained by
influences and homophilies. Independently of its origin, this
assortativity gives us information about each node given its
neighbors. Assortativity can thus be used to improve individual
predictions in a broad range of situations, when data are missing
or inaccurate. This paper presents a general framework based on
probabilistic graphical models to exploit social network structures
for improving individual predictions of node attributes. Using
this framework, we quantify the assortativity range leading to an
accuracy gain in several situations. We finally show how specific
characteristics of the network can improve performances further.
For instance, the gender assortativity in real-world mobile phone
data changes significantly according to some communication
attributes. In this case, individual predictions with 75% accuracy
are improved by up to 3%
Locally Adaptive Dynamic Networks
Our focus is on realistically modeling and forecasting dynamic networks of
face-to-face contacts among individuals. Important aspects of such data that
lead to problems with current methods include the tendency of the contacts to
move between periods of slow and rapid changes, and the dynamic heterogeneity
in the actors' connectivity behaviors. Motivated by this application, we
develop a novel method for Locally Adaptive DYnamic (LADY) network inference.
The proposed model relies on a dynamic latent space representation in which
each actor's position evolves in time via stochastic differential equations.
Using a state space representation for these stochastic processes and
P\'olya-gamma data augmentation, we develop an efficient MCMC algorithm for
posterior inference along with tractable procedures for online updating and
forecasting of future networks. We evaluate performance in simulation studies,
and consider an application to face-to-face contacts among individuals in a
primary school
A Socio-Informatic Approach to Automated Account Classification on Social Media
Automated accounts on social media have become increasingly problematic. We
propose a key feature in combination with existing methods to improve machine
learning algorithms for bot detection. We successfully improve classification
performance through including the proposed feature.Comment: International Conference on Social Media and Societ
Far from random? The role of homophily in student supervision
The paper studies racial and gender homophily in student supervision relationships in a context of social transformations, South Africa academia. We develop a technique to separate choice homophily from that induced by the system. Comprising two permutation tests repeated at two levels of aggregation, system and departments. We find clear evidence of homophily in student supervision, along racial lines in particular. Roughly half of the observed homophily is induced by the departments composition and stays constant over time. Overall, choice homophily has similar magnitude along racial and gender dimensions. Further, we ask where choice homophily originates in the demographic groups of students and professors. We find that white (male) students have high tendency to form same-type relations, while among professors it is black (female) who display the higher frequency. Group differences show that choice homophily is likely to originate from students in the former majority
Network dynamics in regional clusters: The perspective of an emerging economy
Regional clusters are spatial agglomerations of firms operating in the same or connected industries, which enable innovation and economic performance for firms. A wealth of empirical literature shows that one of key elements of the success of regional clusters is that they facilitate the formation of local inter-organizational networks, which act as conduits of knowledge and innovation. While most studies analyze the benefits and characteristics of regional cluster networks and focus on advanced economies and high tech Ôhot spotsÕ, this paper advances with the existing literature by analyzing network dynamics and taking an emerging economyÕs perspective. Using longitudinal data of a wine cluster in Chile and stochastic actor-oriented models for network dynamics, this paper examines what micro-level effects influence the formation of new knowledge ties among wineries. It finds that the coexistence of cohesion effects (reciprocity and transitivity) and the presence of inter-firm knowledge base heterogeneity contribute to the stability of an informal hierarchical network structure over time. Empirical results have interesting implications for cluster competitiveness and network studies, and for the burgeoning literature on corporate behavior in emerging economies.Regional clusters, knowledge networks, network dynamics, wine industry, Chile
Block-Approximated Exponential Random Graphs
An important challenge in the field of exponential random graphs (ERGs) is
the fitting of non-trivial ERGs on large graphs. By utilizing fast matrix
block-approximation techniques, we propose an approximative framework to such
non-trivial ERGs that result in dyadic independence (i.e., edge independent)
distributions, while being able to meaningfully model both local information of
the graph (e.g., degrees) as well as global information (e.g., clustering
coefficient, assortativity, etc.) if desired. This allows one to efficiently
generate random networks with similar properties as an observed network, and
the models can be used for several downstream tasks such as link prediction.
Our methods are scalable to sparse graphs consisting of millions of nodes.
Empirical evaluation demonstrates competitiveness in terms of both speed and
accuracy with state-of-the-art methods -- which are typically based on
embedding the graph into some low-dimensional space -- for link prediction,
showcasing the potential of a more direct and interpretable probabalistic model
for this task.Comment: Accepted for DSAA 2020 conferenc
Using Machine Learning to Predict Swine Movements within a Regional Program to Improve Control of Infectious Diseases in the US.
Between-farm animal movement is one of the most important factors influencing the spread of infectious diseases in food animals, including in the US swine industry. Understanding the structural network of contacts in a food animal industry is prerequisite to planning for efficient production strategies and for effective disease control measures. Unfortunately, data regarding between-farm animal movements in the US are not systematically collected and thus, such information is often unavailable. In this paper, we develop a procedure to replicate the structure of a network, making use of partial data available, and subsequently use the model developed to predict animal movements among sites in 34 Minnesota counties. First, we summarized two networks of swine producing facilities in Minnesota, then we used a machine learning technique referred to as random forest, an ensemble of independent classification trees, to estimate the probability of pig movements between farms and/or markets sites located in two counties in Minnesota. The model was calibrated and tested by comparing predicted data and observed data in those two counties for which data were available. Finally, the model was used to predict animal movements in sites located across 34 Minnesota counties. Variables that were important in predicting pig movements included between-site distance, ownership, and production type of the sending and receiving farms and/or markets. Using a weighted-kernel approach to describe spatial variation in the centrality measures of the predicted network, we showed that the south-central region of the study area exhibited high aggregation of predicted pig movements. Our results show an overlap with the distribution of outbreaks of porcine reproductive and respiratory syndrome, which is believed to be transmitted, at least in part, though animal movements. While the correspondence of movements and disease is not a causal test, it suggests that the predicted network may approximate actual movements. Accordingly, the predictions provided here might help to design and implement control strategies in the region. Additionally, the methodology here may be used to estimate contact networks for other livestock systems when only incomplete information regarding animal movements is available
- …