417 research outputs found

    Improving Individual Predictions using Social Networks Assortativity

    Get PDF
    Social networks are known to be assortative with respect to many attributes, such as age, weight, wealth, level of education, ethnicity and gender. This can be explained by influences and homophilies. Independently of its origin, this assortativity gives us information about each node given its neighbors. Assortativity can thus be used to improve individual predictions in a broad range of situations, when data are missing or inaccurate. This paper presents a general framework based on probabilistic graphical models to exploit social network structures for improving individual predictions of node attributes. Using this framework, we quantify the assortativity range leading to an accuracy gain in several situations. We finally show how specific characteristics of the network can improve performances further. For instance, the gender assortativity in real-world mobile phone data changes significantly according to some communication attributes. In this case, individual predictions with 75% accuracy are improved by up to 3%

    Locally Adaptive Dynamic Networks

    Full text link
    Our focus is on realistically modeling and forecasting dynamic networks of face-to-face contacts among individuals. Important aspects of such data that lead to problems with current methods include the tendency of the contacts to move between periods of slow and rapid changes, and the dynamic heterogeneity in the actors' connectivity behaviors. Motivated by this application, we develop a novel method for Locally Adaptive DYnamic (LADY) network inference. The proposed model relies on a dynamic latent space representation in which each actor's position evolves in time via stochastic differential equations. Using a state space representation for these stochastic processes and P\'olya-gamma data augmentation, we develop an efficient MCMC algorithm for posterior inference along with tractable procedures for online updating and forecasting of future networks. We evaluate performance in simulation studies, and consider an application to face-to-face contacts among individuals in a primary school

    A Socio-Informatic Approach to Automated Account Classification on Social Media

    Full text link
    Automated accounts on social media have become increasingly problematic. We propose a key feature in combination with existing methods to improve machine learning algorithms for bot detection. We successfully improve classification performance through including the proposed feature.Comment: International Conference on Social Media and Societ

    Far from random? The role of homophily in student supervision

    Get PDF
    The paper studies racial and gender homophily in student supervision relationships in a context of social transformations, South Africa academia. We develop a technique to separate choice homophily from that induced by the system. Comprising two permutation tests repeated at two levels of aggregation, system and departments. We find clear evidence of homophily in student supervision, along racial lines in particular. Roughly half of the observed homophily is induced by the departments composition and stays constant over time. Overall, choice homophily has similar magnitude along racial and gender dimensions. Further, we ask where choice homophily originates in the demographic groups of students and professors. We find that white (male) students have high tendency to form same-type relations, while among professors it is black (female) who display the higher frequency. Group differences show that choice homophily is likely to originate from students in the former majority

    Network dynamics in regional clusters: The perspective of an emerging economy

    Get PDF
    Regional clusters are spatial agglomerations of firms operating in the same or connected industries, which enable innovation and economic performance for firms. A wealth of empirical literature shows that one of key elements of the success of regional clusters is that they facilitate the formation of local inter-organizational networks, which act as conduits of knowledge and innovation. While most studies analyze the benefits and characteristics of regional cluster networks and focus on advanced economies and high tech Ôhot spotsÕ, this paper advances with the existing literature by analyzing network dynamics and taking an emerging economyÕs perspective. Using longitudinal data of a wine cluster in Chile and stochastic actor-oriented models for network dynamics, this paper examines what micro-level effects influence the formation of new knowledge ties among wineries. It finds that the coexistence of cohesion effects (reciprocity and transitivity) and the presence of inter-firm knowledge base heterogeneity contribute to the stability of an informal hierarchical network structure over time. Empirical results have interesting implications for cluster competitiveness and network studies, and for the burgeoning literature on corporate behavior in emerging economies.Regional clusters, knowledge networks, network dynamics, wine industry, Chile

    Block-Approximated Exponential Random Graphs

    Get PDF
    An important challenge in the field of exponential random graphs (ERGs) is the fitting of non-trivial ERGs on large graphs. By utilizing fast matrix block-approximation techniques, we propose an approximative framework to such non-trivial ERGs that result in dyadic independence (i.e., edge independent) distributions, while being able to meaningfully model both local information of the graph (e.g., degrees) as well as global information (e.g., clustering coefficient, assortativity, etc.) if desired. This allows one to efficiently generate random networks with similar properties as an observed network, and the models can be used for several downstream tasks such as link prediction. Our methods are scalable to sparse graphs consisting of millions of nodes. Empirical evaluation demonstrates competitiveness in terms of both speed and accuracy with state-of-the-art methods -- which are typically based on embedding the graph into some low-dimensional space -- for link prediction, showcasing the potential of a more direct and interpretable probabalistic model for this task.Comment: Accepted for DSAA 2020 conferenc

    Using Machine Learning to Predict Swine Movements within a Regional Program to Improve Control of Infectious Diseases in the US.

    Get PDF
    Between-farm animal movement is one of the most important factors influencing the spread of infectious diseases in food animals, including in the US swine industry. Understanding the structural network of contacts in a food animal industry is prerequisite to planning for efficient production strategies and for effective disease control measures. Unfortunately, data regarding between-farm animal movements in the US are not systematically collected and thus, such information is often unavailable. In this paper, we develop a procedure to replicate the structure of a network, making use of partial data available, and subsequently use the model developed to predict animal movements among sites in 34 Minnesota counties. First, we summarized two networks of swine producing facilities in Minnesota, then we used a machine learning technique referred to as random forest, an ensemble of independent classification trees, to estimate the probability of pig movements between farms and/or markets sites located in two counties in Minnesota. The model was calibrated and tested by comparing predicted data and observed data in those two counties for which data were available. Finally, the model was used to predict animal movements in sites located across 34 Minnesota counties. Variables that were important in predicting pig movements included between-site distance, ownership, and production type of the sending and receiving farms and/or markets. Using a weighted-kernel approach to describe spatial variation in the centrality measures of the predicted network, we showed that the south-central region of the study area exhibited high aggregation of predicted pig movements. Our results show an overlap with the distribution of outbreaks of porcine reproductive and respiratory syndrome, which is believed to be transmitted, at least in part, though animal movements. While the correspondence of movements and disease is not a causal test, it suggests that the predicted network may approximate actual movements. Accordingly, the predictions provided here might help to design and implement control strategies in the region. Additionally, the methodology here may be used to estimate contact networks for other livestock systems when only incomplete information regarding animal movements is available
    • …
    corecore