1,800,527 research outputs found

    Reliable online social network data collection

    Get PDF
    Large quantities of information are shared through online social networks, making them attractive sources of data for social network research. When studying the usage of online social networks, these data may not describe properly users’ behaviours. For instance, the data collected often include content shared by the users only, or content accessible to the researchers, hence obfuscating a large amount of data that would help understanding users’ behaviours and privacy concerns. Moreover, the data collection methods employed in experiments may also have an effect on data reliability when participants self-report inacurrate information or are observed while using a simulated application. Understanding the effects of these collection methods on data reliability is paramount for the study of social networks; for understanding user behaviour; for designing socially-aware applications and services; and for mining data collected from such social networks and applications. This chapter reviews previous research which has looked at social network data collection and user behaviour in these networks. We highlight shortcomings in the methods used in these studies, and introduce our own methodology and user study based on the Experience Sampling Method; we claim our methodology leads to the collection of more reliable data by capturing both those data which are shared and not shared. We conclude with suggestions for collecting and mining data from online social networks.Postprin

    Causal inference for social network data

    Full text link
    We describe semiparametric estimation and inference for causal effects using observational data from a single social network. Our asymptotic result is the first to allow for dependence of each observation on a growing number of other units as sample size increases. While previous methods have generally implicitly focused on one of two possible sources of dependence among social network observations, we allow for both dependence due to transmission of information across network ties, and for dependence due to latent similarities among nodes sharing ties. We describe estimation and inference for new causal effects that are specifically of interest in social network settings, such as interventions on network ties and network structure. Using our methods to reanalyze the Framingham Heart Study data used in one of the most influential and controversial causal analyses of social network data, we find that after accounting for network structure there is no evidence for the causal effects claimed in the original paper

    Using Social Network Analysis on classroom video data

    Full text link
    We propose a novel application of Social Network Analysis (SNA) using classroom video data as a means of quantitatively and visually exploring the collaborations between students. The context for our study was a summer program that works with first generation students and deaf/hard-of-hearing students to engage in authentic science practice and develop a supportive community. We applied SNA to data from one activity during the two-week program to test our approach and as a means to begin to assess whether the goals of the program are being met. We used SNA to identify groups that were interacting in unexpected ways and then to highlight how individuals were contributing to the overall group behavior. We plan to expand our new use of SNA to video data on a larger scale

    Social Network Analysis on Food Web and Dispute Data

    Get PDF
    Several social science disciplines, especially anthropology and sociology, have long engaged in social network analyses. Social Network Analysis (SNA) uses network theory to analyse social networks – a network that often involves individual social actors (people) and relations between them. Social network analysis aims at understanding the network structure by description, visualization, and statistical modeling. In this research, the illustration of the use of SNA is done on two different datasets: food web data and militarized interstate dispute data

    The Researcher Social Network: a social network based on metadata of scientific publications

    No full text
    Scientific journals can capture a scholar’s research career. A researcher’s publication data often reflects his/her research interests and their social relations. It is demonstrated that scientist collaboration networks can be constructed based on co-authorship data from journal papers. The problem with such a network is that researchers are limited within their professional social network. This work proposes the idea of constructing a researcher’s social network based on data harvested from metadata of scientific publications and personal online profiles. We hypothesize that data, such as, publication keywords, personal interests, the themes of the conferences where papers are published, and co-authors of the papers, either directly or indirectly represent the authors’ research interests, and by measuring the similarity between these data we are able to construct a researcher social network. Based on the four types of data mentioned above, social network graphs were plotted, studied and analyzed. These graphs were then evaluated by the researchers themselves by giving ratings. Based on this evaluation, we estimated the weight for each type of data, in order to blend all data together to construct one ideal researcher’s social network. Interestingly, our results showed that a graph based on publication’s keywords were more representative than the one based on publication’s co-authorship. The findings from the evaluation were used to propose a dynamic social network data model

    Social Network Data Management

    Get PDF
    With the increasing usage of online social networks and the semantic web's graph structured RDF framework, and the rising adoption of networks in various fields from biology to social science, there is a rapidly growing need for indexing, querying, and analyzing massive graph structured data. Facebook has amassed over 500 million users creating huge volumes of highly connected data. Governments have made RDF datasets containing billions of triples available to the public. In the life sciences, researches have started to connect disparate data sets of research results into one giant network of valuable information. Clearly, networks are becoming increasingly popular and growing rapidly in size, requiring scalable solutions for network data management. This thesis focuses on the following aspects of network data management. We present a hierarchical index structure for external memory storage of network data that aims to maximize data locality. We propose efficient algorithms to answer subgraph matching queries against network databases and discuss effective pruning strategies to improve performance. We show how adaptive cost models can speed up subgraph matching query answering by assigning budgets to index retrieval operations and adjusting the query plan while executing. We develop a cloud oriented social network database, COSI, which handles massive network datasets too large for a single computer by partitioning the data across multiple machines and achieving high performance query answering through asynchronous parallelization and cluster-aware heuristics. Tracking multiple standing queries against a social network database is much faster with our novel multi-view maintenance algorithm, which exploits common substructures between queries. To capture uncertainty inherent in social network querying, we define probabilistic subgraph matching queries over deterministic graph data and propose algorithms to answer them efficiently. Finally, we introduce a general relational machine learning framework and rule-based language, Probabilistic Soft Logic, to learn from and probabilistically reason about social network data and describe applications to information integration and information fusion

    Potential Networks, Contagious Communities, and Understanding Social Network Structure

    Full text link
    In this paper we study how the network of agents adopting a particular technology relates to the structure of the underlying network over which the technology adoption spreads. We develop a model and show that the network of agents adopting a particular technology may have characteristics that differ significantly from the social network of agents over which the technology spreads. For example, the network induced by a cascade may have a heavy-tailed degree distribution even if the original network does not. This provides evidence that online social networks created by technology adoption over an underlying social network may look fundamentally different from social networks and indicates that using data from many online social networks may mislead us if we try to use it to directly infer the structure of social networks. Our results provide an alternate explanation for certain properties repeatedly observed in data sets, for example: heavy-tailed degree distribution, network densification, shrinking diameter, and network community profile. These properties could be caused by a sort of `sampling bias' rather than by attributes of the underlying social structure. By generating networks using cascades over traditional network models that do not themselves contain these properties, we can nevertheless reliably produce networks that contain all these properties. An opportunity for interesting future research is developing new methods that correctly infer underlying network structure from data about a network that is generated via a cascade spread over the underlying network.Comment: To Appear in Proceedings of the 22nd International World Wide Web Conference(WWW 2013
    corecore