1,800,527 research outputs found
Reliable online social network data collection
Large quantities of information are shared through online social networks, making them attractive sources of data for social network research. When studying the usage of online social networks, these data may not describe properly users’ behaviours. For instance, the data collected often include content shared by the users only, or content accessible to the researchers, hence obfuscating a large amount of data that would help understanding users’ behaviours and privacy concerns. Moreover, the data collection methods employed in experiments may also have an effect on data reliability when participants self-report inacurrate information or are observed while using a simulated application. Understanding the effects of these collection methods on data reliability is paramount for the study of social networks; for understanding user behaviour; for designing socially-aware applications and services; and for mining data collected from such social networks and applications. This chapter reviews previous research which has looked at social network data collection and user behaviour in these networks. We highlight shortcomings in the methods used in these studies, and introduce our own methodology and user study based on the Experience Sampling Method; we claim our methodology leads to the collection of more reliable data by capturing both those data which are shared and not shared. We conclude with suggestions for collecting and mining data from online social networks.Postprin
Causal inference for social network data
We describe semiparametric estimation and inference for causal effects using
observational data from a single social network. Our asymptotic result is the
first to allow for dependence of each observation on a growing number of other
units as sample size increases. While previous methods have generally
implicitly focused on one of two possible sources of dependence among social
network observations, we allow for both dependence due to transmission of
information across network ties, and for dependence due to latent similarities
among nodes sharing ties. We describe estimation and inference for new causal
effects that are specifically of interest in social network settings, such as
interventions on network ties and network structure. Using our methods to
reanalyze the Framingham Heart Study data used in one of the most influential
and controversial causal analyses of social network data, we find that after
accounting for network structure there is no evidence for the causal effects
claimed in the original paper
Using Social Network Analysis on classroom video data
We propose a novel application of Social Network Analysis (SNA) using
classroom video data as a means of quantitatively and visually exploring the
collaborations between students. The context for our study was a summer program
that works with first generation students and deaf/hard-of-hearing students to
engage in authentic science practice and develop a supportive community. We
applied SNA to data from one activity during the two-week program to test our
approach and as a means to begin to assess whether the goals of the program are
being met. We used SNA to identify groups that were interacting in unexpected
ways and then to highlight how individuals were contributing to the overall
group behavior. We plan to expand our new use of SNA to video data on a larger
scale
Social Network Analysis on Food Web and Dispute Data
Several social science disciplines, especially anthropology and sociology, have long engaged in social network analyses. Social Network Analysis (SNA) uses network theory to analyse social networks – a network that often involves individual social actors (people) and relations between them. Social network analysis aims at understanding the network structure by description, visualization, and statistical modeling. In this research, the illustration of the use of SNA is done on two different datasets: food web data and militarized interstate dispute data
The Researcher Social Network: a social network based on metadata of scientific publications
Scientific journals can capture a scholar’s research career. A researcher’s publication data often reflects his/her research interests and their social relations. It is demonstrated that scientist collaboration networks can be constructed based on co-authorship data from journal papers. The problem with such a network is that researchers are limited within their professional social network. This work proposes the idea of constructing a researcher’s social network based on data harvested from metadata of scientific publications and personal online profiles. We hypothesize that data, such as, publication keywords, personal interests, the themes of the conferences where papers are published, and co-authors of the papers, either directly or indirectly represent the authors’ research interests, and by measuring the similarity between these data we are able to construct a researcher social network. Based on the four types of data mentioned above, social network graphs were plotted, studied and analyzed. These graphs were then evaluated by the researchers themselves by giving ratings. Based on this evaluation, we estimated the weight for each type of data, in order to blend all data together to construct one ideal researcher’s social network. Interestingly, our results showed that a graph based on publication’s keywords were more representative than the one based on publication’s co-authorship. The findings from the evaluation were used to propose a dynamic social network data model
Social Network Data Management
With the increasing usage of online social networks and the semantic web's graph structured RDF framework, and the rising adoption of networks in various fields from biology to social science, there is a rapidly growing need for indexing, querying, and analyzing massive graph structured data. Facebook has amassed over 500 million users creating huge volumes of highly connected data. Governments have made RDF datasets containing billions of triples available to the public. In the life sciences, researches have started to connect disparate data sets of research results into one giant network of valuable information. Clearly, networks are becoming increasingly popular and growing rapidly in size, requiring scalable solutions for network data management.
This thesis focuses on the following aspects of network data management. We present a hierarchical index structure for external memory storage of network data that aims to maximize data locality. We propose efficient algorithms to answer subgraph matching queries against network databases and discuss effective pruning strategies to improve performance. We show how adaptive cost models can speed up subgraph matching query answering by assigning budgets to index retrieval operations and adjusting the query plan while executing.
We develop a cloud oriented social network database, COSI, which handles massive network datasets too large for a single computer by partitioning the data across multiple machines and achieving high performance query answering through asynchronous parallelization and cluster-aware heuristics.
Tracking multiple standing queries against a social network database is much faster with our novel multi-view maintenance algorithm, which exploits common substructures between queries.
To capture uncertainty inherent in social network querying, we define probabilistic subgraph matching queries over deterministic graph data and propose algorithms to answer them efficiently.
Finally, we introduce a general relational machine learning framework and rule-based language, Probabilistic Soft Logic, to learn from and probabilistically reason about social network data and describe applications to information integration and information fusion
Potential Networks, Contagious Communities, and Understanding Social Network Structure
In this paper we study how the network of agents adopting a particular
technology relates to the structure of the underlying network over which the
technology adoption spreads. We develop a model and show that the network of
agents adopting a particular technology may have characteristics that differ
significantly from the social network of agents over which the technology
spreads. For example, the network induced by a cascade may have a heavy-tailed
degree distribution even if the original network does not.
This provides evidence that online social networks created by technology
adoption over an underlying social network may look fundamentally different
from social networks and indicates that using data from many online social
networks may mislead us if we try to use it to directly infer the structure of
social networks. Our results provide an alternate explanation for certain
properties repeatedly observed in data sets, for example: heavy-tailed degree
distribution, network densification, shrinking diameter, and network community
profile. These properties could be caused by a sort of `sampling bias' rather
than by attributes of the underlying social structure. By generating networks
using cascades over traditional network models that do not themselves contain
these properties, we can nevertheless reliably produce networks that contain
all these properties.
An opportunity for interesting future research is developing new methods that
correctly infer underlying network structure from data about a network that is
generated via a cascade spread over the underlying network.Comment: To Appear in Proceedings of the 22nd International World Wide Web
Conference(WWW 2013
- …