1,144 research outputs found

    Semantic Networks of Interests in Online NSSI Communities

    Full text link
    Persons who engage in non-suicidal self-injury (NSSI), often conceal their practices which limits the examination and understanding of those who engage in NSSI. The goal of this research is to utilize public online social networks (namely, in LiveJournal, a major blogging network) to observe the NSSI population's communication in a naturally occurring setting. Specifically, LiveJournal users can publicly declare their interests. We collected the self-declared interests of 22,000 users who are members of or participate in 43 NSSI-related communities. We extracted a bimodal socio-semantic network of users and interests based on their similarity. The semantic subnetwork of interests contains NSSI terms (such as "self-injury" and "razors"), references to music performers (such as "Nine Inch Nails"), and general daily life and creativity related terms (such as "poetry" and "boys"). Assuming users are genuine in their declarations, the words reveal distinct patterns of interest and may signal keys to NSSI.Comment: 5 pages, 3 figures. Presented at Words and Networks: Language Use in Socio-Technical Networks (workshop at 2012 ACM Web Science Conference

    Ringo: Interactive Graph Analytics on Big-Memory Machines

    Full text link
    We present Ringo, a system for analysis of large graphs. Graphs provide a way to represent and analyze systems of interacting objects (people, proteins, webpages) with edges between the objects denoting interactions (friendships, physical interactions, links). Mining graphs provides valuable insights about individual objects as well as the relationships among them. In building Ringo, we take advantage of the fact that machines with large memory and many cores are widely available and also relatively affordable. This allows us to build an easy-to-use interactive high-performance graph analytics system. Graphs also need to be built from input data, which often resides in the form of relational tables. Thus, Ringo provides rich functionality for manipulating raw input data tables into various kinds of graphs. Furthermore, Ringo also provides over 200 graph analytics functions that can then be applied to constructed graphs. We show that a single big-memory machine provides a very attractive platform for performing analytics on all but the largest graphs as it offers excellent performance and ease of use as compared to alternative approaches. With Ringo, we also demonstrate how to integrate graph analytics with an iterative process of trial-and-error data exploration and rapid experimentation, common in data mining workloads.Comment: 6 pages, 2 figure

    Line graphs as social networks

    Full text link
    The line graphs are clustered and assortative. They share these topological features with some social networks. We argue that this similarity reveals the cliquey character of the social networks. In the model proposed here, a social network is the line graph of an initial network of families, communities, interest groups, school classes and small companies. These groups play the role of nodes, and individuals are represented by links between these nodes. The picture is supported by the data on the LiveJournal network of about 8 x 10^6 people. In particular, sharp maxima of the observed data of the degree dependence of the clustering coefficient C(k) are associated with cliques in the social network.Comment: 11 pages, 4 figure

    The role of reciprocation in social network formation, with an application to blogging

    Get PDF
    This paper deals with the role of reciprocation in the formation of individuals' social networks, that is to what extent initiating a relation brings about its reciprocation. Following the activity of a panel of bloggers over more than a year, we seek to establish whether bloggers are mainly involved in social networking or are part of the media industry. We adapt a standard capital investment model to study the effect of reciprocation on the building of social capital. Results of our analysis confirm that activity and reciprocation both play a role in the dynamics of social media.Bloggers, Friendship, LiveJournal, Media, Panel Data, Reciprocation, Reci procity, Social Capital, Social Networks

    NScale: Neighborhood-centric Large-Scale Graph Analytics in the Cloud

    Full text link
    There is an increasing interest in executing complex analyses over large graphs, many of which require processing a large number of multi-hop neighborhoods or subgraphs. Examples include ego network analysis, motif counting, personalized recommendations, and others. These tasks are not well served by existing vertex-centric graph processing frameworks, where user programs are only able to directly access the state of a single vertex. This paper introduces NSCALE, a novel end-to-end graph processing framework that enables the distributed execution of complex subgraph-centric analytics over large-scale graphs in the cloud. NSCALE enables users to write programs at the level of subgraphs rather than at the level of vertices. Unlike most previous graph processing frameworks, which apply the user program to the entire graph, NSCALE allows users to declaratively specify subgraphs of interest. Our framework includes a novel graph extraction and packing (GEP) module that utilizes a cost-based optimizer to partition and pack the subgraphs of interest into memory on as few machines as possible. The distributed execution engine then takes over and runs the user program in parallel, while respecting the scope of the various subgraphs. Our experimental results show orders-of-magnitude improvements in performance and drastic reductions in the cost of analytics compared to vertex-centric approaches.Comment: 26 pages, 15 figures, 5 table

    EAGr: Supporting Continuous Ego-centric Aggregate Queries over Large Dynamic Graphs

    Full text link
    In this work, we present EAGr, a system for supporting large numbers of continuous neighborhood-based ("ego-centric") aggregate queries over large, highly dynamic, and rapidly evolving graphs. Examples of such queries include computation of personalized, tailored trends in social networks, anomaly/event detection in financial transaction networks, local search and alerts in spatio-temporal networks, to name a few. Key challenges in supporting such continuous queries include high update rates typically seen in these situations, large numbers of queries that need to be executed simultaneously, and stringent low latency requirements. We propose a flexible, general, and extensible in-memory framework for executing different types of ego-centric aggregate queries over large dynamic graphs with low latencies. Our framework is built around the notion of an aggregation overlay graph, a pre-compiled data structure that encodes the computations to be performed when an update/query is received. The overlay graph enables sharing of partial aggregates across multiple ego-centric queries (corresponding to the nodes in the graph), and also allows partial pre-computation of the aggregates to minimize the query latencies. We present several highly scalable techniques for constructing an overlay graph given an aggregation function, and also design incremental algorithms for handling structural changes to the underlying graph. We also present an optimal, polynomial-time algorithm for making the pre-computation decisions given an overlay graph, and evaluate an approach to incrementally adapt those decisions as the workload changes. Although our approach is naturally parallelizable, we focus on a single-machine deployment and show that our techniques can easily handle graphs of size up to 320 million nodes and edges, and achieve update/query throughputs of over 500K/s using a single, powerful machine.Comment: 18 pages, 1 table, 14 figure

    EmptyHeaded: A Relational Engine for Graph Processing

    Full text link
    There are two types of high-performance graph processing engines: low- and high-level engines. Low-level engines (Galois, PowerGraph, Snap) provide optimized data structures and computation models but require users to write low-level imperative code, hence ensuring that efficiency is the burden of the user. In high-level engines, users write in query languages like datalog (SociaLite) or SQL (Grail). High-level engines are easier to use but are orders of magnitude slower than the low-level graph engines. We present EmptyHeaded, a high-level engine that supports a rich datalog-like query language and achieves performance comparable to that of low-level engines. At the core of EmptyHeaded's design is a new class of join algorithms that satisfy strong theoretical guarantees but have thus far not achieved performance comparable to that of specialized graph processing engines. To achieve high performance, EmptyHeaded introduces a new join engine architecture, including a novel query optimizer and data layouts that leverage single-instruction multiple data (SIMD) parallelism. With this architecture, EmptyHeaded outperforms high-level approaches by up to three orders of magnitude on graph pattern queries, PageRank, and Single-Source Shortest Paths (SSSP) and is an order of magnitude faster than many low-level baselines. We validate that EmptyHeaded competes with the best-of-breed low-level engine (Galois), achieving comparable performance on PageRank and at most 3x worse performance on SSSP

    PyTorch-BigGraph: A Large-scale Graph Embedding System

    Full text link
    Graph embedding methods produce unsupervised node features from graphs that can then be used for a variety of machine learning tasks. Modern graphs, particularly in industrial applications, contain billions of nodes and trillions of edges, which exceeds the capability of existing embedding systems. We present PyTorch-BigGraph (PBG), an embedding system that incorporates several modifications to traditional multi-relation embedding systems that allow it to scale to graphs with billions of nodes and trillions of edges. PBG uses graph partitioning to train arbitrarily large embeddings on either a single machine or in a distributed environment. We demonstrate comparable performance with existing embedding systems on common benchmarks, while allowing for scaling to arbitrarily large graphs and parallelization on multiple machines. We train and evaluate embeddings on several large social network graphs as well as the full Freebase dataset, which contains over 100 million nodes and 2 billion edges

    Being Rational or Aggressive? A Revisit to Dunbar's Number in Online Social Networks

    Full text link
    Recent years have witnessed the explosion of online social networks (OSNs). They provide powerful IT-innovations for online social activities such as organizing contacts, publishing contents, and sharing interests between friends who may never meet before. As more and more people become the active users of online social networks, one may ponder questions such as: (1) Do OSNs indeed improve our sociability? (2) To what extent can we expand our offline social spectrum in OSNs? (3) Can we identify some interesting user behaviors in OSNs? Our work in this paper just aims to answer these interesting questions. To this end, we pay a revisit to the well-known Dunbar's number in online social networks. Our main research contributions are as follows. First, to our best knowledge, our work is the first one that systematically validates the existence of the online Dunbar's number in the range of [200,300]. To reach this, we combine using local-structure analysis and user-interaction analysis for extensive real-world OSNs. Second, we divide OSNs users into two categories: rational and aggressive, and find that rational users intend to develop close and reciprocated relationships, whereas aggressive users have no consistent behaviors. Third, we build a simple model to capture the constraints of time and cognition that affect the evolution of online social networks. Finally, we show the potential use of our findings in viral marketing and privacy management in online social networks

    Network Sampling: From Static to Streaming Graphs

    Full text link
    Network sampling is integral to the analysis of social, information, and biological networks. Since many real-world networks are massive in size, continuously evolving, and/or distributed in nature, the network structure is often sampled in order to facilitate study. For these reasons, a more thorough and complete understanding of network sampling is critical to support the field of network science. In this paper, we outline a framework for the general problem of network sampling, by highlighting the different objectives, population and units of interest, and classes of network sampling methods. In addition, we propose a spectrum of computational models for network sampling methods, ranging from the traditionally studied model based on the assumption of a static domain to a more challenging model that is appropriate for streaming domains. We design a family of sampling methods based on the concept of graph induction that generalize across the full spectrum of computational models (from static to streaming) while efficiently preserving many of the topological properties of the input graphs. Furthermore, we demonstrate how traditional static sampling algorithms can be modified for graph streams for each of the three main classes of sampling methods: node, edge, and topology-based sampling. Our experimental results indicate that our proposed family of sampling methods more accurately preserves the underlying properties of the graph for both static and streaming graphs. Finally, we study the impact of network sampling algorithms on the parameter estimation and performance evaluation of relational classification algorithms
    corecore