4,871 research outputs found

    Learning and Inference in Massive Social Networks

    Get PDF
    Researchers and practitioners increasingly are gaining access to data on explicit social networks. For example, telecommunications and technology firms record data on consumer networks (via phone calls, emails, voice-over-IP, instant messaging), and social-network portal sites such as MySpace, Friendster and Facebook record consumer-generated data on social networks. Inference for fraud detection [5, 3, 8], marketing [9], and other tasks can be improved with learned models that take social networks into account and with collective inference [12], which allows inferences about nodes in the network to affect each other. However, these socialnetwork graphs can be huge, comprising millions to billions of nodes and one or two orders of magnitude more links. This paper studies the application of collective inference to improve prediction over a massive graph. Faced initially with a social network comprising hundreds of millions of nodes and a few billion edges, our goal is: to produce an approximate consumer network that is orders of magnitude smaller, but still facilitates improved performance via collective inference. We introduce a sampling technique designed to reduce the size of the network by many orders of magnitude, but to keep linkages that facilitate improved prediction via collective inference. In short, the sampling scheme operates as follows: (1) choose a set of nodes of interest; (2) then, in analogy to snowball sampling [14], grow local graphs around these nodes, adding their social networks, their neighbors’ social networks, and so on; (3) next, prune these local graphs of edges which are expected to contribute little to the collective inference; (4) finally, connect the local graphs together to form a graph with (hopefully) useful inference connectivity. We apply this sampling method to assess whether collective inference can improve learned targeted-marketing models for a social network of consumers of telecommunication services. Prior work [9] has shown improvement to the learning of targeting models by including social-neighborhood information—in particular, information on existing customers in the immediate social network of a potential target. However, the improvement was restricted to the “network neighbors”, those targets linked to a prior customer thought to be good candidates for the new service. Collective inference techniques may extend the predictive influence of existing customers beyond their immediate neighborhoods. For the present work, our motivating conjecture has been that this influence can improve prediction for consumers who are not strongly connected to existing customers. Our results show that this is indeed the case: collective inference on the approximate network enables significantly improved predictive performance for non-network-neighbor consumers, and for consumers who have few links to existing customers. In the rest of this extended abstract we motivate our approach, describe our sampling method, present results on applying our approach to a large real-world target marketing campaign in the telecommunications industry, and finally discuss our findings.NYU, Stern School of Business, IOMS Department, Center for Digital Economy Researc

    Human neuromaturation, juvenile extreme energy liability, and adult cognition/cooperation

    Get PDF
    Human childhood and adolescence is the period in which adult cognitive competences (including those that create the unique cooperativeness of humans) are acquired. It is also a period when neural development puts a juvenile’s survival at risk due to the high vulnerability of their brain to energy shortage. The brain of a 4 year-old human uses ≈50% of its total energy expenditure (TEE) (cf. adult ≈12%). This brain expensiveness is due to (1) the brain making up ≈6% of a 4 year-old body compared to 2% in an adult, and (2) increased energy metabolism that is ≈100% greater in the gray matter of a child than in an adult (a result of the extra costs of synaptic neuromaturation). The high absolute number of neurons in the human brain requires as part of learning a prolonged neurodevelopment. This refines inter- and intraarea neural networks so they become structured with economical “small world” connectivity attributes (such as hub organization and high cross-brain differentiation/integration). Once acquired, this connectivity enables highly complex adult cognitive capacities. Humans evolved as hunter-gatherers. Contemporary hunter-gatherers (and it is also likely Middle Paleolithic ones) pool high energy foods in an egalitarian manner that reliably supported mothers and juveniles with high energy intake. This type of sharing unique to humans protects against energy shortage happening to the immature brain. This cooperation that protects neuromaturation arises from adults having the capacity to communicate and evaluate social reputation, cognitive skills that exist as a result of extended neuromaturation. Human biology is therefore characterized by a presently overlooked bioenergetic-cognition loop (called here the “HEBE ring”) by which extended neuromaturation creates the cooperative abilities in adults that support juveniles through the potentially vulnerable period of the neurodevelopment needed to become such adults

    Transforming Graph Representations for Statistical Relational Learning

    Full text link
    Relational data representations have become an increasingly important topic due to the recent proliferation of network datasets (e.g., social, biological, information networks) and a corresponding increase in the application of statistical relational learning (SRL) algorithms to these domains. In this article, we examine a range of representation issues for graph-based relational data. Since the choice of relational data representation for the nodes, links, and features can dramatically affect the capabilities of SRL algorithms, we survey approaches and opportunities for relational representation transformation designed to improve the performance of these algorithms. This leads us to introduce an intuitive taxonomy for data representation transformations in relational domains that incorporates link transformation and node transformation as symmetric representation tasks. In particular, the transformation tasks for both nodes and links include (i) predicting their existence, (ii) predicting their label or type, (iii) estimating their weight or importance, and (iv) systematically constructing their relevant features. We motivate our taxonomy through detailed examples and use it to survey and compare competing approaches for each of these tasks. We also discuss general conditions for transforming links, nodes, and features. Finally, we highlight challenges that remain to be addressed

    Analysis of group evolution prediction in complex networks

    Full text link
    In the world, in which acceptance and the identification with social communities are highly desired, the ability to predict evolution of groups over time appears to be a vital but very complex research problem. Therefore, we propose a new, adaptable, generic and mutli-stage method for Group Evolution Prediction (GEP) in complex networks, that facilitates reasoning about the future states of the recently discovered groups. The precise GEP modularity enabled us to carry out extensive and versatile empirical studies on many real-world complex / social networks to analyze the impact of numerous setups and parameters like time window type and size, group detection method, evolution chain length, prediction models, etc. Additionally, many new predictive features reflecting the group state at a given time have been identified and tested. Some other research problems like enriching learning evolution chains with external data have been analyzed as well

    Prediction, evolution and privacy in social and affiliation networks

    Get PDF
    In the last few years, there has been a growing interest in studying online social and affiliation networks, leading to a new category of inference problems that consider the actor characteristics and their social environments. These problems have a variety of applications, from creating more effective marketing campaigns to designing better personalized services. Predictive statistical models allow learning hidden information automatically in these networks but also bring many privacy concerns. Three of the main challenges that I address in my thesis are understanding 1) how the complex observed and unobserved relationships among actors can help in building better behavior models, and in designing more accurate predictive algorithms, 2) what are the processes that drive the network growth and link formation, and 3) what are the implications of predictive algorithms to the privacy of users who share content online. The majority of previous work in prediction, evolution and privacy in online social networks has concentrated on the single-mode networks which form around user-user links, such as friendship and email communication. However, single-mode networks often co-exist with two-mode affiliation networks in which users are linked to other entities, such as social groups, online content and events. We study the interplay between these two types of networks and show that analyzing these higher-order interactions can reveal dependencies that are difficult to extract from the pair-wise interactions alone. In particular, we present our contributions to the challenging problems of collective classification, link prediction, network evolution, anonymization and preserving privacy in social and affiliation networks. We evaluate our models on real-world data sets from well-known online social networks, such as Flickr, Facebook, Dogster and LiveJournal

    Network Defense: Pruning, Grafting, and Closing to Prevent Leakage of Strategic Knowledge to Rivals

    Get PDF
    We explore how firms protect themselves from the risks of knowledge spillover to indirectly connected rivals in a network of interorganizational ties. We argue that the safeguards to limit opportunistic behavior by directly linked firms in a dyad, which have been the focus of extant research, are insufficient to overcome extra-dyadic leakage risks. Instead, firms terminate or avoid ties that expose their knowledge to indirectly linked rivals (“pruning” and “grafting”) and embed themselves in dense networks (“closing”) to prevent strategic knowledge spillover. Through a longitudinal study of German board interlocks during 1990–2003, we find that firms are more likely to prune, graft, and close their networks as they accumulate strategic knowledge and as the firms to which they are interlocked increasingly generate indirect ties to competitors, even when controlling for dyadic safeguards discussed by prior research. We capture strategic knowledge by tracking firms’ experience in the former Warsaw Pact countries from immediately after the sudden fall of communism in 1990 until 2003. The study introduces indirect links to rivals as a source of knowledge spillover in networks, shows how firms deal with extra-dyadic risks, and provides a defensive explanation for the evolution of network composition and structure
    • …
    corecore