4,871 research outputs found
Learning and Inference in Massive Social Networks
Researchers and practitioners increasingly are gaining access
to data on explicit social networks. For example, telecommunications
and technology firms record data on consumer
networks (via phone calls, emails, voice-over-IP, instant messaging),
and social-network portal sites such as MySpace,
Friendster and Facebook record consumer-generated data
on social networks. Inference for fraud detection [5, 3, 8],
marketing [9], and other tasks can be improved with learned
models that take social networks into account and with collective
inference [12], which allows inferences about nodes
in the network to affect each other. However, these socialnetwork
graphs can be huge, comprising millions to billions
of nodes and one or two orders of magnitude more links.
This paper studies the application of collective inference
to improve prediction over a massive graph. Faced initially
with a social network comprising hundreds of millions of
nodes and a few billion edges, our goal is: to produce an
approximate consumer network that is orders of magnitude
smaller, but still facilitates improved performance via collective
inference. We introduce a sampling technique designed
to reduce the size of the network by many orders of magnitude,
but to keep linkages that facilitate improved prediction
via collective inference.
In short, the sampling scheme operates as follows: (1)
choose a set of nodes of interest; (2) then, in analogy to
snowball sampling [14], grow local graphs around these nodes,
adding their social networks, their neighbors’ social networks,
and so on; (3) next, prune these local graphs of edges
which are expected to contribute little to the collective inference;
(4) finally, connect the local graphs together to form
a graph with (hopefully) useful inference connectivity.
We apply this sampling method to assess whether collective
inference can improve learned targeted-marketing models
for a social network of consumers of telecommunication
services. Prior work [9] has shown improvement to the learning
of targeting models by including social-neighborhood
information—in particular, information on existing customers
in the immediate social network of a potential target. However,
the improvement was restricted to the “network neighbors”,
those targets linked to a prior customer thought to
be good candidates for the new service. Collective inference
techniques may extend the predictive influence of existing
customers beyond their immediate neighborhoods. For the
present work, our motivating conjecture has been that this
influence can improve prediction for consumers who are not
strongly connected to existing customers. Our results show
that this is indeed the case: collective inference on the approximate
network enables significantly improved predictive
performance for non-network-neighbor consumers, and for
consumers who have few links to existing customers.
In the rest of this extended abstract we motivate our approach,
describe our sampling method, present results on
applying our approach to a large real-world target marketing
campaign in the telecommunications industry, and finally
discuss our findings.NYU, Stern School of Business, IOMS Department, Center for Digital Economy Researc
Human neuromaturation, juvenile extreme energy liability, and adult cognition/cooperation
Human childhood and adolescence is the period in which adult cognitive competences (including those that create the unique cooperativeness of humans) are acquired. It is also a period when neural development puts a juvenile’s survival at risk due to the high vulnerability of their brain to energy shortage. The brain of a 4 year-old human uses ≈50% of its total energy expenditure (TEE) (cf. adult ≈12%). This brain expensiveness is due to (1) the brain making up ≈6% of a 4 year-old body compared to 2% in an adult, and (2) increased energy metabolism that is ≈100% greater in the gray matter of a child than in an adult (a result of the extra costs of synaptic neuromaturation). The high absolute number of neurons in the human brain requires as part of learning a prolonged neurodevelopment. This refines inter- and intraarea neural networks so they become structured with economical “small world” connectivity attributes (such as hub organization and high cross-brain differentiation/integration). Once acquired, this connectivity enables highly complex adult cognitive capacities. Humans evolved as hunter-gatherers. Contemporary hunter-gatherers (and it is also likely Middle Paleolithic ones) pool high energy foods in an egalitarian manner that reliably supported mothers and juveniles with high energy intake. This type of sharing unique to humans protects against energy shortage happening to the immature brain. This cooperation that protects neuromaturation arises from adults having the capacity to communicate and evaluate social reputation, cognitive skills that exist as a result of extended neuromaturation. Human biology is therefore characterized by a presently overlooked bioenergetic-cognition loop (called here the “HEBE ring”) by which extended neuromaturation creates the cooperative abilities in adults that support juveniles through the potentially vulnerable period of the neurodevelopment needed to become such adults
Transforming Graph Representations for Statistical Relational Learning
Relational data representations have become an increasingly important topic
due to the recent proliferation of network datasets (e.g., social, biological,
information networks) and a corresponding increase in the application of
statistical relational learning (SRL) algorithms to these domains. In this
article, we examine a range of representation issues for graph-based relational
data. Since the choice of relational data representation for the nodes, links,
and features can dramatically affect the capabilities of SRL algorithms, we
survey approaches and opportunities for relational representation
transformation designed to improve the performance of these algorithms. This
leads us to introduce an intuitive taxonomy for data representation
transformations in relational domains that incorporates link transformation and
node transformation as symmetric representation tasks. In particular, the
transformation tasks for both nodes and links include (i) predicting their
existence, (ii) predicting their label or type, (iii) estimating their weight
or importance, and (iv) systematically constructing their relevant features. We
motivate our taxonomy through detailed examples and use it to survey and
compare competing approaches for each of these tasks. We also discuss general
conditions for transforming links, nodes, and features. Finally, we highlight
challenges that remain to be addressed
Analysis of group evolution prediction in complex networks
In the world, in which acceptance and the identification with social
communities are highly desired, the ability to predict evolution of groups over
time appears to be a vital but very complex research problem. Therefore, we
propose a new, adaptable, generic and mutli-stage method for Group Evolution
Prediction (GEP) in complex networks, that facilitates reasoning about the
future states of the recently discovered groups. The precise GEP modularity
enabled us to carry out extensive and versatile empirical studies on many
real-world complex / social networks to analyze the impact of numerous setups
and parameters like time window type and size, group detection method,
evolution chain length, prediction models, etc. Additionally, many new
predictive features reflecting the group state at a given time have been
identified and tested. Some other research problems like enriching learning
evolution chains with external data have been analyzed as well
Prediction, evolution and privacy in social and affiliation networks
In the last few years, there has been a growing interest in studying online social and affiliation networks, leading to a new category of inference problems that consider the actor characteristics and their social environments. These problems have a variety of applications, from creating more effective marketing campaigns to designing better personalized services. Predictive statistical models allow learning hidden information automatically in these networks but also bring many privacy concerns. Three of the main challenges that I address in my thesis are understanding 1) how the complex observed and unobserved relationships among actors can help in building better behavior models, and in designing more accurate predictive algorithms, 2) what are the processes that drive the network growth and link formation, and 3) what are the implications of predictive algorithms to the privacy of users who share content online.
The majority of previous work in prediction, evolution and privacy in online social networks has concentrated on the single-mode networks which form around user-user links, such as friendship and email communication. However, single-mode networks often co-exist with two-mode affiliation networks in which users are linked to other entities, such as social groups, online content and events. We study the interplay between these two types of networks and show that analyzing these higher-order interactions can reveal dependencies that are difficult to extract from the pair-wise interactions alone. In particular, we present our contributions to the challenging problems of collective classification, link prediction, network evolution, anonymization and preserving privacy in social and affiliation networks. We evaluate our models on real-world data sets from well-known online social networks, such as Flickr, Facebook, Dogster and LiveJournal
Network Defense: Pruning, Grafting, and Closing to Prevent Leakage of Strategic Knowledge to Rivals
We explore how firms protect themselves from the risks of knowledge spillover to indirectly connected rivals in a network of interorganizational ties. We argue that the safeguards to limit opportunistic behavior by directly linked firms in a dyad, which have been the focus of extant research, are insufficient to overcome extra-dyadic leakage risks. Instead, firms terminate or avoid ties that expose their knowledge to indirectly linked rivals (“pruning” and “grafting”) and embed themselves in dense networks (“closing”) to prevent strategic knowledge spillover. Through a longitudinal study of German board interlocks during 1990–2003, we find that firms are more likely to prune, graft, and close their networks as they accumulate strategic knowledge and as the firms to which they are interlocked increasingly generate indirect ties to competitors, even when controlling for dyadic safeguards discussed by prior research. We capture strategic knowledge by tracking firms’ experience in the former Warsaw Pact countries from immediately after the sudden fall of communism in 1990 until 2003. The study introduces indirect links to rivals as a source of knowledge spillover in networks, shows how firms deal with extra-dyadic risks, and provides a defensive explanation for the evolution of network composition and structure
- …