11,013 research outputs found
Validating Network Value of Influencers by means of Explanations
Recently, there has been significant interest in social influence analysis.
One of the central problems in this area is the problem of identifying
influencers, such that by convincing these users to perform a certain action
(like buying a new product), a large number of other users get influenced to
follow the action. The client of such an application is a marketer who would
target these influencers for marketing a given new product, say by providing
free samples or discounts. It is natural that before committing resources for
targeting an influencer the marketer would be interested in validating the
influence (or network value) of influencers returned. This requires digging
deeper into such analytical questions as: who are their followers, on what
actions (or products) they are influential, etc. However, the current
approaches to identifying influencers largely work as a black box in this
respect. The goal of this paper is to open up the black box, address these
questions and provide informative and crisp explanations for validating the
network value of influencers.
We formulate the problem of providing explanations (called PROXI) as a
discrete optimization problem of feature selection. We show that PROXI is not
only NP-hard to solve exactly, it is NP-hard to approximate within any
reasonable factor. Nevertheless, we show interesting properties of the
objective function and develop an intuitive greedy heuristic. We perform
detailed experimental analysis on two real world datasets - Twitter and
Flixster, and show that our approach is useful in generating concise and
insightful explanations of the influence distribution of users and that our
greedy algorithm is effective and efficient with respect to several baselines
Seeding with Costly Network Information
We study the task of selecting nodes in a social network of size , to
seed a diffusion with maximum expected spread size, under the independent
cascade model with cascade probability . Most of the previous work on this
problem (known as influence maximization) focuses on efficient algorithms to
approximate the optimal seed set with provable guarantees, given the knowledge
of the entire network. However, in practice, obtaining full knowledge of the
network is very costly. To address this gap, we first study the achievable
guarantees using influence samples. We provide an approximation
algorithm with a tight (1-1/e){\mbox{OPT}}-\epsilon n guarantee, using
influence samples and show that this dependence on
is asymptotically optimal. We then propose a probing algorithm that queries
edges from the graph and use them to find a seed set with the
same almost tight approximation guarantee. We also provide a matching (up to
logarithmic factors) lower-bound on the required number of edges. To address
the dependence of our probing algorithm on the independent cascade probability
, we show that it is impossible to maintain the same approximation
guarantees by controlling the discrepancy between the probing and seeding
cascade probabilities. Instead, we propose to down-sample the probed edges to
match the seeding cascade probability, provided that it does not exceed that of
probing. Finally, we test our algorithms on real world data to quantify the
trade-off between the cost of obtaining more refined network information and
the benefit of the added information for guiding improved seeding strategies
Structure of Heterogeneous Networks
Heterogeneous networks play a key role in the evolution of communities and
the decisions individuals make. These networks link different types of
entities, for example, people and the events they attend. Network analysis
algorithms usually project such networks unto simple graphs composed of
entities of a single type. In the process, they conflate relations between
entities of different types and loose important structural information. We
develop a mathematical framework that can be used to compactly represent and
analyze heterogeneous networks that combine multiple entity and link types. We
generalize Bonacich centrality, which measures connectivity between nodes by
the number of paths between them, to heterogeneous networks and use this
measure to study network structure. Specifically, we extend the popular
modularity-maximization method for community detection to use this centrality
metric. We also rank nodes based on their connectivity to other nodes. One
advantage of this centrality metric is that it has a tunable parameter we can
use to set the length scale of interactions. By studying how rankings change
with this parameter allows us to identify important nodes in the network. We
apply the proposed method to analyze the structure of several heterogeneous
networks. We show that exploiting additional sources of evidence corresponding
to links between, as well as among, different entity types yields new insights
into network structure
Submodular Inference of Diffusion Networks from Multiple Trees
Diffusion and propagation of information, influence and diseases take place
over increasingly larger networks. We observe when a node copies information,
makes a decision or becomes infected but networks are often hidden or
unobserved. Since networks are highly dynamic, changing and growing rapidly, we
only observe a relatively small set of cascades before a network changes
significantly. Scalable network inference based on a small cascade set is then
necessary for understanding the rapidly evolving dynamics that govern
diffusion. In this article, we develop a scalable approximation algorithm with
provable near-optimal performance based on submodular maximization which
achieves a high accuracy in such scenario, solving an open problem first
introduced by Gomez-Rodriguez et al (2010). Experiments on synthetic and real
diffusion data show that our algorithm in practice achieves an optimal
trade-off between accuracy and running time.Comment: To appear in the 29th International Conference on Machine Learning
(ICML), 2012. Website:
http://www.stanford.edu/~manuelgr/network-inference-multitree
- …