36 research outputs found
Tracing the Use of Practices through Networks of Collaboration
An active line of research has used on-line data to study the ways in which
discrete units of information---including messages, photos, product
recommendations, group invitations---spread through social networks. There is
relatively little understanding, however, of how on-line data might help in
studying the diffusion of more complex {\em practices}---roughly, routines or
styles of work that are generally handed down from one person to another
through collaboration or mentorship. In this work, we propose a framework
together with a novel type of data analysis that seeks to study the spread of
such practices by tracking their syntactic signatures in large document
collections. Central to this framework is the notion of an "inheritance graph"
that represents how people pass the practice on to others through
collaboration. Our analysis of these inheritance graphs demonstrates that we
can trace a significant number of practices over long time-spans, and we show
that the structure of these graphs can help in predicting the longevity of
collaborations within a field, as well as the fitness of the practices
themselves.Comment: To Appear in Proceedings of ICWSM 2017, data at
https://github.com/CornellNLP/Macro
Influence Maximization with Bandits
We consider the problem of \emph{influence maximization}, the problem of
maximizing the number of people that become aware of a product by finding the
`best' set of `seed' users to expose the product to. Most prior work on this
topic assumes that we know the probability of each user influencing each other
user, or we have data that lets us estimate these influences. However, this
information is typically not initially available or is difficult to obtain. To
avoid this assumption, we adopt a combinatorial multi-armed bandit paradigm
that estimates the influence probabilities as we sequentially try different
seed sets. We establish bounds on the performance of this procedure under the
existing edge-level feedback as well as a novel and more realistic node-level
feedback. Beyond our theoretical results, we describe a practical
implementation and experimentally demonstrate its efficiency and effectiveness
on four real datasets.Comment: 12 page
Virus Propagation in Multiple Profile Networks
Suppose we have a virus or one competing idea/product that propagates over a
multiple profile (e.g., social) network. Can we predict what proportion of the
network will actually get "infected" (e.g., spread the idea or buy the
competing product), when the nodes of the network appear to have different
sensitivity based on their profile? For example, if there are two profiles
and in a network and the nodes of profile
and profile are susceptible to a highly spreading
virus with probabilities and
respectively, what percentage of both profiles will actually get infected from
the virus at the end? To reverse the question, what are the necessary
conditions so that a predefined percentage of the network is infected? We
assume that nodes of different profiles can infect one another and we prove
that under realistic conditions, apart from the weak profile (great
sensitivity), the stronger profile (low sensitivity) will get infected as well.
First, we focus on cliques with the goal to provide exact theoretical results
as well as to get some intuition as to how a virus affects such a multiple
profile network. Then, we move to the theoretical analysis of arbitrary
networks. We provide bounds on certain properties of the network based on the
probabilities of infection of each node in it when it reaches the steady state.
Finally, we provide extensive experimental results that verify our theoretical
results and at the same time provide more insight on the problem
Validating Network Value of Influencers by means of Explanations
Recently, there has been significant interest in social influence analysis.
One of the central problems in this area is the problem of identifying
influencers, such that by convincing these users to perform a certain action
(like buying a new product), a large number of other users get influenced to
follow the action. The client of such an application is a marketer who would
target these influencers for marketing a given new product, say by providing
free samples or discounts. It is natural that before committing resources for
targeting an influencer the marketer would be interested in validating the
influence (or network value) of influencers returned. This requires digging
deeper into such analytical questions as: who are their followers, on what
actions (or products) they are influential, etc. However, the current
approaches to identifying influencers largely work as a black box in this
respect. The goal of this paper is to open up the black box, address these
questions and provide informative and crisp explanations for validating the
network value of influencers.
We formulate the problem of providing explanations (called PROXI) as a
discrete optimization problem of feature selection. We show that PROXI is not
only NP-hard to solve exactly, it is NP-hard to approximate within any
reasonable factor. Nevertheless, we show interesting properties of the
objective function and develop an intuitive greedy heuristic. We perform
detailed experimental analysis on two real world datasets - Twitter and
Flixster, and show that our approach is useful in generating concise and
insightful explanations of the influence distribution of users and that our
greedy algorithm is effective and efficient with respect to several baselines
Towards Profit Maximization for Online Social Network Providers
Online Social Networks (OSNs) attract billions of users to share information
and communicate where viral marketing has emerged as a new way to promote the
sales of products. An OSN provider is often hired by an advertiser to conduct
viral marketing campaigns. The OSN provider generates revenue from the
commission paid by the advertiser which is determined by the spread of its
product information. Meanwhile, to propagate influence, the activities
performed by users such as viewing video ads normally induce diffusion cost to
the OSN provider. In this paper, we aim to find a seed set to optimize a new
profit metric that combines the benefit of influence spread with the cost of
influence propagation for the OSN provider. Under many diffusion models, our
profit metric is the difference between two submodular functions which is
challenging to optimize as it is neither submodular nor monotone. We design a
general two-phase framework to select seeds for profit maximization and develop
several bounds to measure the quality of the seed set constructed. Experimental
results with real OSN datasets show that our approach can achieve high
approximation guarantees and significantly outperform the baseline algorithms,
including state-of-the-art influence maximization algorithms.Comment: INFOCOM 2018 (Full version), 12 page