36 research outputs found

    Tracing the Use of Practices through Networks of Collaboration

    Full text link
    An active line of research has used on-line data to study the ways in which discrete units of information---including messages, photos, product recommendations, group invitations---spread through social networks. There is relatively little understanding, however, of how on-line data might help in studying the diffusion of more complex {\em practices}---roughly, routines or styles of work that are generally handed down from one person to another through collaboration or mentorship. In this work, we propose a framework together with a novel type of data analysis that seeks to study the spread of such practices by tracking their syntactic signatures in large document collections. Central to this framework is the notion of an "inheritance graph" that represents how people pass the practice on to others through collaboration. Our analysis of these inheritance graphs demonstrates that we can trace a significant number of practices over long time-spans, and we show that the structure of these graphs can help in predicting the longevity of collaborations within a field, as well as the fitness of the practices themselves.Comment: To Appear in Proceedings of ICWSM 2017, data at https://github.com/CornellNLP/Macro

    Influence Maximization with Bandits

    Full text link
    We consider the problem of \emph{influence maximization}, the problem of maximizing the number of people that become aware of a product by finding the `best' set of `seed' users to expose the product to. Most prior work on this topic assumes that we know the probability of each user influencing each other user, or we have data that lets us estimate these influences. However, this information is typically not initially available or is difficult to obtain. To avoid this assumption, we adopt a combinatorial multi-armed bandit paradigm that estimates the influence probabilities as we sequentially try different seed sets. We establish bounds on the performance of this procedure under the existing edge-level feedback as well as a novel and more realistic node-level feedback. Beyond our theoretical results, we describe a practical implementation and experimentally demonstrate its efficiency and effectiveness on four real datasets.Comment: 12 page

    Virus Propagation in Multiple Profile Networks

    Full text link
    Suppose we have a virus or one competing idea/product that propagates over a multiple profile (e.g., social) network. Can we predict what proportion of the network will actually get "infected" (e.g., spread the idea or buy the competing product), when the nodes of the network appear to have different sensitivity based on their profile? For example, if there are two profiles A\mathcal{A} and B\mathcal{B} in a network and the nodes of profile A\mathcal{A} and profile B\mathcal{B} are susceptible to a highly spreading virus with probabilities βA\beta_{\mathcal{A}} and βB\beta_{\mathcal{B}} respectively, what percentage of both profiles will actually get infected from the virus at the end? To reverse the question, what are the necessary conditions so that a predefined percentage of the network is infected? We assume that nodes of different profiles can infect one another and we prove that under realistic conditions, apart from the weak profile (great sensitivity), the stronger profile (low sensitivity) will get infected as well. First, we focus on cliques with the goal to provide exact theoretical results as well as to get some intuition as to how a virus affects such a multiple profile network. Then, we move to the theoretical analysis of arbitrary networks. We provide bounds on certain properties of the network based on the probabilities of infection of each node in it when it reaches the steady state. Finally, we provide extensive experimental results that verify our theoretical results and at the same time provide more insight on the problem

    Validating Network Value of Influencers by means of Explanations

    Full text link
    Recently, there has been significant interest in social influence analysis. One of the central problems in this area is the problem of identifying influencers, such that by convincing these users to perform a certain action (like buying a new product), a large number of other users get influenced to follow the action. The client of such an application is a marketer who would target these influencers for marketing a given new product, say by providing free samples or discounts. It is natural that before committing resources for targeting an influencer the marketer would be interested in validating the influence (or network value) of influencers returned. This requires digging deeper into such analytical questions as: who are their followers, on what actions (or products) they are influential, etc. However, the current approaches to identifying influencers largely work as a black box in this respect. The goal of this paper is to open up the black box, address these questions and provide informative and crisp explanations for validating the network value of influencers. We formulate the problem of providing explanations (called PROXI) as a discrete optimization problem of feature selection. We show that PROXI is not only NP-hard to solve exactly, it is NP-hard to approximate within any reasonable factor. Nevertheless, we show interesting properties of the objective function and develop an intuitive greedy heuristic. We perform detailed experimental analysis on two real world datasets - Twitter and Flixster, and show that our approach is useful in generating concise and insightful explanations of the influence distribution of users and that our greedy algorithm is effective and efficient with respect to several baselines

    Towards Profit Maximization for Online Social Network Providers

    Full text link
    Online Social Networks (OSNs) attract billions of users to share information and communicate where viral marketing has emerged as a new way to promote the sales of products. An OSN provider is often hired by an advertiser to conduct viral marketing campaigns. The OSN provider generates revenue from the commission paid by the advertiser which is determined by the spread of its product information. Meanwhile, to propagate influence, the activities performed by users such as viewing video ads normally induce diffusion cost to the OSN provider. In this paper, we aim to find a seed set to optimize a new profit metric that combines the benefit of influence spread with the cost of influence propagation for the OSN provider. Under many diffusion models, our profit metric is the difference between two submodular functions which is challenging to optimize as it is neither submodular nor monotone. We design a general two-phase framework to select seeds for profit maximization and develop several bounds to measure the quality of the seed set constructed. Experimental results with real OSN datasets show that our approach can achieve high approximation guarantees and significantly outperform the baseline algorithms, including state-of-the-art influence maximization algorithms.Comment: INFOCOM 2018 (Full version), 12 page
    corecore