936 research outputs found

    Stability of Influence Maximization

    Full text link
    The present article serves as an erratum to our paper of the same title, which was presented and published in the KDD 2014 conference. In that article, we claimed falsely that the objective function defined in Section 1.4 is non-monotone submodular. We are deeply indebted to Debmalya Mandal, Jean Pouget-Abadie and Yaron Singer for bringing to our attention a counter-example to that claim. Subsequent to becoming aware of the counter-example, we have shown that the objective function is in fact NP-hard to approximate to within a factor of O(n1−ϔ)O(n^{1-\epsilon}) for any Ï”>0\epsilon > 0. In an attempt to fix the record, the present article combines the problem motivation, models, and experimental results sections from the original incorrect article with the new hardness result. We would like readers to only cite and use this version (which will remain an unpublished note) instead of the incorrect conference version.Comment: Erratum of Paper "Stability of Influence Maximization" which was presented and published in the KDD1

    Seeding with Costly Network Information

    Full text link
    We study the task of selecting kk nodes in a social network of size nn, to seed a diffusion with maximum expected spread size, under the independent cascade model with cascade probability pp. Most of the previous work on this problem (known as influence maximization) focuses on efficient algorithms to approximate the optimal seed set with provable guarantees, given the knowledge of the entire network. However, in practice, obtaining full knowledge of the network is very costly. To address this gap, we first study the achievable guarantees using o(n)o(n) influence samples. We provide an approximation algorithm with a tight (1-1/e){\mbox{OPT}}-\epsilon n guarantee, using OÏ”(k2log⁥n)O_{\epsilon}(k^2\log n) influence samples and show that this dependence on kk is asymptotically optimal. We then propose a probing algorithm that queries OÏ”(pn2log⁥4n+kpn1.5log⁥5.5n+knlog⁥3.5n){O}_{\epsilon}(p n^2\log^4 n + \sqrt{k p} n^{1.5}\log^{5.5} n + k n\log^{3.5}{n}) edges from the graph and use them to find a seed set with the same almost tight approximation guarantee. We also provide a matching (up to logarithmic factors) lower-bound on the required number of edges. To address the dependence of our probing algorithm on the independent cascade probability pp, we show that it is impossible to maintain the same approximation guarantees by controlling the discrepancy between the probing and seeding cascade probabilities. Instead, we propose to down-sample the probed edges to match the seeding cascade probability, provided that it does not exceed that of probing. Finally, we test our algorithms on real world data to quantify the trade-off between the cost of obtaining more refined network information and the benefit of the added information for guiding improved seeding strategies

    Combining Traditional Marketing and Viral Marketing with Amphibious Influence Maximization

    Full text link
    In this paper, we propose the amphibious influence maximization (AIM) model that combines traditional marketing via content providers and viral marketing to consumers in social networks in a single framework. In AIM, a set of content providers and consumers form a bipartite network while consumers also form their social network, and influence propagates from the content providers to consumers and among consumers in the social network following the independent cascade model. An advertiser needs to select a subset of seed content providers and a subset of seed consumers, such that the influence from the seed providers passing through the seed consumers could reach a large number of consumers in the social network in expectation. We prove that the AIM problem is NP-hard to approximate to within any constant factor via a reduction from Feige's k-prover proof system for 3-SAT5. We also give evidence that even when the social network graph is trivial (i.e. has no edges), a polynomial time constant factor approximation for AIM is unlikely. However, when we assume that the weighted bi-adjacency matrix that describes the influence of content providers on consumers is of constant rank, a common assumption often used in recommender systems, we provide a polynomial-time algorithm that achieves approximation ratio of (1−1/e−ϔ)3(1-1/e-\epsilon)^3 for any (polynomially small) Ï”>0\epsilon > 0. Our algorithmic results still hold for a more general model where cascades in social network follow a general monotone and submodular function.Comment: An extended abstract appeared in the Proceedings of the 16th ACM Conference on Economics and Computation (EC), 201

    Evaluating the role of community detection in improving influence maximization heuristics

    Get PDF
    Both community detection and influence maximization are well-researched fields of network science. Here, we investigate how several popular community detection algorithms can be used as part of a heuristic approach to influence maximization. The heuristic is based on the community value, a node-based metric defined on the outputs of overlapping community detection algorithms. This metric is used to select nodes as high influence candidates for expanding the set of influential nodes. Our aim in this paper is twofold. First, we evaluate the performance of eight frequently used overlapping community detection algorithms on this specific task to show how much improvement can be gained compared to the originally proposed method of Kempe et al. Second, selecting the community detection algorithm(s) with the best performance, we propose a variant of the influence maximization heuristic with significantly reduced runtime, at the cost of slightly reduced quality of the output. We use both artificial benchmarks and real-life networks to evaluate the performance of our approach

    Network based data oriented methods for application driven problems

    Get PDF
    Networks are amazing. If you think about it, some of them can be found in almost every single aspect of our life from sociological, financial and biological processes to the human body. Even considering entities that are not necessarily connected to each other in a natural sense, can be connected based on real life properties, creating a whole new aspect to express knowledge. A network as a structure implies not only interesting and complex mathematical questions, but the possibility to extract hidden and additional information from real life data. The data that is one of the most valuable resources of this century. The different activities of the society and the underlying processes produces a huge amount of data, which can be available for us due to the technological knowledge and tools we have nowadays. Nevertheless, the data without the contained knowledge does not represent value, thus the main focus in the last decade is to generate or extract information and knowledge from the data. Consequently, data analytics and science, as well as data-driven methodologies have become leading research fields both in scientific and industrial areas. In this dissertation, the author introduces efficient algorithms to solve application oriented optimization and data analysis tasks built on network science based models. The main idea is to connect these problems along graph based approaches, from virus modelling on an existing system through understanding the spreading mechanism of an infection/influence and maximize or minimize the effect, to financial applications, such as fraud detection or cost optimization in a case of employee rostering
    • 

    corecore