285 research outputs found

    Locally Adaptive Optimization: Adaptive Seeding for Monotone Submodular Functions

    Full text link
    The Adaptive Seeding problem is an algorithmic challenge motivated by influence maximization in social networks: One seeks to select among certain accessible nodes in a network, and then select, adaptively, among neighbors of those nodes as they become accessible in order to maximize a global objective function. More generally, adaptive seeding is a stochastic optimization framework where the choices in the first stage affect the realizations in the second stage, over which we aim to optimize. Our main result is a (1−1/e)2(1-1/e)^2-approximation for the adaptive seeding problem for any monotone submodular function. While adaptive policies are often approximated via non-adaptive policies, our algorithm is based on a novel method we call \emph{locally-adaptive} policies. These policies combine a non-adaptive global structure, with local adaptive optimizations. This method enables the (1−1/e)2(1-1/e)^2-approximation for general monotone submodular functions and circumvents some of the impossibilities associated with non-adaptive policies. We also introduce a fundamental problem in submodular optimization that may be of independent interest: given a ground set of elements where every element appears with some small probability, find a set of expected size at most kk that has the highest expected value over the realization of the elements. We show a surprising result: there are classes of monotone submodular functions (including coverage) that can be approximated almost optimally as the probability vanishes. For general monotone submodular functions we show via a reduction from \textsc{Planted-Clique} that approximations for this problem are not likely to be obtainable. This optimization problem is an important tool for adaptive seeding via non-adaptive policies, and its hardness motivates the introduction of \emph{locally-adaptive} policies we use in the main result

    Combining Traditional Marketing and Viral Marketing with Amphibious Influence Maximization

    Full text link
    In this paper, we propose the amphibious influence maximization (AIM) model that combines traditional marketing via content providers and viral marketing to consumers in social networks in a single framework. In AIM, a set of content providers and consumers form a bipartite network while consumers also form their social network, and influence propagates from the content providers to consumers and among consumers in the social network following the independent cascade model. An advertiser needs to select a subset of seed content providers and a subset of seed consumers, such that the influence from the seed providers passing through the seed consumers could reach a large number of consumers in the social network in expectation. We prove that the AIM problem is NP-hard to approximate to within any constant factor via a reduction from Feige's k-prover proof system for 3-SAT5. We also give evidence that even when the social network graph is trivial (i.e. has no edges), a polynomial time constant factor approximation for AIM is unlikely. However, when we assume that the weighted bi-adjacency matrix that describes the influence of content providers on consumers is of constant rank, a common assumption often used in recommender systems, we provide a polynomial-time algorithm that achieves approximation ratio of (1−1/e−ϵ)3(1-1/e-\epsilon)^3 for any (polynomially small) ϵ>0\epsilon > 0. Our algorithmic results still hold for a more general model where cascades in social network follow a general monotone and submodular function.Comment: An extended abstract appeared in the Proceedings of the 16th ACM Conference on Economics and Computation (EC), 201

    Resolution of ranking hierarchies in directed networks

    Get PDF
    Identifying hierarchies and rankings of nodes in directed graphs is fundamental in many applications such as social network analysis, biology, economics, and finance. A recently proposed method identifies the hierarchy by finding the ordered partition of nodes which minimises a score function, termed agony. This function penalises the links violating the hierarchy in a way depending on the strength of the violation. To investigate the resolution of ranking hierarchies we introduce an ensemble of random graphs, the Ranked Stochastic Block Model. We find that agony may fail to identify hierarchies when the structure is not strong enough and the size of the classes is small with respect to the whole network. We analytically characterise the resolution threshold and we show that an iterated version of agony can partly overcome this resolution limit.Comment: 27 pages, 9 figure

    Relax, no need to round: integrality of clustering formulations

    Full text link
    We study exact recovery conditions for convex relaxations of point cloud clustering problems, focusing on two of the most common optimization problems for unsupervised clustering: kk-means and kk-median clustering. Motivations for focusing on convex relaxations are: (a) they come with a certificate of optimality, and (b) they are generic tools which are relatively parameter-free, not tailored to specific assumptions over the input. More precisely, we consider the distributional setting where there are kk clusters in Rm\mathbb{R}^m and data from each cluster consists of nn points sampled from a symmetric distribution within a ball of unit radius. We ask: what is the minimal separation distance between cluster centers needed for convex relaxations to exactly recover these kk clusters as the optimal integral solution? For the kk-median linear programming relaxation we show a tight bound: exact recovery is obtained given arbitrarily small pairwise separation ϵ>0\epsilon > 0 between the balls. In other words, the pairwise center separation is Δ>2+ϵ\Delta > 2+\epsilon. Under the same distributional model, the kk-means LP relaxation fails to recover such clusters at separation as large as Δ=4\Delta = 4. Yet, if we enforce PSD constraints on the kk-means LP, we get exact cluster recovery at center separation Δ>22(1+1/m)\Delta > 2\sqrt2(1+\sqrt{1/m}). In contrast, common heuristics such as Lloyd's algorithm (a.k.a. the kk-means algorithm) can fail to recover clusters in this setting; even with arbitrarily large cluster separation, k-means++ with overseeding by any constant factor fails with high probability at exact cluster recovery. To complement the theoretical analysis, we provide an experimental study of the recovery guarantees for these various methods, and discuss several open problems which these experiments suggest.Comment: 30 pages, ITCS 201
    • …
    corecore