272 research outputs found

    Sketch-based Influence Maximization and Computation: Scaling up with Guarantees

    Full text link
    Propagation of contagion through networks is a fundamental process. It is used to model the spread of information, influence, or a viral infection. Diffusion patterns can be specified by a probabilistic model, such as Independent Cascade (IC), or captured by a set of representative traces. Basic computational problems in the study of diffusion are influence queries (determining the potency of a specified seed set of nodes) and Influence Maximization (identifying the most influential seed set of a given size). Answering each influence query involves many edge traversals, and does not scale when there are many queries on very large graphs. The gold standard for Influence Maximization is the greedy algorithm, which iteratively adds to the seed set a node maximizing the marginal gain in influence. Greedy has a guaranteed approximation ratio of at least (1-1/e) and actually produces a sequence of nodes, with each prefix having approximation guarantee with respect to the same-size optimum. Since Greedy does not scale well beyond a few million edges, for larger inputs one must currently use either heuristics or alternative algorithms designed for a pre-specified small seed set size. We develop a novel sketch-based design for influence computation. Our greedy Sketch-based Influence Maximization (SKIM) algorithm scales to graphs with billions of edges, with one to two orders of magnitude speedup over the best greedy methods. It still has a guaranteed approximation ratio, and in practice its quality nearly matches that of exact greedy. We also present influence oracles, which use linear-time preprocessing to generate a small sketch for each node, allowing the influence of any seed set to be quickly answered from the sketches of its nodes.Comment: 10 pages, 5 figures. Appeared at the 23rd Conference on Information and Knowledge Management (CIKM 2014) in Shanghai, Chin

    Complexity of Strong Implementability

    Full text link
    We consider the question of implementability of a social choice function in a classical setting where the preferences of finitely many selfish individuals with private information have to be aggregated towards a social choice. This is one of the central questions in mechanism design. If the concept of weak implementation is considered, the Revelation Principle states that one can restrict attention to truthful implementations and direct revelation mechanisms, which implies that implementability of a social choice function is easy to check. For the concept of strong implementation, however, the Revelation Principle becomes invalid, and the complexity of deciding whether a given social choice function is strongly implementable has been open so far. In this paper, we show by using methods from polyhedral theory that strong implementability of a social choice function can be decided in polynomial space and that each of the payments needed for strong implementation can always be chosen to be of polynomial encoding length. Moreover, we show that strong implementability of a social choice function involving only a single selfish individual can be decided in polynomial time via linear programming

    Streaming Algorithms for Submodular Function Maximization

    Full text link
    We consider the problem of maximizing a nonnegative submodular set function f:2NR+f:2^{\mathcal{N}} \rightarrow \mathbb{R}^+ subject to a pp-matchoid constraint in the single-pass streaming setting. Previous work in this context has considered streaming algorithms for modular functions and monotone submodular functions. The main result is for submodular functions that are {\em non-monotone}. We describe deterministic and randomized algorithms that obtain a Ω(1p)\Omega(\frac{1}{p})-approximation using O(klogk)O(k \log k)-space, where kk is an upper bound on the cardinality of the desired set. The model assumes value oracle access to ff and membership oracles for the matroids defining the pp-matchoid constraint.Comment: 29 pages, 7 figures, extended abstract to appear in ICALP 201

    On k-Column Sparse Packing Programs

    Full text link
    We consider the class of packing integer programs (PIPs) that are column sparse, i.e. there is a specified upper bound k on the number of constraints that each variable appears in. We give an (ek+o(k))-approximation algorithm for k-column sparse PIPs, improving on recent results of k22kk^2\cdot 2^k and O(k2)O(k^2). We also show that the integrality gap of our linear programming relaxation is at least 2k-1; it is known that k-column sparse PIPs are Ω(k/logk)\Omega(k/ \log k)-hard to approximate. We also extend our result (at the loss of a small constant factor) to the more general case of maximizing a submodular objective over k-column sparse packing constraints.Comment: 19 pages, v3: additional detail

    An oil pipeline design problem

    Get PDF
    Copyright @ 2003 INFORMSWe consider a given set of offshore platforms and onshore wells producing known (or estimated) amounts of oil to be connected to a port. Connections may take place directly between platforms, well sites, and the port, or may go through connection points at given locations. The configuration of the network and sizes of pipes used must be chosen to minimize construction costs. This problem is expressed as a mixed-integer program, and solved both heuristically by Tabu Search and Variable Neighborhood Search methods and exactly by a branch-and-bound method. Two new types of valid inequalities are introduced. Tests are made with data from the South Gabon oil field and randomly generated problems.The work of the first author was supported by NSERC grant #OGP205041. The work of the second author was supported by FCAR (Fonds pour la Formation des Chercheurs et l’Aide à la Recherche) grant #95-ER-1048, and NSERC grant #GP0105574

    Solving Medium-Density Subset Sum Problems in Expected Polynomial Time: An Enumeration Approach

    Full text link
    The subset sum problem (SSP) can be briefly stated as: given a target integer EE and a set AA containing nn positive integer aja_j, find a subset of AA summing to EE. The \textit{density} dd of an SSP instance is defined by the ratio of nn to mm, where mm is the logarithm of the largest integer within AA. Based on the structural and statistical properties of subset sums, we present an improved enumeration scheme for SSP, and implement it as a complete and exact algorithm (EnumPlus). The algorithm always equivalently reduces an instance to be low-density, and then solve it by enumeration. Through this approach, we show the possibility to design a sole algorithm that can efficiently solve arbitrary density instance in a uniform way. Furthermore, our algorithm has considerable performance advantage over previous algorithms. Firstly, it extends the density scope, in which SSP can be solved in expected polynomial time. Specifically, It solves SSP in expected O(nlogn)O(n\log{n}) time when density dcn/lognd \geq c\cdot \sqrt{n}/\log{n}, while the previously best density scope is dcn/(logn)2d \geq c\cdot n/(\log{n})^{2}. In addition, the overall expected time and space requirement in the average case are proven to be O(n5logn)O(n^5\log n) and O(n5)O(n^5) respectively. Secondly, in the worst case, it slightly improves the previously best time complexity of exact algorithms for SSP. Specifically, the worst-case time complexity of our algorithm is proved to be O((n6)2n/2+n)O((n-6)2^{n/2}+n), while the previously best result is O(n2n/2)O(n2^{n/2}).Comment: 11 pages, 1 figur

    On the equivalence of strong formulations for capacitated multi-level lot sizing problems with setup times

    Get PDF
    Several mixed integer programming formulations have been proposed for modeling capacitated multi-level lot sizing problems with setup times. These formulations include the so-called facility location formulation, the shortest route formulation, and the inventory and lot sizing formulation with (l,S) inequalities. In this paper, we demonstrate the equivalence of these formulations when the integrality requirement is relaxed for any subset of binary setup decision variables. This equivalence has significant implications for decomposition-based methods since same optimal solution values are obtained no matter which formulation is used. In particular, we discuss the relax-and-fix method, a decomposition-based heuristic used for the efficient solution of hard lot sizing problems. Computational tests allow us to compare the effectiveness of different formulations using benchmark problems. The choice of formulation directly affects the required computational effort, and our results therefore provide guidelines on choosing an effective formulation during the development of heuristic-based solution procedures

    Discovering Valuable Items from Massive Data

    Full text link
    Suppose there is a large collection of items, each with an associated cost and an inherent utility that is revealed only once we commit to selecting it. Given a budget on the cumulative cost of the selected items, how can we pick a subset of maximal value? This task generalizes several important problems such as multi-arm bandits, active search and the knapsack problem. We present an algorithm, GP-Select, which utilizes prior knowledge about similarity be- tween items, expressed as a kernel function. GP-Select uses Gaussian process prediction to balance exploration (estimating the unknown value of items) and exploitation (selecting items of high value). We extend GP-Select to be able to discover sets that simultaneously have high utility and are diverse. Our preference for diversity can be specified as an arbitrary monotone submodular function that quantifies the diminishing returns obtained when selecting similar items. Furthermore, we exploit the structure of the model updates to achieve an order of magnitude (up to 40X) speedup in our experiments without resorting to approximations. We provide strong guarantees on the performance of GP-Select and apply it to three real-world case studies of industrial relevance: (1) Refreshing a repository of prices in a Global Distribution System for the travel industry, (2) Identifying diverse, binding-affine peptides in a vaccine de- sign task and (3) Maximizing clicks in a web-scale recommender system by recommending items to users

    A Unified Submodular Framework for Multimodal IC Trojan Detection

    Full text link
    Abstract. This paper presents a unified formal framework for inte-grated circuits (IC) Trojan detection that can simultaneously employ multiple noninvasive measurement types. Hardware Trojans refer to modifications, alterations, or insertions to the original IC for adversarial purposes. The new framework formally defines the IC Trojan detection for each measurement type as an optimization problem and discusses the complexity. A formulation of the problem that is applicable to a large class of Trojan detection problems and is submodular is devised. Based on the objective function properties, an efficient Trojan detection method with strong approximation and optimality guarantees is intro-duced. Signal processing methods for calibrating the impact of inter-chip and intra-chip correlations are presented. We propose a number of meth-ods for combining the detections of the different measurement types. Experimental evaluations on benchmark designs reveal the low-overhead and effectiveness of the new Trojan detection framework and provides a comparison of different detection combining methods.
    corecore