Search CORE

272 research outputs found

Sketch-based Influence Maximization and Computation: Scaling up with Guarantees

Author: Chen W.
Du N.
Gomez-Rodriguez M.
Nemhauser G.
Ohsaka N.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 26/08/2014
Field of study

Propagation of contagion through networks is a fundamental process. It is used to model the spread of information, influence, or a viral infection. Diffusion patterns can be specified by a probabilistic model, such as Independent Cascade (IC), or captured by a set of representative traces. Basic computational problems in the study of diffusion are influence queries (determining the potency of a specified seed set of nodes) and Influence Maximization (identifying the most influential seed set of a given size). Answering each influence query involves many edge traversals, and does not scale when there are many queries on very large graphs. The gold standard for Influence Maximization is the greedy algorithm, which iteratively adds to the seed set a node maximizing the marginal gain in influence. Greedy has a guaranteed approximation ratio of at least (1-1/e) and actually produces a sequence of nodes, with each prefix having approximation guarantee with respect to the same-size optimum. Since Greedy does not scale well beyond a few million edges, for larger inputs one must currently use either heuristics or alternative algorithms designed for a pre-specified small seed set size. We develop a novel sketch-based design for influence computation. Our greedy Sketch-based Influence Maximization (SKIM) algorithm scales to graphs with billions of edges, with one to two orders of magnitude speedup over the best greedy methods. It still has a guaranteed approximation ratio, and in practice its quality nearly matches that of exact greedy. We also present influence oracles, which use linear-time preprocessing to generate a small sketch for each node, allowing the influence of any seed set to be quickly answered from the sketches of its nodes.Comment: 10 pages, 5 figures. Appeared at the 23rd Conference on Information and Knowledge Management (CIKM 2014) in Shanghai, Chin

arXiv.org e-Print Archive

CiteSeerX

Crossref

Complexity of Strong Implementability

Author: A. Mas-Colell
Clemens Thielen
D. Mookherjee
Evangelos Markakis
G. L. Nemhauser
Ioannis Milis
M. Grötschel
M. J. Osborne
N. Nisan
R. Müller
Sven O. Krumke
Publication venue: 'Open Publishing Association'
Publication date: 01/01/2009
Field of study

We consider the question of implementability of a social choice function in a classical setting where the preferences of finitely many selfish individuals with private information have to be aggregated towards a social choice. This is one of the central questions in mechanism design. If the concept of weak implementation is considered, the Revelation Principle states that one can restrict attention to truthful implementations and direct revelation mechanisms, which implies that implementability of a social choice function is easy to check. For the concept of strong implementation, however, the Revelation Principle becomes invalid, and the complexity of deciding whether a given social choice function is strongly implementable has been open so far. In this paper, we show by using methods from polyhedral theory that strong implementability of a social choice function can be decided in polynomial space and that each of the payments needed for strong implementation can always be chosen to be of polynomial encoding length. Moreover, we show that strong implementability of a social choice function involving only a single selfish individual can be decided in polynomial time via linear programming

arXiv.org e-Print Archive

CiteSeerX

Crossref

Directory of Open Access Journals

Streaming Algorithms for Submodular Function Maximization

Author: A Badanidiyuru Varadaraja
A Chakrabarti
A Goyal
A Gupta
A Kulik
G Calinescu
G Calinescu
GL Nemhauser
J Feigenbaum
J Lee
J Lee
J Vondrák
M Bateni
M Feldman
ML Fisher
N Bansal
Y Filmus
Publication venue
Publication date: 29/04/2015
Field of study

We consider the problem of maximizing a nonnegative submodular set function

f:2^{\mathcal{N}} \rightarrow \mathbb{R}^+

subject to a

p

-matchoid constraint in the single-pass streaming setting. Previous work in this context has considered streaming algorithms for modular functions and monotone submodular functions. The main result is for submodular functions that are {\em non-monotone}. We describe deterministic and randomized algorithms that obtain a

\Omega(\frac{1}{p})

-approximation using

O(k \log k)

-space, where

k

is an upper bound on the cardinality of the desired set. The model assumes value oracle access to

f

and membership oracles for the matroids defining the

p

-matchoid constraint.Comment: 29 pages, 7 figures, extended abstract to appear in ICALP 201

arXiv.org e-Print Archive

Crossref

On k-Column Sparse Packing Programs

Author: A. Baveja
A. Baveja
A. Srinivasan
A.J. Hurkens
B. Shepherd
C. Chekuri
D. Pritchard
D. Zuckerman
E. Halperin
E. Hazan
G. Calinescu
G.L. Nemhauser
J. Beck
N. Alon
P. Berman
S. Kolliopoulos
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2009
Field of study

We consider the class of packing integer programs (PIPs) that are column sparse, i.e. there is a specified upper bound k on the number of constraints that each variable appears in. We give an (ek+o(k))-approximation algorithm for k-column sparse PIPs, improving on recent results of

k^2\cdot 2^k

and

O(k^2)

. We also show that the integrality gap of our linear programming relaxation is at least 2k-1; it is known that k-column sparse PIPs are

\Omega(k/ \log k)

-hard to approximate. We also extend our result (at the loss of a small constant factor) to the more general case of maximizing a submodular objective over k-column sparse packing constraints.Comment: 19 pages, v3: additional detail

arXiv.org e-Print Archive

An oil pipeline design problem

Author: Aarts E. H. L.
Bertelè U.
Brimberg J.
Chu Y.
Hansen P.
Jack Brimberg
Keh-Wei Lin
MichÈle Breton
Minoux M.
Minoux M.
Mladenović N.
Nemhauser G. L.
Nenad Mladenović
Pierre Hansen
Publication venue: 'Institute for Operations Research and the Management Sciences (INFORMS)'
Publication date: 01/01/2003
Field of study

Copyright @ 2003 INFORMSWe consider a given set of offshore platforms and onshore wells producing known (or estimated) amounts of oil to be connected to a port. Connections may take place directly between platforms, well sites, and the port, or may go through connection points at given locations. The configuration of the network and sizes of pipes used must be chosen to minimize construction costs. This problem is expressed as a mixed-integer program, and solved both heuristically by Tabu Search and Variable Neighborhood Search methods and exactly by a branch-and-bound method. Two new types of valid inequalities are introduced. Tests are made with data from the South Gabon oil field and randomly generated problems.The work of the first author was supported by NSERC grant #OGP205041. The work of the second author was supported by FCAR (Fonds pour la Formation des Chercheurs et l’Aide à la Recherche) grant #95-ER-1048, and NSERC grant #GP0105574

Crossref

University of Birmingham Research Portal

Brunel University Research Archive

Solving Medium-Density Subset Sum Problems in Expected Polynomial Time: An Enumeration Approach

Author: A.D. Flaxman
A.M. Frieze
C. Borgs
D. Pisinger
E. Horowitz
G. Nemhauser
H. Kellerer
H. Yanasse
Heiko Bauke
J.C. Lagarias
M. Chaimovich
M.J. Coster
N. Soma
O.H. Ibarra
R. Beier
S. Martello
T. Sasamoto
Z. Galil
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2008
Field of study

The subset sum problem (SSP) can be briefly stated as: given a target integer

E

and a set

A

containing

n

positive integer

a_j

, find a subset of

A

summing to

E

. The \textit{density}

d

of an SSP instance is defined by the ratio of

n

m

, where

m

is the logarithm of the largest integer within

A

. Based on the structural and statistical properties of subset sums, we present an improved enumeration scheme for SSP, and implement it as a complete and exact algorithm (EnumPlus). The algorithm always equivalently reduces an instance to be low-density, and then solve it by enumeration. Through this approach, we show the possibility to design a sole algorithm that can efficiently solve arbitrary density instance in a uniform way. Furthermore, our algorithm has considerable performance advantage over previous algorithms. Firstly, it extends the density scope, in which SSP can be solved in expected polynomial time. Specifically, It solves SSP in expected

O(n\log{n})

time when density

d \geq c\cdot \sqrt{n}/\log{n}

, while the previously best density scope is

d \geq c\cdot n/(\log{n})^{2}

. In addition, the overall expected time and space requirement in the average case are proven to be

O(n^5\log n)

and

O(n^5)

respectively. Secondly, in the worst case, it slightly improves the previously best time complexity of exact algorithms for SSP. Specifically, the worst-case time complexity of our algorithm is proved to be

O((n-6)2^{n/2}+n)

, while the previously best result is

O(n2^{n/2})

.Comment: 11 pages, 1 figur

arXiv.org e-Print Archive

CiteSeerX

Crossref

On the equivalence of strong formulations for capacitated multi-level lot sizing problems with setup times

Author: A. Migdalas
C. Mercé
F. Sahling
G. Belvaux
G.D. Eppen
G.L. Nemhauser
H. Stadtler
H. Tempelmeier
H. Tempelmeier
I. Barany
Joseph Geunes
K. Akartunalı
Kerem Akartunalı
Leyuan Shi
M. Denizel
N. Absi
P. Billington
T. Wu
Tao Wu
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2012
Field of study

Several mixed integer programming formulations have been proposed for modeling capacitated multi-level lot sizing problems with setup times. These formulations include the so-called facility location formulation, the shortest route formulation, and the inventory and lot sizing formulation with (l,S) inequalities. In this paper, we demonstrate the equivalence of these formulations when the integrality requirement is relaxed for any subset of binary setup decision variables. This equivalence has significant implications for decomposition-based methods since same optimal solution values are obtained no matter which formulation is used. In particular, we discuss the relax-and-fix method, a decomposition-based heuristic used for the efficient solution of hard lot sizing problems. Computational tests allow us to compare the effectiveness of different formulations using benchmark problems. The choice of formulation directly affects the required computational effort, and our results therefore provide guidelines on choosing an effective formulation during the development of heuristic-based solution procedures

Crossref

University of Strathclyde Institutional Repository

Discovering Valuable Items from Massive Data

Author: Bubeck S.
Dani V.
Desautels T.
Garnett R.
Kale S.
Krause A.
Kulesza A.
Lawrence N. D.
Lin H.
Nemhauser G.
Rasmussen C. E.
Schölkopf B.
Settles B.
Slivkins A.
Streeter M.
Streeter M.
Tran-Thanh L.
Yue Y.
Publication venue
Publication date: 02/06/2015
Field of study

Suppose there is a large collection of items, each with an associated cost and an inherent utility that is revealed only once we commit to selecting it. Given a budget on the cumulative cost of the selected items, how can we pick a subset of maximal value? This task generalizes several important problems such as multi-arm bandits, active search and the knapsack problem. We present an algorithm, GP-Select, which utilizes prior knowledge about similarity be- tween items, expressed as a kernel function. GP-Select uses Gaussian process prediction to balance exploration (estimating the unknown value of items) and exploitation (selecting items of high value). We extend GP-Select to be able to discover sets that simultaneously have high utility and are diverse. Our preference for diversity can be specified as an arbitrary monotone submodular function that quantifies the diminishing returns obtained when selecting similar items. Furthermore, we exploit the structure of the model updates to achieve an order of magnitude (up to 40X) speedup in our experiments without resorting to approximations. We provide strong guarantees on the performance of GP-Select and apply it to three real-world case studies of industrial relevance: (1) Refreshing a repository of prices in a Global Distribution System for the travel industry, (2) Identifying diverse, binding-affine peptides in a vaccine de- sign task and (3) Maximizing clicks in a web-scale recommender system by recommending items to users

arXiv.org e-Print Archive

CiteSeerX

Repository for Publications and Research Data

Crossref

PolyEDA: Combining Estimation of Distribution Algorithms and Linear Inequality Constraints

Author: A.E. Gelfand
G. Casella
G.L. Nemhauser
H. Williams
N. Metropolis
R. Duffin
R.J. Vanderbei
S. Geman
S. Kotz
W.K. Hastings
Z. Michalewicz
Z. Michalewicz
Z. Michalewicz
Publication venue
Publication date: 01/01/2004
Field of study

Algorithmus , Linearität , Ungleichhei

CiteSeerX

Crossref

MAnnheim DOCument Server (Univ. Mannheim)

A Unified Submodular Framework for Multimodal IC Trojan Detection

Author: A. Srivastava
G. Nemhauser
M. Nelson
N. Jha
S. Chakravarty
S. Sabade
U. Feige
V. Kreinovich
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2010
Field of study

Abstract. This paper presents a unified formal framework for inte-grated circuits (IC) Trojan detection that can simultaneously employ multiple noninvasive measurement types. Hardware Trojans refer to modifications, alterations, or insertions to the original IC for adversarial purposes. The new framework formally defines the IC Trojan detection for each measurement type as an optimization problem and discusses the complexity. A formulation of the problem that is applicable to a large class of Trojan detection problems and is submodular is devised. Based on the objective function properties, an efficient Trojan detection method with strong approximation and optimality guarantees is intro-duced. Signal processing methods for calibrating the impact of inter-chip and intra-chip correlations are presented. We propose a number of meth-ods for combining the detections of the different measurement types. Experimental evaluations on benchmark designs reveal the low-overhead and effectiveness of the new Trojan detection framework and provides a comparison of different detection combining methods.

CiteSeerX

Crossref