272 research outputs found
Sketch-based Influence Maximization and Computation: Scaling up with Guarantees
Propagation of contagion through networks is a fundamental process. It is
used to model the spread of information, influence, or a viral infection.
Diffusion patterns can be specified by a probabilistic model, such as
Independent Cascade (IC), or captured by a set of representative traces.
Basic computational problems in the study of diffusion are influence queries
(determining the potency of a specified seed set of nodes) and Influence
Maximization (identifying the most influential seed set of a given size).
Answering each influence query involves many edge traversals, and does not
scale when there are many queries on very large graphs. The gold standard for
Influence Maximization is the greedy algorithm, which iteratively adds to the
seed set a node maximizing the marginal gain in influence. Greedy has a
guaranteed approximation ratio of at least (1-1/e) and actually produces a
sequence of nodes, with each prefix having approximation guarantee with respect
to the same-size optimum. Since Greedy does not scale well beyond a few million
edges, for larger inputs one must currently use either heuristics or
alternative algorithms designed for a pre-specified small seed set size.
We develop a novel sketch-based design for influence computation. Our greedy
Sketch-based Influence Maximization (SKIM) algorithm scales to graphs with
billions of edges, with one to two orders of magnitude speedup over the best
greedy methods. It still has a guaranteed approximation ratio, and in practice
its quality nearly matches that of exact greedy. We also present influence
oracles, which use linear-time preprocessing to generate a small sketch for
each node, allowing the influence of any seed set to be quickly answered from
the sketches of its nodes.Comment: 10 pages, 5 figures. Appeared at the 23rd Conference on Information
and Knowledge Management (CIKM 2014) in Shanghai, Chin
Complexity of Strong Implementability
We consider the question of implementability of a social choice function in a
classical setting where the preferences of finitely many selfish individuals
with private information have to be aggregated towards a social choice. This is
one of the central questions in mechanism design. If the concept of weak
implementation is considered, the Revelation Principle states that one can
restrict attention to truthful implementations and direct revelation
mechanisms, which implies that implementability of a social choice function is
easy to check. For the concept of strong implementation, however, the
Revelation Principle becomes invalid, and the complexity of deciding whether a
given social choice function is strongly implementable has been open so far. In
this paper, we show by using methods from polyhedral theory that strong
implementability of a social choice function can be decided in polynomial space
and that each of the payments needed for strong implementation can always be
chosen to be of polynomial encoding length. Moreover, we show that strong
implementability of a social choice function involving only a single selfish
individual can be decided in polynomial time via linear programming
Streaming Algorithms for Submodular Function Maximization
We consider the problem of maximizing a nonnegative submodular set function
subject to a -matchoid
constraint in the single-pass streaming setting. Previous work in this context
has considered streaming algorithms for modular functions and monotone
submodular functions. The main result is for submodular functions that are {\em
non-monotone}. We describe deterministic and randomized algorithms that obtain
a -approximation using -space, where is
an upper bound on the cardinality of the desired set. The model assumes value
oracle access to and membership oracles for the matroids defining the
-matchoid constraint.Comment: 29 pages, 7 figures, extended abstract to appear in ICALP 201
On k-Column Sparse Packing Programs
We consider the class of packing integer programs (PIPs) that are column
sparse, i.e. there is a specified upper bound k on the number of constraints
that each variable appears in. We give an (ek+o(k))-approximation algorithm for
k-column sparse PIPs, improving on recent results of and
. We also show that the integrality gap of our linear programming
relaxation is at least 2k-1; it is known that k-column sparse PIPs are
-hard to approximate. We also extend our result (at the loss
of a small constant factor) to the more general case of maximizing a submodular
objective over k-column sparse packing constraints.Comment: 19 pages, v3: additional detail
An oil pipeline design problem
Copyright @ 2003 INFORMSWe consider a given set of offshore platforms and onshore wells producing known (or estimated) amounts of oil to be connected to a port. Connections may take place directly between platforms, well sites, and the port, or may go through connection points at given locations. The configuration of the network and sizes of pipes used must be chosen to minimize construction costs. This problem is expressed as a mixed-integer program, and solved both heuristically by Tabu Search and Variable Neighborhood Search methods and exactly by a branch-and-bound method. Two new types of valid inequalities are introduced. Tests are made with data from the South Gabon oil field and randomly generated problems.The work of the first author was supported by NSERC grant #OGP205041. The work of the second author was supported by FCAR (Fonds pour la Formation des Chercheurs et l’Aide à la Recherche) grant #95-ER-1048, and NSERC grant #GP0105574
Solving Medium-Density Subset Sum Problems in Expected Polynomial Time: An Enumeration Approach
The subset sum problem (SSP) can be briefly stated as: given a target integer
and a set containing positive integer , find a subset of
summing to . The \textit{density} of an SSP instance is defined by the
ratio of to , where is the logarithm of the largest integer within
. Based on the structural and statistical properties of subset sums, we
present an improved enumeration scheme for SSP, and implement it as a complete
and exact algorithm (EnumPlus). The algorithm always equivalently reduces an
instance to be low-density, and then solve it by enumeration. Through this
approach, we show the possibility to design a sole algorithm that can
efficiently solve arbitrary density instance in a uniform way. Furthermore, our
algorithm has considerable performance advantage over previous algorithms.
Firstly, it extends the density scope, in which SSP can be solved in expected
polynomial time. Specifically, It solves SSP in expected time
when density , while the previously best
density scope is . In addition, the overall
expected time and space requirement in the average case are proven to be
and respectively. Secondly, in the worst case, it
slightly improves the previously best time complexity of exact algorithms for
SSP. Specifically, the worst-case time complexity of our algorithm is proved to
be , while the previously best result is .Comment: 11 pages, 1 figur
On the equivalence of strong formulations for capacitated multi-level lot sizing problems with setup times
Several mixed integer programming formulations have been proposed for modeling capacitated multi-level lot sizing problems with setup times. These formulations include the so-called facility location formulation, the shortest route formulation, and the inventory and lot sizing formulation with (l,S) inequalities. In this paper, we demonstrate the equivalence of these formulations when the integrality requirement is relaxed for any subset of binary setup decision variables. This equivalence has significant implications for decomposition-based methods since same optimal solution values are obtained no matter which formulation is used. In particular, we discuss the relax-and-fix method, a decomposition-based heuristic used for the efficient solution of hard lot sizing problems. Computational tests allow us to compare the effectiveness of different formulations using benchmark problems. The choice of formulation directly affects the required computational effort, and our results therefore provide guidelines on choosing an effective formulation during the development of heuristic-based solution procedures
Discovering Valuable Items from Massive Data
Suppose there is a large collection of items, each with an associated cost
and an inherent utility that is revealed only once we commit to selecting it.
Given a budget on the cumulative cost of the selected items, how can we pick a
subset of maximal value? This task generalizes several important problems such
as multi-arm bandits, active search and the knapsack problem. We present an
algorithm, GP-Select, which utilizes prior knowledge about similarity be- tween
items, expressed as a kernel function. GP-Select uses Gaussian process
prediction to balance exploration (estimating the unknown value of items) and
exploitation (selecting items of high value). We extend GP-Select to be able to
discover sets that simultaneously have high utility and are diverse. Our
preference for diversity can be specified as an arbitrary monotone submodular
function that quantifies the diminishing returns obtained when selecting
similar items. Furthermore, we exploit the structure of the model updates to
achieve an order of magnitude (up to 40X) speedup in our experiments without
resorting to approximations. We provide strong guarantees on the performance of
GP-Select and apply it to three real-world case studies of industrial
relevance: (1) Refreshing a repository of prices in a Global Distribution
System for the travel industry, (2) Identifying diverse, binding-affine
peptides in a vaccine de- sign task and (3) Maximizing clicks in a web-scale
recommender system by recommending items to users
PolyEDA: Combining Estimation of Distribution Algorithms and Linear Inequality Constraints
Algorithmus , Linearität , Ungleichhei
A Unified Submodular Framework for Multimodal IC Trojan Detection
Abstract. This paper presents a unified formal framework for inte-grated circuits (IC) Trojan detection that can simultaneously employ multiple noninvasive measurement types. Hardware Trojans refer to modifications, alterations, or insertions to the original IC for adversarial purposes. The new framework formally defines the IC Trojan detection for each measurement type as an optimization problem and discusses the complexity. A formulation of the problem that is applicable to a large class of Trojan detection problems and is submodular is devised. Based on the objective function properties, an efficient Trojan detection method with strong approximation and optimality guarantees is intro-duced. Signal processing methods for calibrating the impact of inter-chip and intra-chip correlations are presented. We propose a number of meth-ods for combining the detections of the different measurement types. Experimental evaluations on benchmark designs reveal the low-overhead and effectiveness of the new Trojan detection framework and provides a comparison of different detection combining methods.
- …