57 research outputs found
A New Framework for Distributed Submodular Maximization
A wide variety of problems in machine learning, including exemplar
clustering, document summarization, and sensor placement, can be cast as
constrained submodular maximization problems. A lot of recent effort has been
devoted to developing distributed algorithms for these problems. However, these
results suffer from high number of rounds, suboptimal approximation ratios, or
both. We develop a framework for bringing existing algorithms in the sequential
setting to the distributed setting, achieving near optimal approximation ratios
for many settings in only a constant number of MapReduce rounds. Our techniques
also give a fast sequential algorithm for non-monotone maximization subject to
a matroid constraint
Enhancing massive MIMO: A new approach for Uplink training based on heterogeneous coherence time
Massive multiple-input multiple-output (MIMO) is one of the key technologies
in future generation networks. Owing to their considerable spectral and energy
efficiency gains, massive MIMO systems provide the needed performance to cope
with the ever increasing wireless capacity demand. Nevertheless, the number of
scheduled users stays limited in massive MIMO both in time division duplexing
(TDD) and frequency division duplexing (FDD) systems. This is due to the
limited coherence time, in TDD systems, and to limited feedback capacity, in
FDD mode. In current systems, the time slot duration in TDD mode is the same
for all users. This is a suboptimal approach since users are subject to
heterogeneous Doppler spreads and, consequently, different coherence times. In
this paper, we investigate a massive MIMO system operating in TDD mode in
which, the frequency of uplink training differs among users based on their
actual channel coherence times. We argue that optimizing uplink training by
exploiting this diversity can lead to considerable spectral efficiency gain. We
then provide a user scheduling algorithm that exploits a coherence interval
based grouping in order to maximize the achievable weighted sum rate
LQG Control and Sensing Co-Design
We investigate a Linear-Quadratic-Gaussian (LQG) control and sensing
co-design problem, where one jointly designs sensing and control policies. We
focus on the realistic case where the sensing design is selected among a finite
set of available sensors, where each sensor is associated with a different cost
(e.g., power consumption). We consider two dual problem instances:
sensing-constrained LQG control, where one maximizes control performance
subject to a sensor cost budget, and minimum-sensing LQG control, where one
minimizes sensor cost subject to performance constraints. We prove no
polynomial time algorithm guarantees across all problem instances a constant
approximation factor from the optimal. Nonetheless, we present the first
polynomial time algorithms with per-instance suboptimality guarantees. To this
end, we leverage a separation principle, that partially decouples the design of
sensing and control. Then, we frame LQG co-design as the optimization of
approximately supermodular set functions; we develop novel algorithms to solve
the problems; and we prove original results on the performance of the
algorithms, and establish connections between their suboptimality and
control-theoretic quantities. We conclude the paper by discussing two
applications, namely, sensing-constrained formation control and
resource-constrained robot navigation.Comment: Accepted to IEEE TAC. Includes contributions to submodular function
optimization literature, and extends conference paper arXiv:1709.0882
Recommended from our members
Submodular Secretary Problem with Shortlists under General Constraints
In submodular k-secretary problem, the goal is to select k items in a randomly ordered input so as to maximize the expected value of a given monotone submodular function on the set of selected items. In this paper, we introduce a relaxation of this problem, which we refer to as submodular k-secretary problem with shortlists. In the proposed problem setting, the algorithm is allowed to choose more than k items as part of a shortlist. Then, after seeing the entire input, the algorithm can choose a subset of size k from the bigger set of items in the shortlist. We are interested in understanding to what extent this relaxation can improve the achievable competitive ratio for the submodular k-secretary problem. In particular, using an O(k) shortlist, can an online algorithm achieve a competitive ratio close to the best achievable online approximation factor for this problem? We answer this question affirmatively by giving a polynomial time algorithm that achieves a 1 - 1/e - epsilon -O(k^{-1}) competitive ratio for any constant epsilon>0, using a shortlist of size eta {epsilon}(k)=O(k). Also, for the special case of m-submodular functions, we demonstrate an algorithm that achieves a 1 - epsilon competitive ratio for any constant epsilon > 0, using an O(1) shortlist. Finally, we show that our algorithm can be implemented in the streaming setting using a memory buffer of size eta{epsilon}(k)=O(k) to achieve a 1 - 1/e - epsilon - O(k^{-1}) approximation for submodular function maximization in the random order streaming model. This substantially improves upon the previously best known approximation factor of 1/2 + 8*10^{-14} [Norouzi-Fard et al. 2018] that used a memory buffer of size O(k log k).
We further generalize our results to the case of matroid constraints. We design an algorithm that achieves a 1/2(1 - 1/e^2 - epsilon - O(1/k)) competitive ratio for any constant epsilon>0, using a shortlist of size O(k). This is especially surprising considering that the best known competitive ratio for the matroid secretary problem is O(log log k). An important application of our algorithm is for the random order streaming of submodular functions. We show that our algorithm can be implemented in the streaming setting using O(k) memory. It achieves a 1/2 (1 - 1/e^2 - epsilon - O(1/k)) approximation. The previously best known approximation ratio for streaming submodular maximization under matroid constraint is 0.25 (adversarial order) due to [Feldman et al.], [Chekuri et al.], and [Chakrabarti et al.]. Moreover, we generalize our results to the case of p-matchoid constraints and give a frac{1}{p+1}(1 - 1/e^{p+1} - epsilon - O(1/k)) approximation using O(k) memory, which asymptotically approaches the best known offline guarantee frac{1}{p+1} [Nemhauser et al.]. Finally we empirically evaluate our results on real world data sets such as YouTube video and Twitter stream
Scheduling to minimize power consumption using submodular functions
Thesis (S.M.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2010.Cataloged from PDF version of thesis.Includes bibliographical references (p. 59-64).We develop logarithmic approximation algorithms for extremely general formulations of multiprocessor multi-interval offline task scheduling to minimize power usage. Here each processor has an arbitrary specified power consumption to be turned on for each possible time interval, and each job has a specified list of time interval/processor pairs during which it could be scheduled. (A processor need not be in use for an entire interval it is turned on.) If there is a feasible schedule, our algorithm finds a feasible schedule with total power usage within an O(log n) factor of optimal, where n is the number of jobs. (Even in a simple setting with one processor, the problem is Set-Cover hard.) If not all jobs can be scheduled and each job has a specified value, then our algorithm finds a schedule of value at least (1 - c)Z and power usage within an O(log(1/E)) factor of the optimal schedule of value at least Z, for any specified Z and c > 0. At the foundation of our work is a general framework for logarithmic approximation to maximizing any submodular function subject to budget constraints. We also introduce the online version of this scheduling problem, and show its relation to the classical secretary problem. In order to obtain constant competitive algorithms for this online version, we study the secretary problem with submodular utility function. We present several constant competitive algorithms for the secretary problem with different kinds of utility functions.by Morteza Zadimoghaddam.S.M
Test Score Algorithms for Budgeted Stochastic Utility Maximization
Motivated by recent developments in designing algorithms based on individual
item scores for solving utility maximization problems, we study the framework
of using test scores, defined as a statistic of observed individual item
performance data, for solving the budgeted stochastic utility maximization
problem. We extend an existing scoring mechanism, namely the replication test
scores, to incorporate heterogeneous item costs as well as item values. We show
that a natural greedy algorithm that selects items solely based on their
replication test scores outputs solutions within a constant factor of the
optimum for a broad class of utility functions. Our algorithms and
approximation guarantees assume that test scores are noisy estimates of certain
expected values with respect to marginal distributions of individual item
values, thus making our algorithms practical and extending previous work that
assumes noiseless estimates. Moreover, we show how our algorithm can be adapted
to the setting where items arrive in a streaming fashion while maintaining the
same approximation guarantee. We present numerical results, using synthetic
data and data sets from the Academia.StackExchange Q&A forum, which show that
our test score algorithm can achieve competitiveness, and in some cases better
performance than a benchmark algorithm that requires access to a value oracle
to evaluate function values
Balancing Relevance and Diversity in Online Bipartite Matching via Submodularity
In bipartite matching problems, vertices on one side of a bipartite graph are
paired with those on the other. In its online variant, one side of the graph is
available offline, while the vertices on the other side arrive online. When a
vertex arrives, an irrevocable and immediate decision should be made by the
algorithm; either match it to an available vertex or drop it. Examples of such
problems include matching workers to firms, advertisers to keywords, organs to
patients, and so on. Much of the literature focuses on maximizing the total
relevance---modeled via total weight---of the matching. However, in many
real-world problems, it is also important to consider contributions of
diversity: hiring a diverse pool of candidates, displaying a relevant but
diverse set of ads, and so on. In this paper, we propose the Online Submodular
Bipartite Matching (\osbm) problem, where the goal is to maximize a submodular
function over the set of matched edges. This objective is general enough to
capture the notion of both diversity (\emph{e.g.,} a weighted coverage
function) and relevance (\emph{e.g.,} the traditional linear function)---as
well as many other natural objective functions occurring in practice
(\emph{e.g.,} limited total budget in advertising settings). We propose novel
algorithms that have provable guarantees and are essentially optimal when
restricted to various special cases. We also run experiments on real-world and
synthetic datasets to validate our algorithms.Comment: To appear in AAAI 201
Approximate Inference for Determinantal Point Processes
In this thesis we explore a probabilistic model that is well-suited to a variety of subset selection tasks: the determinantal point process (DPP). DPPs were originally developed in the physics community to describe the repulsive interactions of fermions. More recently, they have been applied to machine learning problems such as search diversification and document summarization, which can be cast as subset selection tasks. A challenge, however, is scaling such DPP-based methods to the size of the datasets of interest to this community, and developing approximations for DPP inference tasks whose exact computation is prohibitively expensive.
A DPP defines a probability distribution over all subsets of a ground set of items. Consider the inference tasks common to probabilistic models, which include normalizing, marginalizing, conditioning, sampling, estimating the mode, and maximizing likelihood. For DPPs, exactly computing the quantities necessary for the first four of these tasks requires time cubic in the number of items or features of the items. In this thesis, we propose a means of making these four tasks tractable even in the realm where the number of items and the number of features is large. Specifically, we analyze the impact of randomly projecting the features down to a lower-dimensional space and show that the variational distance between the resulting DPP and the original is bounded. In addition to expanding the circumstances in which these first four tasks are tractable, we also tackle the other two tasks, the first of which is known to be NP-hard (with no PTAS) and the second of which is conjectured to be NP-hard. For mode estimation, we build on submodular maximization techniques to develop an algorithm with a multiplicative approximation guarantee. For likelihood maximization, we exploit the generative process associated with DPP sampling to derive an expectation-maximization (EM) algorithm. We experimentally verify the practicality of all the techniques that we develop, testing them on applications such as news and research summarization, political candidate comparison, and product recommendation
Recommended from our members
Data Stream Algorithms for Large Graphs and High Dimensional Data
In contrast to the traditional random access memory computational model where the entire input is available in the working memory, the data stream model only provides sequential access to the input. The data stream model is a natural framework to handle large and dynamic data. In this model, we focus on designing algorithms that use sublinear memory and a small number of passes over the stream. Other desirable properties include fast update time, query time, and post processing time.
In this dissertation, we consider different problems in graph theory, combinatorial optimization, and high dimensional data processing.
The first part of this dissertation focuses on algorithms for graph theory and combinatorial optimization. We present new results for the problems of finding the densest subgraph, counting the number of triangles, finding max cut with bounded components, and finding the maximum set coverage.
The second part of this dissertation considers problems in high dimensional data streams. In this setting, each stream item consists of multiple coordinates corresponding to different attributes. We consider the problem of testing or learning about the relationships among the attributes, and the problem of finding heavy hitters in subsets of attributes
- …