Search CORE

57 research outputs found

A New Framework for Distributed Submodular Maximization

Author: Barbosa Rafael da Ponte
Ene Alina
Nguyen Huy L.
Ward Justin
Publication venue
Publication date: 11/08/2016
Field of study

A wide variety of problems in machine learning, including exemplar clustering, document summarization, and sensor placement, can be cast as constrained submodular maximization problems. A lot of recent effort has been devoted to developing distributed algorithms for these problems. However, these results suffer from high number of rounds, suboptimal approximation ratios, or both. We develop a framework for bringing existing algorithms in the sequential setting to the distributed setting, achieving near optimal approximation ratios for many settings in only a constant number of MapReduce rounds. Our techniques also give a fast sequential algorithm for non-monotone maximization subject to a matroid constraint

arXiv.org e-Print Archive

Infoscience - École polytechnique fédérale de Lausanne

Crossref

Enhancing massive MIMO: A new approach for Uplink training based on heterogeneous coherence time

Author: Assaad Mohamad
Hajri Salah Eddine
Larrañaga Maialen
Publication venue
Publication date: 06/10/2017
Field of study

Massive multiple-input multiple-output (MIMO) is one of the key technologies in future generation networks. Owing to their considerable spectral and energy efficiency gains, massive MIMO systems provide the needed performance to cope with the ever increasing wireless capacity demand. Nevertheless, the number of scheduled users stays limited in massive MIMO both in time division duplexing (TDD) and frequency division duplexing (FDD) systems. This is due to the limited coherence time, in TDD systems, and to limited feedback capacity, in FDD mode. In current systems, the time slot duration in TDD mode is the same for all users. This is a suboptimal approach since users are subject to heterogeneous Doppler spreads and, consequently, different coherence times. In this paper, we investigate a massive MIMO system operating in TDD mode in which, the frequency of uplink training differs among users based on their actual channel coherence times. We argue that optimizing uplink training by exploiting this diversity can lead to considerable spectral efficiency gain. We then provide a user scheduling algorithm that exploits a coherence interval based grouping in order to maximize the achievable weighted sum rate

arXiv.org e-Print Archive

HAL-CentraleSupelec

HAL-Rennes 1

LQG Control and Sensing Co-Design

Author: Carlone Luca
Jadbabaie Ali
Pappas George J.
Tzoumas Vasileios
Publication venue
Publication date: 19/05/2020
Field of study

We investigate a Linear-Quadratic-Gaussian (LQG) control and sensing co-design problem, where one jointly designs sensing and control policies. We focus on the realistic case where the sensing design is selected among a finite set of available sensors, where each sensor is associated with a different cost (e.g., power consumption). We consider two dual problem instances: sensing-constrained LQG control, where one maximizes control performance subject to a sensor cost budget, and minimum-sensing LQG control, where one minimizes sensor cost subject to performance constraints. We prove no polynomial time algorithm guarantees across all problem instances a constant approximation factor from the optimal. Nonetheless, we present the first polynomial time algorithms with per-instance suboptimality guarantees. To this end, we leverage a separation principle, that partially decouples the design of sensing and control. Then, we frame LQG co-design as the optimization of approximately supermodular set functions; we develop novel algorithms to solve the problems; and we prove original results on the performance of the algorithms, and establish connections between their suboptimality and control-theoretic quantities. We conclude the paper by discussing two applications, namely, sensing-constrained formation control and resource-constrained robot navigation.Comment: Accepted to IEEE TAC. Includes contributions to submodular function optimization literature, and extends conference paper arXiv:1709.0882

arXiv.org e-Print Archive

DSpace@MIT

Recommended from our members

Submodular Secretary Problem with Shortlists under General Constraints

Author: Shadravan Mohammad
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2020
Field of study

In submodular k-secretary problem, the goal is to select k items in a randomly ordered input so as to maximize the expected value of a given monotone submodular function on the set of selected items. In this paper, we introduce a relaxation of this problem, which we refer to as submodular k-secretary problem with shortlists. In the proposed problem setting, the algorithm is allowed to choose more than k items as part of a shortlist. Then, after seeing the entire input, the algorithm can choose a subset of size k from the bigger set of items in the shortlist. We are interested in understanding to what extent this relaxation can improve the achievable competitive ratio for the submodular k-secretary problem. In particular, using an O(k) shortlist, can an online algorithm achieve a competitive ratio close to the best achievable online approximation factor for this problem? We answer this question affirmatively by giving a polynomial time algorithm that achieves a 1 - 1/e - epsilon -O(k^{-1}) competitive ratio for any constant epsilon>0, using a shortlist of size eta {epsilon}(k)=O(k). Also, for the special case of m-submodular functions, we demonstrate an algorithm that achieves a 1 - epsilon competitive ratio for any constant epsilon > 0, using an O(1) shortlist. Finally, we show that our algorithm can be implemented in the streaming setting using a memory buffer of size eta{epsilon}(k)=O(k) to achieve a 1 - 1/e - epsilon - O(k^{-1}) approximation for submodular function maximization in the random order streaming model. This substantially improves upon the previously best known approximation factor of 1/2 + 8*10^{-14} [Norouzi-Fard et al. 2018] that used a memory buffer of size O(k log k). We further generalize our results to the case of matroid constraints. We design an algorithm that achieves a 1/2(1 - 1/e^2 - epsilon - O(1/k)) competitive ratio for any constant epsilon>0, using a shortlist of size O(k). This is especially surprising considering that the best known competitive ratio for the matroid secretary problem is O(log log k). An important application of our algorithm is for the random order streaming of submodular functions. We show that our algorithm can be implemented in the streaming setting using O(k) memory. It achieves a 1/2 (1 - 1/e^2 - epsilon - O(1/k)) approximation. The previously best known approximation ratio for streaming submodular maximization under matroid constraint is 0.25 (adversarial order) due to [Feldman et al.], [Chekuri et al.], and [Chakrabarti et al.]. Moreover, we generalize our results to the case of p-matchoid constraints and give a frac{1}{p+1}(1 - 1/e^{p+1} - epsilon - O(1/k)) approximation using O(k) memory, which asymptotically approaches the best known offline guarantee frac{1}{p+1} [Nemhauser et al.]. Finally we empirically evaluate our results on real world data sets such as YouTube video and Twitter stream

Columbia University Academic Commons

Scheduling to minimize power consumption using submodular functions

Author: Demaine Erik D.
Zadimoghaddam Morteza
Publication venue: Massachusetts Institute of Technology
Publication date: 01/01/2010
Field of study

Thesis (S.M.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2010.Cataloged from PDF version of thesis.Includes bibliographical references (p. 59-64).We develop logarithmic approximation algorithms for extremely general formulations of multiprocessor multi-interval offline task scheduling to minimize power usage. Here each processor has an arbitrary specified power consumption to be turned on for each possible time interval, and each job has a specified list of time interval/processor pairs during which it could be scheduled. (A processor need not be in use for an entire interval it is turned on.) If there is a feasible schedule, our algorithm finds a feasible schedule with total power usage within an O(log n) factor of optimal, where n is the number of jobs. (Even in a simple setting with one processor, the problem is Set-Cover hard.) If not all jobs can be scheduled and each job has a specified value, then our algorithm finds a schedule of value at least (1 - c)Z and power usage within an O(log(1/E)) factor of the optimal schedule of value at least Z, for any specified Z and c > 0. At the foundation of our work is a general framework for logarithmic approximation to maximizing any submodular function subject to budget constraints. We also introduce the online version of this scheduling problem, and show its relation to the classical secretary problem. In order to obtain constant competitive algorithms for this online version, we study the secretary problem with submodular utility function. We present several constant competitive algorithms for the secretary problem with different kinds of utility functions.by Morteza Zadimoghaddam.S.M

CiteSeerX

DSpace@MIT

Crossref

Test Score Algorithms for Budgeted Stochastic Utility Maximization

Author: Lee Dabeen
Vojnovic Milan
Yun Se-Young
Publication venue
Publication date: 30/12/2020
Field of study

Motivated by recent developments in designing algorithms based on individual item scores for solving utility maximization problems, we study the framework of using test scores, defined as a statistic of observed individual item performance data, for solving the budgeted stochastic utility maximization problem. We extend an existing scoring mechanism, namely the replication test scores, to incorporate heterogeneous item costs as well as item values. We show that a natural greedy algorithm that selects items solely based on their replication test scores outputs solutions within a constant factor of the optimum for a broad class of utility functions. Our algorithms and approximation guarantees assume that test scores are noisy estimates of certain expected values with respect to marginal distributions of individual item values, thus making our algorithms practical and extending previous work that assumes noiseless estimates. Moreover, we show how our algorithm can be adapted to the setting where items arrive in a streaming fashion while maintaining the same approximation guarantee. We present numerical results, using synthetic data and data sets from the Academia.StackExchange Q&A forum, which show that our test score algorithm can achieve competitiveness, and in some cases better performance than a benchmark algorithm that requires access to a value oracle to evaluate function values

arXiv.org e-Print Archive

IBS Publications Repository

LSE Research Online

Balancing Relevance and Diversity in Online Bipartite Matching via Submodularity

Author: Dickerson John P.
Sankararaman Karthik Abinav
Srinivasan Aravind
Xu Pan
Publication venue
Publication date: 12/11/2018
Field of study

In bipartite matching problems, vertices on one side of a bipartite graph are paired with those on the other. In its online variant, one side of the graph is available offline, while the vertices on the other side arrive online. When a vertex arrives, an irrevocable and immediate decision should be made by the algorithm; either match it to an available vertex or drop it. Examples of such problems include matching workers to firms, advertisers to keywords, organs to patients, and so on. Much of the literature focuses on maximizing the total relevance---modeled via total weight---of the matching. However, in many real-world problems, it is also important to consider contributions of diversity: hiring a diverse pool of candidates, displaying a relevant but diverse set of ads, and so on. In this paper, we propose the Online Submodular Bipartite Matching (\osbm) problem, where the goal is to maximize a submodular function

f

over the set of matched edges. This objective is general enough to capture the notion of both diversity (\emph{e.g.,} a weighted coverage function) and relevance (\emph{e.g.,} the traditional linear function)---as well as many other natural objective functions occurring in practice (\emph{e.g.,} limited total budget in advertising settings). We propose novel algorithms that have provable guarantees and are essentially optimal when restricted to various special cases. We also run experiments on real-world and synthetic datasets to validate our algorithms.Comment: To appear in AAAI 201

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

Approximate Inference for Determinantal Point Processes

Author: Gillenwater Jennifer Ann
Publication venue: ScholarlyCommons
Publication date: 01/01/2014
Field of study

In this thesis we explore a probabilistic model that is well-suited to a variety of subset selection tasks: the determinantal point process (DPP). DPPs were originally developed in the physics community to describe the repulsive interactions of fermions. More recently, they have been applied to machine learning problems such as search diversification and document summarization, which can be cast as subset selection tasks. A challenge, however, is scaling such DPP-based methods to the size of the datasets of interest to this community, and developing approximations for DPP inference tasks whose exact computation is prohibitively expensive. A DPP defines a probability distribution over all subsets of a ground set of items. Consider the inference tasks common to probabilistic models, which include normalizing, marginalizing, conditioning, sampling, estimating the mode, and maximizing likelihood. For DPPs, exactly computing the quantities necessary for the first four of these tasks requires time cubic in the number of items or features of the items. In this thesis, we propose a means of making these four tasks tractable even in the realm where the number of items and the number of features is large. Specifically, we analyze the impact of randomly projecting the features down to a lower-dimensional space and show that the variational distance between the resulting DPP and the original is bounded. In addition to expanding the circumstances in which these first four tasks are tractable, we also tackle the other two tasks, the first of which is known to be NP-hard (with no PTAS) and the second of which is conjectured to be NP-hard. For mode estimation, we build on submodular maximization techniques to develop an algorithm with a multiplicative approximation guarantee. For likelihood maximization, we exploit the generative process associated with DPP sampling to derive an expectation-maximization (EM) algorithm. We experimentally verify the practicality of all the techniques that we develop, testing them on applications such as news and research summarization, political candidate comparison, and product recommendation

ScholarlyCommons@Penn

Recommended from our members

Data Stream Algorithms for Large Graphs and High Dimensional Data

Author: Vu Hoa
Publication venue: ScholarWorks@UMass Amherst
Publication date: 25/10/2018
Field of study

In contrast to the traditional random access memory computational model where the entire input is available in the working memory, the data stream model only provides sequential access to the input. The data stream model is a natural framework to handle large and dynamic data. In this model, we focus on designing algorithms that use sublinear memory and a small number of passes over the stream. Other desirable properties include fast update time, query time, and post processing time. In this dissertation, we consider different problems in graph theory, combinatorial optimization, and high dimensional data processing. The first part of this dissertation focuses on algorithms for graph theory and combinatorial optimization. We present new results for the problems of finding the densest subgraph, counting the number of triangles, finding max cut with bounded components, and finding the maximum

k

set coverage. The second part of this dissertation considers problems in high dimensional data streams. In this setting, each stream item consists of multiple coordinates corresponding to different attributes. We consider the problem of testing or learning about the relationships among the attributes, and the problem of finding heavy hitters in subsets of attributes

ScholarWorks@UMass Amherst