    Subdeterminant Maximization via Nonconvex Relaxations and Anti-concentration

    Several fundamental problems that arise in optimization and computer science can be cast as follows: Given vectors v1,…,vm∈Rdv_1,\ldots,v_m \in \mathbb{R}^d and a constraint family B⊆2[m]{\cal B}\subseteq 2^{[m]}, find a set S∈BS \in \cal{B} that maximizes the squared volume of the simplex spanned by the vectors in SS. A motivating example is the data-summarization problem in machine learning where one is given a collection of vectors that represent data such as documents or images. The volume of a set of vectors is used as a measure of their diversity, and partition or matroid constraints over [m][m] are imposed in order to ensure resource or fairness constraints. Recently, Nikolov and Singh presented a convex program and showed how it can be used to estimate the value of the most diverse set when B{\cal B} corresponds to a partition matroid. This result was recently extended to regular matroids in works of Straszak and Vishnoi, and Anari and Oveis Gharan. The question of whether these estimation algorithms can be converted into the more useful approximation algorithms -- that also output a set -- remained open. The main contribution of this paper is to give the first approximation algorithms for both partition and regular matroids. We present novel formulations for the subdeterminant maximization problem for these matroids; this reduces them to the problem of finding a point that maximizes the absolute value of a nonconvex function over a Cartesian product of probability simplices. The technical core of our results is a new anti-concentration inequality for dependent random variables that allows us to relate the optimal value of these nonconvex functions to their value at a random point. Unlike prior work on the constrained subdeterminant maximization problem, our proofs do not rely on real-stability or convexity and could be of independent interest both in algorithms and complexity.Comment: in FOCS 201

    Constrained Submodular Maximization: Beyond 1/e

    In this work, we present a new algorithm for maximizing a non-monotone submodular function subject to a general constraint. Our algorithm finds an approximate fractional solution for maximizing the multilinear extension of the function over a down-closed polytope. The approximation guarantee is 0.372 and it is the first improvement over the 1/e approximation achieved by the unified Continuous Greedy algorithm [Feldman et al., FOCS 2011]

    A New Framework for Distributed Submodular Maximization

    A wide variety of problems in machine learning, including exemplar clustering, document summarization, and sensor placement, can be cast as constrained submodular maximization problems. A lot of recent effort has been devoted to developing distributed algorithms for these problems. However, these results suffer from high number of rounds, suboptimal approximation ratios, or both. We develop a framework for bringing existing algorithms in the sequential setting to the distributed setting, achieving near optimal approximation ratios for many settings in only a constant number of MapReduce rounds. Our techniques also give a fast sequential algorithm for non-monotone maximization subject to a matroid constraint

    Algorithms and Hardness for Robust Subspace Recovery

    We consider a fundamental problem in unsupervised learning called \emph{subspace recovery}: given a collection of mm points in Rn\mathbb{R}^n, if many but not necessarily all of these points are contained in a dd-dimensional subspace TT can we find it? The points contained in TT are called {\em inliers} and the remaining points are {\em outliers}. This problem has received considerable attention in computer science and in statistics. Yet efficient algorithms from computer science are not robust to {\em adversarial} outliers, and the estimators from robust statistics are hard to compute in high dimensions. Are there algorithms for subspace recovery that are both robust to outliers and efficient? We give an algorithm that finds TT when it contains more than a dn\frac{d}{n} fraction of the points. Hence, for say d=n/2d = n/2 this estimator is both easy to compute and well-behaved when there are a constant fraction of outliers. We prove that it is Small Set Expansion hard to find TT when the fraction of errors is any larger, thus giving evidence that our estimator is an {\em optimal} compromise between efficiency and robustness. As it turns out, this basic problem has a surprising number of connections to other areas including small set expansion, matroid theory and functional analysis that we make use of here.Comment: Appeared in Proceedings of COLT 201

    Near-Optimal Sensor Scheduling for Batch State Estimation: Complexity, Algorithms, and Limits

    In this paper, we focus on batch state estimation for linear systems. This problem is important in applications such as environmental field estimation, robotic navigation, and target tracking. Its difficulty lies on that limited operational resources among the sensors, e.g., shared communication bandwidth or battery power, constrain the number of sensors that can be active at each measurement step. As a result, sensor scheduling algorithms must be employed. Notwithstanding, current sensor scheduling algorithms for batch state estimation scale poorly with the system size and the time horizon. In addition, current sensor scheduling algorithms for Kalman filtering, although they scale better, provide no performance guarantees or approximation bounds for the minimization of the batch state estimation error. In this paper, one of our main contributions is to provide an algorithm that enjoys both the estimation accuracy of the batch state scheduling algorithms and the low time complexity of the Kalman filtering scheduling algorithms. In particular: 1) our algorithm is near-optimal: it achieves a solution up to a multiplicative factor 1/2 from the optimal solution, and this factor is close to the best approximation factor 1/e one can achieve in polynomial time for this problem; 2) our algorithm has (polynomial) time complexity that is not only lower than that of the current algorithms for batch state estimation; it is also lower than, or similar to, that of the current algorithms for Kalman filtering. We achieve these results by proving two properties for our batch state estimation error metric, which quantifies the square error of the minimum variance linear estimator of the batch state vector: a) it is supermodular in the choice of the sensors; b) it has a sparsity pattern (it involves matrices that are block tri-diagonal) that facilitates its evaluation at each sensor set.Comment: Correction of typos in proof

    Optimal Approximation for Submodular and Supermodular Optimization with Bounded Curvature

    We design new approximation algorithms for the problems of optimizing submodular and supermodular functions subject to a single matroid constraint. Specifically, we consider the case in which we wish to maximize a monotone increasing submodular function or minimize a monotone decreasing supermodular function with a bounded total curvature c. Intuitively, the parameter c represents how nonlinear a function f is: when c = 0, f is linear, while for c = 1, f may be an arbitrary monotone increasing submodular function. For the case of submodular maximization with total curvature c, we obtain a (1 − c/e)-approximation—the first improvement over the greedy algorithm of of Conforti and Cornuéjols from 1984, which holds for a cardinality constraint, as well as a recent analogous result for an arbitrary matroid constraint. Our approach is based on modifications of the continuous greedy algorithm and nonoblivious local search, and allows us to approximately maximize the sum of a nonnegative, monotone increasing submodular function and a (possibly negative) linear function. We show how to reduce both submodular maximization and supermodular minimization to this general problem when the objective function has bounded total curvature. We prove that the approximation results we obtain are the best possible in the value oracle model, even in the case of a cardinality constraint. We define an extension of the notion of curvature to general monotone set functions and show a (1 − c)-approximation for maximization and a 1/(1 − c)-approximation for minimization cases. Finally, we give two concrete applications of our results in the settings of maximum entropy sampling, and the column-subset selection problem
