14 research outputs found

    Differentially Private Empirical Risk Minimization with Sparsity-Inducing Norms

    Get PDF
    Differential privacy is concerned about the prediction quality while measuring the privacy impact on individuals whose information is contained in the data. We consider differentially private risk minimization problems with regularizers that induce structured sparsity. These regularizers are known to be convex but they are often non-differentiable. We analyze the standard differentially private algorithms, such as output perturbation, Frank-Wolfe and objective perturbation. Output perturbation is a differentially private algorithm that is known to perform well for minimizing risks that are strongly convex. Previous works have derived excess risk bounds that are independent of the dimensionality. In this paper, we assume a particular class of convex but non-smooth regularizers that induce structured sparsity and loss functions for generalized linear models. We also consider differentially private Frank-Wolfe algorithms to optimize the dual of the risk minimization problem. We derive excess risk bounds for both these algorithms. Both the bounds depend on the Gaussian width of the unit ball of the dual norm. We also show that objective perturbation of the risk minimization problems is equivalent to the output perturbation of a dual optimization problem. This is the first work that analyzes the dual optimization problems of risk minimization problems in the context of differential privacy

    Active-set Methods for Submodular Minimization Problems

    Get PDF
    International audienceWe consider the submodular function minimization (SFM) and the quadratic minimization problemsregularized by the Lov'asz extension of the submodular function. These optimization problemsare intimately related; for example,min-cut problems and total variation denoising problems, wherethe cut function is submodular and its Lov'asz extension is given by the associated total variation.When a quadratic loss is regularized by the total variation of a cut function, it thus becomes atotal variation denoising problem and we use the same terminology in this paper for “general” submodularfunctions. We propose a new active-set algorithm for total variation denoising with theassumption of an oracle that solves the corresponding SFM problem. This can be seen as localdescent algorithm over ordered partitions with explicit convergence guarantees. It is more flexiblethan the existing algorithms with the ability for warm-restarts using the solution of a closely relatedproblem. Further, we also consider the case when a submodular function can be decomposed intothe sum of two submodular functions F1 and F2 and assume SFM oracles for these two functions.We propose a new active-set algorithm for total variation denoising (and hence SFM by thresholdingthe solution at zero). This algorithm also performs local descent over ordered partitions and itsability to warm start considerably improves the performance of the algorithm. In the experiments,we compare the performance of the proposed algorithms with state-of-the-art algorithms, showingthat it reduces the calls to SFM oracles

    The Graph Cut Kernel for Ranked Data

    Get PDF
    Many algorithms for ranked data become computationally intractable as the number of objects grows due to the complex geometric structure induced by rankings. An additional challenge is posed by partial rankings, i.e. rankings in which the preference is only known for a subset of all objects. For these reasons, state-of-the-art methods cannot scale to real-world applications, such as recommender systems. We address this challenge by exploiting the geometric structure of ranked data and additional available information about the objects to derive a kernel for ranking based on the graph cut function. The graph cut kernel combines the efficiency of submodular optimization with the theoretical properties of kernel-based methods. The graph cut kernel combines the efficiency of submodular optimization with the theoretical properties of kernel-based methods

    Sliced Multi-Marginal Optimal Transport

    Get PDF
    Multi-marginal optimal transport enables one to compare multiple probability measures, which increasingly finds application in multi-task learning problems. One practical limitation of multi-marginal transport is computational scalability in the number of measures, samples and dimensionality. In this work, we propose a multi-marginal optimal transport paradigm based on random one-dimensional projections, whose (generalized) distance we term the sliced multi-marginal Wasserstein distance. To construct this distance, we introduce a characterization of the one-dimensional multi-marginal Kantorovich problem and use it to highlight a number of properties of the sliced multi-marginal Wasserstein distance. In particular, we show that (i) the sliced multi-marginal Wasserstein distance is a (generalized) metric that induces the same topology as the standard Wasserstein distance, (ii) it admits a dimension-free sample complexity, (iii) it is tightly connected with the problem of barycentric averaging under the sliced-Wasserstein metric. We conclude by illustrating the sliced multi-marginal Wasserstein on multi-task density estimation and multi-dynamics reinforcement learning problems

    Maximizing submodular functions using probabilistic graphical models

    Get PDF
    We consider the problem of maximizing submodular functions; while this problem is known to be NP-hard, several numerically efficient local search techniques with approximation guarantees are available. In this paper, we propose a novel convex relaxation which is based on the relationship between submodular functions, entropies and probabilistic graphical models. In a graphical model, the entropy of the joint distribution decomposes as a sum of marginal entropies of subsets of variables; moreover, for any distribution, the entropy of the closest distribution factorizing in the graphical model provides an bound on the entropy. For directed graphical models, this last property turns out to be a direct consequence of the submodularity of the entropy function, and allows the generalization of graphical-model-based upper bounds to any submodular functions. These upper bounds may then be jointly maximized with respect to a set, while minimized with respect to the graph, leading to a convex variational inference scheme for maximizing submodular functions, based on outer approximations of the marginal polytope and maximum likelihood bounded treewidth structures. By considering graphs of increasing treewidths, we may then explore the trade-off between computational complexity and tightness of the relaxation. We also present extensions to constrained problems and maximizing the difference of submodular functions, which include all possible set functions

    Convex Relaxations for Learning Bounded-Treewidth Decomposable Graphs

    Get PDF
    We consider the problem of learning the structureofundirectedgraphicalmodelswith bounded treewidth, within the maximum likelihood framework. This is an NP-hard problem and most approaches consider local search techniques. In this paper, we pose it as a combinatorial optimization problem, which is then relaxed to a convex optimization problem that involves searching over the forest and hyperforest polytopes with special structures. A supergradient method is used to solve the dual problem, with a run-time complexity of O(k 3 n k+2 logn) for each iteration, where n is the number of variables and k is a bound on the treewidth. We compare our approach to state-of-the-art methods on synthetic datasets and classical benchmarks, showing the gains of the novel convex approach. 1

    Learning to Segment Document Images

    No full text
    A hierarchical framework for document segmentation is proposed as an optimization problem. The model incorporates the dependencies between various levels of the hierarchy unlike traditional document segmentation algorithms
    corecore