59,585 research outputs found

    VecHGrad for Solving Accurately Complex Tensor Decomposition

    Full text link
    Tensor decomposition, a collection of factorization techniques for multidimensional arrays, are among the most general and powerful tools for scientific analysis. However, because of their increasing size, today's data sets require more complex tensor decomposition involving factorization with multiple matrices and diagonal tensors such as DEDICOM or PARATUCK2. Traditional tensor resolution algorithms such as Stochastic Gradient Descent (SGD), Non-linear Conjugate Gradient descent (NCG) or Alternating Least Square (ALS), cannot be easily applied to complex tensor decomposition or often lead to poor accuracy at convergence. We propose a new resolution algorithm, called VecHGrad, for accurate and efficient stochastic resolution over all existing tensor decomposition, specifically designed for complex decomposition. VecHGrad relies on gradient, Hessian-vector product and adaptive line search to ensure the convergence during optimization. Our experiments on five real-world data sets with the state-of-the-art deep learning gradient optimization models show that VecHGrad is capable of converging considerably faster because of its superior theoretical convergence rate per step. Therefore, VecHGrad targets as well deep learning optimizer algorithms. The experiments are performed for various tensor decomposition including CP, DEDICOM and PARATUCK2. Although it involves a slightly more complex update rule, VecHGrad's runtime is similar in practice to that of gradient methods such as SGD, Adam or RMSProp

    A Linearly Convergent Conditional Gradient Algorithm with Applications to Online and Stochastic Optimization

    Full text link
    Linear optimization is many times algorithmically simpler than non-linear convex optimization. Linear optimization over matroid polytopes, matching polytopes and path polytopes are example of problems for which we have simple and efficient combinatorial algorithms, but whose non-linear convex counterpart is harder and admits significantly less efficient algorithms. This motivates the computational model of convex optimization, including the offline, online and stochastic settings, using a linear optimization oracle. In this computational model we give several new results that improve over the previous state-of-the-art. Our main result is a novel conditional gradient algorithm for smooth and strongly convex optimization over polyhedral sets that performs only a single linear optimization step over the domain on each iteration and enjoys a linear convergence rate. This gives an exponential improvement in convergence rate over previous results. Based on this new conditional gradient algorithm we give the first algorithms for online convex optimization over polyhedral sets that perform only a single linear optimization step over the domain while having optimal regret guarantees, answering an open question of Kalai and Vempala, and Hazan and Kale. Our online algorithms also imply conditional gradient algorithms for non-smooth and stochastic convex optimization with the same convergence rates as projected (sub)gradient methods

    Decomposition algorithms for globally solving mathematical programs with affine equilibrium constraints

    Full text link
    A mathematical programming problem with affine equilibrium constraints (AMPEC) is a bilevel programming problem where the lower one is a parametric affine variational inequality. We formulate some classes of bilevel programming in forms of MPEC. Then we use a regularization technique to formulate the resulting problem as a mathematical program with an additional constraint defined by the difference of two convex functions (DC function). A main feature of this DC decomposition is that the second component depends upon only the parameter in the lower problem. This property allows us to develop branch-and-bound algorithms for globally solving AMPEC where the adaptive rectangular bisection takes place only in the space of the parameter. As an example, we use the proposed algorithm to solve a bilevel Nash-Cournot equilibrium market model. Computational results show the efficiency of the proposed algorithm.Comment: 17 page

    Fast and Globally Optimal Rigid Registration of 3D Point Sets by Transformation Decomposition

    Full text link
    The rigid registration of two 3D point sets is a fundamental problem in computer vision. The current trend is to solve this problem globally using the BnB optimization framework. However, the existing global methods are slow for two main reasons: the computational complexity of BnB is exponential to the problem dimensionality (which is six for 3D rigid registration), and the bound evaluation used in BnB is inefficient. In this paper, we propose two techniques to address these problems. First, we introduce the idea of translation invariant vectors, which allows us to decompose the search of a 6D rigid transformation into a search of 3D rotation followed by a search of 3D translation, each of which is solved by a separate BnB algorithm. This transformation decomposition reduces the problem dimensionality of BnB algorithms and substantially improves its efficiency. Then, we propose a new data structure, named 3D Integral Volume, to accelerate the bound evaluation in both BnB algorithms. By combining these two techniques, we implement an efficient algorithm for rigid registration of 3D point sets. Extensive experiments on both synthetic and real data show that the proposed algorithm is three orders of magnitude faster than the existing state-of-the-art global methods.Comment: 17pages, 16 figures and 6 table

    Efficient and Provable Multi-Query Optimization

    Full text link
    Complex queries for massive data analysis jobs have become increasingly commonplace. Many such queries contain com- mon subexpressions, either within a single query or among multiple queries submitted as a batch. Conventional query optimizers do not exploit these subexpressions and produce sub-optimal plans. The problem of multi-query optimization (MQO) is to generate an optimal combined evaluation plan by computing common subexpressions once and reusing them. Exhaustive algorithms for MQO explore an O(n^n) search space. Thus, this problem has primarily been tackled using various heuristic algorithms, without providing any theoretical guarantees on the quality of their solution. In this paper, instead of the conventional cost minimization problem, we treat the problem as maximizing a linear transformation of the cost function. We propose a greedy algorithm for this transformed formulation of the problem, which under weak, intuitive assumptions, provides an approximation factor guarantee for this formulation. We go on to show that this factor is optimal, unless P = NP. Another noteworthy point about our algorithm is that it can be easily incorporated into existing transformation-based optimizers. We finally propose optimizations which can be used to improve the efficiency of our algorithm

    A Fast Algorithm for Maximum Likelihood Estimation of Mixture Proportions Using Sequential Quadratic Programming

    Full text link
    Maximum likelihood estimation of mixture proportions has a long history, and continues to play an important role in modern statistics, including in development of nonparametric empirical Bayes methods. Maximum likelihood of mixture proportions has traditionally been solved using the expectation maximization (EM) algorithm, but recent work by Koenker \& Mizera shows that modern convex optimization techniques---in particular, interior point methods---are substantially faster and more accurate than EM. Here, we develop a new solution based on sequential quadratic programming (SQP). It is substantially faster than the interior point method, and just as accurate. Our approach combines several ideas: first, it solves a reformulation of the original problem; second, it uses an SQP approach to make the best use of the expensive gradient and Hessian computations; third, the SQP iterations are implemented using an active set method to exploit the sparse nature of the quadratic subproblems; fourth, it uses accurate low-rank approximations for more efficient gradient and Hessian computations. We illustrate the benefits of our approach in experiments on synthetic data sets as well as a large genetic association data set. In large data sets (n≈106n \approx 10^6 observations, m≈103m \approx 10^3 mixture components), our implementation achieves at least 100-fold reduction in runtime compared with a state-of-the-art interior point solver. Our methods are implemented in Julia, and in an R package available on CRAN (see https://CRAN.R-project.org/package=mixsqp).Comment: 28 pages, 6 figure

    Optimizing Adiabatic Quantum Program Compilation using a Graph-Theoretic Framework

    Full text link
    Adiabatic quantum computing has evolved in recent years from a theoretical field into an immensely practical area, a change partially sparked by D-Wave System's quantum annealing hardware. These multimillion-dollar quantum annealers offer the potential to solve optimization problems millions of times faster than classical heuristics, prompting researchers at Google, NASA and Lockheed Martin to study how these computers can be applied to complex real-world problems such as NASA rover missions. Unfortunately, compiling (embedding) an optimization problem into the annealing hardware is itself a difficult optimization problem and a major bottleneck currently preventing widespread adoption. Additionally, while finding a single embedding is difficult, no generalized method is known for tuning embeddings to use minimal hardware resources. To address these barriers, we introduce a graph-theoretic framework for developing structured embedding algorithms. Using this framework, we introduce a biclique virtual hardware layer to provide a simplified interface to the physical hardware. Additionally, we exploit bipartite structure in quantum programs using odd cycle transversal (OCT) decompositions. By coupling an OCT-based embedding algorithm with new, generalized reduction methods, we develop a new baseline for embedding a wide range of optimization problems into fault-free D-Wave annealing hardware. To encourage the reuse and extension of these techniques, we provide an implementation of the framework and embedding algorithms

    Tensor p-shrinkage nuclear norm for low-rank tensor completion

    Full text link
    In this paper, a new definition of tensor p-shrinkage nuclear norm (p-TNN) is proposed based on tensor singular value decomposition (t-SVD). In particular, it can be proved that p-TNN is a better approximation of the tensor average rank than the tensor nuclear norm when p < 1. Therefore, by employing the p-shrinkage nuclear norm, a novel low-rank tensor completion (LRTC) model is proposed to estimate a tensor from its partial observations. Statistically, the upper bound of recovery error is provided for the LRTC model. Furthermore, an efficient algorithm, accelerated by the adaptive momentum scheme, is developed to solve the resulting nonconvex optimization problem. It can be further guaranteed that the algorithm enjoys a global convergence rate under the smoothness assumption. Numerical experiments conducted on both synthetic and real-world data sets verify our results and demonstrate the superiority of our p-TNN in LRTC problems over several state-of-the-art methods

    An Iterative Reweighted Method for Tucker Decomposition of Incomplete Multiway Tensors

    Full text link
    We consider the problem of low-rank decomposition of incomplete multiway tensors. Since many real-world data lie on an intrinsically low dimensional subspace, tensor low-rank decomposition with missing entries has applications in many data analysis problems such as recommender systems and image inpainting. In this paper, we focus on Tucker decomposition which represents an Nth-order tensor in terms of N factor matrices and a core tensor via multilinear operations. To exploit the underlying multilinear low-rank structure in high-dimensional datasets, we propose a group-based log-sum penalty functional to place structural sparsity over the core tensor, which leads to a compact representation with smallest core tensor. The method for Tucker decomposition is developed by iteratively minimizing a surrogate function that majorizes the original objective function, which results in an iterative reweighted process. In addition, to reduce the computational complexity, an over-relaxed monotone fast iterative shrinkage-thresholding technique is adapted and embedded in the iterative reweighted process. The proposed method is able to determine the model complexity (i.e. multilinear rank) in an automatic way. Simulation results show that the proposed algorithm offers competitive performance compared with other existing algorithms

    Parallel Nonnegative CP Decomposition of Dense Tensors

    Full text link
    The CP tensor decomposition is a low-rank approximation of a tensor. We present a distributed-memory parallel algorithm and implementation of an alternating optimization method for computing a CP decomposition of dense tensor data that can enforce nonnegativity of the computed low-rank factors. The principal task is to parallelize the matricized-tensor times Khatri-Rao product (MTTKRP) bottleneck subcomputation. The algorithm is computation efficient, using dimension trees to avoid redundant computation across MTTKRPs within the alternating method. Our approach is also communication efficient, using a data distribution and parallel algorithm across a multidimensional processor grid that can be tuned to minimize communication. We benchmark our software on synthetic as well as hyperspectral image and neuroscience dynamic functional connectivity data, demonstrating that our algorithm scales well to 100s of nodes (up to 4096 cores) and is faster and more general than the currently available parallel software
    • …
    corecore