134,190 research outputs found

    Top Rank Optimization in Linear Time

    Full text link
    Bipartite ranking aims to learn a real-valued ranking function that orders positive instances before negative instances. Recent efforts of bipartite ranking are focused on optimizing ranking accuracy at the top of the ranked list. Most existing approaches are either to optimize task specific metrics or to extend the ranking loss by emphasizing more on the error associated with the top ranked instances, leading to a high computational cost that is super-linear in the number of training instances. We propose a highly efficient approach, titled TopPush, for optimizing accuracy at the top that has computational complexity linear in the number of training instances. We present a novel analysis that bounds the generalization error for the top ranked instances for the proposed approach. Empirical study shows that the proposed approach is highly competitive to the state-of-the-art approaches and is 10-100 times faster

    Linear Convergence of a Frank-Wolfe Type Algorithm over Trace-Norm Balls

    Full text link
    We propose a rank-kk variant of the classical Frank-Wolfe algorithm to solve convex optimization over a trace-norm ball. Our algorithm replaces the top singular-vector computation (11-SVD) in Frank-Wolfe with a top-kk singular-vector computation (kk-SVD), which can be done by repeatedly applying 11-SVD kk times. Alternatively, our algorithm can be viewed as a rank-kk restricted version of projected gradient descent. We show that our algorithm has a linear convergence rate when the objective function is smooth and strongly convex, and the optimal solution has rank at most kk. This improves the convergence rate and the total time complexity of the Frank-Wolfe method and its variants.Comment: In NIPS 201

    Faster Eigenvector Computation via Shift-and-Invert Preconditioning

    Full text link
    We give faster algorithms and improved sample complexities for estimating the top eigenvector of a matrix Σ\Sigma -- i.e. computing a unit vector xx such that xTΣx(1ϵ)λ1(Σ)x^T \Sigma x \ge (1-\epsilon)\lambda_1(\Sigma): Offline Eigenvector Estimation: Given an explicit ARn×dA \in \mathbb{R}^{n \times d} with Σ=ATA\Sigma = A^TA, we show how to compute an ϵ\epsilon approximate top eigenvector in time O~([nnz(A)+dsr(A)gap2]log1/ϵ)\tilde O([nnz(A) + \frac{d*sr(A)}{gap^2} ]* \log 1/\epsilon ) and O~([nnz(A)3/4(dsr(A))1/4gap]log1/ϵ)\tilde O([\frac{nnz(A)^{3/4} (d*sr(A))^{1/4}}{\sqrt{gap}} ] * \log 1/\epsilon ). Here nnz(A)nnz(A) is the number of nonzeros in AA, sr(A)sr(A) is the stable rank, gapgap is the relative eigengap. By separating the gapgap dependence from the nnz(A)nnz(A) term, our first runtime improves upon the classical power and Lanczos methods. It also improves prior work using fast subspace embeddings [AC09, CW13] and stochastic optimization [Sha15c], giving significantly better dependencies on sr(A)sr(A) and ϵ\epsilon. Our second running time improves these further when nnz(A)dsr(A)gap2nnz(A) \le \frac{d*sr(A)}{gap^2}. Online Eigenvector Estimation: Given a distribution DD with covariance matrix Σ\Sigma and a vector x0x_0 which is an O(gap)O(gap) approximate top eigenvector for Σ\Sigma, we show how to refine to an ϵ\epsilon approximation using O(var(D)gapϵ) O(\frac{var(D)}{gap*\epsilon}) samples from DD. Here var(D)var(D) is a natural notion of variance. Combining our algorithm with previous work to initialize x0x_0, we obtain improved sample complexity and runtime results under a variety of assumptions on DD. We achieve our results using a general framework that we believe is of independent interest. We give a robust analysis of the classic method of shift-and-invert preconditioning to reduce eigenvector computation to approximately solving a sequence of linear systems. We then apply fast stochastic variance reduced gradient (SVRG) based system solvers to achieve our claims.Comment: Appearing in ICML 2016. Combination of work in arXiv:1509.05647 and arXiv:1510.0889

    A Spectral Learning Approach to Range-Only SLAM

    Full text link
    We present a novel spectral learning algorithm for simultaneous localization and mapping (SLAM) from range data with known correspondences. This algorithm is an instance of a general spectral system identification framework, from which it inherits several desirable properties, including statistical consistency and no local optima. Compared with popular batch optimization or multiple-hypothesis tracking (MHT) methods for range-only SLAM, our spectral approach offers guaranteed low computational requirements and good tracking performance. Compared with popular extended Kalman filter (EKF) or extended information filter (EIF) approaches, and many MHT ones, our approach does not need to linearize a transition or measurement model; such linearizations can cause severe errors in EKFs and EIFs, and to a lesser extent MHT, particularly for the highly non-Gaussian posteriors encountered in range-only SLAM. We provide a theoretical analysis of our method, including finite-sample error bounds. Finally, we demonstrate on a real-world robotic SLAM problem that our algorithm is not only theoretically justified, but works well in practice: in a comparison of multiple methods, the lowest errors come from a combination of our algorithm with batch optimization, but our method alone produces nearly as good a result at far lower computational cost
    corecore