2,817 research outputs found

    Calibrated Multivariate Regression with Application to Neural Semantic Basis Discovery

    Full text link
    We propose a calibrated multivariate regression method named CMR for fitting high dimensional multivariate regression models. Compared with existing methods, CMR calibrates regularization for each regression task with respect to its noise level so that it simultaneously attains improved finite-sample performance and tuning insensitiveness. Theoretically, we provide sufficient conditions under which CMR achieves the optimal rate of convergence in parameter estimation. Computationally, we propose an efficient smoothed proximal gradient algorithm with a worst-case numerical rate of convergence \cO(1/\epsilon), where ϵ\epsilon is a pre-specified accuracy of the objective function value. We conduct thorough numerical simulations to illustrate that CMR consistently outperforms other high dimensional multivariate regression methods. We also apply CMR to solve a brain activity prediction problem and find that it is as competitive as a handcrafted model created by human experts. The R package \texttt{camel} implementing the proposed method is available on the Comprehensive R Archive Network \url{http://cran.r-project.org/web/packages/camel/}.Comment: Journal of Machine Learning Research, 201

    A fast semi-direct least squares algorithm for hierarchically block separable matrices

    Full text link
    We present a fast algorithm for linear least squares problems governed by hierarchically block separable (HBS) matrices. Such matrices are generally dense but data-sparse and can describe many important operators including those derived from asymptotically smooth radial kernels that are not too oscillatory. The algorithm is based on a recursive skeletonization procedure that exposes this sparsity and solves the dense least squares problem as a larger, equality-constrained, sparse one. It relies on a sparse QR factorization coupled with iterative weighted least squares methods. In essence, our scheme consists of a direct component, comprised of matrix compression and factorization, followed by an iterative component to enforce certain equality constraints. At most two iterations are typically required for problems that are not too ill-conditioned. For an M×NM \times N HBS matrix with MNM \geq N having bounded off-diagonal block rank, the algorithm has optimal O(M+N)\mathcal{O} (M + N) complexity. If the rank increases with the spatial dimension as is common for operators that are singular at the origin, then this becomes O(M+N)\mathcal{O} (M + N) in 1D, O(M+N3/2)\mathcal{O} (M + N^{3/2}) in 2D, and O(M+N2)\mathcal{O} (M + N^{2}) in 3D. We illustrate the performance of the method on both over- and underdetermined systems in a variety of settings, with an emphasis on radial basis function approximation and efficient updating and downdating.Comment: 24 pages, 8 figures, 6 tables; to appear in SIAM J. Matrix Anal. App

    Elastic net prefiltering for two class classification

    No full text
    A two-stage linear-in-the-parameter model construction algorithm is proposed aimed at noisy two-class classification problems. The purpose of the first stage is to produce a prefiltered signal that is used as the desired output for the second stage which constructs a sparse linear-in-the-parameter classifier. The prefiltering stage is a two-level process aimed at maximizing a model’s generalization capability, in which a new elastic-net model identification algorithm using singular value decomposition is employed at the lower level, and then, two regularization parameters are optimized using a particle-swarm-optimization algorithm at the upper level by minimizing the leave-one-out (LOO) misclassification rate. It is shown that the LOO misclassification rate based on the resultant prefiltered signal can be analytically computed without splitting the data set, and the associated computational cost is minimal due to orthogonality. The second stage of sparse classifier construction is based on orthogonal forward regression with the D-optimality algorithm. Extensive simulations of this approach for noisy data sets illustrate the competitiveness of this approach to classification of noisy data problems
    corecore