37,691 research outputs found
Efficient computation of condition estimates for linear least squares problems
Linear least squares (LLS) is a classical linear algebra problem in scientific computing, arising for instance in many parameter estimation problems. In addition to computing efficiently LLS solutions, an important issue is to assess the numerical quality of the computed solution. The notion of conditioning provides a theoretical framework that can be used to measure the numerical sensitivity of a problem solution to perturbations in its data. We recall some results for least squares conditioning and we derive a statistical estimate for the conditioning of an LLS solution. We present numerical experiments to compare exact values and statistical estimates. We also propose performance results using new routines on top of the multicore-GPU library MAGMA. This set of routines is based on an efficient computation of the variance-covariance matrix for which, to our knowledge, there is no implementation in current public domain libraries LAPACK and ScaLAPACK
Computing the Conditioning of the Components of a Linear Least Squares Solution
In this paper, we address the accuracy of the results for the overdetermined
full rank linear least squares problem. We recall theoretical results obtained
in Arioli, Baboulin and Gratton, SIMAX 29(2):413--433, 2007, on conditioning of
the least squares solution and the components of the solution when the matrix
perturbations are measured in Frobenius or spectral norms. Then we define
computable estimates for these condition numbers and we interpret them in terms
of statistical quantities. In particular, we show that, in the classical linear
statistical model, the ratio of the variance of one component of the solution
by the variance of the right-hand side is exactly the condition number of this
solution component when perturbations on the right-hand side are considered. We
also provide fragment codes using LAPACK routines to compute the
variance-covariance matrix and the least squares conditioning and we give the
corresponding computational cost. Finally we present a small historical
numerical example that was used by Laplace in Theorie Analytique des
Probabilites, 1820, for computing the mass of Jupiter and experiments from the
space industry with real physical data
Bayesian Restricted Likelihood Methods: Conditioning on Insufficient Statistics in Bayesian Regression
Bayesian methods have proven themselves to be successful across a wide range
of scientific problems and have many well-documented advantages over competing
methods. However, these methods run into difficulties for two major and
prevalent classes of problems: handling data sets with outliers and dealing
with model misspecification. We outline the drawbacks of previous solutions to
both of these problems and propose a new method as an alternative. When working
with the new method, the data is summarized through a set of insufficient
statistics, targeting inferential quantities of interest, and the prior
distribution is updated with the summary statistics rather than the complete
data. By careful choice of conditioning statistics, we retain the main benefits
of Bayesian methods while reducing the sensitivity of the analysis to features
of the data not captured by the conditioning statistics. For reducing
sensitivity to outliers, classical robust estimators (e.g., M-estimators) are
natural choices for conditioning statistics. A major contribution of this work
is the development of a data augmented Markov chain Monte Carlo (MCMC)
algorithm for the linear model and a large class of summary statistics. We
demonstrate the method on simulated and real data sets containing outliers and
subject to model misspecification. Success is manifested in better predictive
performance for data points of interest as compared to competing methods
A Sensitivity Matrix Methodology for Inverse Problem Formulation
We propose an algorithm to select parameter subset combinations that can be estimated using an ordinary least-squares (OLS) inverse problem formulation with a given data set. First, the algorithm selects the parameter combinations that correspond to sensitivity matrices with full rank. Second, the algorithm involves uncertainty quantification by using the inverse of the Fisher Information Matrix. Nominal values of parameters are used to construct synthetic data sets, and explore the effects of removing certain parameters from those to be estimated using OLS procedures. We quantify these effects in a score for a vector parameter defined using the norm of the vector of standard errors for components of estimates divided by the estimates. In some cases the method leads to reduction of the standard error for a parameter to less than 1% of the estimate
Detecting and Assessing the Problems Caused by Multi-Collinearity: A Useof the Singular-Value Decomposition
This paper presents a means for detecting the presence of multicollinearity and for assessing the damage that such collinearity may cause estimated coefficients in the standard linear regression model. The means of analysis is the singular value decomposition, a numerical analytic device that directly exposes both the conditioning of the data matrix X and the linear dependencies that may exist among its columns. The same information is employed in the second part of the paper to determine the extent to which each regression coefficient is being adversely affected by each linear relation among the columns of X that lead to its ill conditioning.
- …