56 research outputs found

    Nonlinear Parameterization in Bi-Criteria Sample Balancing

    Get PDF
    Sample balancing is widely used in applied research to adjust a sample data to achieve better correspondence to Census statistics. The classic Deming-Stephan iterative proportional approach finds the weights of observations by fitting the cross-tables of sample counts to known margins. This work considers a bi-criteria objective for finding weights with maximum possible effective base size. This approach is presented as a ridge regression with the exponential nonlinear parameterization that produces nonnegative weights for sample balancing

    Regression Split by Levels of the Dependent Variable

    Get PDF
    Multiple regression coefficients split by the levels of the dependent variable are examined. The decomposition of the coefficients can be defined by points on the ordinal scale or by levels in the numerical response using the Gifi system of binary variables. This approach permits consideration of specific values of the coefficients at each layer of the response variable. Numerical results illustrate how to identify levels of interpretable regression coefficients

    Optimal Lp-Metric for Minimizing Powered Deviations in Regression

    Get PDF
    Minimizations by least squares or by least absolute deviations are well known criteria in regression modeling. In this work the criterion of generalized mean by powered deviations is suggested. If the parameter of the generalized mean equals one or two, the fitting corresponds to the least absolute or the least squared deviations, respectively. Varying the power parameter yields an optimum value for the objective with a minimum possible residual error. Estimation of a most favorable value of the generalized mean parameter shows that it almost does not depend on data. The optimal power always occurs to be close to 1.7, so these powered deviations should be used for a better regression fit

    Entropy Criterion In Logistic Regression And Shapley Value Of Predictors

    Get PDF
    Entropy criterion is used for constructing a binary response regression model with a logistic link. This approach yields a logistic model with coefficients proportional to the coefficients of linear regression. Based on this property, the Shapley value estimation of predictors’ contribution is applied for obtaining robust coefficients of the linear aggregate adjusted to the logistic model. This procedure produces a logistic regression with interpretable coefficients robust to multicollinearity. Numerical results demonstrate theoretical and practical advantages of the entropy-logistic regression

    Priorities in Thurstone Scaling and Steady-State Probabilities in Markov Stochastic Modeling

    Get PDF
    Thurstone scaling is widely used in marketing and advertising research where various methods of applied psychology are utilized. This article considers several analytical tools useful for positioning a set of items on a Thurstone scale via regression modeling and Markov stochastic processing in the form of Chapman-Kolmogorov equations. These approaches produce interval and ratio scales of preferences and enrich the possibilities of paired comparison estimation applied for solving practical problems of prioritization and probability of choice modeling

    How Good is Best? Multivariate Case of Ehrenberg-Weisberg Analysis of Residual Errors in Competing Regressions

    Get PDF
    A.S.C. Ehrenberg first noticed and S. Weisberg then formalized a property of pairwise regression to keep its quality almost at the same level of precision while the coefficients of the model could vary over a wide span of values. This paper generalizes the estimates of the percent change in the residual standard deviation to the case of competing multiple regressions. It shows that in contrast to the simple pairwise model, the coefficients of multiple regression can be changed over a wider range of the values including the opposite by signs coefficients. Consideration of these features facilitates better understanding the properties of regression and opens a possibility to modify the obtained regression coefficients into meaningful and interpretable values using additional criteria. Several competing modifications of the linear regression with interpretable coefficients are described and compared in the Ehrenberg-Weisberg approach

    Generalized Singular Value Decomposition with Additive Components

    Get PDF
    The singular value decomposition (SVD) technique is extended to incorporate the additive components for approximation of a rectangular matrix by the outer products of vectors. While dual vectors of the regular SVD can be expressed one via linear transformation of the other, the modified SVD corresponds to the general linear transformation with the additive part. The method obtained can be related to the family of principal component and correspondence analyses, and can be reduced to an eigenproblem of a specific transformation of a data matrix. This technique is applied to constructing dual eigenvectors for data visualizing in a two dimensional space

    Regressions Regularized by Correlations

    Get PDF
    The regularization of multiple regression by proportionality to correlations of predictors with dependent variable is applied to the least squares objective and normal equations to relax the exact equalities and to get a robust solution. This technique produces models not prone to multicollinearity and is very useful in practical applications

    Multiple Regression in Pair Correlation Solution

    Get PDF
    Behavior of the coefficients of ordinary least squares (OLS) regression with the coefficients regularized by the one-parameter ridge (Ridge-1) and two-parameter ridge (Ridge-2) regressions are compared. The ridge models are not prone to multicollinearity. The fit quality of Ridge-2 does not decrease with the profile parameter increase, but the Ridge-2 model converges to a solution proportional to the coefficients of pair correlation between the dependent variable and predictors. The Correlation-Regression (CORE) model suggests meaningful coefficients and net effects for the individual impact of the predictors, high quality model fit, and convenient analysis and interpretation of the regression. Simulation with three correlations show in which areas the OLS regression coefficients have the same signs with pair correlations, and where the signs are opposite. The CORE technique should be used to keep the expected direction of the predictor’s impact on the dependent variable

    Regression Modeling and Prediction by Individual Observations versus Frequency

    Get PDF
    A regression model built by a dataset could sometimes demonstrate a low quality of fit and poor predictions of individual observations. However, using the frequencies of possible combinations of the predictors and the outcome, the same models with the same parameters may yield a high quality of fit and precise predictions for the frequencies of the outcome occurrence. Linear and logistical regressions are used to make an explicit exposition of the results of regression modeling and prediction
    • …
    corecore