320,781 research outputs found
A mixed approach and a distribution free multiple imputation technique for the estimation of multivariate probit models with missing values
In the present paper a mixed generalized estimating/pseudoÂscore equations (GEPSE) approach together with a distribution free multiple imputation technique is proposed for the estimation of regression and correlation structure parameters of multivariate probit models with missing values for an ordered categorical time invariant variable. Furthermore, a generalization of the squared trace correlation (R_T^2) for multivariate probit models, denoted as pseudo R_T^2, is proposed. A simulation study was conducted, simulating a probit model with an equicorrelation structure in the errors of an underlying regression model and using two different missing mechanisms. For a low `true' correlation the difference between the GEPSE, a generalized estimating equations (GEE) and a maximum likelihood (ML) estimator were negligible. For a high `true' correlation the GEPSE estimator turned out to be more efficient than the GEE and very efficient relative to the ML estimator. Furthermore, the pseudo R_T^2 was close to R_T^2 of the underlying linear model. The mixed approach is illustrated using a psychiatric data set of depressive inpatients. The results of this analysis suggest, that the depression score at discharge from a psychiatric hospital and the occurence of stressful life events seem to increase the probability of having an episode of major depression within a oneÂyear interval after discharge. Furthermore, the correlation structure points to shortÂtime effects on having or not having a depressive episode, not accounted for in the systematic part of the regression model
Analysis of overhead cost behavior: Case study on decision-making approach
Cost management is one of the most significant issues in company performance and company financial management which any enterprise has to solve as in the periods of declines of sales revenues, as well as during their growth. In this study we designed and tested several regression models that could be suitable for cost behavior prediction and subsequent decision-making based on these predictions. We used multiple linear regression models with a point estimate and with interval estimate of the model parameters. Comparison of regression models of cost behavior and their reliability was carried out due to the quality of the data collected for the case of basic and adjusted data. The overheads were divided into several groups of relevant costs and their dependences were examined on different factors other than only the production volume using the correlation matrix. From the results of the transformed model we believe that asymmetric cost behavior is affected by asymmetric behavior of the chosen factors. As the final one was intended the model representing the change in costs in time shifting about one-month period. This model can be used for examining costs in time shift by a short period (e.g., months) and thus it is possible to provecost asymmetric behavior called “sticky costs”. We used the model adjusted in accordance with Anderson et al. (2003). and we kept the model clearly transformed and assembled so that there remained only those variables that had a statistically significant effect on the dependent variable. The limitations of these models were also defined. Finally, graphical analyses of deviations were performed to find similarities in cost through cost centres and through the examined periods. © Foundation of International Studies, 2017. and CSR, 2017
Multiple Imputation Using Gaussian Copulas
Missing observations are pervasive throughout empirical research, especially
in the social sciences. Despite multiple approaches to dealing adequately with
missing data, many scholars still fail to address this vital issue. In this
paper, we present a simple-to-use method for generating multiple imputations
using a Gaussian copula. The Gaussian copula for multiple imputation (Hoff,
2007) allows scholars to attain estimation results that have good coverage and
small bias. The use of copulas to model the dependence among variables will
enable researchers to construct valid joint distributions of the data, even
without knowledge of the actual underlying marginal distributions. Multiple
imputations are then generated by drawing observations from the resulting
posterior joint distribution and replacing the missing values. Using simulated
and observational data from published social science research, we compare
imputation via Gaussian copulas with two other widely used imputation methods:
MICE and Amelia II. Our results suggest that the Gaussian copula approach has a
slightly smaller bias, higher coverage rates, and narrower confidence intervals
compared to the other methods. This is especially true when the variables with
missing data are not normally distributed. These results, combined with
theoretical guarantees and ease-of-use suggest that the approach examined
provides an attractive alternative for applied researchers undertaking multiple
imputations
Real-time localised forecasting of the Madden-Julian Oscillation using neural network models
Existing statistical forecast models of the Madden-Julian Oscillation (MJO) are generally of very low order and predict the evolution of a small number (typically two) of principal components (PCs). While such models are skilful up to 25 days lead time, by design they only predict the very largest-scale features of the MJO. Here we present a higher-order MJO statistical forecast model that is able to predict MJO variability on smaller, more localised scales, that will be of more direct benefit to national weather agencies and regional government planning. The model is based on daily outgoing long-wave radiation (OLR) data that are intraseasonally filtered using a recently developed technique of empirical mode decomposition that can be used in real time. A standard truncated PC analysis is then used to isolate the maximum amount of variance in a finite number of modes. The evolution of these modes is then forecast using a neural network model, which does not suffer from the parametrisation problems of other statistical forecast techniques when applied to a higher number of modes. Compared to a standard 2-PC model, the higher-order PC model showed improved skill over the whole MJO domain, with substantial improvements over the western Pacific, Arabian Sea, Bay of Bengal, South China Sea and Phillipine Sea
Multiple imputation for continuous variables using a Bayesian principal component analysis
We propose a multiple imputation method based on principal component analysis
(PCA) to deal with incomplete continuous data. To reflect the uncertainty of
the parameters from one imputation to the next, we use a Bayesian treatment of
the PCA model. Using a simulation study and real data sets, the method is
compared to two classical approaches: multiple imputation based on joint
modelling and on fully conditional modelling. Contrary to the others, the
proposed method can be easily used on data sets where the number of individuals
is less than the number of variables and when the variables are highly
correlated. In addition, it provides unbiased point estimates of quantities of
interest, such as an expectation, a regression coefficient or a correlation
coefficient, with a smaller mean squared error. Furthermore, the widths of the
confidence intervals built for the quantities of interest are often smaller
whilst ensuring a valid coverage.Comment: 16 page
- …