9,939 research outputs found

    On the estimation of normal copula discrete regression models using the continuous extension and simulated likelihood

    Get PDF
    The continuous extension of a discrete random variable is amongst the computational methods used for estimation of multivariate normal copula-based models with discrete margins. Its advantage is that the likelihood can be derived conveniently under the theory for copula models with continuous margins, but there has not been a clear analysis of the adequacy of this method. We investigate the asymptotic and small-sample efficiency of two variants of the method for estimating the multivariate normal copula with univariate binary, Poisson, and negative binomial regressions, and show that they lead to biased estimates for the latent correlations, and the univariate marginal parameters that are not regression coefficients. We implement a maximum simulated likelihood method, which is based on evaluating the multidimensional integrals of the likelihood with randomized quasi Monte Carlo methods. Asymptotic and small-sample efficiency calculations show that our method is nearly as efficient as maximum likelihood for fully specified multivariate normal copula-based models. An illustrative example is given to show the use of our simulated likelihood method

    Penalized Likelihood and Bayesian Function Selection in Regression Models

    Full text link
    Challenging research in various fields has driven a wide range of methodological advances in variable selection for regression models with high-dimensional predictors. In comparison, selection of nonlinear functions in models with additive predictors has been considered only more recently. Several competing suggestions have been developed at about the same time and often do not refer to each other. This article provides a state-of-the-art review on function selection, focusing on penalized likelihood and Bayesian concepts, relating various approaches to each other in a unified framework. In an empirical comparison, also including boosting, we evaluate several methods through applications to simulated and real data, thereby providing some guidance on their performance in practice

    MIDAS: A SAS Macro for Multiple Imputation Using Distance-Aided Selection of Donors

    Get PDF
    In this paper we describe MIDAS: a SAS macro for multiple imputation using distance aided selection of donors which implements an iterative predictive mean matching hot-deck for imputing missing data. This is a flexible multiple imputation approach that can handle data in a variety of formats: continuous, ordinal, and scaled. Because the imputation models are implicit, it is not necessary to specify a parametric distribution for each variable to be imputed. MIDAS also allows the user to address the sensitivity of their inferences to different assumptions concerning the missing data mechanism. An example using MIDAS to impute missing data is presented and MIDAS is compared to existing missing data software.

    Normal-Mixture-of-Inverse-Gamma Priors for Bayesian Regularization and Model Selection in Structured Additive Regression Models

    Get PDF
    In regression models with many potential predictors, choosing an appropriate subset of covariates and their interactions at the same time as determining whether linear or more flexible functional forms are required is a challenging and important task. We propose a spike-and-slab prior structure in order to include or exclude single coefficients as well as blocks of coefficients associated with factor variables, random effects or basis expansions of smooth functions. Structured additive models with this prior structure are estimated with Markov Chain Monte Carlo using a redundant multiplicative parameter expansion. We discuss shrinkage properties of the novel prior induced by the redundant parameterization, investigate its sensitivity to hyperparameter settings and compare performance of the proposed method in terms of model selection, sparsity recovery, and estimation error for Gaussian, binomial and Poisson responses on real and simulated data sets with that of component-wise boosting and other approaches

    Phylogenetic Machine Learning Methods and Application to Mammal Dental Traits and Bioclimatic Variables

    Get PDF
    Standard machine learning procedures are based on assumption that training and testing data is sampled independently from identical distributions. Comparative data of traits in biological species breaks this assumption. Data instances are related by ancestry relationships, that is phylogeny. In this study, new machine learning procedures are presented that take into account phylogenetic information when fitting predictive models. Phylogenetic statistics for classification accuracy and error are proposed based on the concept of effective sample size. Versions of perceptron training and KNN classification are built on these metrics. Procedures for regularised PGLS regression, phylogenetic KNN regression, neural network regression and regression trees are presented. Properties of phylogenetic perceptron training and KNN regression are studied with synthetic data. Experiments demonstrate that phylogenetic perceptron training improves robustness when the phylogeny is unbalanced. Regularised PGLS and KNN regression are applied to mammal dental traits and environments to both test the algorithms and gain insights in the relationship of mammal teeth and the environment

    Variable Selection and Model Averaging in Semiparametric Overdispersed Generalized Linear Models

    Full text link
    We express the mean and variance terms in a double exponential regression model as additive functions of the predictors and use Bayesian variable selection to determine which predictors enter the model, and whether they enter linearly or flexibly. When the variance term is null we obtain a generalized additive model, which becomes a generalized linear model if the predictors enter the mean linearly. The model is estimated using Markov chain Monte Carlo simulation and the methodology is illustrated using real and simulated data sets.Comment: 8 graphs 35 page

    Nonlinear Suppression of Range Ambiguity in Pulse Doppler Radar

    Get PDF
    Coherent pulse train processing is most commonly used in airborne pulse Doppler radar, achieving adequate transmitter/receiver isolation and excellent resolution properties while inherently inducing ambiguities in Doppler and range. First introduced by Palermo in 1962 using two conjugate LFM pulses, the primary nonlinear suppression objective involves reducing range ambiguity, given the waveform is nominally unambiguous in Doppler, by using interpulse and intrapulse coding (pulse compression) to discriminate received ambiguous pulse responses. By introducing a nonlinear operation on compressed (undesired) pulse responses within individual channels, ambiguous energy levels are reduced in channel outputs. This research expands the NLS concept using discrete coding and processing. A general theory is developed showing how NLS accomplishes ambiguity surface volume removal without requiring orthogonal coding. Useful NLS code sets are generated using combinatorial, simulated annealing optimization techniques - a general algorithm is developed to extended family size, code length, and number of phases (polyphase coding). An adaptive reserved code thresholding scheme is introduced to efficiently and effectively track the matched filter response of a target field over a wide dynamic range, such as normally experienced in airborne radar systems. An evaluation model for characterizing NLS clutter suppression performance is developed - NLS performance is characterized using measured clutter data with analysis indicating the proposed technique performs relatively well even when large clutter cells exist
    corecore