47,372 research outputs found

    Fast stable direct fitting and smoothness selection for Generalized Additive Models

    Get PDF
    Existing computationally efficient methods for penalized likelihood GAM fitting employ iterative smoothness selection on working linear models (or working mixed models). Such schemes fail to converge for a non-negligible proportion of models, with failure being particularly frequent in the presence of concurvity. If smoothness selection is performed by optimizing `whole model' criteria these problems disappear, but until now attempts to do this have employed finite difference based optimization schemes which are computationally inefficient, and can suffer from false convergence. This paper develops the first computationally efficient method for direct GAM smoothness selection. It is highly stable, but by careful structuring achieves a computational efficiency that leads, in simulations, to lower mean computation times than the schemes based on working-model smoothness selection. The method also offers a reliable way of fitting generalized additive mixed models

    Testing the consistency of wildlife data types before combining them: the case of camera traps and telemetry.

    Get PDF
    Wildlife data gathered by different monitoring techniques are often combined to estimate animal density. However, methods to check whether different types of data provide consistent information (i.e., can information from one data type be used to predict responses in the other?) before combining them are lacking. We used generalized linear models and generalized linear mixed-effects models to relate camera trap probabilities for marked animals to independent space use from telemetry relocations using 2 years of data for fishers (Pekania pennanti) as a case study. We evaluated (1) camera trap efficacy by estimating how camera detection probabilities are related to nearby telemetry relocations and (2) whether home range utilization density estimated from telemetry data adequately predicts camera detection probabilities, which would indicate consistency of the two data types. The number of telemetry relocations within 250 and 500 m from camera traps predicted detection probability well. For the same number of relocations, females were more likely to be detected during the first year. During the second year, all fishers were more likely to be detected during the fall/winter season. Models predicting camera detection probability and photo counts solely from telemetry utilization density had the best or nearly best Akaike Information Criterion (AIC), suggesting that telemetry and camera traps provide consistent information on space use. Given the same utilization density, males were more likely to be photo-captured due to larger home ranges and higher movement rates. Although methods that combine data types (spatially explicit capture-recapture) make simple assumptions about home range shapes, it is reasonable to conclude that in our case, camera trap data do reflect space use in a manner consistent with telemetry data. However, differences between the 2 years of data suggest that camera efficacy is not fully consistent across ecological conditions and make the case for integrating other sources of space-use data

    A Penalty Approach to Differential Item Functioning in Rasch Models

    Get PDF
    A new diagnostic tool for the identification of differential item functioning (DIF) is proposed. Classical approaches to DIF allow to consider only few subpopulations like ethnic groups when investigating if the solution of items depends on the membership to a subpopulation. We propose an explicit model for differential item functioning that includes a set of variables, containing metric as well as categorical components, as potential candidates for inducing DIF. The ability to include a set of covariates entails that the model contains a large number of parameters. Regularized estimators, in particular penalized maximum likelihood estimators, are used to solve the estimation problem and to identify the items that induce DIF. It is shown that the method is able to detect items with DIF. Simulations and two applications demonstrate the applicability of the method

    Global and local distance-based generalized linear models

    Get PDF
    This paper introduces local distance-based generalized linear models. These models extend (weighted) distance-based linear models first to the generalized linear model framework. Then, a nonparametric version of these models is proposed by means of local fitting. Distances between individuals are the only predictor information needed to fit these models. Therefore, they are applicable, among others, to mixed (qualitative and quantitative) explanatory variables or when the regressor is of functional type. An implementation is provided by the R package dbstats, which also implements other distance-based prediction methods. Supplementary material for this article is available online, which reproduces all the results of this article.Peer ReviewedPostprint (author's final draft

    Bayesian Deep Net GLM and GLMM

    Full text link
    Deep feedforward neural networks (DFNNs) are a powerful tool for functional approximation. We describe flexible versions of generalized linear and generalized linear mixed models incorporating basis functions formed by a DFNN. The consideration of neural networks with random effects is not widely used in the literature, perhaps because of the computational challenges of incorporating subject specific parameters into already complex models. Efficient computational methods for high-dimensional Bayesian inference are developed using Gaussian variational approximation, with a parsimonious but flexible factor parametrization of the covariance matrix. We implement natural gradient methods for the optimization, exploiting the factor structure of the variational covariance matrix in computation of the natural gradient. Our flexible DFNN models and Bayesian inference approach lead to a regression and classification method that has a high prediction accuracy, and is able to quantify the prediction uncertainty in a principled and convenient way. We also describe how to perform variable selection in our deep learning method. The proposed methods are illustrated in a wide range of simulated and real-data examples, and the results compare favourably to a state of the art flexible regression and classification method in the statistical literature, the Bayesian additive regression trees (BART) method. User-friendly software packages in Matlab, R and Python implementing the proposed methods are available at https://github.com/VBayesLabComment: 35 pages, 7 figure, 10 table

    Design Issues for Generalized Linear Models: A Review

    Full text link
    Generalized linear models (GLMs) have been used quite effectively in the modeling of a mean response under nonstandard conditions, where discrete as well as continuous data distributions can be accommodated. The choice of design for a GLM is a very important task in the development and building of an adequate model. However, one major problem that handicaps the construction of a GLM design is its dependence on the unknown parameters of the fitted model. Several approaches have been proposed in the past 25 years to solve this problem. These approaches, however, have provided only partial solutions that apply in only some special cases, and the problem, in general, remains largely unresolved. The purpose of this article is to focus attention on the aforementioned dependence problem. We provide a survey of various existing techniques dealing with the dependence problem. This survey includes discussions concerning locally optimal designs, sequential designs, Bayesian designs and the quantile dispersion graph approach for comparing designs for GLMs.Comment: Published at http://dx.doi.org/10.1214/088342306000000105 in the Statistical Science (http://www.imstat.org/sts/) by the Institute of Mathematical Statistics (http://www.imstat.org
    corecore