49,258 research outputs found

    Scalable visualisation methods for modern Generalized Additive Models

    Full text link
    In the last two decades the growth of computational resources has made it possible to handle Generalized Additive Models (GAMs) that formerly were too costly for serious applications. However, the growth in model complexity has not been matched by improved visualisations for model development and results presentation. Motivated by an industrial application in electricity load forecasting, we identify the areas where the lack of modern visualisation tools for GAMs is particularly severe, and we address the shortcomings of existing methods by proposing a set of visual tools that a) are fast enough for interactive use, b) exploit the additive structure of GAMs, c) scale to large data sets and d) can be used in conjunction with a wide range of response distributions. All the new visual methods proposed in this work are implemented by the mgcViz R package, which can be found on the Comprehensive R Archive Network

    Fast Genome-Wide QTL Association Mapping on Pedigree and Population Data

    Full text link
    Since most analysis software for genome-wide association studies (GWAS) currently exploit only unrelated individuals, there is a need for efficient applications that can handle general pedigree data or mixtures of both population and pedigree data. Even data sets thought to consist of only unrelated individuals may include cryptic relationships that can lead to false positives if not discovered and controlled for. In addition, family designs possess compelling advantages. They are better equipped to detect rare variants, control for population stratification, and facilitate the study of parent-of-origin effects. Pedigrees selected for extreme trait values often segregate a single gene with strong effect. Finally, many pedigrees are available as an important legacy from the era of linkage analysis. Unfortunately, pedigree likelihoods are notoriously hard to compute. In this paper we re-examine the computational bottlenecks and implement ultra-fast pedigree-based GWAS analysis. Kinship coefficients can either be based on explicitly provided pedigrees or automatically estimated from dense markers. Our strategy (a) works for random sample data, pedigree data, or a mix of both; (b) entails no loss of power; (c) allows for any number of covariate adjustments, including correction for population stratification; (d) allows for testing SNPs under additive, dominant, and recessive models; and (e) accommodates both univariate and multivariate quantitative traits. On a typical personal computer (6 CPU cores at 2.67 GHz), analyzing a univariate HDL (high-density lipoprotein) trait from the San Antonio Family Heart Study (935,392 SNPs on 1357 individuals in 124 pedigrees) takes less than 2 minutes and 1.5 GB of memory. Complete multivariate QTL analysis of the three time-points of the longitudinal HDL multivariate trait takes less than 5 minutes and 1.5 GB of memory

    A goodness-of-fit test for parametric and semi-parametric models in multiresponse regression

    Full text link
    We propose an empirical likelihood test that is able to test the goodness of fit of a class of parametric and semi-parametric multiresponse regression models. The class includes as special cases fully parametric models; semi-parametric models, like the multiindex and the partially linear models; and models with shape constraints. Another feature of the test is that it allows both the response variable and the covariate be multivariate, which means that multiple regression curves can be tested simultaneously. The test also allows the presence of infinite-dimensional nuisance functions in the model to be tested. It is shown that the empirical likelihood test statistic is asymptotically normally distributed under certain mild conditions and permits a wild bootstrap calibration. Despite the large size of the class of models to be considered, the empirical likelihood test enjoys good power properties against departures from a hypothesized model within the class.Comment: Published in at http://dx.doi.org/10.3150/09-BEJ208 the Bernoulli (http://isi.cbs.nl/bernoulli/) by the International Statistical Institute/Bernoulli Society (http://isi.cbs.nl/BS/bshome.htm
    corecore