Search CORE

28,896 research outputs found

Recommended from our members

Relationships between estimated autozygosity and complex traits in the UK Biobank

Author: Evans Luke M
Johnson Emma C
Keller Matthew C
Publication venue: Digital Commons@Becker
Publication date: 01/01/2018
Field of study

<div><p>Inbreeding increases the risk of certain Mendelian disorders in humans but may also reduce fitness through its effects on complex traits and diseases. Such inbreeding depression is thought to occur due to increased homozygosity at causal variants that are recessive with respect to fitness. Until recently it has been difficult to amass large enough sample sizes to investigate the effects of inbreeding depression on complex traits using genome-wide single nucleotide polymorphism (SNP) data in population-based samples. Further, it is difficult to infer causation in analyses that relate degree of inbreeding to complex traits because confounding variables (e.g., education) may influence both the likelihood for parents to outbreed and offspring trait values. The present study used runs of homozygosity in genome-wide SNP data in up to 400,000 individuals in the UK Biobank to estimate the proportion of the autosome that exists in autozygous tracts—stretches of the genome which are identical due to a shared common ancestor. After multiple testing corrections and controlling for possible sociodemographic confounders, we found significant relationships in the predicted direction between estimated autozygosity and three of the 26 traits we investigated: age at first sexual intercourse, fluid intelligence, and forced expiratory volume in 1 second. Our findings corroborate those of several published studies. These results may imply that these traits have been associated with Darwinian fitness over evolutionary time. However, some of the autozygosity-trait relationships were attenuated after controlling for background sociodemographic characteristics, suggesting that alternative explanations for these associations have not been eliminated. Care needs to be taken in the design and interpretation of ROH studies in order to glean reliable information about the genetic architecture and evolutionary history of complex traits.</p></div

CU Scholar Institutional Repository

Crossref

Directory of Open Access Journals

Digital Commons@Becker

FigShare

Regularization Paths for Generalized Linear Models via Coordinate Descent

Author: Jerome H. Friedman
Rob Tibshirani
Trevor Hastie
Publication venue
Publication date
Field of study

We develop fast algorithms for estimation of generalized linear models with convex penalties. The models include linear regression, two-class logistic regression, and multi- nomial regression problems while the penalties include Ã¢ÂÂ_1 (the lasso), Ã¢ÂÂ_2 (ridge regression) and mixtures of the two (the elastic net). The algorithms use cyclical coordinate descent, computed along a regularization path. The methods can handle large problems and can also deal efficiently with sparse features. In comparative timings we find that the new algorithms are considerably faster than competing methods.

Research Papers in Economics

Bayesian Item Response Modeling in R with brms and Stan

Author: Bürkner Paul-Christian
Publication venue
Publication date: 01/02/2020
Field of study

Item Response Theory (IRT) is widely applied in the human sciences to model persons' responses on a set of items measuring one or more latent constructs. While several R packages have been developed that implement IRT models, they tend to be restricted to respective prespecified classes of models. Further, most implementations are frequentist while the availability of Bayesian methods remains comparably limited. We demonstrate how to use the R package brms together with the probabilistic programming language Stan to specify and fit a wide range of Bayesian IRT models using flexible and intuitive multilevel formula syntax. Further, item and person parameters can be related in both a linear or non-linear manner. Various distributions for categorical, ordinal, and continuous responses are supported. Users may even define their own custom response distribution for use in the presented framework. Common IRT model classes that can be specified natively in the presented framework include 1PL and 2PL logistic models optionally also containing guessing parameters, graded response and partial credit ordinal models, as well as drift diffusion models of response times coupled with binary decisions. Posterior distributions of item and person parameters can be conveniently extracted and post-processed. Model fit can be evaluated and compared using Bayes factors and efficient cross-validation procedures.Comment: 54 pages, 16 figures, 3 table

arXiv.org e-Print Archive

Journal of Statistical Software

High-dimensional estimation with geometric constraints

Author: Plan Yaniv
Vershynin Roman
Yudovina Elena
Publication venue
Publication date: 18/05/2016
Field of study

Consider measuring an n-dimensional vector x through the inner product with several measurement vectors, a_1, a_2, ..., a_m. It is common in both signal processing and statistics to assume the linear response model y_i = + e_i, where e_i is a noise term. However, in practice the precise relationship between the signal x and the observations y_i may not follow the linear model, and in some cases it may not even be known. To address this challenge, in this paper we propose a general model where it is only assumed that each observation y_i may depend on a_i only through . We do not assume that the dependence is known. This is a form of the semiparametric single index model, and it includes the linear model as well as many forms of the generalized linear model as special cases. We further assume that the signal x has some structure, and we formulate this as a general assumption that x belongs to some known (but arbitrary) feasible set K. We carefully detail the benefit of using the signal structure to improve estimation. The theory is based on the mean width of K, a geometric parameter which can be used to understand its effective dimension in estimation problems. We determine a simple, efficient two-step procedure for estimating the signal based on this model -- a linear estimation followed by metric projection onto K. We give general conditions under which the estimator is minimax optimal up to a constant. This leads to the intriguing conclusion that in the high noise regime, an unknown non-linearity in the observations does not significantly reduce one's ability to determine the signal, even when the non-linearity may be non-invertible. Our results may be specialized to understand the effect of non-linearities in compressed sensing.Comment: This version incorporates minor revisions suggested by referee

arXiv.org e-Print Archive

CiteSeerX

Point process-based modeling of multiple debris flow landslides using INLA: an application to the 2009 Messina disaster

Author: Huser Raphael
Lombardo Luigi
Opitz Thomas
Publication venue
Publication date: 10/08/2017
Field of study

We develop a stochastic modeling approach based on spatial point processes of log-Gaussian Cox type for a collection of around 5000 landslide events provoked by a precipitation trigger in Sicily, Italy. Through the embedding into a hierarchical Bayesian estimation framework, we can use the Integrated Nested Laplace Approximation methodology to make inference and obtain the posterior estimates. Several mapping units are useful to partition a given study area in landslide prediction studies. These units hierarchically subdivide the geographic space from the highest grid-based resolution to the stronger morphodynamic-oriented slope units. Here we integrate both mapping units into a single hierarchical model, by treating the landslide triggering locations as a random point pattern. This approach diverges fundamentally from the unanimously used presence-absence structure for areal units since we focus on modeling the expected landslide count jointly within the two mapping units. Predicting this landslide intensity provides more detailed and complete information as compared to the classically used susceptibility mapping approach based on relative probabilities. To illustrate the model's versatility, we compute absolute probability maps of landslide occurrences and check its predictive power over space. While the landslide community typically produces spatial predictive models for landslides only in the sense that covariates are spatially distributed, no actual spatial dependence has been explicitly integrated so far for landslide susceptibility. Our novel approach features a spatial latent effect defined at the slope unit level, allowing us to assess the spatial influence that remains unexplained by the covariates in the model

arXiv.org e-Print Archive

HAL Descartes

Fused kernel-spline smoothing for repeatedly measured outcomes in a generalized partially linear model with functional single index

Author: Jiang Fei
Ma Yanyuan
Wang Yuanjia
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 01/01/2015
Field of study

We propose a generalized partially linear functional single index risk score model for repeatedly measured outcomes where the index itself is a function of time. We fuse the nonparametric kernel method and regression spline method, and modify the generalized estimating equation to facilitate estimation and inference. We use local smoothing kernel to estimate the unspecified coefficient functions of time, and use B-splines to estimate the unspecified function of the single index component. The covariance structure is taken into account via a working model, which provides valid estimation and inference procedure whether or not it captures the true covariance. The estimation method is applicable to both continuous and discrete outcomes. We derive large sample properties of the estimation procedure and show a different convergence rate for each component of the model. The asymptotic properties when the kernel and regression spline methods are combined in a nested fashion has not been studied prior to this work, even in the independent data case.Comment: Published at http://dx.doi.org/10.1214/15-AOS1330 in the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

PubMed Central

HKU Scholars Hub