Search CORE

29,231 research outputs found

Penalized Likelihood and Bayesian Function Selection in Regression Models

Author: Fahrmeir Ludwig
Kneib Thomas
Scheipl Fabian
Publication venue
Publication date: 04/03/2013
Field of study

Challenging research in various fields has driven a wide range of methodological advances in variable selection for regression models with high-dimensional predictors. In comparison, selection of nonlinear functions in models with additive predictors has been considered only more recently. Several competing suggestions have been developed at about the same time and often do not refer to each other. This article provides a state-of-the-art review on function selection, focusing on penalized likelihood and Bayesian concepts, relating various approaches to each other in a unified framework. In an empirical comparison, also including boosting, we evaluate several methods through applications to simulated and real data, thereby providing some guidance on their performance in practice

arXiv.org e-Print Archive

CiteSeerX

Spike-and-Slab Priors for Function Selection in Structured Additive Regression Models

Author: Fabian Scheipl
Fahrmeir L.
Hothorn T.
Lewis B.
Ludwig Fahrmeir
Polson N.
Sabanés Bové D.
Scheipl F.
Scheipl F.
Scheipl F.
Thomas Kneib
Publication venue: 'Informa UK Limited'
Publication date: 02/12/2011
Field of study

Structured additive regression provides a general framework for complex Gaussian and non-Gaussian regression models, with predictors comprising arbitrary combinations of nonlinear functions and surfaces, spatial effects, varying coefficients, random effects and further regression terms. The large flexibility of structured additive regression makes function selection a challenging and important task, aiming at (1) selecting the relevant covariates, (2) choosing an appropriate and parsimonious representation of the impact of covariates on the predictor and (3) determining the required interactions. We propose a spike-and-slab prior structure for function selection that allows to include or exclude single coefficients as well as blocks of coefficients representing specific model terms. A novel multiplicative parameter expansion is required to obtain good mixing and convergence properties in a Markov chain Monte Carlo simulation approach and is shown to induce desirable shrinkage properties. In simulation studies and with (real) benchmark classification data, we investigate sensitivity to hyperparameter settings and compare performance to competitors. The flexibility and applicability of our approach are demonstrated in an additive piecewise exponential model with time-varying effects for right-censored survival times of intensive care patients with sepsis. Geoadditive and additive mixed logit model applications are discussed in an extensive appendix

arXiv.org e-Print Archive

Crossref

Variable selection for BART: An application to gene regulation

Author: Bleich Justin
George Edward I.
Jensen Shane T.
Kapelner Adam
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 01/01/2014
Field of study

We consider the task of discovering gene regulatory networks, which are defined as sets of genes and the corresponding transcription factors which regulate their expression levels. This can be viewed as a variable selection problem, potentially with high dimensionality. Variable selection is especially challenging in high-dimensional settings, where it is difficult to detect subtle individual effects and interactions between predictors. Bayesian Additive Regression Trees [BART, Ann. Appl. Stat. 4 (2010) 266-298] provides a novel nonparametric alternative to parametric regression approaches, such as the lasso or stepwise regression, especially when the number of relevant predictors is sparse relative to the total number of available predictors and the fundamental relationships are nonlinear. We develop a principled permutation-based inferential approach for determining when the effect of a selected predictor is likely to be real. Going further, we adapt the BART procedure to incorporate informed prior information about variable importance. We present simulations demonstrating that our method compares favorably to existing parametric and nonparametric procedures in a variety of data settings. To demonstrate the potential of our approach in a biological context, we apply it to the task of inferring the gene regulatory network in yeast (Saccharomyces cerevisiae). We find that our BART-based procedure is best able to recover the subset of covariates with the largest signal compared to other variable selection methods. The methods developed in this work are readily available in the R package bartMachine.Comment: Published in at http://dx.doi.org/10.1214/14-AOAS755 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

CiteSeerX

ScholarlyCommons@Penn

The advantage of being slow: the quasi-neutral contact process

Author: de Oliveira Marcelo Martins
Dickman Ronald
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 21/07/2017
Field of study

According to the competitive exclusion principle, in a finite ecosystem, extinction occurs naturally when two or more species compete for the same resources. An important question that arises is: when coexistence is not possible, which mechanisms confer an advantage to a given species against the other(s)? In general, it is expected that the species with the higher reproductive/death ratio will win the competition, but other mechanisms, such as asymmetry in interspecific competition or unequal diffusion rates, have been found to change this scenario dramatically. In this work, we examine competitive advantage in the context of quasi-neutral population models, including stochastic models with spatial structure as well as macroscopic (mean-field) descriptions. We employ a two-species contact process in which the "biological clock" of one species is a factor of

\alpha

slower than that of the other species. Our results provide new insights into how stochasticity and competition interact to determine extinction in finite spatial systems. We find that a species with a slower biological clock has an advantage if resources are limited, winning the competition against a species with a faster clock, in relatively small systems. Periodic or stochastic environmental variations also favor the slower species, even in much larger systems.Comment: Reviewed extended versio

arXiv.org e-Print Archive

Directory of Open Access Journals

Speech Enhancement Using An {MMSE} Spectral Amplitude Estimator Based On A Modulation Domain Kalman Filter With A Gamma Prior

Author: Brookes D
Wang Y
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 21/12/2015
Field of study

In this paper, we propose a minimum mean square error spectral estimator for clean speech spectral amplitudes that uses a Kalman filter to model the temporal dynamics of the spectral amplitudes in the modulation domain. Using a two-parameter Gamma distribution to model the prior distribution of the speech spectral amplitudes, we derive closed form expressions for the posterior mean and variance of the spectral amplitudes as well as for the associated update step of the Kalman filter. The performance of the proposed algorithm is evaluated on the TIMIT core test set using the perceptual evaluation of speech quality (PESQ) measure and segmental SNR measure and is shown to give a consistent improvement over a wide range of SNRs when compared to competitive algorithms

Spiral - Imperial College Digital Repository