186 research outputs found
On the estimation of variance parameters in non-standard generalised linear mixed models: Application to penalised smoothing
We present a novel method for the estimation of variance parameters in generalised linear mixed models. The method has its roots in Harville (1977)'s work, but it is able to deal with models that have a precision matrix for the random-effect vector that is linear in the inverse of the variance parameters (i.e., the precision parameters). We call the method SOP (Separation of Overlapping Precision matrices). SOP is based on applying the method of successive approximations to easy-to-compute estimate updates of the variance parameters. These estimate updates have an appealing form: they are the ratio of a (weighted) sum of squares to a quantity related to effective degrees of freedom. We provide the sufficient and necessary conditions for these estimates to be strictly positive. An important application field of SOP is penalised regression estimation of models where multiple quadratic penalties act on the same regression coefficients. We discuss in detail two of those models: penalised splines for locally adaptive smoothness and for hierarchical curve data. Several data examples in these settings are presented.MTM2014-55966-P
MTM2014-52184-
On the estimation of variance parameters in non-standard generalised linear mixed models: application to penalised smoothing
We present a novel method for the estimation of variance parameters in generalised linear mixed models. The method has its roots in Harville (J Am Stat Assoc 72(358):320-338, 1977)'s work, but it is able to deal with models that have a precision matrix for the random effect vector that is linear in the inverse of the variance parameters (i.e., the precision parameters). We call the method SOP (separation of overlapping precision matrices). SOP is based on applying the method of successive approximations to easy-to-compute estimate updates of the variance parameters. These estimate updates have an appealing form: they are the ratio of a (weighted) sum of squares to a quantity related to effective degrees of freedom. We provide the sufficient and necessary conditions for these estimates to be strictly positive. An important application field of SOP is penalised regression estimation of models where multiple quadratic penalties act on the same regression coefficients. We discuss in detail two of those models: penalised splines for locally adaptive smoothness and for hierarchical curve data. Several data examples in these settings are presented.This research was supported by the Basque Government through the BERC 2018-2021 program and by Spanish Ministry of Economy and Competitiveness MINECO through BCAM Severo Ochoa excellence accreditation SEV-2013-0323 and through projects MTM2017-82379-R funded by (AEI/FEDER, UE) and acronym âAFTERAMâ, MTM2014-52184-P and MTM2014-55966-P. The MRI/DTI data were collected at Johns Hopkins University and the Kennedy-Krieger Institute. We are grateful to Pedro Caro and Iain Currie for useful discussions, to Martin Boer and Cajo ter Braak for the detailed reading of the paper and their many suggestions, and to Bas Engel for sharing with us his knowledge. We are also grateful to the two peer referees for their constructive comments of the paper
Penalising model component complexity:A principled, practical approach to constructing priors
In this paper, we introduce a new concept for constructing prior distributions. We exploit the natural nested structure inherent to many model components, which defines the model component to be a flexible extension of a base model. Proper priors are defined to penalise the complexity induced by deviating from the simpler base model and are formulated after the input of a user-defined scaling parameter for that model component, both in the univariate and the multivariate case. These priors are invariant to reparameterisations, have a natural connection to Jeffreys' priors, are designed to support Occam's razor and seem to have excellent robustness properties, all which are highly desirable and allow us to use this approach to define default prior distributions. Through examples and theoretical results, we demonstrate the appropriateness of this approach and how it can be applied in various situations
Hierarchical array priors for ANOVA decompositions of cross-classified data
ANOVA decompositions are a standard method for describing and estimating
heterogeneity among the means of a response variable across levels of multiple
categorical factors. In such a decomposition, the complete set of main effects
and interaction terms can be viewed as a collection of vectors, matrices and
arrays that share various index sets defined by the factor levels. For many
types of categorical factors, it is plausible that an ANOVA decomposition
exhibits some consistency across orders of effects, in that the levels of a
factor that have similar main-effect coefficients may also have similar
coefficients in higher-order interaction terms. In such a case, estimation of
the higher-order interactions should be improved by borrowing information from
the main effects and lower-order interactions. To take advantage of such
patterns, this article introduces a class of hierarchical prior distributions
for collections of interaction arrays that can adapt to the presence of such
interactions. These prior distributions are based on a type of array-variate
normal distribution, for which a covariance matrix for each factor is
estimated. This prior is able to adapt to potential similarities among the
levels of a factor, and incorporate any such information into the estimation of
the effects in which the factor appears. In the presence of such similarities,
this prior is able to borrow information from well-estimated main effects and
lower-order interactions to assist in the estimation of higher-order terms for
which data information is limited.Comment: Published in at http://dx.doi.org/10.1214/13-AOAS685 the Annals of
Applied Statistics (http://www.imstat.org/aoas/) by the Institute of
Mathematical Statistics (http://www.imstat.org
Rekonstruktion und skalierbare Detektion und Verfolgung von 3D Objekten
The task of detecting objects in images is essential for autonomous systems to categorize, comprehend and eventually navigate or manipulate its environment. Since many applications demand not only detection of objects but also the estimation of their exact poses, 3D CAD models can prove helpful since they provide means for feature extraction and hypothesis refinement. This work, therefore, explores two paths: firstly, we will look into methods to create richly-textured and geometrically accurate models of real-life objects. Using these reconstructions as a basis, we will investigate on how to improve in the domain of 3D object detection and pose estimation, focusing especially on scalability, i.e. the problem of dealing with multiple objects simultaneously.Objekterkennung in Bildern ist fĂŒr ein autonomes System von entscheidender Bedeutung, um seine Umgebung zu kategorisieren, zu erfassen und schlieĂlich zu navigieren oder zu manipulieren. Da viele Anwendungen nicht nur die Erkennung von Objekten, sondern auch die SchĂ€tzung ihrer exakten Positionen erfordern, können sich 3D-CAD-Modelle als hilfreich erweisen, da sie Mittel zur Merkmalsextraktion und Verfeinerung von Hypothesen bereitstellen. In dieser Arbeit werden daher zwei Wege untersucht: Erstens werden wir Methoden untersuchen, um strukturreiche und geometrisch genaue Modelle realer Objekte zu erstellen. Auf der Grundlage dieser Konstruktionen werden wir untersuchen, wie sich der Bereich der 3D-Objekterkennung und der PosenschĂ€tzung verbessern lĂ€sst, wobei insbesondere die Skalierbarkeit im Vordergrund steht, d.h. das Problem der gleichzeitigen Bearbeitung mehrerer Objekte
Mechanistic neutral models show that sampling biases drive the apparent explosion of early tetrapod diversity
Estimates of deep-time biodiversity typically rely on statistical methods to mitigate the impacts of sampling biases in the fossil record. However, these methods are limited by the spatial and temporal scale of the underlying data. Here we use a spatially explicit mechanistic model, based on neutral theory, to test hypotheses of early tetrapod diversity change during the late Carboniferous and early Permian, critical intervals for the diversification of vertebrate life on land. Our simulations suggest that apparent increases in early tetrapod diversity were not driven by local endemism following the âCarboniferous rainforest collapseâ. Instead, changes in face-value diversity can be explained by variation in sampling intensity through time. Our results further demonstrate the importance of accounting for sampling biases in analyses of the fossil record and highlight the vast potential of mechanistic models, including neutral models, for testing hypotheses in palaeobiology
Recommended from our members
Bayesian Structural Causal Inference with Probabilistic Programming
Reasoning about causal relationships is central to the human experience. This evokes a natural question in our pursuit of human-like artificial intelligence: how might we imbue intelligent systems with similar causal reasoning capabilities? Better yet, how might we imbue intelligent systems with the ability to learn cause and effect relationships from observation and experimentation? Unfortunately, reasoning about cause and effect requires more than just data: it also requires partial knowledge about data generating mechanisms. Given this need, our task then as computational scientists is to design data structures for representing partial causal knowledge, and algorithms for updating that knowledge in light of observations and experiments. In this dissertation, I explore the Bayesian structural approach to causal inference in which probability distributions over structural causal models are one such data structure, and probabilistic inference in multi-world transformations of those models as the corresponding algorithmic task. Specifically, I demonstrate that this approach has two distinct advantages over the dominant computational paradigm of causal graphical models: (i) it expands the breadth of compatible assumptions; and (ii) it seamlessly integrates with modern Bayesian modeling and inference technologies to facilitate quantification of uncertainty about causal structure and the effects of interventions.
Specifically, doing so allows the emerging and powerful technology of probabilistic programming to be brought to bear on a large and diverse set of causal inference problems. In Chapter 3, I present an example-driven pedagogical introduction to the Bayesian structural approach to causal inference, demonstrating how priors over structural causal models induce joint distributions over observed and latent counterfactual random variables, and how the resulting posterior distributions capture common motifs in causal inference. In particular, I show how various assumptions about latent confounding influence our ability to estimate causal effects from data and I provide examples of common observational and quasi-experimental designs expressed as probabilistic programs. In Chapter 4, I present an advanced application of the Bayesian structural approach for modeling hierarchical relational dependencies with latent confounders, and how to combine such assumptions with flexible Gaussian process models. In Chapter 5, I present a prototype software implementation for causal inference using probabilistic programming, accommodating a broad class of multi-source observational and experimental data. Finally, in Chapter 6, I present Simulation-Based Identifiability, a gradient-based optimization method for determining if any differentiable and bounded prior over structural causal models converges to a unique causal conclusion asymptotically
Applying the Free-Energy Principle to Complex Adaptive Systems
The free energy principle is a mathematical theory of the behaviour of self-organising systems that originally gained prominence as a unified model of the brain. Since then, the theory has been applied to a plethora of biological phenomena, extending from single-celled and multicellular organisms through to niche construction and human culture, and even the emergence of life itself. The free energy principle tells us that perception and action operate synergistically to minimize an organismâs exposure to surprising biological states, which are more likely to lead to decay. A key corollary of this hypothesis is active inferenceâthe idea that all behavior involves the selective sampling of sensory data so that we experience what we expect to (in order to avoid surprises). Simply put, we act upon the world to fulfill our expectations. It is now widely recognized that the implications of the free energy principle for our understanding of the human mind and behavior are far-reaching and profound. To date, however, its capacity to extend beyond our brainâto more generally explain living and other complex adaptive systemsâhas only just begun to be explored. The aim of this collection is to showcase the breadth of the free energy principle as a unified theory of complex adaptive systemsâconscious, social, living, or not
Using functional annotation to characterize genome-wide association results
Genome-wide association studies (GWAS) have successfully identified thousands of variants robustly associated with hundreds of complex traits, but the biological mechanisms driving these results remain elusive. Functional annotation, describing the roles of known genes and regulatory elements, provides additional information about associated variants. This dissertation explores the potential of these annotations to explain the biology behind observed GWAS results.
The first project develops a random-effects approach to genetic fine mapping of trait-associated loci. Functional annotation and estimates of the enrichment of genetic effects in each annotation category are integrated with linkage disequilibrium (LD) within each locus and GWAS summary statistics to prioritize variants with plausible functionality. Applications of this method to simulated and real data show good performance in a wider range of scenarios relative to previous approaches. The second project focuses on the estimation of enrichment by annotation categories. I derive the distribution of GWAS summary statistics as a function of annotations and LD structure and perform maximum likelihood estimation of enrichment coefficients in two simulated scenarios. The resulting estimates are less variable than previous methods, but the asymptotic theory of standard errors is often not applicable due to non-convexity of the likelihood function. In the third project, I investigate the problem of selecting an optimal set of tissue-specific annotations with greatest relevance to a trait of interest. I consider three selection criteria defined in terms of the mutual information between functional annotations and GWAS summary statistics. These algorithms correctly identify enriched categories in simulated data, but in the application to a GWAS of BMI the penalty for redundant features outweighs the modest relationships with the outcome yielding null selected feature sets, due to the weaker overall association and high similarity between tissue-specific regulatory features.
All three projects require little in the way of prior hypotheses regarding the mechanism of genetic effects. These data-driven approaches have the potential to illuminate unanticipated biological relationships, but are also limited by the high dimensionality of the data relative to the moderate strength of the signals under investigation. These approaches advance the set of tools available to researchers to draw biological insights from GWAS results
- âŠ