176,063 research outputs found
Bayesian Sparse Factor Analysis of Genetic Covariance Matrices
Quantitative genetic studies that model complex, multivariate phenotypes are
important for both evolutionary prediction and artificial selection. For
example, changes in gene expression can provide insight into developmental and
physiological mechanisms that link genotype and phenotype. However, classical
analytical techniques are poorly suited to quantitative genetic studies of gene
expression where the number of traits assayed per individual can reach many
thousand. Here, we derive a Bayesian genetic sparse factor model for estimating
the genetic covariance matrix (G-matrix) of high-dimensional traits, such as
gene expression, in a mixed effects model. The key idea of our model is that we
need only consider G-matrices that are biologically plausible. An organism's
entire phenotype is the result of processes that are modular and have limited
complexity. This implies that the G-matrix will be highly structured. In
particular, we assume that a limited number of intermediate traits (or factors,
e.g., variations in development or physiology) control the variation in the
high-dimensional phenotype, and that each of these intermediate traits is
sparse -- affecting only a few observed traits. The advantages of this approach
are two-fold. First, sparse factors are interpretable and provide biological
insight into mechanisms underlying the genetic architecture. Second, enforcing
sparsity helps prevent sampling errors from swamping out the true signal in
high-dimensional data. We demonstrate the advantages of our model on simulated
data and in an analysis of a published Drosophila melanogaster gene expression
data set.Comment: 35 pages, 7 figure
A Latent Variable Approach to Multivariate Quantitative Trait Loci
A novel approach based on latent variable modelling is presented for the analysis of multivariate quantitative and qualitative trait loci. The approach is general in the sense that it enables the joint analysis of many kinds of quantitative and qualitative traits (including count data and censored traits) in a single modelling framework. In the framework, the observations are modelled as functions of latent variables, which are then affected by quantitative trait loci.  Separating the analysis in this way means that measurement errors in the phenotypic observations can be included easily in the model, providing robust inferences. The performance of the method is illustrated using two real multivariate datasets, from barley and Scots pine
Job Satisfaction as a Reflection of Disposition: A Multiple Source Casual Analysis
Dispositional sources of job satisfaction have been the subject of recent research in the organizational sciences. Problems in much of this research, which limit the conclusions one can draw from the results, are discussed. This study makes a distinction between affective disposition, defined as the tendency to respond generally to the environment in an affect-based manner, and subjective well-being, the level of overall happiness and satisfaction an individual has with his or her life. Affective disposition was hypothesized to lead to subjective well-being, and subjective well-being and job satisfaction were hypothesized to be mutually causative. A causal model was tested employing two different sources of data: self-reports and  significant other  evaluations. This biangulation of sources of data and estimation of nonrecursive relationships removes some problems often assumed to plague results based on single source data. Results indicated support for the overall hypothesized causal model and supported a dispositional influence on job attitudes. The influences are more complex than past research has suggested
Dissecting high-dimensional phenotypes with bayesian sparse factor analysis of genetic covariance matrices.
Quantitative genetic studies that model complex, multivariate phenotypes are important for both evolutionary prediction and artificial selection. For example, changes in gene expression can provide insight into developmental and physiological mechanisms that link genotype and phenotype. However, classical analytical techniques are poorly suited to quantitative genetic studies of gene expression where the number of traits assayed per individual can reach many thousand. Here, we derive a Bayesian genetic sparse factor model for estimating the genetic covariance matrix (G-matrix) of high-dimensional traits, such as gene expression, in a mixed-effects model. The key idea of our model is that we need consider only G-matrices that are biologically plausible. An organism's entire phenotype is the result of processes that are modular and have limited complexity. This implies that the G-matrix will be highly structured. In particular, we assume that a limited number of intermediate traits (or factors, e.g., variations in development or physiology) control the variation in the high-dimensional phenotype, and that each of these intermediate traits is sparse - affecting only a few observed traits. The advantages of this approach are twofold. First, sparse factors are interpretable and provide biological insight into mechanisms underlying the genetic architecture. Second, enforcing sparsity helps prevent sampling errors from swamping out the true signal in high-dimensional data. We demonstrate the advantages of our model on simulated data and in an analysis of a published Drosophila melanogaster gene expression data set
A bi-dimensional finite mixture model for longitudinal data subject to dropout
In longitudinal studies, subjects may be lost to follow-up, or miss some of
the planned visits, leading to incomplete response sequences. When the
probability of non-response, conditional on the available covariates and the
observed responses, still depends on unobserved outcomes, the dropout mechanism
is said to be non ignorable. A common objective is to build a reliable
association structure to account for dependence between the longitudinal and
the dropout processes. Starting from the existing literature, we introduce a
random coefficient based dropout model where the association between outcomes
is modeled through discrete latent effects. These effects are outcome-specific
and account for heterogeneity in the univariate profiles. Dependence between
profiles is introduced by using a bi-dimensional representation for the
corresponding distribution. In this way, we define a flexible latent class
structure which allows to efficiently describe both dependence within the two
margins of interest and dependence between them. By using this representation
we show that, unlike standard (unidimensional) finite mixture models, the non
ignorable dropout model properly nests its ignorable counterpart. We detail the
proposed modeling approach by analyzing data from a longitudinal study on the
dynamics of cognitive functioning in the elderly. Further, the effects of
assumptions about non ignorability of the dropout process on model parameter
estimates are (locally) investigated using the index of (local) sensitivity to
non-ignorability
A biomarker based on gene expression indicates plant water status in controlled and natural environments
Plant or soil water status are required in many scientific fields to
understand plant responses to drought. Because the transcriptomic response to
abiotic conditions, such as water deficit, reflects plant water status, genomic
tools could be used to develop a new type of molecular biomarker. Using the
sunflower (Helianthus annuus L.) as a model species to study the transcriptomic
response to water deficit both in greenhouse and field conditions, we
specifically identified three genes that showed an expression pattern highly
correlated to plant water status as estimated by the pre-dawn leaf water
potential, fraction of transpirable soil water, soil water content or fraction
of total soil water in controlled conditions. We developed a generalized linear
model to estimate these classical water status indicators from the expression
levels of the three selected genes under controlled conditions. This estimation
was independent of the four tested genotypes and the stage (pre- or
post-flowering) of the plant. We further validated this gene expression
biomarker under field conditions for four genotypes in three different trials,
over a large range of water status, and we were able to correct their
expression values for a large diurnal sampling period.Comment: Plant, Cell & Environment, 201
Analysis of measurement and simulation errors in structural system identification by observability techniques
This is the peer reviewed version of the following article: [Lei, J., Lozano-Galant, J. A., Nogal, M., Xu, D., and Turmo, J. (2017) Analysis of measurement and simulation errors in structural system identification by observability techniques. Struct. Control Health Monit., 24: . doi: 10.1002/stc.1923.], which has been published in final form at http://onlinelibrary.wiley.com/wol1/doi/10.1002/stc.1923/full. This article may be used for non-commercial purposes in accordance with Wiley Terms and Conditions for Self-Archiving.During the process of structural system identification, errors are unavoidable. This paper analyzes the effects of measurement and simulation errors in structural system identification based on observability techniques. To illustrate the symbolic approach of this method a simply supported beam is analyzed step-by-step. This analysis provides, for the very first time in the literature, the parametric equations of the estimated parameters. The effects of several factors, such as errors in a particular measurement or in the whole measurement set, load location, measurement location or sign of the errors, on the accuracy of the identification results are also investigated. It is found that error in a particular measurement increases the errors of individual estimations, and this effect can be significantly mitigated by introducing random errors in the whole measurement set. The propagation of simulation errors when using observability techniques is illustrated by two structures with different measurement sets and loading cases. A fluctuation of the observed parameters around the real values is proved to be a characteristic of this method. Also, it is suggested that a sufficient combination of different load cases should be utilized to avoid the inaccurate estimation at the location of low curvature zones.Peer ReviewedPostprint (author's final draft
Overcoming Problems in the Measurement of Biological Complexity
In a genetic algorithm, fluctuations of the entropy of a genome over time are
interpreted as fluctuations of the information that the genome's organism is
storing about its environment, being this reflected in more complex organisms.
The computation of this entropy presents technical problems due to the small
population sizes used in practice. In this work we propose and test an
alternative way of measuring the entropy variation in a population by means of
algorithmic information theory, where the entropy variation between two
generational steps is the Kolmogorov complexity of the first step conditioned
to the second one. As an example application of this technique, we report
experimental differences in entropy evolution between systems in which sexual
reproduction is present or absent.Comment: 4 pages, 5 figure
- …
