1,019 research outputs found
Populations in statistical genetic modelling and inference
What is a population? This review considers how a population may be defined
in terms of understanding the structure of the underlying genetics of the
individuals involved. The main approach is to consider statistically
identifiable groups of randomly mating individuals, which is well defined in
theory for any type of (sexual) organism. We discuss generative models using
drift, admixture and spatial structure, and the ancestral recombination graph.
These are contrasted with statistical models for inference, principle component
analysis and other `non-parametric' methods. The relationships between these
approaches are explored with both simulated and real-data examples. The
state-of-the-art practical software tools are discussed and contrasted. We
conclude that populations are a useful theoretical construct that can be well
defined in theory and often approximately exist in practice
Methods and Algorithms for Inference Problems in Population Genetics
Inference of population history is a central problem of population genetics. The advent of large genetic data brings us not only opportunities on developing more accurate methods for inference problems, but also computational challenges. Thus, we aim at developing accurate method and fast algorithm for problems in population genetics.
Inference of admixture proportions is a classical statistical problem. We particularly focus on the problem of ancestry inference for ancestors. Standard methods implicitly assume that both parents of an individual have the same admixture fraction. However, this is rarely the case in real data. We develop a Hidden Markov Model (HMM) framework for estimating the admixture proportions of the immediate ancestors of an individual, i.e. a type of appropriation of an individual\u27s admixture proportions into further subsets of ancestral proportions in the ancestors. Based on a genealogical model for admixture tracts, we develop an efficient algorithm for computing the sampling probability of the genome from a single individual, as a function of the admixture proportions of the ancestors of this individual. We show that the distribution and lengths of admixture tracts in a genome contain information about the admixture proportions of the ancestors of an individual. This allows us to perform probabilistic inference of admixture proportions of ancestors only using the genome of an extant individual.
To better understand population, we further study the species delimitation problem. It is a problem of determining the boundary between population and species. We propose a classification-based method to assign a set of populations to a number of species. Our new method uses summary statistics generated from genetic data to classify pairwise populations as either \u27same species\u27 or \u27different species\u27. We show that machine learning can be used for species delimitation and scaled for large genomic data. It can also outperform Bayesian approaches, especially when gene flow involves in the evolutionary process
Medical Statistics - Current Developments in Statistical Methodology for Genetic Architecture of Complex Diseases
[no abstract available
Using numerical plant models and phenotypic correlation space to design achievable ideotypes
Numerical plant models can predict the outcome of plant traits modifications
resulting from genetic variations, on plant performance, by simulating
physiological processes and their interaction with the environment.
Optimization methods complement those models to design ideotypes, i.e. ideal
values of a set of plant traits resulting in optimal adaptation for given
combinations of environment and management, mainly through the maximization of
a performance criteria (e.g. yield, light interception). As use of simulation
models gains momentum in plant breeding, numerical experiments must be
carefully engineered to provide accurate and attainable results, rooting them
in biological reality. Here, we propose a multi-objective optimization
formulation that includes a metric of performance, returned by the numerical
model, and a metric of feasibility, accounting for correlations between traits
based on field observations. We applied this approach to two contrasting
models: a process-based crop model of sunflower and a functional-structural
plant model of apple trees. In both cases, the method successfully
characterized key plant traits and identified a continuum of optimal solutions,
ranging from the most feasible to the most efficient. The present study thus
provides successful proof of concept for this enhanced modeling approach, which
identified paths for desirable trait modification, including direction and
intensity.Comment: 25 pages, 5 figures, 2017, Plant, Cell and Environmen
- …