299 research outputs found
Determination of Nonlinear Genetic Architecture using Compressed Sensing
We introduce a statistical method that can reconstruct nonlinear genetic
models (i.e., including epistasis, or gene-gene interactions) from
phenotype-genotype (GWAS) data. The computational and data resource
requirements are similar to those necessary for reconstruction of linear
genetic models (or identification of gene-trait associations), assuming a
condition of generalized sparsity, which limits the total number of gene-gene
interactions. An example of a sparse nonlinear model is one in which a typical
locus interacts with several or even many others, but only a small subset of
all possible interactions exist. It seems plausible that most genetic
architectures fall in this category. Our method uses a generalization of
compressed sensing (L1-penalized regression) applied to nonlinear functions of
the sensing matrix. We give theoretical arguments suggesting that the method is
nearly optimal in performance, and demonstrate its effectiveness on broad
classes of nonlinear genetic models using both real and simulated human
genomes.Comment: 20 pages, 8 figures. arXiv admin note: text overlap with
arXiv:1408.342
Quasispecies evolution in general mean-field landscapes
I consider a class of fitness landscapes, in which the fitness is a function
of a finite number of phenotypic "traits", which are themselves linear
functions of the genotype. I show that the stationary trait distribution in
such a landscape can be explicitly evaluated in a suitably defined
"thermodynamic limit", which is a combination of infinite-genome and strong
selection limits. These considerations can be applied in particular to identify
relevant features of the evolution of promoter binding sites, in spite of the
shortness of the corresponding sequences.Comment: 6 pages, 2 figures, Europhysics Letters style (included) Finite-size
scaling analysis sketched. To appear in Europhysics Letter
Recommended from our members
An investigation into the use of genetic algorithms for shape recognition
The use of the genetic algorithm for shape recognition has been investigated in relation to features along a shape boundary contour. Various methods for encoding chromosomes were investigated, the most successful of which led to the development of a new technique to input normalised 'perceptually important point' features from the contour into a genetic algorithm. Chromosomes evolve with genes defining various ways of 'observing' different parts of the contour. The normalisation process provides the capability for multi-scale spatial frequency filtering and fine/coarse resolution of the contour features. A standard genetic algorithm was chosen for this investigation because its performance can be analysed by applying schema analysis to the genes. A new method for measurement of gene diversity has been developed. It is shown that this diversity measure can be used to direct the genetic algorithm parameters to evolve a number of 'good' chromosomes. In this way a variety of sections along the contour can be observed. A new and effective recognition technique has been developed which makes use of these 'good' chromosomes and the same fitness calculation as used in the genetic algorithm. Correct recognition can be achieved by selecting chromosomes and adjusting two thresholds, the values of which are found not to be critical. Difficulties associated with the calculation of a shape's fitness were analysed and the structure of the genes in the chromosome investigated using schema and epistatic analysis. It was shown that the behaviour of the genetic algorithm is compatible with the schema theorem of J. H. Holland. Reasons are given to explain the minimum value for the mutation probability that is required for the evolution of a number of' good' chromosomes. Suggestions for future research are made and, in particular, it is recommended that the convergence properties of the standard genetic algorithm be investigated
Predicting functional associations from metabolism using bi-partite network algorithms
<p>Abstract</p> <p>Background</p> <p>Metabolic reconstructions contain detailed information about metabolic enzymes and their reactants and products. These networks can be used to infer functional associations between metabolic enzymes. Many methods are based on the number of metabolites shared by two enzymes, or the shortest path between two enzymes. Metabolite sharing can miss associations between non-consecutive enzymes in a serial pathway, and shortest-path algorithms are sensitive to high-degree metabolites such as water and ATP that create connections between enzymes with little functional similarity.</p> <p>Results</p> <p>We present new, fast methods to infer functional associations in metabolic networks. A local method, the degree-corrected Poisson score, is based only on the metabolites shared by two enzymes, but uses the known metabolite degree distribution. A global method, based on graph diffusion kernels, predicts associations between enzymes that do not share metabolites. Both methods are robust to high-degree metabolites. They out-perform previous methods in predicting shared Gene Ontology (GO) annotations and in predicting experimentally observed synthetic lethal genetic interactions. Including cellular compartment information improves GO annotation predictions but degrades synthetic lethal interaction prediction. These new methods perform nearly as well as computationally demanding methods based on flux balance analysis.</p> <p>Conclusions</p> <p>We present fast, accurate methods to predict functional associations from metabolic networks. Biological significance is demonstrated by identifying enzymes whose strong metabolic correlations are missed by conventional annotations in GO, most often enzymes involved in transport vs. synthesis of the same metabolite or other enzyme pairs that share a metabolite but are separated by conventional pathway boundaries. More generally, the methods described here may be valuable for analyzing other types of networks with long-tailed degree distributions and high-degree hubs.</p
Leveraging Benchmarking Data for Informed One-Shot Dynamic Algorithm Selection
A key challenge in the application of evolutionary algorithms in practice is
the selection of an algorithm instance that best suits the problem at hand.
What complicates this decision further is that different algorithms may be best
suited for different stages of the optimization process. Dynamic algorithm
selection and configuration are therefore well-researched topics in
evolutionary computation. However, while hyper-heuristics and parameter control
studies typically assume a setting in which the algorithm needs to be chosen
while running the algorithms, without prior information, AutoML approaches such
as hyper-parameter tuning and automated algorithm configuration assume the
possibility of evaluating different configurations before making a final
recommendation. In practice, however, we are often in a middle-ground between
these two settings, where we need to decide on the algorithm instance before
the run ("oneshot" setting), but where we have (possibly lots of) data
available on which we can base an informed decision.
We analyze in this work how such prior performance data can be used to infer
informed dynamic algorithm selection schemes for the solution of pseudo-Boolean
optimization problems. Our specific use-case considers a family of genetic
algorithms.Comment: Submitted for review to GECCO'2
Efficient network-guided multi-locus association mapping with graph cuts
As an increasing number of genome-wide association studies reveal the
limitations of attempting to explain phenotypic heritability by single genetic
loci, there is growing interest for associating complex phenotypes with sets of
genetic loci. While several methods for multi-locus mapping have been proposed,
it is often unclear how to relate the detected loci to the growing knowledge
about gene pathways and networks. The few methods that take biological pathways
or networks into account are either restricted to investigating a limited
number of predetermined sets of loci, or do not scale to genome-wide settings.
We present SConES, a new efficient method to discover sets of genetic loci
that are maximally associated with a phenotype, while being connected in an
underlying network. Our approach is based on a minimum cut reformulation of the
problem of selecting features under sparsity and connectivity constraints that
can be solved exactly and rapidly.
SConES outperforms state-of-the-art competitors in terms of runtime, scales
to hundreds of thousands of genetic loci, and exhibits higher power in
detecting causal SNPs in simulation studies than existing methods. On flowering
time phenotypes and genotypes from Arabidopsis thaliana, SConES detects loci
that enable accurate phenotype prediction and that are supported by the
literature.
Matlab code for SConES is available at
http://webdav.tuebingen.mpg.de/u/karsten/Forschung/scones/Comment: 20 pages, 6 figures, accepted at ISMB (International Conference on
Intelligent Systems for Molecular Biology) 201
An exploration of evolutionary computation applied to frequency modulation audio synthesis parameter optimisation
With the ever-increasing complexity of sound synthesisers, there is a growing demand for automated parameter estimation and sound space navigation techniques. This thesis explores the potential for evolutionary computation to automatically map known sound qualities onto the parameters of frequency modulation synthesis. Within this exploration are original contributions in the domain of synthesis parameter estimation and, within the developed system, evolutionary computation, in the form of the evolutionary algorithms that drive the underlying optimisation process. Based upon the requirement for the parameter estimation system to deliver multiple search space solutions, existing evolutionary algorithmic architectures are augmented to enable niching, while maintaining the strengths of the original algorithms. Two novel evolutionary algorithms are proposed in which cluster analysis is used to identify and maintain species within the evolving populations. A conventional evolution strategy and cooperative coevolution strategy are defined, with cluster-orientated operators that enable the simultaneous optimisation of multiple search space solutions at distinct optima. A test methodology is developed that enables components of the synthesis matching problem to be identified and isolated, enabling the performance of different optimisation techniques to be compared quantitatively. A system is consequently developed that evolves sound matches using conventional frequency modulation synthesis models, and the effectiveness of different evolutionary algorithms is assessed and compared in application to both static and timevarying sound matching problems. Performance of the system is then evaluated by interview with expert listeners. The thesis is closed with a reflection on the algorithms and systems which have been developed, discussing possibilities for the future of automated synthesis parameter estimation techniques, and how they might be employed
- …