691 research outputs found
A sparse conditional Gaussian graphical model for analysis of genetical genomics data
Genetical genomics experiments have now been routinely conducted to measure
both the genetic markers and gene expression data on the same subjects. The
gene expression levels are often treated as quantitative traits and are subject
to standard genetic analysis in order to identify the gene expression
quantitative loci (eQTL). However, the genetic architecture for many gene
expressions may be complex, and poorly estimated genetic architecture may
compromise the inferences of the dependency structures of the genes at the
transcriptional level. In this paper we introduce a sparse conditional Gaussian
graphical model for studying the conditional independent relationships among a
set of gene expressions adjusting for possible genetic effects where the gene
expressions are modeled with seemingly unrelated regressions. We present an
efficient coordinate descent algorithm to obtain the penalized estimation of
both the regression coefficients and the sparse concentration matrix. The
corresponding graph can be used to determine the conditional independence among
a group of genes while adjusting for shared genetic effects. Simulation
experiments and asymptotic convergence rates and sparsistency are used to
justify our proposed methods. By sparsistency, we mean the property that all
parameters that are zero are actually estimated as zero with probability
tending to one. We apply our methods to the analysis of a yeast eQTL data set
and demonstrate that the conditional Gaussian graphical model leads to a more
interpretable gene network than a standard Gaussian graphical model based on
gene expression data alone.Comment: Published in at http://dx.doi.org/10.1214/11-AOAS494 the Annals of
Applied Statistics (http://www.imstat.org/aoas/) by the Institute of
Mathematical Statistics (http://www.imstat.org
edge2vec: Representation learning using edge semantics for biomedical knowledge discovery
Representation learning provides new and powerful graph analytical approaches
and tools for the highly valued data science challenge of mining knowledge
graphs. Since previous graph analytical methods have mostly focused on
homogeneous graphs, an important current challenge is extending this
methodology for richly heterogeneous graphs and knowledge domains. The
biomedical sciences are such a domain, reflecting the complexity of biology,
with entities such as genes, proteins, drugs, diseases, and phenotypes, and
relationships such as gene co-expression, biochemical regulation, and
biomolecular inhibition or activation. Therefore, the semantics of edges and
nodes are critical for representation learning and knowledge discovery in real
world biomedical problems. In this paper, we propose the edge2vec model, which
represents graphs considering edge semantics. An edge-type transition matrix is
trained by an Expectation-Maximization approach, and a stochastic gradient
descent model is employed to learn node embedding on a heterogeneous graph via
the trained transition matrix. edge2vec is validated on three biomedical domain
tasks: biomedical entity classification, compound-gene bioactivity prediction,
and biomedical information retrieval. Results show that by considering
edge-types into node embedding learning in heterogeneous graphs,
\textbf{edge2vec}\ significantly outperforms state-of-the-art models on all
three tasks. We propose this method for its added value relative to existing
graph analytical methodology, and in the real world context of biomedical
knowledge discovery applicability.Comment: 10 page
A Unifying View on Recombination Spaces and Abstract Convex Evolutionary Search
This is the author accepted manuscript. The final version is available from Springer via the DOI in this record.Proceedings of EvoCOP 2019 - 19th European Conference on Evolutionary Computation, 24-26 April 2019, Leipzig, GermanyPrevious work proposed to unify an algebraic theory of fitness landscapes and a geometric framework of evolutionary algorithms (EAs). One of the main goals behind this unification is to develop an analytical method that verifies if a problem's landscape belongs to certain abstract convex landscapes classes, where certain recombination-based EAs (without mutation) have polynomial runtime performance. This paper advances such unification by showing that: (a) crossovers can be formally classified according to geometric or algebraic axiomatic properties; and (b) the population behaviour induced by certain crossovers in recombination-based EAs can be formalised in the geometric and algebraic theories. These results make a significant contribution to the basis of an integrated geometric-algebraic framework with which analyse recombination spaces and recombination-based EAs
Population-based incremental learning with associative memory for dynamic environments
Copyright © 2007 IEEE. Reprinted from IEEE Transactions on Evolutionary Computation.
This material is posted here with permission of the IEEE. Such permission of the IEEE does not in any way imply IEEE endorsement of any of Brunel University's products or services. Internal or personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution must be obtained from the IEEE by writing to [email protected].
By choosing to view this document, you agree to all provisions of the copyright laws protecting it.In recent years there has been a growing interest in studying evolutionary algorithms (EAs) for dynamic optimization problems (DOPs) due to its importance in real world applications. Several approaches, such as the memory and multiple population schemes, have been developed for EAs to address dynamic problems. This paper investigates the application of the memory scheme for population-based incremental learning (PBIL) algorithms, a class of EAs, for DOPss. A PBIL-specific associative memory scheme, which stores best solutions as well as corresponding environmental information in the memory, is investigated to improve its adaptability in dynamic environments. In this paper, the interactions between the memory scheme and random immigrants, multi-population, and restart schemes for PBILs in dynamic environments are investigated. In order to better test the performance of memory schemes for PBILs and other EAs in dynamic environments, this paper also proposes a dynamic environment generator that can systematically generate dynamic environments of different difficulty with respect to memory schemes. Using this generator a series of dynamic environments are generated and experiments are carried out to compare the performance of investigated algorithms. The experimental results show that the proposed memory scheme is efficient for PBILs in dynamic environments and also indicate that different interactions exist between the memory scheme and random immigrants, multi-population schemes for PBILs in different dynamic environments
Multitasking Evolutionary Algorithm Based on Adaptive Seed Transfer for Combinatorial Problem
Evolutionary computing (EC) is widely used in dealing with combinatorial
optimization problems (COP). Traditional EC methods can only solve a single
task in a single run, while real-life scenarios often need to solve multiple
COPs simultaneously. In recent years, evolutionary multitasking optimization
(EMTO) has become an emerging topic in the EC community. And many methods have
been designed to deal with multiple COPs concurrently through exchanging
knowledge. However, many-task optimization, cross-domain knowledge transfer,
and negative transfer are still significant challenges in this field. A new
evolutionary multitasking algorithm based on adaptive seed transfer (MTEA-AST)
is developed for multitasking COPs in this work. First, a dimension unification
strategy is proposed to unify the dimensions of different tasks. And then, an
adaptive task selection strategy is designed to capture the similarity between
the target task and other online optimization tasks. The calculated similarity
is exploited to select suitable source tasks for the target one and determine
the transfer strength. Next, a task transfer strategy is established to select
seeds from source tasks and correct unsuitable knowledge in seeds to suppress
negative transfer. Finally, the experimental results indicate that MTEA-AST can
adaptively transfer knowledge in both same-domain and cross-domain many-task
environments. And the proposed method shows competitive performance compared to
other state-of-the-art EMTOs in experiments consisting of four COPs
Unifying a Geometric Framework of Evolutionary Algorithms and Elementary Landscapes Theory
Evolutionary algorithms (EAs) are randomised general-purpose strategies, inspired by natural evolution, often used for finding (near) optimal solutions to problems in combinatorial optimisation. Over the last 50 years, many theoretical approaches in evolutionary computation have been developed to analyse the performance of EAs, design EAs or measure problem difficulty via fitness landscape analysis. An open challenge is to formally explain why a general class of EAs perform better, or worse, than others on a class of combinatorial problems across representations. However, the lack of a general unified theory of EAs and fitness landscapes, across problems and representations, makes it harder to characterise pairs of general classes of EAs and combinatorial problems where good performance can be guaranteed provably. This thesis explores a unification between a geometric framework of EAs and elementary landscapes theory, not tied to a specific representation nor problem, with complementary strengths in the analysis of population-based EAs and combinatorial landscapes. This unification organises around three essential aspects: search space structure induced by crossovers, search behaviour of population-based EAs and structure of fitness landscapes. First, this thesis builds a crossover classification to systematically compare crossovers in the geometric framework and elementary landscapes theory, revealing a shared general subclass of crossovers: geometric recombination P-structures, which covers well-known crossovers. The crossover classification is then extended to a general framework for axiomatically analysing the population behaviour induced by crossover classes on associated EAs. This shows the shared general class of all EAs using geometric recombination P-structures, but no mutation, always do the same abstract form of convex evolutionary search. Finally, this thesis characterises a class of globally convex combinatorial landscapes shared by the geometric framework and elementary landscapes theory: abstract convex elementary landscapes. It is formally explained why geometric recombination P-structure EAs expectedly can outperform random search on abstract convex elementary landscapes related to low-order graph Laplacian eigenvalues. Altogether, this thesis paves a way towards a general unified theory of EAs and combinatorial fitness landscapes
Investigation into the applications of genetic algorithms to control engineering
Bibliography: pages 117-120.This thesis report presents the results of a study carried out to determine possible uses of genetic algorithms to problems in control engineering. This thesis reviewed the literature on the subject of genetics and genetic algorithms and applied the algorithms to the problems of systems parameter identification and Pl/D controller tuning. More specifically, the study had the following objectives: To investigate possible uses of genetic algorithms to the task of system identification and Pl/D controller tuning. To do an in depth comparison of the proposed uses with orthodox traditional engineering thinking which is based on mathematical optimisation and empirical studies. To draw conclusions and present the findings in the form of a thesis. Genetic algorithms are a class of artificial intelligence methods inspired by the Darwinian principles of natural selection and survival of the fittest. The algorithm encodes potential solutions into chromosome-like data structures that. are evolved using genetic ·operators to determine the optimal solution of the problem. Fundamentally, the evolutionary nature of the algorithm is introduced through the operators called crossover and mutation. Crossover fundamentally takes two strings, selects a crossing point randomly and swaps segments of the strings on either side of the crossover point to create two new individuals. There are three variations of crossover which were considered in this thesis: single point crossover, two point crossover and uniform crossover. It was important that these be given careful consideration since much of the outcome of the algorithm is influenced by both the choice and the amount with which they are applied
Computational strategies for dissecting the high-dimensional complexity of adaptive immune repertoires
The adaptive immune system recognizes antigens via an immense array of
antigen-binding antibodies and T-cell receptors, the immune repertoire. The
interrogation of immune repertoires is of high relevance for understanding the
adaptive immune response in disease and infection (e.g., autoimmunity, cancer,
HIV). Adaptive immune receptor repertoire sequencing (AIRR-seq) has driven the
quantitative and molecular-level profiling of immune repertoires thereby
revealing the high-dimensional complexity of the immune receptor sequence
landscape. Several methods for the computational and statistical analysis of
large-scale AIRR-seq data have been developed to resolve immune repertoire
complexity in order to understand the dynamics of adaptive immunity. Here, we
review the current research on (i) diversity, (ii) clustering and network,
(iii) phylogenetic and (iv) machine learning methods applied to dissect,
quantify and compare the architecture, evolution, and specificity of immune
repertoires. We summarize outstanding questions in computational immunology and
propose future directions for systems immunology towards coupling AIRR-seq with
the computational discovery of immunotherapeutics, vaccines, and
immunodiagnostics.Comment: 27 pages, 2 figure
- …