21,115 research outputs found
Constructing level-2 phylogenetic networks from triplets
Jansson and Sung showed that, given a dense set of input triplets T
(representing hypotheses about the local evolutionary relationships of triplets
of species), it is possible to determine in polynomial time whether there
exists a level-1 network consistent with T, and if so to construct such a
network. They also showed that, unlike in the case of trees (i.e. level-0
networks), the problem becomes NP-hard when the input is non-dense. Here we
further extend this work by showing that, when the set of input triplets is
dense, the problem is even polynomial-time solvable for the construction of
level-2 networks. This shows that, assuming density, it is tractable to
construct plausible evolutionary histories from input triplets even when such
histories are heavily non-tree like. This further strengthens the case for the
use of triplet-based methods in the construction of phylogenetic networks. We
also show that, in the non-dense case, the level-2 problem remains NP-hard
Uniqueness, intractability and exact algorithms: reflections on level-k phylogenetic networks
Phylogenetic networks provide a way to describe and visualize evolutionary
histories that have undergone so-called reticulate evolutionary events such as
recombination, hybridization or horizontal gene transfer. The level k of a
network determines how non-treelike the evolution can be, with level-0 networks
being trees. We study the problem of constructing level-k phylogenetic networks
from triplets, i.e. phylogenetic trees for three leaves (taxa). We give, for
each k, a level-k network that is uniquely defined by its triplets. We
demonstrate the applicability of this result by using it to prove that (1) for
all k of at least one it is NP-hard to construct a level-k network consistent
with all input triplets, and (2) for all k it is NP-hard to construct a level-k
network consistent with a maximum number of input triplets, even when the input
is dense. As a response to this intractability we give an exact algorithm for
constructing level-1 networks consistent with a maximum number of input
triplets
GENESIM : genetic extraction of a single, interpretable model
Models obtained by decision tree induction techniques excel in being
interpretable.However, they can be prone to overfitting, which results in a low
predictive performance. Ensemble techniques are able to achieve a higher
accuracy. However, this comes at a cost of losing interpretability of the
resulting model. This makes ensemble techniques impractical in applications
where decision support, instead of decision making, is crucial.
To bridge this gap, we present the GENESIM algorithm that transforms an
ensemble of decision trees to a single decision tree with an enhanced
predictive performance by using a genetic algorithm. We compared GENESIM to
prevalent decision tree induction and ensemble techniques using twelve publicly
available data sets. The results show that GENESIM achieves a better predictive
performance on most of these data sets than decision tree induction techniques
and a predictive performance in the same order of magnitude as the ensemble
techniques. Moreover, the resulting model of GENESIM has a very low complexity,
making it very interpretable, in contrast to ensemble techniques.Comment: Presented at NIPS 2016 Workshop on Interpretable Machine Learning in
Complex System
A Practical Algorithm for Reconstructing Level-1 Phylogenetic Networks
Recently much attention has been devoted to the construction of phylogenetic
networks which generalize phylogenetic trees in order to accommodate complex
evolutionary processes. Here we present an efficient, practical algorithm for
reconstructing level-1 phylogenetic networks - a type of network slightly more
general than a phylogenetic tree - from triplets. Our algorithm has been made
publicly available as the program LEV1ATHAN. It combines ideas from several
known theoretical algorithms for phylogenetic tree and network reconstruction
with two novel subroutines. Namely, an exponential-time exact and a greedy
algorithm both of which are of independent theoretical interest. Most
importantly, LEV1ATHAN runs in polynomial time and always constructs a level-1
network. If the data is consistent with a phylogenetic tree, then the algorithm
constructs such a tree. Moreover, if the input triplet set is dense and, in
addition, is fully consistent with some level-1 network, it will find such a
network. The potential of LEV1ATHAN is explored by means of an extensive
simulation study and a biological data set. One of our conclusions is that
LEV1ATHAN is able to construct networks consistent with a high percentage of
input triplets, even when these input triplets are affected by a low to
moderate level of noise
Application of multiobjective genetic programming to the design of robot failure recognition systems
We present an evolutionary approach using multiobjective genetic programming (MOGP) to derive optimal feature extraction preprocessing stages for robot failure detection. This data-driven machine learning method is compared both with conventional (nonevolutionary) classifiers and a set of domain-dependent feature extraction methods. We conclude MOGP is an effective and practical design method for failure recognition systems with enhanced recognition accuracy over conventional classifiers, independent of domain knowledge
Phylogenetic Networks Do not Need to Be Complex: Using Fewer Reticulations to Represent Conflicting Clusters
Phylogenetic trees are widely used to display estimates of how groups of
species evolved. Each phylogenetic tree can be seen as a collection of
clusters, subgroups of the species that evolved from a common ancestor. When
phylogenetic trees are obtained for several data sets (e.g. for different
genes), then their clusters are often contradicting. Consequently, the set of
all clusters of such a data set cannot be combined into a single phylogenetic
tree. Phylogenetic networks are a generalization of phylogenetic trees that can
be used to display more complex evolutionary histories, including reticulate
events such as hybridizations, recombinations and horizontal gene transfers.
Here we present the new CASS algorithm that can combine any set of clusters
into a phylogenetic network. We show that the networks constructed by CASS are
usually simpler than networks constructed by other available methods. Moreover,
we show that CASS is guaranteed to produce a network with at most two
reticulations per biconnected component, whenever such a network exists. We
have implemented CASS and integrated it in the freely available Dendroscope
software
Relatedness Measures to Aid the Transfer of Building Blocks among Multiple Tasks
Multitask Learning is a learning paradigm that deals with multiple different
tasks in parallel and transfers knowledge among them. XOF, a Learning
Classifier System using tree-based programs to encode building blocks
(meta-features), constructs and collects features with rich discriminative
information for classification tasks in an observed list. This paper seeks to
facilitate the automation of feature transferring in between tasks by utilising
the observed list. We hypothesise that the best discriminative features of a
classification task carry its characteristics. Therefore, the relatedness
between any two tasks can be estimated by comparing their most appropriate
patterns. We propose a multiple-XOF system, called mXOF, that can dynamically
adapt feature transfer among XOFs. This system utilises the observed list to
estimate the task relatedness. This method enables the automation of
transferring features. In terms of knowledge discovery, the resemblance
estimation provides insightful relations among multiple data. We experimented
mXOF on various scenarios, e.g. representative Hierarchical Boolean problems,
classification of distinct classes in the UCI Zoo dataset, and unrelated tasks,
to validate its abilities of automatic knowledge-transfer and estimating task
relatedness. Results show that mXOF can estimate the relatedness reasonably
between multiple tasks to aid the learning performance with the dynamic feature
transferring.Comment: accepted by The Genetic and Evolutionary Computation Conference
(GECCO 2020
- …