5,262 research outputs found
Towards the Evolution of Multi-Layered Neural Networks: A Dynamic Structured Grammatical Evolution Approach
Current grammar-based NeuroEvolution approaches have several shortcomings. On
the one hand, they do not allow the generation of Artificial Neural Networks
(ANNs composed of more than one hidden-layer. On the other, there is no way to
evolve networks with more than one output neuron. To properly evolve ANNs with
more than one hidden-layer and multiple output nodes there is the need to know
the number of neurons available in previous layers. In this paper we introduce
Dynamic Structured Grammatical Evolution (DSGE): a new genotypic representation
that overcomes the aforementioned limitations. By enabling the creation of
dynamic rules that specify the connection possibilities of each neuron, the
methodology enables the evolution of multi-layered ANNs with more than one
output neuron. Results in different classification problems show that DSGE
evolves effective single and multi-layered ANNs, with a varying number of
output neurons
Interpretable Categorization of Heterogeneous Time Series Data
Understanding heterogeneous multivariate time series data is important in
many applications ranging from smart homes to aviation. Learning models of
heterogeneous multivariate time series that are also human-interpretable is
challenging and not adequately addressed by the existing literature. We propose
grammar-based decision trees (GBDTs) and an algorithm for learning them. GBDTs
extend decision trees with a grammar framework. Logical expressions derived
from a context-free grammar are used for branching in place of simple
thresholds on attributes. The added expressivity enables support for a wide
range of data types while retaining the interpretability of decision trees. In
particular, when a grammar based on temporal logic is used, we show that GBDTs
can be used for the interpretable classi cation of high-dimensional and
heterogeneous time series data. Furthermore, we show how GBDTs can also be used
for categorization, which is a combination of clustering and generating
interpretable explanations for each cluster. We apply GBDTs to analyze the
classic Australian Sign Language dataset as well as data on near mid-air
collisions (NMACs). The NMAC data comes from aircraft simulations used in the
development of the next-generation Airborne Collision Avoidance System (ACAS
X).Comment: 9 pages, 5 figures, 2 tables, SIAM International Conference on Data
Mining (SDM) 201
Search based software engineering: Trends, techniques and applications
© ACM, 2012. This is the author's version of the work. It is posted here by permission of ACM for your personal use. Not for redistribution. The definitive version is available from the link below.In the past five years there has been a dramatic increase in work on Search-Based Software Engineering (SBSE), an approach to Software Engineering (SE) in which Search-Based Optimization (SBO) algorithms are used to address problems in SE. SBSE has been applied to problems throughout the SE lifecycle, from requirements and project planning to maintenance and reengineering. The approach is attractive because it offers a suite of adaptive automated and semiautomated solutions in situations typified by large complex problem spaces with multiple competing and conflicting objectives.
This article provides a review and classification of literature on SBSE. The work identifies research trends and relationships between the techniques applied and the applications to which they have been applied and highlights gaps in the literature and avenues for further research.EPSRC and E
The Market Fraction Hypothesis under different GP algorithms
In a previous work, inspired by observations made in many agent-based financial models, we formulated and presented the Market Fraction Hypothesis, which basically predicts a short duration for any dominant type of agents, but then a uniform distribution over all types in the long run. We then proposed a two-step approach, a rule-inference step and a rule-clustering step, to testing this hypothesis. We employed genetic programming as the rule inference engine, and applied self-organizing maps to cluster the inferred rules. We then ran tests for 10 international markets and provided a general examination of the plausibility of the hypothesis. However, because of the fact that the tests took place under a GP system, it could be argued that these results are dependent on the nature of the GP algorithm. This chapter thus serves as an extension to our previous work. We test the Market Fraction Hypothesis under two new different GP algorithms, in order to prove that the previous results are rigorous and are not sensitive to the choice of GP. We thus test again the hypothesis under the same 10 empirical datasets that were used in our previous experiments. Our work shows that certain parts of the hypothesis are indeed sensitive on the algorithm. Nevertheless, this sensitivity does not apply to all aspects of our tests. This therefore allows us to conclude that our previously derived results are rigorous and can thus be generalized
FlashProfile: A Framework for Synthesizing Data Profiles
We address the problem of learning a syntactic profile for a collection of
strings, i.e. a set of regex-like patterns that succinctly describe the
syntactic variations in the strings. Real-world datasets, typically curated
from multiple sources, often contain data in various syntactic formats. Thus,
any data processing task is preceded by the critical step of data format
identification. However, manual inspection of data to identify the different
formats is infeasible in standard big-data scenarios.
Prior techniques are restricted to a small set of pre-defined patterns (e.g.
digits, letters, words, etc.), and provide no control over granularity of
profiles. We define syntactic profiling as a problem of clustering strings
based on syntactic similarity, followed by identifying patterns that succinctly
describe each cluster. We present a technique for synthesizing such profiles
over a given language of patterns, that also allows for interactive refinement
by requesting a desired number of clusters.
Using a state-of-the-art inductive synthesis framework, PROSE, we have
implemented our technique as FlashProfile. Across tasks over large
real datasets, we observe a median profiling time of only s.
Furthermore, we show that access to syntactic profiles may allow for more
accurate synthesis of programs, i.e. using fewer examples, in
programming-by-example (PBE) workflows such as FlashFill.Comment: 28 pages, SPLASH (OOPSLA) 201
A System for Accessible Artificial Intelligence
While artificial intelligence (AI) has become widespread, many commercial AI
systems are not yet accessible to individual researchers nor the general public
due to the deep knowledge of the systems required to use them. We believe that
AI has matured to the point where it should be an accessible technology for
everyone. We present an ongoing project whose ultimate goal is to deliver an
open source, user-friendly AI system that is specialized for machine learning
analysis of complex data in the biomedical and health care domains. We discuss
how genetic programming can aid in this endeavor, and highlight specific
examples where genetic programming has automated machine learning analyses in
previous projects.Comment: 14 pages, 5 figures, submitted to Genetic Programming Theory and
Practice 2017 worksho
- …