Search CORE

4,619 research outputs found

A Probabilistic Linear Genetic Programming with Stochastic Context-Free Grammar for solving Symbolic Regression problems

Author: Bosman P. A. N.
Poli R.
Shan Y.
Wong P. K.
Yanai K.
Yanai K.
Publication venue
Publication date: 03/04/2017
Field of study

Traditional Linear Genetic Programming (LGP) algorithms are based only on the selection mechanism to guide the search. Genetic operators combine or mutate random portions of the individuals, without knowing if the result will lead to a fitter individual. Probabilistic Model Building Genetic Programming (PMB-GP) methods were proposed to overcome this issue through a probability model that captures the structure of the fit individuals and use it to sample new individuals. This work proposes the use of LGP with a Stochastic Context-Free Grammar (SCFG), that has a probability distribution that is updated according to selected individuals. We proposed a method for adapting the grammar into the linear representation of LGP. Tests performed with the proposed probabilistic method, and with two hybrid approaches, on several symbolic regression benchmark problems show that the results are statistically better than the obtained by the traditional LGP.Comment: Genetic and Evolutionary Computation Conference (GECCO) 2017, Berlin, German

arXiv.org e-Print Archive

Crossref

Grammar Variational Autoencoder

Author: Hernández-Lobato José Miguel
Kusner Matt J.
Paige Brooks
Publication venue
Publication date: 06/03/2017
Field of study

Deep generative models have been wildly successful at learning coherent latent representations for continuous data such as video and audio. However, generative modeling of discrete data such as arithmetic expressions and molecular structures still poses significant challenges. Crucially, state-of-the-art methods often produce outputs that are not valid. We make the key observation that frequently, discrete data can be represented as a parse tree from a context-free grammar. We propose a variational autoencoder which encodes and decodes directly to and from these parse trees, ensuring the generated outputs are always valid. Surprisingly, we show that not only does our model more often generate valid outputs, it also learns a more coherent latent space in which nearby points decode to similar discrete outputs. We demonstrate the effectiveness of our learned models by showing their improved performance in Bayesian optimization for symbolic regression and molecular synthesis

arXiv.org e-Print Archive

UCL Discovery

Automatic Machine Learning by Pipeline Synthesis using Model-Based Reinforcement Learning and a Grammar

Author: Cho Kyunghyun
Drori Iddo
Freire Juliana
Krishnamurthy Yamuna
Lourenco Raoni
Rampin Remi
Silva Claudio
Publication venue
Publication date: 01/01/2019
Field of study

Automatic machine learning is an important problem in the forefront of machine learning. The strongest AutoML systems are based on neural networks, evolutionary algorithms, and Bayesian optimization. Recently AlphaD3M reached state-of-the-art results with an order of magnitude speedup using reinforcement learning with self-play. In this work we extend AlphaD3M by using a pipeline grammar and a pre-trained model which generalizes from many different datasets and similar tasks. Our results demonstrate improved performance compared with our earlier work and existing methods on AutoML benchmark datasets for classification and regression tasks. In the spirit of reproducible research we make our data, models, and code publicly available.Comment: ICML Workshop on Automated Machine Learnin

arXiv.org e-Print Archive

Open Repository and Bibliography - Luxembourg

Towards a Comprehensible and Accurate Credit Management Model: Application of four Computational Intelligence Methodologies

Author: Ampazis Nikolaos
Dounias Georgios
Tsakonas Athanasios
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/09/2006
Field of study

The paper presents methods for classification of applicants into different categories of credit risk using four different computational intelligence techniques. The selected methodologies involved in the rule-based categorization task are (1) feedforward neural networks trained with second order methods (2) inductive machine learning, (3) hierarchical decision trees produced by grammar-guided genetic programming and (4) fuzzy rule based systems produced by grammar-guided genetic programming. The data used are both numerical and linguistic in nature and they represent a real-world problem, that of deciding whether a loan should be granted or not, in respect to financial details of customers applying for that loan, to a specific private EU bank. We examine the proposed classification models with a sample of enterprises that applied for a loan, each of which is described by financial decision variables (ratios), and classified to one of the four predetermined classes. Attention is given to the comprehensibility and the ease of use for the acquired decision models. Results show that the application of the proposed methods can make the classification task easier and - in some cases - may minimize significantly the amount of required credit data. We consider that these methodologies may also give the chance for the extraction of a comprehensible credit management model or even the incorporation of a related decision support system in bankin

Crossref

Bournemouth University Research Online

Genetic algorithms with DNN-based trainable crossover as an example of partial specialization of general search

Author: A Graves
A Potapov
A Potapov
B Goertzel
E Özkural
RJ Solomonoff
V Batishcheva
Y Futamura
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 18/07/2018
Field of study

Universal induction relies on some general search procedure that is doomed to be inefficient. One possibility to achieve both generality and efficiency is to specialize this procedure w.r.t. any given narrow task. However, complete specialization that implies direct mapping from the task parameters to solutions (discriminative models) without search is not always possible. In this paper, partial specialization of general search is considered in the form of genetic algorithms (GAs) with a specialized crossover operator. We perform a feasibility study of this idea implementing such an operator in the form of a deep feedforward neural network. GAs with trainable crossover operators are compared with the result of complete specialization, which is also represented as a deep neural network. Experimental results show that specialized GAs can be more efficient than both general GAs and discriminative models.Comment: AGI 2017 procedding, The final publication is available at link.springer.co

arXiv.org e-Print Archive

Crossref

Parameter Learning of Logic Programs for Symbolic-Statistical Modeling

Author: Kameya Y.
Sato T.
Publication venue: 'AI Access Foundation'
Publication date: 09/06/2011
Field of study

We propose a logical/mathematical framework for statistical parameter learning of parameterized logic programs, i.e. definite clause programs containing probabilistic facts with a parameterized distribution. It extends the traditional least Herbrand model semantics in logic programming to distribution semantics, possible world semantics with a probability distribution which is unconditionally applicable to arbitrary logic programs including ones for HMMs, PCFGs and Bayesian networks. We also propose a new EM algorithm, the graphical EM algorithm, that runs for a class of parameterized logic programs representing sequential decision processes where each decision is exclusive and independent. It runs on a new data structure called support graphs describing the logical relationship between observations and their explanations, and learns parameters by computing inside and outside probability generalized for logic programs. The complexity analysis shows that when combined with OLDT search for all explanations for observations, the graphical EM algorithm, despite its generality, has the same time complexity as existing EM algorithms, i.e. the Baum-Welch algorithm for HMMs, the Inside-Outside algorithm for PCFGs, and the one for singly connected Bayesian networks that have been developed independently in each research field. Learning experiments with PCFGs using two corpora of moderate size indicate that the graphical EM algorithm can significantly outperform the Inside-Outside algorithm

arXiv.org e-Print Archive

CiteSeerX

Crossref

Programming with Annotated Grammar Estimation

Author: Yoshihiko Hasegawa
Publication venue: 'IntechOpen'
Publication date: 18/10/2012
Field of study

IntechOpen