Search CORE

2,468 research outputs found

Gathering Statistics to Aspectually Classify Sentences with a Genetic Algorithm

Author: McKeown Kathleen R.
Siegel Eric V.
Publication venue
Publication date: 01/01/1996
Field of study

This paper presents a method for large corpus analysis to semantically classify an entire clause. In particular, we use cooccurrence statistics among similar clauses to determine the aspectual class of an input clause. The process examines linguistic features of clauses that are relevant to aspectual classification. A genetic algorithm determines what combinations of linguistic features to use for this task.Comment: postscript, 9 pages, Proceedings of the Second International Conference on New Methods in Language Processing, Oflazer and Somers ed

arXiv.org e-Print Archive

CiteSeerX

Columbia University Academic Commons

Preface

Author: Abraham Ajith
Jain Lakhmi
van der Zwaag B.J.
Publication venue: 'Springer Fachmedien Wiesbaden GmbH'
Publication date: 01/01/2004
Field of study

University of Twente Research Information

Adaptive Operator Mechanism for Genetic Programming

Author: 김민혁
Publication venue: 서울대학교 대학원
Publication date: 01/08/2013
Field of study

학위논문 (박사)-- 서울대학교 대학원 : 전기·컴퓨터공학부, 2013. 8. Robert Ian McKay.their performances are competitive with systems without an adaptive operator mechanism. However they showed some drawbacks, which we discuss. To overcome them, we suggest three variants on operator selection, which performed somewhat better. We have investigated evaluation of operator impact in adaptive operator mechanism, which measures the impact of operator applications on improvement of solution. Hence the impact guides operator rates, evaluation of operator impact is very important in adaptive operator mechanism. There are two issues in evaluation of operator impact: the resource and the method. Basically all history information of run are able to be used as resources for the operator impact, but fitness value which is directly related with the improvement of solution, is usually used as a resource. By using a variety of problems, we used two kinds of resources: accuracy and structure in this thesis. On the other hand, although we used same resources, the evaluated impacts are different by methods. We suggested several methods of the evaluation of operator impact. Although they require only small change, they have a large effect on performance. Finally, we verified adaptive operator mechanism by applying it to a real-world applicationa modeling of algal blooms in the Nakdong River. The objective of this application is a model that describes and predicts the ecosystem of the Nakdong River. We verified it with two researches: fitting the parameters of an expert-derived model for the Nakdong River with a GA, and modeling by extending the expert-derived model with TAG3P.유전 프로그래밍은 모델 학습에 효과적인 진화 연산 알고리즘이다. 유전 프로그래밍은 다양한 파라미터를 가지고 있는데, 이들 파라미터의 값은 대체로 주어진 문제에 맞춰 사용자가 직접 조정한다. 유전 프로그래밍의 성능은 파라미터의 값에 따라 크게 좌우되기 때문에 파라미터 설정에 대한 연구는 진화 연산에서 많은 주목을 받고 있다. 하지만 아직까지 효과적으로 파라미터를 설정하는 방법에 대한 보편적인 지침이 없으며, 많은 실험을 통한 시행착오를 거치면서 적절한 파라미터 값을 찾는 방법이 일반적으로 쓰이고 있다. 본 논문에서 제시하는 적응 연산자 메커니즘은 여러 파라미터 중 유전 연산자의 적용률을 설정해 주는 방법으로, 학습 중간중간의 상황에 맞춰 연산자 적용률을 자동적으로 조정한다. 본 논문에서는, 기존의 적응 연산자 방법을 다양한 유전 연산자를 가진 문법 기반의 유전 프로그래밍인 TAG3P에 적용하고 새로운 적응 연산자 방법을 개발함으로써, 적응 연산자 메커니즘의 적용 범위를 유전 프로그래밍 영역까지 확장하였다. 기존의 적응 연산자 알고리즘을 TAG3P에 적용시키는 연구는 성공적으로 이루어졌으나 몇 가지 문제점을 드러내었다. 이 문제점은 본문에서 후술한다. 이 문제점을 해결하기 위해 유전자 선택에 대한 새로운 변형 알고리즘을 제시하였고, 이는 기존 알고리즘과 비교하여 더 좋은 성능을 보여주었다. 한편으로 유전 연산자가 해의 향상에 미치는 영향을 측정하는 연산자 영향력 평가에 대한 연구도 진행하였다. 적응 연산자 메커니즘에서는 측정된 영향력을 바탕으로 연산자의 적용률을 변화시키기 때문에 영향력 평가는 적응 연산자 메커니즘에서 매우 중요하다. 이 연구에서는 어떤 정보를 이용하여 영향력을 측정할 것인지, 그리고 어떤 방법을 이용하여 영향력을 측정할 것인지의 두 가지 주요 쟁점을 다룬다. 연산자 영향력 평가에는 학습 과정의 모든 정보가 사용될 수 있으며, 대체로 해의 향상과 직접적인 관련이 있는 적합도를 이용한다. 본 논문에서는 다양한 문제를 이용하여 정확도와 구조에 관련된 두 지표를 영향력 평가에 이용해보았다. 한편으로 같은 정보를 이용하더라도 그것을 활용하는 방법에 따라 측정되는 영향력이 달라지는데, 본 논문에서는 작은 변화를 통해서도 큰 성능 변화를 야기시킬 수 있는 영향력 평가 방법을 몇가지 소개한다. 마지막으로 적응 연산자 메커니즘을 실제 문제에 적용함으로써 유용성을 확인하였다. 이를 위해 사용된 실제 문제는 낙동강의 녹조 현상에 대한 예측으로, 낙동강의 생태 시스템을 묘사하고 예측하는 모델을 개발하는 것을 목적으로 한다. 2가지 연구를 통해 유용성을 확인하였다. 우선 전문가에 의해 만들어진 기본 모델을 바탕으로, 유전 알고리즘을 이용하여 모델의 파라미터를 최적화 하였고, 그리고 TAG3P를 이용하여 기본 모델의 확장하고 이를 통해 새로운 모델을 만들어 보았다.Genetic programming (GP) is an effective evolutionary algorithm for many problems, especially suited to model learning. GP has many parameters, usually defined by the user according to the problem. The performance of GP is sensitive to their values. Parameter setting has been a major focus of study in evolutionary computation. However there is still no general guideline for choosing efficient settings. The usual method for parameter setting is trial and error. The method used in this thesis, adaptive operator mechanism, replaces the user's action in setting rates of application of genetic operators. adaptive operator mechanism autonomously controls the genetic operators during a run. This thesis extends adaptive operator mechanism to genetic programming, applying existing adaptive operator algorithms and developing them for TAG3P, a grammar-guided GP which supports a wide variety of useful genetic operators. Existing adaptive operator selection algorithms are successfully applied to TAG3P1 Introduction 1 1.1 Background and Motivation 1 1.2 Our Approach and Its Contributions 2 1.3 Outline 4 2 Related Works 5 2.1 Evolutionary Algorithms 5 2.1.1 Genetic Algorithm 5 2.1.2 Genetic Programming 8 2.1.3 Tree Adjoining Grammar based Genetic Programming 9 3 Adaptive Mechanism and Adaptive Operator Selection 16 3.1 Adaptive Mechanism 16 3.2 Adaptive Operator Selection 18 3.2.1 Operator Selection 18 3.2.2 Evaluation of Operator Impact 19 3.3 Algorithms of Adaptive Operator Selection 20 3.3.1 Probability Matching 21 3.3.2 Adaptive Pursuit 22 3.3.3 Multi-Armed Bandits 25 4 Preliminary Experiment for Adaptive Operator Mechanism 28 4.1 Test Problems 28 4.2 Experimental Design 30 4.2.1 Search Space 31 4.2.2 General Parameter Settings 32 4.3 Results and Discussion 34 5 Operator Selection 39 5.1 Operator Selection Algorithms for GP 39 5.1.1 Powered Probability Matching 39 5.1.2 Adaptive Probability Matching 41 5.1.3 Recursive Adaptive Pursuit 41 5.2 Experiments and Results 43 5.2.1 Test Problems 43 5.2.2 Experimental Design 44 5.2.3 Results and Discussion 46 6 Evaluation of Operator Impact 56 6.1 Rates for the Amount of Individual Usage 57 6.1.1 Denition of Rates for the Amount of Individual Usage 57 6.1.2 Results and Discussion 58 6.2 Ratio for the Improvement of Fitness 63 6.2.1 Pairs and Group 64 6.2.2 Ratio and Children Fitness 65 6.2.3 Experimental Design 65 6.2.4 Result and Discussion 66 6.3 Ranking Point 73 6.3.1 Denition of Ranking Point 73 6.3.2 Experimental Design 74 6.3.3 Result and Discussion 74 6.4 Pre-Search Structure 76 6.4.1 Denition of Pre-Search Structure 76 6.4.2 Preliminary Experiment for Sampling 78 6.4.3 Experimental Design 82 6.4.4 Result and Discussion 83 7 Application: Nakdong River Modeling 85 7.1 Problem Description 85 7.1.1 Outline 85 7.1.2 Data Description 86 7.1.3 Model Description 88 7.1.4 Methods 93 7.2 Results 97 7.2.1 Parameter Optimization 97 7.2.2 Modeling 101 7.3 Summary 103 8 Conclusion 104 8.1 Summary 104 8.2 Future Works 108Docto

SNU Open Repository and Archive

Semantically-based crossover in genetic programming: application to real-valued symbolic regression

Author: Galván López Edgar
McKay Robert I.
O'Neill Michael
Quang Uy Nguyen
Xuan Hoai Nguyen
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2011
Field of study

We investigate the effects of semantically-based crossover operators in genetic programming, applied to real-valued symbolic regression problems. We propose two new relations derived from the semantic distance between subtrees, known as semantic equivalence and semantic similarity. These relations are used to guide variants of the crossover operator, resulting in two new crossover operators—semantics aware crossover (SAC) and semantic similarity-based crossover (SSC). SAC, was introduced and previously studied, is added here for the purpose of comparison and analysis. SSC extends SAC by more closely controlling the semantic distance between subtrees to which crossover may be applied. The new operators were tested on some real-valued symbolic regression problems and compared with standard crossover (SC), context aware crossover (CAC), Soft Brood Selection (SBS), and No Same Mate (NSM) selection. The experimental results show on the problems examined that, with computational effort measured by the number of function node evaluations, only SSC and SBS were significantly better than SC, and SSC was often better than SBS. Further experiments were also conducted to analyse the perfomance sensitivity to the parameter settings for SSC. This analysis leads to a conclusion that SSC is more constructive and has higher locality than SAC, NSM and SC; we believe these are the main reasons for the improved performance of SSC

MURAL - Maynooth University Research Archive Library

Research Repository UCD

Irish Universities

NUI Maynooth Eprint Archive

Maynooth University ePrints and eTheses Archive

Using Dimensional Aware Genetic Programming to find interpretable Dispatching Rules for the Job Shop Scheduling Problem

Author: Álvaro Manuel Festas Pereira da Silva
Publication venue
Publication date: 07/07/2021
Field of study

Dispatching Rules (DRs) have been used in several applications in manufacturing systems. They assign priority to jobs in a queue choosing the next job to be executed. As they are challenging to design, genetic programming (GP) is being used to find better performative DRs. In GP, several different DRs are evolved, and due to some operations and selection processes inspired in nature, the DRs improve. However, little research has been done in trying to reach small and interpretable DRs. Usually, these generated expressions tend to become extremely large, with a couple of hundred terms or more. This work will innovate by using CFG (context-free grammars) methods, particularly CFG-GP and GE (Grammar Evolution), for reaching DRs which are dimensional aware. These methods will be compared as they have several distinct characteristics and were never used for this problem. The objective is that by forcing the syntax of the DRs to be correct, it will be possible to reach smaller and more interpretable DRs. Furthermore, an enumerator was made that found the best possible expression for a small DRs size, which will serve as a baseline to evaluate how well the different algorithms can explore these spaces and give the best possible DRs for a specific size. The results show a significant performance improvement in using DAGP methods for this problem. Moreover, GP/GE and CFG-GP can explore the small DRs optimally or close to optimally, managing to find the best small DRs

Repositório Aberto da Universidade do Porto

CES-479 A Linear Estimation-of-Distribution GP System

Author: McPhee NF
Poli R
Publication venue: CES-479
Publication date: 01/01/2008
Field of study

We present N-gram GP, an estimation of distribution algorithm for the evolution of linear computer programs. The algorithm learns and samples the joint probability distribution of triplets of instructions (or 3-grams) at the same time as it is learning and sampling a program length distribution. We have tested N-gram GP on symbolic regressions problems where the target function is a polynomial of up to degree 12 and lawn-mower problems with lawn sizes of up to 12 ? 12. Results show that the algorithm is e?ective and scales better on these problems than either linear GP or simple stochastic hill-climbing

University of Essex Research Repository

CiteSeerX

Adapting a Hyper-heuristic to Respond to Scalability Issues in Combinatorial Optimisation

Author: Marshall Richard J.
Publication venue: 'Victoria University of Wellington Library'
Publication date: 01/01/2015
Field of study

The development of a heuristic to solve an optimisation problem in a new domain, or a specific variation of an existing problem domain, is often beyond the means of many smaller businesses. This is largely due to the task normally needing to be assigned to a human expert, and such experts tend to be scarce and expensive. One of the aims of hyper-heuristic research is to automate all or part of the heuristic development process and thereby bring the generation of new heuristics within the means of more organisations. A second aim of hyper-heuristic research is to ensure that the process by which a domain specific heuristic is developed is itself independent of the problem domain. This enables a hyper-heuristic to exist and operate above the combinatorial optimisation problem “domain barrier” and generalise across different problem domains. A common issue with heuristic development is that a heuristic is often designed or evolved using small size problem instances and then assumed to perform well on larger problem instances. The goal of this thesis is to extend current hyper-heuristic research towards answering the question: How can a hyper-heuristic efficiently and effectively adapt the selection, generation and manipulation of domain specific heuristics as you move from small size and/or narrow domain problems to larger size and/or wider domain problems? In other words, how can different hyperheuristics respond to scalability issues? Each hyper-heuristic has its own strengths and weaknesses. In the context of hyper-heuristic research, this thesis contributes towards understanding scalability issues by firstly developing a compact and effective heuristic that can be applied to other problem instances of differing sizes in a compatible problem domain. We construct a hyper-heuristic for the Capacitated Vehicle Routing Problem domain to establish whether a heuristic for a specific problem domain can be developed which is compact and easy to interpret. The results show that generation of a simple but effective heuristic is possible. Secondly we develop two different types of hyper-heuristic and compare their performance across different combinatorial optimisation problem domains. We construct and compare simplified versions of two existing hyper-heuristics (adaptive and grammar-based), and analyse how each handles the trade-off between computation speed and quality of the solution. The performance of the two hyper-heuristics are tested on seven different problem domains compatible with the HyFlex (Hyper-heuristic Flexible) framework. The results indicate that the adaptive hyper-heuristic is able to deliver solutions of a pre-defined quality in a shorter computational time than the grammar-based hyper-heuristic. Thirdly we investigate how the adaptive hyper-heuristic developed in the second stage of this thesis can respond to problem instances of the same size, but containing different features and complexity. We investigate how, with minimal knowledge about the problem domain and features of the instance being worked on, a hyper-heuristic can modify its processes to respond to problem instances containing different features and problem domains of different complexity. In this stage we allow the adaptive hyper-heuristic to select alternative vectors for the selection of problem domain operators, and acceptance criteria used to determine whether solutions should be retained or discarded. We identify a consistent difference between the best performing pairings of selection vector and acceptance criteria, and those pairings which perform poorly. This thesis shows that hyper-heuristics can respond to scalability issues, although not all do so with equal ease. The flexibility of an adaptive hyper-heuristic enables it to perform faster than the more rigid grammar-based hyper-heuristic, but at the expense of losing a reusable heuristic

Victoria University of Wellington

ResearchArchive at Victoria University of Wellington

The application of an artificial immune system for solving the identification problem

Author: Astakhova
Hart
Hunt
Publication venue: 'EDP Sciences'
Publication date: 01/01/2017
Field of study

Crossref

Utilising restricted for-loops in genetic programming

Author: Li X
Publication venue: RMIT University
Publication date: 01/01/2007
Field of study

Genetic programming is an approach that utilises the power of evolution to allow computers to evolve programs. While loops are natural components of most programming languages and appear in every reasonably-sized application, they are rarely used in genetic programming. The work is to investigate a number of restricted looping constructs to determine whether any significant benefits can be obtained in genetic programming. Possible benefits include: Solving problems which cannot be solved without loops, evolving smaller sized solutions which can be more easily understood by human programmers and solving existing problems quicker by using fewer evaluations. In this thesis, a number of explicit restricted loop formats were formulated and tested on the Santa Fe ant problem, a modified ant problem, a sorting problem, a visit-every-square problem and a difficult object classificat ion problem. The experimental results showed that these explicit loops can be successfully used in genetic programming. The evolutionary process can decide when, where and how to use them. Runs with these loops tended to generate smaller sized solutions in fewer evaluations. Solutions with loops were found to some problems that could not be solved without loops. The results and analysis of this thesis have established that there are significant benefits in using loops in genetic programming. Restricted loops can avoid the difficulties of evolving consistent programs and the infinite iterations problem. Researchers and other users of genetic programming should not be afraid of loops

RMIT Research Repository

FixMiner: Mining Relevant Fix Patterns for Automated Program Repair

Author: Bissyandé Tegawendé F.
Kim Dongsun
Klein Jacques
Koyuncu Anil
Liu Kui
Monperrus Martin
Traon Yves Le
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 30/09/2019
Field of study

Patching is a common activity in software development. It is generally performed on a source code base to address bugs or add new functionalities. In this context, given the recurrence of bugs across projects, the associated similar patches can be leveraged to extract generic fix actions. While the literature includes various approaches leveraging similarity among patches to guide program repair, these approaches often do not yield fix patterns that are tractable and reusable as actionable input to APR systems. In this paper, we propose a systematic and automated approach to mining relevant and actionable fix patterns based on an iterative clustering strategy applied to atomic changes within patches. The goal of FixMiner is thus to infer separate and reusable fix patterns that can be leveraged in other patch generation systems. Our technique, FixMiner, leverages Rich Edit Script which is a specialized tree structure of the edit scripts that captures the AST-level context of the code changes. FixMiner uses different tree representations of Rich Edit Scripts for each round of clustering to identify similar changes. These are abstract syntax trees, edit actions trees, and code context trees. We have evaluated FixMiner on thousands of software patches collected from open source projects. Preliminary results show that we are able to mine accurate patterns, efficiently exploiting change information in Rich Edit Scripts. We further integrated the mined patterns to an automated program repair prototype, PARFixMiner, with which we are able to correctly fix 26 bugs of the Defects4J benchmark. Beyond this quantitative performance, we show that the mined fix patterns are sufficiently relevant to produce patches with a high probability of correctness: 81% of PARFixMiner's generated plausible patches are correct.Comment: 31 pages, 11 figure

arXiv.org e-Print Archive

Open Repository and Bibliography - Luxembourg