2,468 research outputs found

    Gathering Statistics to Aspectually Classify Sentences with a Genetic Algorithm

    Full text link
    This paper presents a method for large corpus analysis to semantically classify an entire clause. In particular, we use cooccurrence statistics among similar clauses to determine the aspectual class of an input clause. The process examines linguistic features of clauses that are relevant to aspectual classification. A genetic algorithm determines what combinations of linguistic features to use for this task.Comment: postscript, 9 pages, Proceedings of the Second International Conference on New Methods in Language Processing, Oflazer and Somers ed

    Preface

    Get PDF

    Adaptive Operator Mechanism for Genetic Programming

    Get PDF
    ํ•™์œ„๋…ผ๋ฌธ (๋ฐ•์‚ฌ)-- ์„œ์šธ๋Œ€ํ•™๊ต ๋Œ€ํ•™์› : ์ „๊ธฐยท์ปดํ“จํ„ฐ๊ณตํ•™๋ถ€, 2013. 8. Robert Ian McKay.their performances are competitive with systems without an adaptive operator mechanism. However they showed some drawbacks, which we discuss. To overcome them, we suggest three variants on operator selection, which performed somewhat better. We have investigated evaluation of operator impact in adaptive operator mechanism, which measures the impact of operator applications on improvement of solution. Hence the impact guides operator rates, evaluation of operator impact is very important in adaptive operator mechanism. There are two issues in evaluation of operator impact: the resource and the method. Basically all history information of run are able to be used as resources for the operator impact, but fitness value which is directly related with the improvement of solution, is usually used as a resource. By using a variety of problems, we used two kinds of resources: accuracy and structure in this thesis. On the other hand, although we used same resources, the evaluated impacts are different by methods. We suggested several methods of the evaluation of operator impact. Although they require only small change, they have a large effect on performance. Finally, we verified adaptive operator mechanism by applying it to a real-world applicationa modeling of algal blooms in the Nakdong River. The objective of this application is a model that describes and predicts the ecosystem of the Nakdong River. We verified it with two researches: fitting the parameters of an expert-derived model for the Nakdong River with a GA, and modeling by extending the expert-derived model with TAG3P.์œ ์ „ ํ”„๋กœ๊ทธ๋ž˜๋ฐ์€ ๋ชจ๋ธ ํ•™์Šต์— ํšจ๊ณผ์ ์ธ ์ง„ํ™” ์—ฐ์‚ฐ ์•Œ๊ณ ๋ฆฌ์ฆ˜์ด๋‹ค. ์œ ์ „ ํ”„๋กœ๊ทธ๋ž˜๋ฐ์€ ๋‹ค์–‘ํ•œ ํŒŒ๋ผ๋ฏธํ„ฐ๋ฅผ ๊ฐ€์ง€๊ณ  ์žˆ๋Š”๋ฐ, ์ด๋“ค ํŒŒ๋ผ๋ฏธํ„ฐ์˜ ๊ฐ’์€ ๋Œ€์ฒด๋กœ ์ฃผ์–ด์ง„ ๋ฌธ์ œ์— ๋งž์ถฐ ์‚ฌ์šฉ์ž๊ฐ€ ์ง์ ‘ ์กฐ์ •ํ•œ๋‹ค. ์œ ์ „ ํ”„๋กœ๊ทธ๋ž˜๋ฐ์˜ ์„ฑ๋Šฅ์€ ํŒŒ๋ผ๋ฏธํ„ฐ์˜ ๊ฐ’์— ๋”ฐ๋ผ ํฌ๊ฒŒ ์ขŒ์šฐ๋˜๊ธฐ ๋•Œ๋ฌธ์— ํŒŒ๋ผ๋ฏธํ„ฐ ์„ค์ •์— ๋Œ€ํ•œ ์—ฐ๊ตฌ๋Š” ์ง„ํ™” ์—ฐ์‚ฐ์—์„œ ๋งŽ์€ ์ฃผ๋ชฉ์„ ๋ฐ›๊ณ  ์žˆ๋‹ค. ํ•˜์ง€๋งŒ ์•„์ง๊นŒ์ง€ ํšจ๊ณผ์ ์œผ๋กœ ํŒŒ๋ผ๋ฏธํ„ฐ๋ฅผ ์„ค์ •ํ•˜๋Š” ๋ฐฉ๋ฒ•์— ๋Œ€ํ•œ ๋ณดํŽธ์ ์ธ ์ง€์นจ์ด ์—†์œผ๋ฉฐ, ๋งŽ์€ ์‹คํ—˜์„ ํ†ตํ•œ ์‹œํ–‰์ฐฉ์˜ค๋ฅผ ๊ฑฐ์น˜๋ฉด์„œ ์ ์ ˆํ•œ ํŒŒ๋ผ๋ฏธํ„ฐ ๊ฐ’์„ ์ฐพ๋Š” ๋ฐฉ๋ฒ•์ด ์ผ๋ฐ˜์ ์œผ๋กœ ์“ฐ์ด๊ณ  ์žˆ๋‹ค. ๋ณธ ๋…ผ๋ฌธ์—์„œ ์ œ์‹œํ•˜๋Š” ์ ์‘ ์—ฐ์‚ฐ์ž ๋ฉ”์ปค๋‹ˆ์ฆ˜์€ ์—ฌ๋Ÿฌ ํŒŒ๋ผ๋ฏธํ„ฐ ์ค‘ ์œ ์ „ ์—ฐ์‚ฐ์ž์˜ ์ ์šฉ๋ฅ ์„ ์„ค์ •ํ•ด ์ฃผ๋Š” ๋ฐฉ๋ฒ•์œผ๋กœ, ํ•™์Šต ์ค‘๊ฐ„์ค‘๊ฐ„์˜ ์ƒํ™ฉ์— ๋งž์ถฐ ์—ฐ์‚ฐ์ž ์ ์šฉ๋ฅ ์„ ์ž๋™์ ์œผ๋กœ ์กฐ์ •ํ•œ๋‹ค. ๋ณธ ๋…ผ๋ฌธ์—์„œ๋Š”, ๊ธฐ์กด์˜ ์ ์‘ ์—ฐ์‚ฐ์ž ๋ฐฉ๋ฒ•์„ ๋‹ค์–‘ํ•œ ์œ ์ „ ์—ฐ์‚ฐ์ž๋ฅผ ๊ฐ€์ง„ ๋ฌธ๋ฒ• ๊ธฐ๋ฐ˜์˜ ์œ ์ „ ํ”„๋กœ๊ทธ๋ž˜๋ฐ์ธ TAG3P์— ์ ์šฉํ•˜๊ณ  ์ƒˆ๋กœ์šด ์ ์‘ ์—ฐ์‚ฐ์ž ๋ฐฉ๋ฒ•์„ ๊ฐœ๋ฐœํ•จ์œผ๋กœ์จ, ์ ์‘ ์—ฐ์‚ฐ์ž ๋ฉ”์ปค๋‹ˆ์ฆ˜์˜ ์ ์šฉ ๋ฒ”์œ„๋ฅผ ์œ ์ „ ํ”„๋กœ๊ทธ๋ž˜๋ฐ ์˜์—ญ๊นŒ์ง€ ํ™•์žฅํ•˜์˜€๋‹ค. ๊ธฐ์กด์˜ ์ ์‘ ์—ฐ์‚ฐ์ž ์•Œ๊ณ ๋ฆฌ์ฆ˜์„ TAG3P์— ์ ์šฉ์‹œํ‚ค๋Š” ์—ฐ๊ตฌ๋Š” ์„ฑ๊ณต์ ์œผ๋กœ ์ด๋ฃจ์–ด์กŒ์œผ๋‚˜ ๋ช‡ ๊ฐ€์ง€ ๋ฌธ์ œ์ ์„ ๋“œ๋Ÿฌ๋‚ด์—ˆ๋‹ค. ์ด ๋ฌธ์ œ์ ์€ ๋ณธ๋ฌธ์—์„œ ํ›„์ˆ ํ•œ๋‹ค. ์ด ๋ฌธ์ œ์ ์„ ํ•ด๊ฒฐํ•˜๊ธฐ ์œ„ํ•ด ์œ ์ „์ž ์„ ํƒ์— ๋Œ€ํ•œ ์ƒˆ๋กœ์šด ๋ณ€ํ˜• ์•Œ๊ณ ๋ฆฌ์ฆ˜์„ ์ œ์‹œํ•˜์˜€๊ณ , ์ด๋Š” ๊ธฐ์กด ์•Œ๊ณ ๋ฆฌ์ฆ˜๊ณผ ๋น„๊ตํ•˜์—ฌ ๋” ์ข‹์€ ์„ฑ๋Šฅ์„ ๋ณด์—ฌ์ฃผ์—ˆ๋‹ค. ํ•œํŽธ์œผ๋กœ ์œ ์ „ ์—ฐ์‚ฐ์ž๊ฐ€ ํ•ด์˜ ํ–ฅ์ƒ์— ๋ฏธ์น˜๋Š” ์˜ํ–ฅ์„ ์ธก์ •ํ•˜๋Š” ์—ฐ์‚ฐ์ž ์˜ํ–ฅ๋ ฅ ํ‰๊ฐ€์— ๋Œ€ํ•œ ์—ฐ๊ตฌ๋„ ์ง„ํ–‰ํ•˜์˜€๋‹ค. ์ ์‘ ์—ฐ์‚ฐ์ž ๋ฉ”์ปค๋‹ˆ์ฆ˜์—์„œ๋Š” ์ธก์ •๋œ ์˜ํ–ฅ๋ ฅ์„ ๋ฐ”ํƒ•์œผ๋กœ ์—ฐ์‚ฐ์ž์˜ ์ ์šฉ๋ฅ ์„ ๋ณ€ํ™”์‹œํ‚ค๊ธฐ ๋•Œ๋ฌธ์— ์˜ํ–ฅ๋ ฅ ํ‰๊ฐ€๋Š” ์ ์‘ ์—ฐ์‚ฐ์ž ๋ฉ”์ปค๋‹ˆ์ฆ˜์—์„œ ๋งค์šฐ ์ค‘์š”ํ•˜๋‹ค. ์ด ์—ฐ๊ตฌ์—์„œ๋Š” ์–ด๋–ค ์ •๋ณด๋ฅผ ์ด์šฉํ•˜์—ฌ ์˜ํ–ฅ๋ ฅ์„ ์ธก์ •ํ•  ๊ฒƒ์ธ์ง€, ๊ทธ๋ฆฌ๊ณ  ์–ด๋–ค ๋ฐฉ๋ฒ•์„ ์ด์šฉํ•˜์—ฌ ์˜ํ–ฅ๋ ฅ์„ ์ธก์ •ํ•  ๊ฒƒ์ธ์ง€์˜ ๋‘ ๊ฐ€์ง€ ์ฃผ์š” ์Ÿ์ ์„ ๋‹ค๋ฃฌ๋‹ค. ์—ฐ์‚ฐ์ž ์˜ํ–ฅ๋ ฅ ํ‰๊ฐ€์—๋Š” ํ•™์Šต ๊ณผ์ •์˜ ๋ชจ๋“  ์ •๋ณด๊ฐ€ ์‚ฌ์šฉ๋  ์ˆ˜ ์žˆ์œผ๋ฉฐ, ๋Œ€์ฒด๋กœ ํ•ด์˜ ํ–ฅ์ƒ๊ณผ ์ง์ ‘์ ์ธ ๊ด€๋ จ์ด ์žˆ๋Š” ์ ํ•ฉ๋„๋ฅผ ์ด์šฉํ•œ๋‹ค. ๋ณธ ๋…ผ๋ฌธ์—์„œ๋Š” ๋‹ค์–‘ํ•œ ๋ฌธ์ œ๋ฅผ ์ด์šฉํ•˜์—ฌ ์ •ํ™•๋„์™€ ๊ตฌ์กฐ์— ๊ด€๋ จ๋œ ๋‘ ์ง€ํ‘œ๋ฅผ ์˜ํ–ฅ๋ ฅ ํ‰๊ฐ€์— ์ด์šฉํ•ด๋ณด์•˜๋‹ค. ํ•œํŽธ์œผ๋กœ ๊ฐ™์€ ์ •๋ณด๋ฅผ ์ด์šฉํ•˜๋”๋ผ๋„ ๊ทธ๊ฒƒ์„ ํ™œ์šฉํ•˜๋Š” ๋ฐฉ๋ฒ•์— ๋”ฐ๋ผ ์ธก์ •๋˜๋Š” ์˜ํ–ฅ๋ ฅ์ด ๋‹ฌ๋ผ์ง€๋Š”๋ฐ, ๋ณธ ๋…ผ๋ฌธ์—์„œ๋Š” ์ž‘์€ ๋ณ€ํ™”๋ฅผ ํ†ตํ•ด์„œ๋„ ํฐ ์„ฑ๋Šฅ ๋ณ€ํ™”๋ฅผ ์•ผ๊ธฐ์‹œํ‚ฌ ์ˆ˜ ์žˆ๋Š” ์˜ํ–ฅ๋ ฅ ํ‰๊ฐ€ ๋ฐฉ๋ฒ•์„ ๋ช‡๊ฐ€์ง€ ์†Œ๊ฐœํ•œ๋‹ค. ๋งˆ์ง€๋ง‰์œผ๋กœ ์ ์‘ ์—ฐ์‚ฐ์ž ๋ฉ”์ปค๋‹ˆ์ฆ˜์„ ์‹ค์ œ ๋ฌธ์ œ์— ์ ์šฉํ•จ์œผ๋กœ์จ ์œ ์šฉ์„ฑ์„ ํ™•์ธํ•˜์˜€๋‹ค. ์ด๋ฅผ ์œ„ํ•ด ์‚ฌ์šฉ๋œ ์‹ค์ œ ๋ฌธ์ œ๋Š” ๋‚™๋™๊ฐ•์˜ ๋…น์กฐ ํ˜„์ƒ์— ๋Œ€ํ•œ ์˜ˆ์ธก์œผ๋กœ, ๋‚™๋™๊ฐ•์˜ ์ƒํƒœ ์‹œ์Šคํ…œ์„ ๋ฌ˜์‚ฌํ•˜๊ณ  ์˜ˆ์ธกํ•˜๋Š” ๋ชจ๋ธ์„ ๊ฐœ๋ฐœํ•˜๋Š” ๊ฒƒ์„ ๋ชฉ์ ์œผ๋กœ ํ•œ๋‹ค. 2๊ฐ€์ง€ ์—ฐ๊ตฌ๋ฅผ ํ†ตํ•ด ์œ ์šฉ์„ฑ์„ ํ™•์ธํ•˜์˜€๋‹ค. ์šฐ์„  ์ „๋ฌธ๊ฐ€์— ์˜ํ•ด ๋งŒ๋“ค์–ด์ง„ ๊ธฐ๋ณธ ๋ชจ๋ธ์„ ๋ฐ”ํƒ•์œผ๋กœ, ์œ ์ „ ์•Œ๊ณ ๋ฆฌ์ฆ˜์„ ์ด์šฉํ•˜์—ฌ ๋ชจ๋ธ์˜ ํŒŒ๋ผ๋ฏธํ„ฐ๋ฅผ ์ตœ์ ํ™” ํ•˜์˜€๊ณ , ๊ทธ๋ฆฌ๊ณ  TAG3P๋ฅผ ์ด์šฉํ•˜์—ฌ ๊ธฐ๋ณธ ๋ชจ๋ธ์˜ ํ™•์žฅํ•˜๊ณ  ์ด๋ฅผ ํ†ตํ•ด ์ƒˆ๋กœ์šด ๋ชจ๋ธ์„ ๋งŒ๋“ค์–ด ๋ณด์•˜๋‹ค.Genetic programming (GP) is an effective evolutionary algorithm for many problems, especially suited to model learning. GP has many parameters, usually defined by the user according to the problem. The performance of GP is sensitive to their values. Parameter setting has been a major focus of study in evolutionary computation. However there is still no general guideline for choosing efficient settings. The usual method for parameter setting is trial and error. The method used in this thesis, adaptive operator mechanism, replaces the user's action in setting rates of application of genetic operators. adaptive operator mechanism autonomously controls the genetic operators during a run. This thesis extends adaptive operator mechanism to genetic programming, applying existing adaptive operator algorithms and developing them for TAG3P, a grammar-guided GP which supports a wide variety of useful genetic operators. Existing adaptive operator selection algorithms are successfully applied to TAG3P1 Introduction 1 1.1 Background and Motivation 1 1.2 Our Approach and Its Contributions 2 1.3 Outline 4 2 Related Works 5 2.1 Evolutionary Algorithms 5 2.1.1 Genetic Algorithm 5 2.1.2 Genetic Programming 8 2.1.3 Tree Adjoining Grammar based Genetic Programming 9 3 Adaptive Mechanism and Adaptive Operator Selection 16 3.1 Adaptive Mechanism 16 3.2 Adaptive Operator Selection 18 3.2.1 Operator Selection 18 3.2.2 Evaluation of Operator Impact 19 3.3 Algorithms of Adaptive Operator Selection 20 3.3.1 Probability Matching 21 3.3.2 Adaptive Pursuit 22 3.3.3 Multi-Armed Bandits 25 4 Preliminary Experiment for Adaptive Operator Mechanism 28 4.1 Test Problems 28 4.2 Experimental Design 30 4.2.1 Search Space 31 4.2.2 General Parameter Settings 32 4.3 Results and Discussion 34 5 Operator Selection 39 5.1 Operator Selection Algorithms for GP 39 5.1.1 Powered Probability Matching 39 5.1.2 Adaptive Probability Matching 41 5.1.3 Recursive Adaptive Pursuit 41 5.2 Experiments and Results 43 5.2.1 Test Problems 43 5.2.2 Experimental Design 44 5.2.3 Results and Discussion 46 6 Evaluation of Operator Impact 56 6.1 Rates for the Amount of Individual Usage 57 6.1.1 Denition of Rates for the Amount of Individual Usage 57 6.1.2 Results and Discussion 58 6.2 Ratio for the Improvement of Fitness 63 6.2.1 Pairs and Group 64 6.2.2 Ratio and Children Fitness 65 6.2.3 Experimental Design 65 6.2.4 Result and Discussion 66 6.3 Ranking Point 73 6.3.1 Denition of Ranking Point 73 6.3.2 Experimental Design 74 6.3.3 Result and Discussion 74 6.4 Pre-Search Structure 76 6.4.1 Denition of Pre-Search Structure 76 6.4.2 Preliminary Experiment for Sampling 78 6.4.3 Experimental Design 82 6.4.4 Result and Discussion 83 7 Application: Nakdong River Modeling 85 7.1 Problem Description 85 7.1.1 Outline 85 7.1.2 Data Description 86 7.1.3 Model Description 88 7.1.4 Methods 93 7.2 Results 97 7.2.1 Parameter Optimization 97 7.2.2 Modeling 101 7.3 Summary 103 8 Conclusion 104 8.1 Summary 104 8.2 Future Works 108Docto

    Semantically-based crossover in genetic programming: application to real-valued symbolic regression

    Get PDF
    We investigate the effects of semantically-based crossover operators in genetic programming, applied to real-valued symbolic regression problems. We propose two new relations derived from the semantic distance between subtrees, known as semantic equivalence and semantic similarity. These relations are used to guide variants of the crossover operator, resulting in two new crossover operatorsโ€”semantics aware crossover (SAC) and semantic similarity-based crossover (SSC). SAC, was introduced and previously studied, is added here for the purpose of comparison and analysis. SSC extends SAC by more closely controlling the semantic distance between subtrees to which crossover may be applied. The new operators were tested on some real-valued symbolic regression problems and compared with standard crossover (SC), context aware crossover (CAC), Soft Brood Selection (SBS), and No Same Mate (NSM) selection. The experimental results show on the problems examined that, with computational effort measured by the number of function node evaluations, only SSC and SBS were significantly better than SC, and SSC was often better than SBS. Further experiments were also conducted to analyse the perfomance sensitivity to the parameter settings for SSC. This analysis leads to a conclusion that SSC is more constructive and has higher locality than SAC, NSM and SC; we believe these are the main reasons for the improved performance of SSC

    Using Dimensional Aware Genetic Programming to find interpretable Dispatching Rules for the Job Shop Scheduling Problem

    Get PDF
    Dispatching Rules (DRs) have been used in several applications in manufacturing systems. They assign priority to jobs in a queue choosing the next job to be executed. As they are challenging to design, genetic programming (GP) is being used to find better performative DRs. In GP, several different DRs are evolved, and due to some operations and selection processes inspired in nature, the DRs improve. However, little research has been done in trying to reach small and interpretable DRs. Usually, these generated expressions tend to become extremely large, with a couple of hundred terms or more. This work will innovate by using CFG (context-free grammars) methods, particularly CFG-GP and GE (Grammar Evolution), for reaching DRs which are dimensional aware. These methods will be compared as they have several distinct characteristics and were never used for this problem. The objective is that by forcing the syntax of the DRs to be correct, it will be possible to reach smaller and more interpretable DRs. Furthermore, an enumerator was made that found the best possible expression for a small DRs size, which will serve as a baseline to evaluate how well the different algorithms can explore these spaces and give the best possible DRs for a specific size. The results show a significant performance improvement in using DAGP methods for this problem. Moreover, GP/GE and CFG-GP can explore the small DRs optimally or close to optimally, managing to find the best small DRs

    CES-479 A Linear Estimation-of-Distribution GP System

    Get PDF
    We present N-gram GP, an estimation of distribution algorithm for the evolution of linear computer programs. The algorithm learns and samples the joint probability distribution of triplets of instructions (or 3-grams) at the same time as it is learning and sampling a program length distribution. We have tested N-gram GP on symbolic regressions problems where the target function is a polynomial of up to degree 12 and lawn-mower problems with lawn sizes of up to 12 ? 12. Results show that the algorithm is e?ective and scales better on these problems than either linear GP or simple stochastic hill-climbing

    Adapting a Hyper-heuristic to Respond to Scalability Issues in Combinatorial Optimisation

    No full text
    The development of a heuristic to solve an optimisation problem in a new domain, or a specific variation of an existing problem domain, is often beyond the means of many smaller businesses. This is largely due to the task normally needing to be assigned to a human expert, and such experts tend to be scarce and expensive. One of the aims of hyper-heuristic research is to automate all or part of the heuristic development process and thereby bring the generation of new heuristics within the means of more organisations. A second aim of hyper-heuristic research is to ensure that the process by which a domain specific heuristic is developed is itself independent of the problem domain. This enables a hyper-heuristic to exist and operate above the combinatorial optimisation problem โ€œdomain barrierโ€ and generalise across different problem domains. A common issue with heuristic development is that a heuristic is often designed or evolved using small size problem instances and then assumed to perform well on larger problem instances. The goal of this thesis is to extend current hyper-heuristic research towards answering the question: How can a hyper-heuristic efficiently and effectively adapt the selection, generation and manipulation of domain specific heuristics as you move from small size and/or narrow domain problems to larger size and/or wider domain problems? In other words, how can different hyperheuristics respond to scalability issues? Each hyper-heuristic has its own strengths and weaknesses. In the context of hyper-heuristic research, this thesis contributes towards understanding scalability issues by firstly developing a compact and effective heuristic that can be applied to other problem instances of differing sizes in a compatible problem domain. We construct a hyper-heuristic for the Capacitated Vehicle Routing Problem domain to establish whether a heuristic for a specific problem domain can be developed which is compact and easy to interpret. The results show that generation of a simple but effective heuristic is possible. Secondly we develop two different types of hyper-heuristic and compare their performance across different combinatorial optimisation problem domains. We construct and compare simplified versions of two existing hyper-heuristics (adaptive and grammar-based), and analyse how each handles the trade-off between computation speed and quality of the solution. The performance of the two hyper-heuristics are tested on seven different problem domains compatible with the HyFlex (Hyper-heuristic Flexible) framework. The results indicate that the adaptive hyper-heuristic is able to deliver solutions of a pre-defined quality in a shorter computational time than the grammar-based hyper-heuristic. Thirdly we investigate how the adaptive hyper-heuristic developed in the second stage of this thesis can respond to problem instances of the same size, but containing different features and complexity. We investigate how, with minimal knowledge about the problem domain and features of the instance being worked on, a hyper-heuristic can modify its processes to respond to problem instances containing different features and problem domains of different complexity. In this stage we allow the adaptive hyper-heuristic to select alternative vectors for the selection of problem domain operators, and acceptance criteria used to determine whether solutions should be retained or discarded. We identify a consistent difference between the best performing pairings of selection vector and acceptance criteria, and those pairings which perform poorly. This thesis shows that hyper-heuristics can respond to scalability issues, although not all do so with equal ease. The flexibility of an adaptive hyper-heuristic enables it to perform faster than the more rigid grammar-based hyper-heuristic, but at the expense of losing a reusable heuristic

    Utilising restricted for-loops in genetic programming

    Get PDF
    Genetic programming is an approach that utilises the power of evolution to allow computers to evolve programs. While loops are natural components of most programming languages and appear in every reasonably-sized application, they are rarely used in genetic programming. The work is to investigate a number of restricted looping constructs to determine whether any significant benefits can be obtained in genetic programming. Possible benefits include: Solving problems which cannot be solved without loops, evolving smaller sized solutions which can be more easily understood by human programmers and solving existing problems quicker by using fewer evaluations. In this thesis, a number of explicit restricted loop formats were formulated and tested on the Santa Fe ant problem, a modified ant problem, a sorting problem, a visit-every-square problem and a difficult object classificat ion problem. The experimental results showed that these explicit loops can be successfully used in genetic programming. The evolutionary process can decide when, where and how to use them. Runs with these loops tended to generate smaller sized solutions in fewer evaluations. Solutions with loops were found to some problems that could not be solved without loops. The results and analysis of this thesis have established that there are significant benefits in using loops in genetic programming. Restricted loops can avoid the difficulties of evolving consistent programs and the infinite iterations problem. Researchers and other users of genetic programming should not be afraid of loops

    FixMiner: Mining Relevant Fix Patterns for Automated Program Repair

    Get PDF
    Patching is a common activity in software development. It is generally performed on a source code base to address bugs or add new functionalities. In this context, given the recurrence of bugs across projects, the associated similar patches can be leveraged to extract generic fix actions. While the literature includes various approaches leveraging similarity among patches to guide program repair, these approaches often do not yield fix patterns that are tractable and reusable as actionable input to APR systems. In this paper, we propose a systematic and automated approach to mining relevant and actionable fix patterns based on an iterative clustering strategy applied to atomic changes within patches. The goal of FixMiner is thus to infer separate and reusable fix patterns that can be leveraged in other patch generation systems. Our technique, FixMiner, leverages Rich Edit Script which is a specialized tree structure of the edit scripts that captures the AST-level context of the code changes. FixMiner uses different tree representations of Rich Edit Scripts for each round of clustering to identify similar changes. These are abstract syntax trees, edit actions trees, and code context trees. We have evaluated FixMiner on thousands of software patches collected from open source projects. Preliminary results show that we are able to mine accurate patterns, efficiently exploiting change information in Rich Edit Scripts. We further integrated the mined patterns to an automated program repair prototype, PARFixMiner, with which we are able to correctly fix 26 bugs of the Defects4J benchmark. Beyond this quantitative performance, we show that the mined fix patterns are sufficiently relevant to produce patches with a high probability of correctness: 81% of PARFixMiner's generated plausible patches are correct.Comment: 31 pages, 11 figure
    • โ€ฆ
    corecore