5 research outputs found

    Faster Evolutionary Multi-Objective Optimization via GALE, the Geometric Active Learner

    Get PDF
    Goal optimization has long been a topic of great interest in computer science. The literature contains many thousands of papers that discuss methods for the search of optimal solutions to complex problems. In the case of multi-objective optimization, such a search yields iteratively improved approximations to the Pareto frontier, i.e. the set of best solutions contained along a trade-off curve of competing objectives.;To approximate the Pareto frontier, one method that is ubiquitous throughout the field of optimization is stochastic search. Stochastic search engines explore solution spaces by randomly mutating candidate guesses to generate new solutions. This mutation policy is employed by the most commonly used tools (e.g. NSGA-II, SPEA2, etc.), with the goal of a) avoiding local optima, and b) expand upon diversity in the set of generated approximations. Such blind mutation policies explore many sub-optimal solutions that are discarded when better solutions are found. Hence, this approach has two problems. Firstly, stochastic search can be unnecessarily computationally expensive due to evaluating an overwhelming number of candidates. Secondly, the generated approximations to the Pareto frontier are usually very large, and can be difficult to understand.;To solve these two problems, a more-directed, less-stochastic approach than standard search tools is necessary. This thesis presents GALE (Geometric Active Learning). GALE is an active learner that finds approximations to the Pareto frontier by spectrally clustering candidates using a near-linear time recursive descent algorithm that iteratively divides candidates into halves (called leaves at the bottom level). Active learning in GALE selects a minimally most-informative subset of candidates by only evaluating the two-most different candidates during each descending split; hence, GALE only requires at most, 2Log2(N) evaluations per generation. The candidates of each leaf are thereafter non-stochastically mutated in the most promising directions along each piece. Those leafs are piece-wise approximations to the Pareto frontier.;The experiments of this thesis lead to the following conclusion: a near-linear time recursive binary division of the decision space of candidates in a multi-objective optimization algorithm can find useful directions to mutate instances and find quality solutions much faster than traditional randomization approaches. Specifically, in comparative studies with standard methods (NSGA-II and SPEA2) applied to a variety of models, GALE required orders of magnitude fewer evaluations to find solutions. As a result, GALE can perform dramatically faster than the other methods, especially for realistic models

    Multimodal Code Search

    Get PDF

    Practical synthesis from real-world oracles

    Get PDF
    As software systems become increasingly heterogeneous, the ability of compilers to reason about an entire system has decreased. When components of a system are not implemented as traditional programs, but rather as specialised hardware, optimised architecture-specific libraries, or network services, the compiler is unable to cross these abstraction barriers and analyse the system as a whole. If these components could be modelled or understood as programs, then the compiler would be able to reason about their behaviour without concern for their internal implementation details: a homogeneous view of the entire system would be afforded. However, it is not often the case that such components ever corresponded to an original program. This means that to facilitate this homogenenous analysis, programmatic models of component behaviour must be learned or constructed automatically. Constructing these models is an inductive program synthesis problem, albeit a challenging one that is largely beyond the ability of existing implementations. In order for the problem to be made tractable, information provided by the underlying context (i.e. the real component behaviour to be matched) must be integrated. This thesis presents three program synthesis approaches that integrate contextual information to synthesise programmatic models for real, existing components. The first, Annote, exploits informally-encoded information about a component's interface (e.g. from documentation) by weaving that information into an extended type-and-attribute system for component interfaces. The second, Presyn, learns a pair of cooperating probabilistic models from prior syntheses, that aim to predict likely program structure based on a component's interface. Finally, Haze uses observations of common side-effects of component executions to bias the search for programs. These approaches are each evaluated against comparable synthesisers from the literature, on a set of benchmark problems derived from real components. Learning models for component behaviour is only a partial solution; the compiler must also have some mechanism to use those models for program analysis and transformation. This thesis additionally proposes a novel mechanism for context-sensitive automatic API migration based on synthesised programmatic models, and evaluates the effectiveness of doing so on real application code. In summary, this thesis proposes a new framing for program synthesis problems that target the behaviour of real components, and demonstrates three different potential approaches to synthesis in this spirit. The success of these approaches is evaluated against implementations from the literature, and their results used to drive a novel API migration technique

    Active learning for software engineering

    No full text
    © 2019 Copyright held by the owner/author(s). Publication rights licensed to ACM. Software applications have grown increasingly complex to deliver the features desired by users. Software modularity has been used as a way to mitigate the costs of developing such complex software. Active learning-based program inference provides an elegant framework that exploits this modularity to tackle development correctness, performance and cost in large applications. Inferred programs can be used for many purposes, including generation of secure code, code re-use through automatic encapsulation, adaptation to new platforms or languages, and optimization. We show through detailed examples how our approach can infer three modules in a representative application. Finally, we outline the broader paradigm and open research questions
    corecore