15,884 research outputs found

    Evolutionary prototype selection for multi-output regression

    Get PDF
    A novel approach to prototype selection for multi-output regression data sets is presented. A multi-objective evolutionary algorithm is used to evaluate the selections using two criteria: training data set compression and prediction quality expressed in terms of root mean squared error. A multi-target regressor based on k-NN was used for that purpose during the training to evaluate the error, while the tests were performed using four different multi-target predictive models. The distance matrices used by the multi-target regressor were cached to accelerate operational performance. Multiple Pareto fronts were also used to prevent overfitting and to obtain a broader range of solutions, by using different probabilities in the initialization of populations and different evolutionary parameters in each one. The results obtained with the benchmark data sets showed that the proposed method greatly reduced data set size and, at the same time, improved the predictive capabilities of the multi-output regressors trained on the reduced data set.NCN (Polish National Science Center) grant “Evolutionary Methods in Data Selection” No. 2017/01/X/ST6/00202, project TIN2015-67534-P (MINECO/FEDER, UE) of the Ministerio de Economía y Competitividad of the Spanish Government, and project BU085P17 (JCyL/FEDER, UE) of the Junta de Castilla y León cofinanced with European Union FEDER funds

    A survey of outlier detection methodologies

    Get PDF
    Outlier detection has been used for centuries to detect and, where appropriate, remove anomalous observations from data. Outliers arise due to mechanical faults, changes in system behaviour, fraudulent behaviour, human error, instrument error or simply through natural deviations in populations. Their detection can identify system faults and fraud before they escalate with potentially catastrophic consequences. It can identify errors and remove their contaminating effect on the data set and as such to purify the data for processing. The original outlier detection methods were arbitrary but now, principled and systematic techniques are used, drawn from the full gamut of Computer Science and Statistics. In this paper, we introduce a survey of contemporary techniques for outlier detection. We identify their respective motivations and distinguish their advantages and disadvantages in a comparative review

    Multi-objective improvement of software using co-evolution and smart seeding

    Get PDF
    Optimising non-functional properties of software is an important part of the implementation process. One such property is execution time, and compilers target a reduction in execution time using a variety of optimisation techniques. Compiler optimisation is not always able to produce semantically equivalent alternatives that improve execution times, even if such alternatives are known to exist. Often, this is due to the local nature of such optimisations. In this paper we present a novel framework for optimising existing software using a hybrid of evolutionary optimisation techniques. Given as input the implementation of a program or function, we use Genetic Programming to evolve a new semantically equivalent version, optimised to reduce execution time subject to a given probability distribution of inputs. We employ a co-evolved population of test cases to encourage the preservation of the program’s semantics, and exploit the original program through seeding of the population in order to focus the search. We carry out experiments to identify the important factors in maximising efficiency gains. Although in this work we have optimised execution time, other non-functional criteria could be optimised in a similar manner

    Towards the Evolution of Novel Vertical-Axis Wind Turbines

    Full text link
    Renewable and sustainable energy is one of the most important challenges currently facing mankind. Wind has made an increasing contribution to the world's energy supply mix, but still remains a long way from reaching its full potential. In this paper, we investigate the use of artificial evolution to design vertical-axis wind turbine prototypes that are physically instantiated and evaluated under approximated wind tunnel conditions. An artificial neural network is used as a surrogate model to assist learning and found to reduce the number of fabrications required to reach a higher aerodynamic efficiency, resulting in an important cost reduction. Unlike in other approaches, such as computational fluid dynamics simulations, no mathematical formulations are used and no model assumptions are made.Comment: 14 pages, 11 figure

    An adaptive stigmergy-based system for evaluating technological indicator dynamics in the context of smart specialization

    Full text link
    Regional innovation is more and more considered an important enabler of welfare. It is no coincidence that the European Commission has started looking at regional peculiarities and dynamics, in order to focus Research and Innovation Strategies for Smart Specialization towards effective investment policies. In this context, this work aims to support policy makers in the analysis of innovation-relevant trends. We exploit a European database of the regional patent application to determine the dynamics of a set of technological innovation indicators. For this purpose, we design and develop a software system for assessing unfolding trends in such indicators. In contrast with conventional knowledge-based design, our approach is biologically-inspired and based on self-organization of information. This means that a functional structure, called track, appears and stays spontaneous at runtime when local dynamism in data occurs. A further prototyping of tracks allows a better distinction of the critical phenomena during unfolding events, with a better assessment of the progressing levels. The proposed mechanism works if structural parameters are correctly tuned for the given historical context. Determining such correct parameters is not a simple task since different indicators may have different dynamics. For this purpose, we adopt an adaptation mechanism based on differential evolution. The study includes the problem statement and its characterization in the literature, as well as the proposed solving approach, experimental setting and results.Comment: mail: [email protected]

    Evolutionary improvement of programs

    Get PDF
    Most applications of genetic programming (GP) involve the creation of an entirely new function, program or expression to solve a specific problem. In this paper, we propose a new approach that applies GP to improve existing software by optimizing its non-functional properties such as execution time, memory usage, or power consumption. In general, satisfying non-functional requirements is a difficult task and often achieved in part by optimizing compilers. However, modern compilers are in general not always able to produce semantically equivalent alternatives that optimize non-functional properties, even if such alternatives are known to exist: this is usually due to the limited local nature of such optimizations. In this paper, we discuss how best to combine and extend the existing evolutionary methods of GP, multiobjective optimization, and coevolution in order to improve existing software. Given as input the implementation of a function, we attempt to evolve a semantically equivalent version, in this case optimized to reduce execution time subject to a given probability distribution of inputs. We demonstrate that our framework is able to produce non-obvious optimizations that compilers are not yet able to generate on eight example functions. We employ a coevolved population of test cases to encourage the preservation of the function's semantics. We exploit the original program both through seeding of the population in order to focus the search, and as an oracle for testing purposes. As well as discussing the issues that arise when attempting to improve software, we employ rigorous experimental method to provide interesting and practical insights to suggest how to address these issues

    Local Rule-Based Explanations of Black Box Decision Systems

    Get PDF
    The recent years have witnessed the rise of accurate but obscure decision systems which hide the logic of their internal decision processes to the users. The lack of explanations for the decisions of black box systems is a key ethical issue, and a limitation to the adoption of machine learning components in socially sensitive and safety-critical contexts. %Therefore, we need explanations that reveals the reasons why a predictor takes a certain decision. In this paper we focus on the problem of black box outcome explanation, i.e., explaining the reasons of the decision taken on a specific instance. We propose LORE, an agnostic method able to provide interpretable and faithful explanations. LORE first leans a local interpretable predictor on a synthetic neighborhood generated by a genetic algorithm. Then it derives from the logic of the local interpretable predictor a meaningful explanation consisting of: a decision rule, which explains the reasons of the decision; and a set of counterfactual rules, suggesting the changes in the instance's features that lead to a different outcome. Wide experiments show that LORE outperforms existing methods and baselines both in the quality of explanations and in the accuracy in mimicking the black box

    A systematic review of data quality issues in knowledge discovery tasks

    Get PDF
    Hay un gran crecimiento en el volumen de datos porque las organizaciones capturan permanentemente la cantidad colectiva de datos para lograr un mejor proceso de toma de decisiones. El desafĂ­o mas fundamental es la exploraciĂłn de los grandes volĂșmenes de datos y la extracciĂłn de conocimiento Ăștil para futuras acciones por medio de tareas para el descubrimiento del conocimiento; sin embargo, muchos datos presentan mala calidad. Presentamos una revisiĂłn sistemĂĄtica de los asuntos de calidad de datos en las ĂĄreas del descubrimiento de conocimiento y un estudio de caso aplicado a la enfermedad agrĂ­cola conocida como la roya del cafĂ©.Large volume of data is growing because the organizations are continuously capturing the collective amount of data for better decision-making process. The most fundamental challenge is to explore the large volumes of data and extract useful knowledge for future actions through knowledge discovery tasks, nevertheless many data has poor quality. We presented a systematic review of the data quality issues in knowledge discovery tasks and a case study applied to agricultural disease named coffee rust
    • 

    corecore