12 research outputs found

    Evolving hash functions by means of genetic programming

    Get PDF
    Proceedings of the 8th annual conference on Genetic and evolutionary computation. Seattle, Washington, USA, July 08-12, 2006The design of hash functions by means of evolutionary computation is a relatively new and unexplored problem. In this work, we use Genetic Programming (GP) to evolve robust and fast hash functions. We use a fitness function based on a non-linearity measure, producing evolved hashes with a good degree of Avalanche Effect. Efficiency is assured by using only very fast operators (both in hardware and software) and by limiting the number of nodes. Using this approach, we have created a new hash function, which we call gp-hash, that is able to outperform a set of five human-generated, widely-used hash functions.This article has been financed by the Spanish founded research MCyT project OP:LINK, Ref:TIN2005-08818-C04-02.Publicad

    Finding state-of-the-art non-cryptographic hashes with genetic programming

    Get PDF
    Proceding of: 9th International Conference, Reykjavik, Iceland, September 9-13, 2006.The design of non-cryptographic hash functions by means of evolutionary computation is a relatively new and unexplored problem. In this paper, we use the Genetic Programming paradigm to evolve collision free and fast hash functions. For achieving robustness against collision we use a fitness function based on a non-linearity concept, producing evolved hashes with a good degree of Avalanche Effect. The other main issue, efficiency, is assured by using only very fast operators (both in hardware and software) and by limiting the number of nodes. Using this approach, we have created a new hash function, which we call gp-hash, that is able to outperform a set of five human-generated, widely-used hash functions.This article has been financed by the Spanish founded research MCyT project OP:LINK, Ref:TIN2005-08818-C04-02

    A first attempt at constructing genetic programming expressions for EEG classification

    Get PDF
    Proceeding of: 15th International Conference on Artificial Neural Networks ICANN 2005, Poland, 11-15 September, 2005In BCI (Brain Computer Interface) research, the classification of EEG signals is a domain where raw data has to undergo some preprocessing, so that the right attributes for classification are obtained. Several transformational techniques have been used for this purpose: Principal Component Analysis, the Adaptive Autoregressive Model, FFT or Wavelet Transforms, etc. However, it would be useful to automatically build significant attributes appropriate for each particular problem. In this paper, we use Genetic Programming to evolve projections that translate EEG data into a new vectorial space (coordinates of this space being the new attributes), where projected data can be more easily classified. Although our method is applied here in a straightforward way to check for feasibility, it has achieved reasonable classification results that are comparable to those obtained by other state of the art algorithms. In the future, we expect that by choosing carefully primitive functions, Genetic Programming will be able to give original results that cannot be matched by other machine learning classification algorithms.Publicad

    Genetic programming based data projections for classification tasks

    Get PDF
    In this paper we present a GP-based method for automatically evolve projections, so that data can be more easily classified in the projected spaces. At the same time, our approach can reduce dimensionality by constructing more relevant attributes. Fitness of each projection measures how easy is to classify the dataset after applying the projection. This is quickly computed by a Simple Linear Perceptron. We have tested our approach in three domains. The experiments show that it obtains good results, compared to other Machine Learning approaches, while reducing dimensionality in many cases.Publicad

    GPPE: a method to generate ad-hoc feature extractors for prediction in financial domains

    Get PDF
    When dealing with classification and regression problems, there is a strong need for high-quality attributes. This is a capital issue not only in financial problems, but in many Data Mining domains. Constructive Induction methods help to overcome this problem by mapping the original representation into a new one, where prediction becomes easier. In this work we present GPPE: a GP-based method that projects data from an original data space into another one where data approaches linear behavior (linear separability or linear regression). Also, GPPE is able to reduce the dimensionality of the problem by recombining related attributes and discarding irrelevant ones. We have applied GPPE to two financial domains: Bankruptcy prediction and IPO Underpricing prediction. In both cases GPPE automatically generated a new data representation that obtained competitive prediction rates and drastically reduced the dimensionality of the problem.Publicad

    An experimental study on fitness distributions of tree shapes in GP with one-point crossover

    Get PDF
    Proceeding of: 12th European Conference, EuroGP 2009, Tübingen, Germany, April 15-17In Genetic Programming (GP), One-Point Crossover is an alternative to the destructive properties and poor performance of Standard Crossover. One-Point Crossover acts in two phases, first making the population converge to a common tree shape, then looking for the best individual within that shape. So, we understand that One-Point Crossover is making an implicit evolution of tree shapes. We want to know if making this evolution explicit could lead to any improvement in the search power of GP. But we first need to define how this evolution could be performed. In this work we made an exhaustive study of fitness distributions of tree shapes for 6 different GP problems. We were able to identify common properties on distributions, and we propose a method to explicitly evaluate tree shapes. Based on this method, in the future, we want to implement a new genetic operator and a novel representation system for GP.This work has been funded by the Spanish Ministry of Education and Science and FEDER under contract TIN2005-08818-C04 (the OPLINK project) and by Comunidad de Madrid under contract 2008/00035/001 (Técnicas de Aprendizaje Automático Aplicadas al Interfaz Cerebro-Ordenador)Publicad

    Optimization Algorithms for Large-Scale Real-World Instances of the Frequency Assignment Problem

    Get PDF
    Nowadays, mobile communications are experiencing a strong growth, being more and more indispensable. One of the key issues in the design of mobile networks is the Frequency Assignment Problem (FAP). This problem is crucial at present and will remain important in the foreseeable future. Real world instances of FAP typically involve very large networks, which can only be handled by heuristic methods. In the present work, we are interested in optimizing frequency assignments for problems described in a mathematical formalism that incorporates actual interference information, measured directly on the field, as is done in current GSM networks. To achieve this goal, a range of metaheuristics have been designed, adapted, and rigourously compared on two actual GSM networks modeled according to the latter formalism. In order to generate quickly and reliably high quality solutions, all metaheuristics combine their global search capabilities with a local-search method specially tailored for this domain. The experiments and statistical tests show that in general, all metaheuristics are able to improve upon results published in previous studies, but two of the metaheuristics emerge as the best performers: a population-based algorithm (Scatter Search) and a trajectory based (1+1) Evolutionary Algorithm. Finally, the analysis of the frequency plans obtained offers insight about how the interference cost is reduced in the optimal plans.Publicad

    Diseño automático de funciones hash no criptográficas

    No full text
    Las funciones hash no criptográficas son una de las herramientas más ampliamente utilizadas en las ciencias de la computación. Sus innumerables campos de aplicación van desde analizadores léxicos y compiladores, hasta bases de datos, cachés, redes de comunicaciones, bloom filters, algoritmos de reconocimiento de patrones, juegos de ordenador, servidores DNS, sistemas de archivos, y prácticamente cualquier trozo de código en el que sea necesario consultar o indexar información a gran velocidad. Su tremenda utilidad se debe a que pueden llevar a cabo búsquedas en tiempo constante, independientemente del tamaño del conjunto en el que se busca. Sin embargo, el diseño de estas expresiones matemáticas sigue siendo, a día de hoy, una tarea poco conocida por los ingenieros de software, escasamente documentada, y tradicionalmente llevada a cabo por expertos en procesos prácticamente artesanales. La principal razón es que una buena función hash debe generar salidas pseudoaleatorias, aparentemente impredecibles, y por tanto su diseño involucra estructuras altamente no lineales y sistemas caóticos. Este tipo de diseño, por definición, es muy poco intuitivo y plantea dificultades importantes, incluso para expertos en hashing. Pero por otro lado, estas mismas características que resultan tan exigentes para los humanos, parecen muy apropiadas para que técnicas de inteligencia artificial como la Programaci on Gen etica (PG) automaticen el trabajo, sustituyendo a los expertos en la producción de buenas funciones hash. En esta tesis se presenta GP-hash, un sistema basado en PG capaz de generar de forma automática funciones hash no criptográficas de alta calidad. En este documento se demostrará empíricamente que GP-hash puede generar funciones hash de propósito general con un rendimiento igual o superior al de las má utilizadas por la industria, todas ellas creadas por algunos de los mayores expertos en hasing del mundo, y que forman el estado del arte actual. Este resultado por si sólo ya es importante, puesto que permite sostener que la PG es capaz de igualar (y en ocasiones superar) a los humanos en una tarea que claramente requiere cierto nivel de inteligencia. Adem as, GP-hash también puede utilizarse para generar funciones hash a medida, específicamente diseñadas para ofrecer un rendimiento óptimo en un problema en concreto. Se justificará que, si se entrena al sistema GP-hash bajo ciertas condiciones, un porcentaje muy alto de las funciones generadas superan fácilmente en el problema en cuestión a las funciones de propósito general más utilizadas. Esto permitiría a los ingenieros de software -con o sin conocimientos específicos sobre hashing- evitar una de las decisiones más comprometidas y que más problemas de rendimiento generan en este tipo de sistemas: la elección de una función hash adecuada a su caso particular. En lugar de eso, podránn utilizar GP-hash para obtener una función específicamente diseñada para su problema, que muy probablemente desarrollará un rendimiento excelente en el mismo. Las aplicaciones prácticas de un sistema así son enormes e inmediatas, y podrán llegar a tener un importante impacto en la industria del software. Por último, durante el desarrollo de esta investigaíón se observó que no existía un método estandarizado y universalmente aceptado para comparar funciones hash no criptogáficas entre sí, lo cual suponía un problema de base para los objetivos de esta tesis: no se puede afirmar que una función hash es competitiva si no hay una manera objetiva de compararla con otras funciones. La última aportación de este trabajo es la de llenar este vacío estructural: se recopilarán y analizarán las métricas de hashing más utilizadas en la literatura, y se propondrá a la comunidad científica un marco de referencia sistemático y estructurado para estandarizar la evaluación de funciones hash no criptográficas
    corecore