299 research outputs found
Finding state-of-the-art non-cryptographic hashes with genetic programming
Proceding of: 9th International Conference, Reykjavik, Iceland, September 9-13, 2006.The design of non-cryptographic hash functions by means of evolutionary computation is a relatively new and unexplored problem. In this paper, we use the Genetic Programming paradigm to evolve collision free and fast hash functions. For achieving robustness against collision we use a fitness function based on a non-linearity concept, producing evolved hashes with a good degree of Avalanche Effect. The other main issue, efficiency, is assured by using only very fast operators (both in hardware and software) and by limiting the number of nodes. Using this approach, we have created a new hash function, which we call gp-hash, that is able to outperform a set of five human-generated, widely-used hash functions.This article has been financed by the Spanish founded research MCyT project
OP:LINK, Ref:TIN2005-08818-C04-02
Evolving hash functions by means of genetic programming
Proceedings of the 8th annual conference on Genetic and evolutionary computation. Seattle, Washington, USA, July 08-12, 2006The design of hash functions by means of evolutionary computation is a relatively new and unexplored problem. In this work, we use Genetic Programming (GP) to evolve robust and fast hash functions. We use a fitness function based on a non-linearity measure, producing evolved hashes with a good degree of Avalanche Effect. Efficiency is assured by using only very fast operators (both in hardware and software) and by limiting the number of nodes. Using this approach, we have created a new hash function, which we call gp-hash, that is able to outperform a set of five human-generated, widely-used hash functions.This article has been financed by the Spanish founded research MCyT project OP:LINK, Ref:TIN2005-08818-C04-02.Publicad
Applying security features to GA4GH Phenopackets
Global Alliance for Genomic and Health has developed a standard file format called Phenopacket to improve the exchange of phenotypic information over the network. However, this standard does not implement any security mechanism, which allows an attacker to obtain sensitive information if he gets hold of it. This project aims to provide security features within the Phenopacket schema to ensure a secure exchange. To achieve this objective, it is necessary to understand the structure of the schema in order to classify which fields need to be protected. Once the schema has been designed, an investigation is conducted into which technologies are currently the most secure, leading to the implementation of three security mechanisms: digital signature, encryption, and hashing. To conclude, several verification tests are performed to ensure that both the creation of Phenopacket and the security measures applied have been correctly implemented, confirming that data exchange is possible without revealing any sensitive data
Privacy in the Genomic Era
Genome sequencing technology has advanced at a rapid pace and it is now
possible to generate highly-detailed genotypes inexpensively. The collection
and analysis of such data has the potential to support various applications,
including personalized medical services. While the benefits of the genomics
revolution are trumpeted by the biomedical community, the increased
availability of such data has major implications for personal privacy; notably
because the genome has certain essential features, which include (but are not
limited to) (i) an association with traits and certain diseases, (ii)
identification capability (e.g., forensics), and (iii) revelation of family
relationships. Moreover, direct-to-consumer DNA testing increases the
likelihood that genome data will be made available in less regulated
environments, such as the Internet and for-profit companies. The problem of
genome data privacy thus resides at the crossroads of computer science,
medicine, and public policy. While the computer scientists have addressed data
privacy for various data types, there has been less attention dedicated to
genomic data. Thus, the goal of this paper is to provide a systematization of
knowledge for the computer science community. In doing so, we address some of
the (sometimes erroneous) beliefs of this field and we report on a survey we
conducted about genome data privacy with biomedical specialists. Then, after
characterizing the genome privacy problem, we review the state-of-the-art
regarding privacy attacks on genomic data and strategies for mitigating such
attacks, as well as contextualizing these attacks from the perspective of
medicine and public policy. This paper concludes with an enumeration of the
challenges for genome data privacy and presents a framework to systematize the
analysis of threats and the design of countermeasures as the field moves
forward
An experimental study on fitness distributions of tree shapes in GP with one-point crossover
Proceeding of: 12th European Conference, EuroGP 2009, Tübingen, Germany, April 15-17In Genetic Programming (GP), One-Point Crossover is an alternative to the destructive properties and poor performance of Standard Crossover. One-Point Crossover acts in two phases, first making the population converge to a common tree shape, then looking for the best individual within that shape. So, we understand that One-Point Crossover is making an implicit evolution of tree shapes. We want to know if making this evolution explicit could lead to any improvement in the search power of GP. But we first need to define how this evolution could be performed. In this work we made an exhaustive study of fitness distributions of tree shapes for 6 different GP problems. We were able to identify common properties on distributions, and we propose a method to explicitly evaluate tree shapes. Based on this method, in the future, we want to implement a new genetic operator and a novel representation system for GP.This work has been funded by the Spanish Ministry of Education and Science and FEDER under contract TIN2005-08818-C04 (the OPLINK project) and by Comunidad de Madrid under contract 2008/00035/001 (Técnicas de Aprendizaje Automático Aplicadas al Interfaz Cerebro-Ordenador)Publicad
High-Level Object Oriented Genetic Programming in Logistic Warehouse Optimization
DisertaÄŤnĂ práce je zaměřena na optimalizaci prĹŻbÄ›hu pracovnĂch operacĂ v logistickĂ˝ch skladech a distribuÄŤnĂch centrech. HlavnĂm cĂlem je optimalizovat procesy plánovánĂ, rozvrhovánĂ a odbavovánĂ. JelikoĹľ jde o problĂ©m patĹ™ĂcĂ do tĹ™Ădy sloĹľitosti NP-teĹľkĂ˝, je vĂ˝poÄŤetnÄ› velmi nároÄŤnĂ© nalĂ©zt optimálnĂ Ĺ™ešenĂ. MotivacĂ pro Ĺ™ešenĂ tĂ©to práce je vyplnÄ›nĂ pomyslnĂ© mezery mezi metodami zkoumanĂ˝mi na vÄ›deckĂ© a akademickĂ© pĹŻdÄ› a metodami pouĹľĂvanĂ˝mi v produkÄŤnĂch komerÄŤnĂch prostĹ™edĂch. Jádro optimalizaÄŤnĂho algoritmu je zaloĹľeno na základÄ› genetickĂ©ho programovánĂ Ĺ™ĂzenĂ©ho bezkontextovou gramatikou. HlavnĂm pĹ™Ănosem tĂ©to práce je a) navrhnout novĂ˝ optimalizaÄŤnĂ algoritmus, kterĂ˝ respektuje následujĂcĂ optimalizaÄŤnĂ podmĂnky: celkovĂ˝ ÄŤas zpracovánĂ, vyuĹľitĂ zdrojĹŻ, a zahlcenĂ skladovĂ˝ch uliÄŤek, kterĂ© mĹŻĹľe nastat bÄ›hem zpracovánĂ ĂşkolĹŻ, b) analyzovat historická data z provozu skladu a vyvinout sadu testovacĂch pĹ™ĂkladĹŻ, kterĂ© mohou slouĹľit jako referenÄŤnĂ vĂ˝sledky pro dalšà vĂ˝zkum, a dále c) pokusit se pĹ™edÄŤit stanovenĂ© referenÄŤnĂ vĂ˝sledky dosaĹľenĂ© kvalifikovanĂ˝m a trĂ©novanĂ˝m operaÄŤnĂm manaĹľerem jednoho z nejvÄ›tšĂch skladĹŻ ve stĹ™ednĂ EvropÄ›.This work is focused on the work-flow optimization in logistic warehouses and distribution centers. The main aim is to optimize process planning, scheduling, and dispatching. The problem is quite accented in recent years. The problem is of NP hard class of problems and where is very computationally demanding to find an optimal solution. The main motivation for solving this problem is to fill the gap between the new optimization methods developed by researchers in academic world and the methods used in business world. The core of the optimization algorithm is built on the genetic programming driven by the context-free grammar. The main contribution of the thesis is a) to propose a new optimization algorithm which respects the makespan, the utilization, and the congestions of aisles which may occur, b) to analyze historical operational data from warehouse and to develop the set of benchmarks which could serve as the reference baseline results for further research, and c) to try outperform the baseline results set by the skilled and trained operational manager of the one of the biggest warehouses in the middle Europe.
CDCL(Crypto) and Machine Learning based SAT Solvers for Cryptanalysis
Over the last two decades, we have seen a dramatic improvement in the efficiency of conflict-driven clause-learning Boolean satisfiability (CDCL SAT) solvers over industrial problems from a variety of applications such as verification, testing, security, and AI. The availability of such powerful general-purpose search tools as the SAT solver has led many researchers to propose SAT-based methods for cryptanalysis, including techniques for finding collisions in hash functions and breaking symmetric encryption schemes.
A feature of all of the previously proposed SAT-based cryptanalysis work is that they are \textit{blackbox}, in the sense that the cryptanalysis problem is encoded as a SAT instance and then a CDCL SAT solver is invoked to solve said instance. A weakness of this approach is that the encoding thus generated may be too large for any modern solver to solve it efficiently. Perhaps a more important weakness of this approach is that the solver is in no way specialized or tuned to solve the given instance. Finally, very little work has been done to leverage parallelism in the context of SAT-based cryptanalysis.
To address these issues, we developed a set of methods that improve on the state-of-the-art SAT-based cryptanalysis along three fronts. First, we describe an approach called \cdcl (inspired by the CDCL() paradigm) to tailor the internal subroutines of the CDCL SAT solver with domain-specific knowledge about cryptographic primitives. Specifically, we extend the propagation and conflict analysis subroutines of CDCL solvers with specialized codes that have knowledge about the cryptographic primitive being analyzed by the solver. We demonstrate the power of this framework in two cryptanalysis tasks of algebraic fault attack and differential cryptanalysis of SHA-1 and SHA-256 cryptographic hash functions. Second, we propose a machine-learning based parallel SAT solver that performs well on cryptographic problems relative to many state-of-the-art parallel SAT solvers. Finally, we use a formulation of SAT into Bayesian moment matching to address heuristic initialization problem in SAT solvers
- …