3 research outputs found

    Balanced Families of Perfect Hash Functions and Their Applications

    Full text link
    The construction of perfect hash functions is a well-studied topic. In this paper, this concept is generalized with the following definition. We say that a family of functions from [n][n] to [k][k] is a δ\delta-balanced (n,k)(n,k)-family of perfect hash functions if for every S[n]S \subseteq [n], S=k|S|=k, the number of functions that are 1-1 on SS is between T/δT/\delta and δT\delta T for some constant T>0T>0. The standard definition of a family of perfect hash functions requires that there will be at least one function that is 1-1 on SS, for each SS of size kk. In the new notion of balanced families, we require the number of 1-1 functions to be almost the same (taking δ\delta to be close to 1) for every such SS. Our main result is that for any constant δ>1\delta > 1, a δ\delta-balanced (n,k)(n,k)-family of perfect hash functions of size 2O(kloglogk)logn2^{O(k \log \log k)} \log n can be constructed in time 2O(kloglogk)nlogn2^{O(k \log \log k)} n \log n. Using the technique of color-coding we can apply our explicit constructions to devise approximation algorithms for various counting problems in graphs. In particular, we exhibit a deterministic polynomial time algorithm for approximating both the number of simple paths of length kk and the number of simple cycles of size kk for any kO(lognlogloglogn)k \leq O(\frac{\log n}{\log \log \log n}) in a graph with nn vertices. The approximation is up to any fixed desirable relative error

    A Methodological Framework for the Reconstruction of Contiguous Regions of Ancestral Genomes and Its Application to Mammalian Genomes

    Get PDF
    The reconstruction of ancestral genome architectures and gene orders from homologies between extant species is a long-standing problem, considered by both cytogeneticists and bioinformaticians. A comparison of the two approaches was recently investigated and discussed in a series of papers, sometimes with diverging points of view regarding the performance of these two approaches. We describe a general methodological framework for reconstructing ancestral genome segments from conserved syntenies in extant genomes. We show that this problem, from a computational point of view, is naturally related to physical mapping of chromosomes and benefits from using combinatorial tools developed in this scope. We develop this framework into a new reconstruction method considering conserved gene clusters with similar gene content, mimicking principles used in most cytogenetic studies, although on a different kind of data. We implement and apply it to datasets of mammalian genomes. We perform intensive theoretical and experimental comparisons with other bioinformatics methods for ancestral genome segments reconstruction. We show that the method that we propose is stable and reliable: it gives convergent results using several kinds of data at different levels of resolution, and all predicted ancestral regions are well supported. The results come eventually very close to cytogenetics studies. It suggests that the comparison of methods for ancestral genome reconstruction should include the algorithmic aspects of the methods as well as the disciplinary differences in data aquisition
    corecore