422 research outputs found

    On Greedy Algorithms for Binary de Bruijn Sequences

    Full text link
    We propose a general greedy algorithm for binary de Bruijn sequences, called Generalized Prefer-Opposite (GPO) Algorithm, and its modifications. By identifying specific feedback functions and initial states, we demonstrate that most previously-known greedy algorithms that generate binary de Bruijn sequences are particular cases of our new algorithm

    A Study of Syntactic and Semantic Artifacts and its Application to Lambda Definability, Strong Normalization, and Weak Normalization in the Presence of...

    Get PDF
    Church's lambda-calculus underlies the syntax (i.e., the form) and the semantics (i.e., the meaning) of functional programs. This thesis is dedicated to studying man-made constructs (i.e., artifacts) in the lambda calculus. For example, one puts the expressive power of the lambda calculus to the test in the area of lambda definability. In this area, we present a course-of-value representation bridging Church numerals and Scott numerals. We then turn to weak and strong normalization using Danvy et al.'s syntactic and functional correspondences. We give a new account of Felleisen and Hieb's syntactic theory of state, and of abstract machines for strong normalization due to Curien, Crégut, Lescanne, and Kluge

    Large Genomes Assembly Using MAPREDUCE Framework

    Get PDF
    Knowing the genome sequence of an organism is the essential step toward understanding its genomic and genetic characteristics. Currently, whole genome shotgun (WGS) sequencing is the most widely used genome sequencing technique to determine the entire DNA sequence of an organism. Recent advances in next-generation sequencing (NGS) techniques have enabled biologists to generate large DNA sequences in a high-throughput and low-cost way. However, the assembly of NGS reads faces significant challenges due to short reads and an enormously high volume of data. Despite recent progress in genome assembly, current NGS assemblers cannot generate high-quality results or efficiently handle large genomes with billions of reads. In this research, we proposed a new Genome Assembler based on MapReduce (GAMR), which tackles both limitations. GAMR is based on a bi-directed de Bruijn graph and implemented using the MapReduce framework. We designed a distributed algorithm for each step in GAMR, making it scalable in assembling large-scale genomes. We also proposed novel gap-filling algorithms to improve assembly results to achieve higher accuracy and more extended continuity. We evaluated the assembly performance of GAMR using benchmark data and compared it against other NGS assemblers. We also demonstrated the scalability of GAMR by using it to assemble loblolly pine (~22Gbp). The results showed that GAMR finished the assembly much faster and with a much lower requirement of computing resources

    Combinatorics of explicit substitutions

    Full text link
    λυ\lambda\upsilon is an extension of the λ\lambda-calculus which internalises the calculus of substitutions. In the current paper, we investigate the combinatorial properties of λυ\lambda\upsilon focusing on the quantitative aspects of substitution resolution. We exhibit an unexpected correspondence between the counting sequence for λυ\lambda\upsilon-terms and famous Catalan numbers. As a by-product, we establish effective sampling schemes for random λυ\lambda\upsilon-terms. We show that typical λυ\lambda\upsilon-terms represent, in a strong sense, non-strict computations in the classic λ\lambda-calculus. Moreover, typically almost all substitutions are in fact suspended, i.e. unevaluated, under closures. Consequently, we argue that λυ\lambda\upsilon is an intrinsically non-strict calculus of explicit substitutions. Finally, we investigate the distribution of various redexes governing the substitution resolution in λυ\lambda\upsilon and investigate the quantitative contribution of various substitution primitives
    corecore