422 research outputs found
On Greedy Algorithms for Binary de Bruijn Sequences
We propose a general greedy algorithm for binary de Bruijn sequences, called
Generalized Prefer-Opposite (GPO) Algorithm, and its modifications. By
identifying specific feedback functions and initial states, we demonstrate that
most previously-known greedy algorithms that generate binary de Bruijn
sequences are particular cases of our new algorithm
A Study of Syntactic and Semantic Artifacts and its Application to Lambda Definability, Strong Normalization, and Weak Normalization in the Presence of...
Church's lambda-calculus underlies the syntax (i.e., the form) and the semantics (i.e., the meaning) of functional programs. This thesis is dedicated to studying man-made constructs (i.e., artifacts) in the lambda calculus. For example, one puts the expressive power of the lambda calculus to the test in the area of lambda definability. In this area, we present a course-of-value representation bridging Church numerals and Scott numerals. We then turn to weak and strong normalization using Danvy et al.'s syntactic and functional correspondences. We give a new account of Felleisen and Hieb's syntactic theory of state, and of abstract machines for strong normalization due to Curien, Crégut, Lescanne, and Kluge
Large Genomes Assembly Using MAPREDUCE Framework
Knowing the genome sequence of an organism is the essential step toward understanding its genomic and genetic characteristics. Currently, whole genome shotgun (WGS) sequencing is the most widely used genome sequencing technique to determine the entire DNA sequence of an organism. Recent advances in next-generation sequencing (NGS) techniques have enabled biologists to generate large DNA sequences in a high-throughput and low-cost way. However, the assembly of NGS reads faces significant challenges due to short reads and an enormously high volume of data. Despite recent progress in genome assembly, current NGS assemblers cannot generate high-quality results or efficiently handle large genomes with billions of reads. In this research, we proposed a new Genome Assembler based on MapReduce (GAMR), which tackles both limitations. GAMR is based on a bi-directed de Bruijn graph and implemented using the MapReduce framework. We designed a distributed algorithm for each step in GAMR, making it scalable in assembling large-scale genomes. We also proposed novel gap-filling algorithms to improve assembly results to achieve higher accuracy and more extended continuity. We evaluated the assembly performance of GAMR using benchmark data and compared it against other NGS assemblers. We also demonstrated the scalability of GAMR by using it to assemble loblolly pine (~22Gbp). The results showed that GAMR finished the assembly much faster and with a much lower requirement of computing resources
Combinatorics of explicit substitutions
is an extension of the -calculus which
internalises the calculus of substitutions. In the current paper, we
investigate the combinatorial properties of focusing on the
quantitative aspects of substitution resolution. We exhibit an unexpected
correspondence between the counting sequence for -terms and
famous Catalan numbers. As a by-product, we establish effective sampling
schemes for random -terms. We show that typical
-terms represent, in a strong sense, non-strict computations
in the classic -calculus. Moreover, typically almost all substitutions
are in fact suspended, i.e. unevaluated, under closures. Consequently, we argue
that is an intrinsically non-strict calculus of explicit
substitutions. Finally, we investigate the distribution of various redexes
governing the substitution resolution in and investigate the
quantitative contribution of various substitution primitives
- …