Search CORE

211 research outputs found

On Binary de Bruijn Sequences from LFSRs with Arbitrary Characteristic Polynomials

Author: Chang Zuling
Ezerman Martianus Frederic
Ling San
Wang Huaxiong
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2018
Field of study

We propose a construction of de Bruijn sequences by the cycle joining method from linear feedback shift registers (LFSRs) with arbitrary characteristic polynomial

f(x)

. We study in detail the cycle structure of the set

\Omega(f(x))

that contains all sequences produced by a specific LFSR on distinct inputs and provide a fast way to find a state of each cycle. This leads to an efficient algorithm to find all conjugate pairs between any two cycles, yielding the adjacency graph. The approach is practical to generate a large class of de Bruijn sequences up to order

n \approx 20

. Many previously proposed constructions of de Bruijn sequences are shown to be special cases of our construction

arXiv.org e-Print Archive

DR-NTU (Digital Repository of NTU)

Large Genomes Assembly Using MAPREDUCE Framework

Author: Zhang Yuehua
Publication venue: Clemson University Libraries
Publication date: 01/12/2022
Field of study

Knowing the genome sequence of an organism is the essential step toward understanding its genomic and genetic characteristics. Currently, whole genome shotgun (WGS) sequencing is the most widely used genome sequencing technique to determine the entire DNA sequence of an organism. Recent advances in next-generation sequencing (NGS) techniques have enabled biologists to generate large DNA sequences in a high-throughput and low-cost way. However, the assembly of NGS reads faces significant challenges due to short reads and an enormously high volume of data. Despite recent progress in genome assembly, current NGS assemblers cannot generate high-quality results or efficiently handle large genomes with billions of reads. In this research, we proposed a new Genome Assembler based on MapReduce (GAMR), which tackles both limitations. GAMR is based on a bi-directed de Bruijn graph and implemented using the MapReduce framework. We designed a distributed algorithm for each step in GAMR, making it scalable in assembling large-scale genomes. We also proposed novel gap-filling algorithms to improve assembly results to achieve higher accuracy and more extended continuity. We evaluated the assembly performance of GAMR using benchmark data and compared it against other NGS assemblers. We also demonstrated the scalability of GAMR by using it to assemble loblolly pine (~22Gbp). The results showed that GAMR finished the assembly much faster and with a much lower requirement of computing resources

Clemson University: TigerPrints

Comparative Genomics in Distant Taxa: Generating Total Orders of Digraphs

Author: Gärtner Fabian
Publication venue
Publication date: 11/03/2020
Field of study

Qucosa - Publikationsserver der Universität Leipzig

Searching for patterns in Conway's Game of Life

Author: Bontes Johan
Publication venue: Department of Computer Science
Publication date: 12/03/2020
Field of study

Conway’s Game of Life (Life) is a simple cellular automaton, discovered by John Conway in 1970, that exhibits complex emergent behavior. Life-enthusiasts have been looking for building blocks with specific properties (patterns) to answer unsolved problems in Life for the past five decades. Finding patterns in Life is difficult due to the large search space. Current search algorithms use an explorative approach based on the rules of the game, but this can only sample a small fraction of the search space. More recently, people have used Sat solvers to search for patterns. These solvers are not specifically tuned to this problem and thus waste a lot of time processing Life’s rules in an engine that does not understand them. We propose a novel Sat-based approach that replaces the binary tree used by traditional Sat solvers with a grid-based approach, complemented by an injection of Game of Life specific knowledge. This leads to a significant speedup in searching. As a fortunate side effect, our solver can be generalized to solve general Sat problems. Because it is grid-based, all manipulations are embarrassingly parallel, allowing implementation on massively parallel hardware

Cape Town University OpenUCT

Practical implementation of a dependently typed functional programming language

Author: Edwin C. Brady
Edwin C. Brady
Publication venue
Publication date: 01/01/2005
Field of study

Types express a program's meaning, and checking types ensures that a program has the intended meaning. In a dependently typed programming language types are predicated on values, leading to the possibility of expressing invariants of a program's behaviour in its type. Dependent types allow us to give more detailed meanings to programs, and hence be more confident of their correctness. This thesis considers the practical implementation of a dependently typed programming language, using the Epigram notation defined by McBride and McKinna. Epigram is a high level notation for dependently typed functional programming elaborating to a core type theory based on Lu๙s UTT, using Dybjer's inductive families and elimination rules to implement pattern matching. This gives us a rich framework for reasoning about programs. However, a naive implementation introduces several run-time overheads since the type system blurs the distinction between types and values; these overheads include the duplication of values, and the storage of redundant information and explicit proofs. A practical implementation of any programming language should be as efficient as possible; in this thesis we see how the apparent efficiency problems of dependently typed programming can be overcome and that in many cases the richer type information allows us to apply optimisations which are not directly available in traditional languages. I introduce three storage optimisations on inductive families; forcing, detagging and collapsing. I further introduce a compilation scheme from the core type theory to G-machine code, including a pattern matching compiler for elimination rules and a compilation scheme for efficient run-time implementation of Peano's natural numbers. We also see some low level optimisations for removal of identity functions, unused arguments and impossible case branches. As a result, we see that a dependent type theory is an effective base on which to build a feasible programming language

Durham e-Theses

CiteSeerX

OpenGrey Repository

Ontology based model framework for conceptual design of treatment flow sheets

Author: Koegst Thilo
Publication venue
Publication date: 06/12/2013
Field of study

The primary objective of wastewater treatment is the removal of pollutants to meet given legal effluent standards. To further reduce operators costs additional recovery of resources and energy is desired by industrial and municipal wastewater treatment. Hence the objective in early stage of planning of treatment facilities lies in the identification and evaluation of promising configurations of treatment units. Obviously this early stage of planning may best be supported by software tools to be able to deal with a variety of different treatment configurations. In chemical process engineering various design tools are available that automatically identify feasible process configurations for the purpose to obtain desired products from given educts. In contrast, the adaptation of these design tools for the automatic generation of treatment unit configurations (process chains) to achieve preset effluent standards is hampered by the following three reasons. First, pollutants in wastewater are usually not defined as chemical substances but by compound parameters according to equal properties (e.g. all particulate matter). Consequently the variation of a single compound parameter leads to a change of related parameters (e.g. relation between Chemical Oxygen Demand and Total Suspended Solids). Furthermore, mathematical process models of treatment processes are tailored towards fractions of compound parameters. This hampers the generic representation of these process models which in turn is essential for automatic identification of treatment configurations. Second, treatment technologies for wastewater treatment rely on a variety of chemical, biological, and physical phenomena. Approaches to mathematically describe these phenomena cover a wide range of modeling techniques including stochastic, conceptual or deterministic approaches. Even more the consideration of temporal and spatial resolutions differ. This again hampers a generic representation of process models. Third, the automatic identification of treatment configurations may either be achieved by the use of design rules or by permutation of all possible combinations of units stored within a database of treatment units. The first approach depends on past experience translated into design rules. Hence, no innovative new treatment configurations can be identified. The second approach to identify all possible configurations collapses by extremely high numbers of treatment configurations that cannot be mastered. This is due to the phenomena of combinatorial explosion. It follows therefrom that an appropriate planning algorithm should function without the need of additional design rules and should be able to identify directly feasible configurations while discarding those impractical. This work presents a planning tool for the identification and evaluation of treatment configurations that tackles the before addressed problems. The planning tool comprises two major parts. An external declarative knowledge base and the actual planning tool that includes a goal oriented planning algorithm. The knowledge base describes parameters for wastewater characterization (i.e. material model) and a set of treatment units represented by process models (i.e. process model). The formalization of the knowledge base is achieved by the Web Ontology Language (OWL). The developed data model being the organization structure of the knowledge base describes relations between wastewater parameters and process models to enable for generic representation of process models. Through these parameters for wastewater characterization as well as treatment units can be altered or added to the knowledge base without the requirement to synchronize already included parameter representations or process models. Furthermore the knowledge base describes relations between parameters and properties of water constituents. This allows to track changes of all wastewater parameters which result from modeling of removal efficiency of applied treatment units. So far two generic treatment units have been represented within the knowledge base. These are separation and conversion units. These two raw types have been applied to represent different types of clarifiers and biological treatment units. The developed planning algorithm is based on a Means-Ends Analysis (MEA). This is a goal oriented search algorithm that posts goals from wastewater state and limit value restrictions to select those treatment units only that are likely to solve the treatment problem. Regarding this, all treatment units are qualified according to postconditions that describe the effect of each unit. In addition, units are also characterized by preconditions that state the application range of each unit. The developed planning algorithm furthermore allows for the identification of simple cycles to account for moving bed reactor systems (e.g. functional unit of aeration tank and clarifier). The evaluation of identified treatment configurations is achieved by total estimated cost of each configuration. The planning tool has been tested on five use cases. Some use cases contained multiple sources and sinks. This showed the possibility to identify water reuse capabilities as well as to identify solutions that go beyond end of pipe solutions. Beyond the originated area of application, the planning tool may be used for advanced interrogations. Thereby the knowledge base and planning algorithm may be further developed to address the objectives to identify configurations for any type of material and energy recovery

Technische Universität Dresden: Qucosa

Foundations of Software Science and Computation Structures

Author
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 13/04/2022
Field of study

This open access book constitutes the proceedings of the 25th International Conference on Foundations of Software Science and Computational Structures, FOSSACS 2022, which was held during April 4-6, 2022, in Munich, Germany, as part of the European Joint Conferences on Theory and Practice of Software, ETAPS 2022. The 23 regular papers presented in this volume were carefully reviewed and selected from 77 submissions. They deal with research on theories and methods to support the analysis, integration, synthesis, transformation, and verification of programs and software systems

Directory of Open Access Books (DOAB)

Detecting and visualizing patterns in the causal topology of temporal networks

Author: Perri Vincenzo
Publication venue
Publication date: 01/01/2023
Field of study

ZORA