Search CORE

1,375 research outputs found

A Graph-Theoretical Approach to the Selection of the Minimum Tiling Path from a Physical Map

Author: Bozdag Serdar
Publication venue: e-Publications@Marquette
Publication date: 01/03/2013
Field of study

The problem of computing the minimum tiling path (MTP) from a set of clones arranged in a physical map is a cornerstone of hierarchical (clone-by-clone) genome sequencing projects. We formulate this problem in a graph theoretical framework, and then solve by a combination of minimum hitting set and minimum spanning tree algorithms. The tool implementing this strategy, called FMTP, shows improved performance compared to the widely used software FPC. When we execute FMTP and FPC on the same physical map, the MTP produced by FMTP covers a higher portion of the genome, and uses a smaller number of clones. For instance, on the rice genome the MTP produced by our tool would reduce by about 11 percent the cost of a clone-by-clone sequencing project. Source code, benchmark data sets, and documentation of FMTP are freely available at \u3ehttp://code.google.com/p/fingerprint-based-minimal-tiling-path/ under MIT license

epublications@Marquette

Self-assembly: modelling, simulation, and planning

Author: Lukáš Bertl
Publication venue: Czech Technical University in Prague. Computing and Information Centre.
Publication date: 04/06/2019
Field of study

Samoskládání je proces, při kterém se kolekce neuspořádaných částic samovolně orientuje do uspořádaného vzoru nebo funkční struktury bez působení vnější síly, pouze za pomoci lokálních interakcí mezi samotnými částicemi. Tato teze se zaměřuje na teorii dlaždicových samoskládacích systémů a jejich syntézu. Nejdříve je představena oblast výzkumu věnující se dlaždičovým samoskládacím systémům, a poté jsou důkladně popsány základní typy dlaždicových skládacích systémů, kterými jsou abstract Tile Assembly Model (aTAM ), kinetic Tile Assembly Model (kTAM ), a 2-Handed Assembly Model (2HAM ). Poté jsou představeny novější modely a modely se specifickým použitím. Dále je zahrnut stručný popis původu teorie dlaždicového samoskládání společně s krátkým popisem nedávného výzkumu. Dále jsou představeny dva obecné otevřené problémy dlaždicového samoskládání s hlavním zaměřením na problém Pattern Self-Assembly Tile Set Synthesis (PATS), což je NP-těžká kombinatorická optimalizační úloha. Nakonec je ukázán algoritmus Partition Search with Heuristics (PS-H ), který se používá k řešení problému PATS. Následovně jsou demonstrovány dvě aplikace, které byly vyvinuty pro podporu výzkumu abstraktních dlaždicových skládacích modelů a syntézy množin dlaždic pro samoskládání zadaných vzorů. První aplikace je schopná simulovat aTAM a 2HAM systémy ve 2D prostoru. Druhá aplikace je řešič PATS problému, který využívá algoritmu PS-H. Pro obě aplikace jsou popsány hlavní vlastnosti a návrhová rozhodnutí, která řídila jejich vývoj. Nakonec jsou předloženy výsledky několika experimentů. Jedna skupina experimentů byla zaměřena na ověření výpočetní náročnosti vyvinutých algoritmů pro simulátor. Druhá sada experimentů zkoumala vliv jednotlivých vlastností vzorů na vlastnosti dlaždicových systémů, které byly získány syntézou ze vzorů pomocí vyvinutého řešiče PATS problému. Bylo prokázáno, že algoritmus simulující aTAM systém má lineární časovou výpočetní náročnost, zatímco algoritmus simulující 2HAM systém má exponenciální časovou výpočetní náročnost, která navíc silně závisí na simulovaném systému. Aplikace pro řešení syntézy množiny dlaždic ze vzorů je schopna najít relativně malé řešení i pro velké zadané vzory, a to v přiměřeném čase.Self-assembly is the process in which a collection of disordered units organise themselves into ordered patterns or functional structures without any external direction, solely using local interactions among the components. This thesis focuses on the theory of tile-based self-assembly systems and their synthesis. First, an introduction to the study field of tile-based self-assembly systems are given, followed by a thorough description of common types of tile assembly systems such as abstract Tile Assembly Model (aTAM ), kinetic Tile Assembly Model (kTAM ), and 2-Handed Assembly Model (2HAM ). After that, various recently developed models and models with specific applications are listed. A brief summary of the origins of the tile-based self-assembly is also included together with a short review of recent results. Two general open problems are presented with the main focus on the Pattern Self-Assembly Tile Set Synthesis (PATS) problem, which is NP-hard combinatorial optimisation problem. Partition Search with Heuristics (PS-H ) algorithm is presented as it is used for solving the PATS problem. Next, two applications which were developed to study the abstract tile assembly models and the synthesis of tile sets for pattern self-assembly are introduced. The first application is a simulator capable of simulating aTAM and 2HAM systems in 2D. The second application is a solver of the PATS problem based around the PS-H algorithm. Main features and design decisions are described for both applications. Finally, results from several experiments are presented. One set of experiments were focused on verification of computation complexity of algorithms developed for the simulator, and the other set of experiments studied the influences of the properties of the pattern on the tile assembly system synthesised by our implementation of PATS problem solver. It was shown that the algorithm for simulating aTAM systems have linear computation time complexity, whereas the algorithm simulating 2HAM systems have exponential computation time complexity, which strongly varies based on the simulated system. The synthesiser application is capable of finding a relatively small solution even for quite large input patterns in reasonable amounts of time

Digital Library of the Czech Technical University in Prague

A Survey on Array Storage, Query Languages, and Systems

Author: Cheng Yu
Rusu Florin
Publication venue
Publication date: 19/02/2013
Field of study

Since scientific investigation is one of the most important providers of massive amounts of ordered data, there is a renewed interest in array data processing in the context of Big Data. To the best of our knowledge, a unified resource that summarizes and analyzes array processing research over its long existence is currently missing. In this survey, we provide a guide for past, present, and future research in array processing. The survey is organized along three main topics. Array storage discusses all the aspects related to array partitioning into chunks. The identification of a reduced set of array operators to form the foundation for an array query language is analyzed across multiple such proposals. Lastly, we survey real systems for array processing. The result is a thorough survey on array data storage and processing that should be consulted by anyone interested in this research topic, independent of experience level. The survey is not complete though. We greatly appreciate pointers towards any work we might have forgotten to mention.Comment: 44 page

arXiv.org e-Print Archive

CiteSeerX

Optimizing the Performance of Streaming Numerical Kernels on the IBM Blue Gene/P PowerPC 450 Processor

Author: Bailey D
Ganapathi A
IBM Blue Gene Team
Kamil S
Nguyen A
Peng L
Sosa C and International Business Machines Corporation
Williams S
Publication venue: 'SAGE Publications'
Publication date: 17/01/2012
Field of study

Several emerging petascale architectures use energy-efficient processors with vectorized computational units and in-order thread processing. On these architectures the sustained performance of streaming numerical kernels, ubiquitous in the solution of partial differential equations, represents a challenge despite the regularity of memory access. Sophisticated optimization techniques are required to fully utilize the Central Processing Unit (CPU). We propose a new method for constructing streaming numerical kernels using a high-level assembly synthesis and optimization framework. We describe an implementation of this method in Python targeting the IBM Blue Gene/P supercomputer's PowerPC 450 core. This paper details the high-level design, construction, simulation, verification, and analysis of these kernels utilizing a subset of the CPU's instruction set. We demonstrate the effectiveness of our approach by implementing several three-dimensional stencil kernels over a variety of cached memory scenarios and analyzing the mechanically scheduled variants, including a 27-point stencil achieving a 1.7x speedup over the best previously published results

arXiv.org e-Print Archive

Crossref

Recommended from our members

A SIMD architecture for hard real-time systems

Author: Spliet Roy
Publication venue: University of Cambridge
Publication date: 31/03/2020
Field of study

Emerging safety-critical systems require high-performance data-parallel architectures and, problematically, ones that can guarantee tight and safe worst-case execution times. Given the complexity of existing architectures like GPUs, it is unlikely that sufficiently accurate models and algorithms for timing analysis will emerge in the foreseeable future. This motivates a clean-slate approach to designing a real-time data-parallel architecture. In this work I present Sim-D: a wide-SIMD architecture for hard real-time systems. Similar to GPUs, Sim-D performs hardware strip-mining to schedule the work for a compute kernel in entities called work-groups. Sim-D schedules the work for each work-group as a sequence of uninterruptible access- and execute program phases, interleaving the phases of two work-groups. By providing performance isolation between the memory- and compute resources, the execution time of each phase can be tightly bound through static analysis. I present a predictable closed-page DRAM controller that processes requests for large 1D- and 2D blocks of data, as well as indirect indexed transfers. These large transfers coalesce the data requests of a whole work-group. For a linear 4KiB transfer over a 64-bit data bus, the utilisation provably exceeds 78% for DDR4-3200AA DRAM. For 2D blocks, a well-chosen tiling configuration can achieve near-similar efficiency. I show that bounds on the execution time of indexed transfers are pessimistic by nature, but propose a novel snoopy indexed transfer mechanism that permits more reasonable bounds when the buffer size is limited. Finally, I present a worst-case execution time calculation algorithm for Sim-D. This algorithm is paired with two hardware work-group scheduling policies that deterministically reduce run-time variance. The worst-case execution time analysis algorithm combines static control flow analysis with a simulation-based cost model for execution and DRAM transfers. Its key novelty is the addition of a stage that considers work-group scheduling effects. I show that the work-group scheduling policies degrade performance on average by 8.9%, but permit the calculation of worst-case execution time bounds that are tight within 14.3% on average for benchmarks that avoid inefficient indexed transfers

Apollo (Cambridge)

A High-Throughput Solver for Marginalized Graph Kernels on GPU

Author: Buluc A
Popovici DT
Selvitopi O
Tang YH
Publication venue: eScholarship, University of California
Publication date: 25/02/2020
Field of study

We present the design and optimization of a linear solver on General Purpose GPUs for the efficient and high-throughput evaluation of the marginalized graph kernel between pairs of labeled graphs. The solver implements a preconditioned conjugate gradient (PCG) method to compute the solution to a generalized Laplacian equation associated with the tensor product of two graphs. To cope with the gap between the instruction throughput and the memory bandwidth of current generation GPUs, our solver forms the tensor product linear system on-the-fly without storing it in memory when performing matrix-vector dot product operations in PCG. Such on-the-fly computation is accomplished by using threads in a warp to cooperatively stream the adjacency and edge label matrices of individual graphs by small square matrix blocks called tiles, which are then staged in registers and the shared memory for later reuse. Warps across a thread block can further share tiles via the shared memory to increase data reuse. We exploit the sparsity of the graphs hierarchically by storing only non-empty tiles using a coordinate format and nonzero elements within each tile using bitmaps. Besides, we propose a new partition-based reordering algorithm for aggregating nonzero elements of the graphs into fewer but denser tiles to improve the efficiency of the sparse format.We carry out extensive theoretical analyses on the graph tensor product primitives for tiles of various density and evaluate their performance on synthetic and real-world datasets. Our solver delivers three to four orders of magnitude speedup over existing CPU-based solvers such as GraKeL and GraphKernels. The capability of the solver enables kernel-based learning tasks at unprecedented scales

arXiv.org e-Print Archive

eScholarship - University of California

Proceedings of JAC 2010. Journées Automates Cellulaires

Author: Kari Jarkko (toim.)
Publication venue: TUCS Turku Centre for Computer Science
Publication date: 09/12/2010
Field of study

The second Symposium on Cellular Automata “Journ´ees Automates Cellulaires” (JAC 2010) took place in Turku, Finland, on December 15-17, 2010. The first two conference days were held in the Educarium building of the University of Turku, while the talks of the third day were given onboard passenger ferry boats in the beautiful Turku archipelago, along the route Turku–Mariehamn–Turku. The conference was organized by FUNDIM, the Fundamentals of Computing and Discrete Mathematics research center at the mathematics department of the University of Turku. The program of the conference included 17 submitted papers that were selected by the international program committee, based on three peer reviews of each paper. These papers form the core of these proceedings. I want to thank the members of the program committee and the external referees for the excellent work that have done in choosing the papers to be presented in the conference. In addition to the submitted papers, the program of JAC 2010 included four distinguished invited speakers: Michel Coornaert (Universit´e de Strasbourg, France), Bruno Durand (Universit´e de Provence, Marseille, France), Dora Giammarresi (Universit` a di Roma Tor Vergata, Italy) and Martin Kutrib (Universit¨at Gie_en, Germany). I sincerely thank the invited speakers for accepting our invitation to come and give a plenary talk in the conference. The invited talk by Bruno Durand was eventually given by his co-author Alexander Shen, and I thank him for accepting to make the presentation with a short notice. Abstracts or extended abstracts of the invited presentations appear in the first part of this volume. The program also included several informal presentations describing very recent developments and ongoing research projects. I wish to thank all the speakers for their contribution to the success of the symposium. I also would like to thank the sponsors and our collaborators: the Finnish Academy of Science and Letters, the French National Research Agency project EMC (ANR-09-BLAN-0164), Turku Centre for Computer Science, the University of Turku, and Centro Hotel. Finally, I sincerely thank the members of the local organizing committee for making the conference possible. These proceedings are published both in an electronic format and in print. The electronic proceedings are available on the electronic repository HAL, managed by several French research agencies. The printed version is published in the general publications series of TUCS, Turku Centre for Computer Science. We thank both HAL and TUCS for accepting to publish the proceedings.Siirretty Doriast

UTUPub