Search CORE

3 research outputs found

Efficient tilings of de Bruijn and Kautz graphs

Author: Leonard Jud
Stewart Lawrence C.
Taylor Washington
Publication venue
Publication date: 01/01/2011
Field of study

Kautz and de Bruijn graphs have a high degree of connectivity which makes them ideal candidates for massively parallel computer network topologies. In order to realize a practical computer architecture based on these graphs, it is useful to have a means of constructing a large-scale system from smaller, simpler modules. In this paper we consider the mathematical problem of uniformly tiling a de Bruijn or Kautz graph. This can be viewed as a generalization of the graph bisection problem. We focus on the problem of graph tilings by a set of identical subgraphs. Tiles should contain a maximal number of internal edges so as to minimize the number of edges connecting distinct tiles. We find necessary and sufficient conditions for the construction of tilings. We derive a simple lower bound on the number of edges which must leave each tile, and construct a class of tilings whose number of edges leaving each tile agrees asymptotically in form with the lower bound to within a constant factor. These tilings make possible the construction of large-scale computing systems based on de Bruijn and Kautz graph topologies.Comment: 29 pages, 11 figure

arXiv.org e-Print Archive

CiteSeerX

Assemblage adaptatif de génomes et de méta-génomes par passage de messages

Author: Boisvert Sébastien
Publication venue: Bibliotheque de l' Universite Laval
Publication date: 01/01/2014
Field of study

De manière générale, les procédés et processus produisent maintenant plus de données qu’un humain peut en assimiler. Les grosses données (Big Data), lorsque bien analysées, augmentent la compréhension des processus qui sont opérationnels à l’intérieur de systèmes et, en conséquence, encouragent leur amélioration. Analyser les séquences de l’acide désoxyribonucléique (ADN) permet de mieux comprendre les êtres vivants, en exploitant par exemple la biologie des systèmes. Les séquenceurs d’ADN à haut débit sont des instruments massivement parallèles et produisent beaucoup de données. Les infrastructures informatiques, comme les superordinateurs ou l’informatique infonuagique, sont aussi massivement parallèles de par leur nature distribuée. Par contre, les ordinateurs ne comprennent ni le français, ni l’anglais – il faut les programmer. Les systèmes logiciels pour analyser les données génomiques avec des superordinateurs doivent être aussi massivement parallèles. L’interface de passage de messages permet de créer de tels logiciels et une conception granulaire permet d’entrelacer la communication et le calcul à l’intérieur des processus d’un système de calcul. De tels systèmes produisent des résultats rapidement à partir de données. Ici, les logiciels RayPlatform, Ray (incluant les flux de travail appelé Ray Meta et Ray Communities) et Ray Cloud Browser sont présentés. L’application principale de cette famille de produits est l’assemblage et le profilage adaptatifs de génomes par passage de messages.Generally speaking, current processes – industrial, for direct-to-consumers, or researchrelated – yield far more data than humans can manage. Big Data is a trend of its own and concerns itself with the betterment of humankind through better understanding of processes and systems. To achieve that end, the mean is to leverage massive amounts of big data in order to better comprehend what they contain, mean, and imply. DNA sequencing is such a process and contributes to the discovery of knowledge in genetics and other fields. DNA sequencing instruments are parallel objects and output unprecedented volumes of data. Computer infrastructures, cloud and other means of computation open the door to the analysis of data stated above. However, they need to be programmed for they are not acquainted with natural languages. Massively parallel software must match the parallelism of supercomputers and other distributed computing systems before attempting to decipher big data. Message passing – and the message passing interface – allows one to create such tools, and a granular design of blueprints consolidate production of results. Herein, a line of products that includes RayPlatform, Ray (which includes workflows called Ray Meta and Ray Communities for metagenomics) and Ray Cloud Browser are presented. Its main application is scalable (adaptive) assembly and profiling of genomes using message passing

CorpusUL

Subject Index Volumes 1–200

Author
Publication venue: Published by Elsevier B.V.
Publication date
Field of study

Elsevier - Publisher Connector