2,549 research outputs found

    A Graph-Theoretical Approach to the Selection of the Minimum Tiling Path from a Physical Map

    Get PDF
    The problem of computing the minimum tiling path (MTP) from a set of clones arranged in a physical map is a cornerstone of hierarchical (clone-by-clone) genome sequencing projects. We formulate this problem in a graph theoretical framework, and then solve by a combination of minimum hitting set and minimum spanning tree algorithms. The tool implementing this strategy, called FMTP, shows improved performance compared to the widely used software FPC. When we execute FMTP and FPC on the same physical map, the MTP produced by FMTP covers a higher portion of the genome, and uses a smaller number of clones. For instance, on the rice genome the MTP produced by our tool would reduce by about 11 percent the cost of a clone-by-clone sequencing project. Source code, benchmark data sets, and documentation of FMTP are freely available at \u3ehttp://code.google.com/p/fingerprint-based-minimal-tiling-path/ under MIT license

    A Graph-Theoretical Approach to the Selection of the Minimum Tiling Path from a Physical Map

    Full text link

    A Survey on Array Storage, Query Languages, and Systems

    Full text link
    Since scientific investigation is one of the most important providers of massive amounts of ordered data, there is a renewed interest in array data processing in the context of Big Data. To the best of our knowledge, a unified resource that summarizes and analyzes array processing research over its long existence is currently missing. In this survey, we provide a guide for past, present, and future research in array processing. The survey is organized along three main topics. Array storage discusses all the aspects related to array partitioning into chunks. The identification of a reduced set of array operators to form the foundation for an array query language is analyzed across multiple such proposals. Lastly, we survey real systems for array processing. The result is a thorough survey on array data storage and processing that should be consulted by anyone interested in this research topic, independent of experience level. The survey is not complete though. We greatly appreciate pointers towards any work we might have forgotten to mention.Comment: 44 page

    Sequencing of 15 622 Gene-bearing BACs Clarifies the Gene-dense Regions of the Barley Genome

    Get PDF
    Barley (Hordeum vulgare L.) possesses a large and highly repetitive genome of 5.1 Gb that has hindered the development of a complete sequence. In 2012, the International Barley Sequencing Consortium released a resource integrating whole-genome shotgun sequences with a physical and genetic framework. However, because only 6278 bacterial artificial chromosome (BACs) in the physical map were sequenced, fine structure was limited. To gain access to the gene-containing portion of the barley genome at high resolution, we identified and sequenced 15 622 BACs representing the minimal tiling path of 72 052 physical-mapped gene-bearing BACs. This generated ~1.7 Gb of genomic sequence containing an estimated 2/3 of all Morex barley genes. Exploration of these sequenced BACs revealed that although distal ends of chromosomes contain most of the gene-enriched BACs and are characterized by high recombination rates, there are also gene-dense regions with suppressed recombination. We made use of published map-anchored sequence data from Aegilops tauschii to develop a synteny viewer between barley and the ancestor of the wheat D-genome. Except for some notable inversions, there is a high level of collinearity between the two species. The software HarvEST:Barley provides facile access to BAC sequences and their annotations, along with the barley–Ae. tauschii synteny viewer. These BAC sequences constitute a resource to improve the efficiency of marker development, map-based cloning, and comparative genomics in barley and related crops. Additional knowledge about regions of the barley genome that are gene-dense but low recombination is particularly relevant

    Local symmetry preserving operations on polyhedra

    Get PDF

    Self-assembly: modelling, simulation, and planning

    Get PDF
    Samoskládání je proces, při kterém se kolekce neuspořádaných částic samovolně orientuje do uspořádaného vzoru nebo funkční struktury bez působení vnější síly, pouze za pomoci lokálních interakcí mezi samotnými částicemi. Tato teze se zaměřuje na teorii dlaždicových samoskládacích systémů a jejich syntézu. Nejdříve je představena oblast výzkumu věnující se dlaždičovým samoskládacím systémům, a poté jsou důkladně popsány základní typy dlaždicových skládacích systémů, kterými jsou abstract Tile Assembly Model (aTAM ), kinetic Tile Assembly Model (kTAM ), a 2-Handed Assembly Model (2HAM ). Poté jsou představeny novější modely a modely se specifickým použitím. Dále je zahrnut stručný popis původu teorie dlaždicového samoskládání společně s krátkým popisem nedávného výzkumu. Dále jsou představeny dva obecné otevřené problémy dlaždicového samoskládání s hlavním zaměřením na problém Pattern Self-Assembly Tile Set Synthesis (PATS), což je NP-těžká kombinatorická optimalizační úloha. Nakonec je ukázán algoritmus Partition Search with Heuristics (PS-H ), který se používá k řešení problému PATS. Následovně jsou demonstrovány dvě aplikace, které byly vyvinuty pro podporu výzkumu abstraktních dlaždicových skládacích modelů a syntézy množin dlaždic pro samoskládání zadaných vzorů. První aplikace je schopná simulovat aTAM a 2HAM systémy ve 2D prostoru. Druhá aplikace je řešič PATS problému, který využívá algoritmu PS-H. Pro obě aplikace jsou popsány hlavní vlastnosti a návrhová rozhodnutí, která řídila jejich vývoj. Nakonec jsou předloženy výsledky několika experimentů. Jedna skupina experimentů byla zaměřena na ověření výpočetní náročnosti vyvinutých algoritmů pro simulátor. Druhá sada experimentů zkoumala vliv jednotlivých vlastností vzorů na vlastnosti dlaždicových systémů, které byly získány syntézou ze vzorů pomocí vyvinutého řešiče PATS problému. Bylo prokázáno, že algoritmus simulující aTAM systém má lineární časovou výpočetní náročnost, zatímco algoritmus simulující 2HAM systém má exponenciální časovou výpočetní náročnost, která navíc silně závisí na simulovaném systému. Aplikace pro řešení syntézy množiny dlaždic ze vzorů je schopna najít relativně malé řešení i pro velké zadané vzory, a to v přiměřeném čase.Self-assembly is the process in which a collection of disordered units organise themselves into ordered patterns or functional structures without any external direction, solely using local interactions among the components. This thesis focuses on the theory of tile-based self-assembly systems and their synthesis. First, an introduction to the study field of tile-based self-assembly systems are given, followed by a thorough description of common types of tile assembly systems such as abstract Tile Assembly Model (aTAM ), kinetic Tile Assembly Model (kTAM ), and 2-Handed Assembly Model (2HAM ). After that, various recently developed models and models with specific applications are listed. A brief summary of the origins of the tile-based self-assembly is also included together with a short review of recent results. Two general open problems are presented with the main focus on the Pattern Self-Assembly Tile Set Synthesis (PATS) problem, which is NP-hard combinatorial optimisation problem. Partition Search with Heuristics (PS-H ) algorithm is presented as it is used for solving the PATS problem. Next, two applications which were developed to study the abstract tile assembly models and the synthesis of tile sets for pattern self-assembly are introduced. The first application is a simulator capable of simulating aTAM and 2HAM systems in 2D. The second application is a solver of the PATS problem based around the PS-H algorithm. Main features and design decisions are described for both applications. Finally, results from several experiments are presented. One set of experiments were focused on verification of computation complexity of algorithms developed for the simulator, and the other set of experiments studied the influences of the properties of the pattern on the tile assembly system synthesised by our implementation of PATS problem solver. It was shown that the algorithm for simulating aTAM systems have linear computation time complexity, whereas the algorithm simulating 2HAM systems have exponential computation time complexity, which strongly varies based on the simulated system. The synthesiser application is capable of finding a relatively small solution even for quite large input patterns in reasonable amounts of time
    corecore