299 research outputs found

    Highly Scalable Algorithms for Robust String Barcoding

    Full text link
    String barcoding is a recently introduced technique for genomic-based identification of microorganisms. In this paper we describe the engineering of highly scalable algorithms for robust string barcoding. Our methods enable distinguisher selection based on whole genomic sequences of hundreds of microorganisms of up to bacterial size on a well-equipped workstation, and can be easily parallelized to further extend the applicability range to thousands of bacterial size genomes. Experimental results on both randomly generated and NCBI genomic data show that whole-genome based selection results in a number of distinguishers nearly matching the information theoretic lower bounds for the problem

    The approximability of the String Barcoding problem

    Get PDF
    The String Barcoding (SBC) problem, introduced by Rash and Gusfield (RECOMB, 2002), consists in finding a minimum set of substrings that can be used to distinguish between all members of a set of given strings. In a computational biology context, the given strings represent a set of known viruses, while the substrings can be used as probes for an hybridization experiment via microarray. Eventually, one aims at the classification of new strings (unknown viruses) through the result of the hybridization experiment. In this paper we show that SBC is as hard to approximate as Set Cover. Furthermore, we show that the constrained version of SBC (with probes of bounded length) is also hard to approximate. These negative results are tight

    High-Throughput SNP Genotyping by SBE/SBH

    Full text link
    Despite much progress over the past decade, current Single Nucleotide Polymorphism (SNP) genotyping technologies still offer an insufficient degree of multiplexing when required to handle user-selected sets of SNPs. In this paper we propose a new genotyping assay architecture combining multiplexed solution-phase single-base extension (SBE) reactions with sequencing by hybridization (SBH) using universal DNA arrays such as all kk-mer arrays. In addition to PCR amplification of genomic DNA, SNP genotyping using SBE/SBH assays involves the following steps: (1) Synthesizing primers complementing the genomic sequence immediately preceding SNPs of interest; (2) Hybridizing these primers with the genomic DNA; (3) Extending each primer by a single base using polymerase enzyme and dideoxynucleotides labeled with 4 different fluorescent dyes; and finally (4) Hybridizing extended primers to a universal DNA array and determining the identity of the bases that extend each primer by hybridization pattern analysis. Our contributions include a study of multiplexing algorithms for SBE/SBH genotyping assays and preliminary experimental results showing the achievable tradeoffs between the number of array probes and primer length on one hand and the number of SNPs that can be assayed simultaneously on the other. Simulation results on datasets both randomly generated and extracted from the NCBI dbSNP database suggest that the SBE/SBH architecture provides a flexible and cost-effective alternative to genotyping assays currently used in the industry, enabling genotyping of up to hundreds of thousands of user-specified SNPs per assay.Comment: 19 page

    The string barcoding problem

    Full text link
    In this paper we consider an approach to solve the string barcoding problem. this approach is based on an explicit reduction from the problem to the satisfiability problem

    The shortest common superstring problem

    Full text link
    We consider the problem of the shortest common superstring. We describe an approach to solve the problem. This approach is based on an explicit reduction from the problem to the satisfiability problem. © 2013 Anna Gorbenko and Vladimir Popov

    Restricted common superstrings

    Full text link
    In this paper we consider an approach to solve the restricted common superstring problem. This approach is based on an explicit reduction from the problem to the satisfiability problem. © 2013 Anna Gorbenko and Vladimir Popov

    The minimum test collection problem

    Get PDF
    In this paper we consider an approach to solve the minimum test collection problem. This approach is based on an explicit reduction from the problem to the satisfiability problem

    The shortest common parameterized supersequence problem

    Full text link
    In this paper, we consider the problem of the shortest common parameterized supersequence. In particular, we consider an explicit reduction from the problem to the satisfiability problem. © 2013 Anna Gorbenko and Vladimir Popov
    corecore