209 research outputs found

    Design and Evaluation of a BLAST Ungapped Extension Accelerator, Master\u27s Thesis

    Get PDF
    The amount of biosequence data being produced each year is growing exponentially. Extracting useful information from this massive amount of data is becoming an increasingly difficult task. This thesis focuses on accelerating the most widely-used software tool for analyzing genomic data, BLAST. This thesis presents Mercury BLAST, a novel method for accelerating searches through massive DNA databases. Mercury BLAST takes a streaming approach to the BLAST computation by offloading the performance-critical sections onto reconfigurable hardware. This hardware is then used in combination with the processor of the host system to deliver BLAST results in a fraction of the time of the general-purpose processor alone. Mercury BLAST makes use of new algorithms combined with reconfigurable hardware to accelerate BLAST-like similarity search. An evaluation of this method for use in real BLAST-like searches is presented along with a characterization of the quality of results associated with using these new algorithms in specialized hardware. The primary focus of this thesis is the design of the ungapped extension stage of Mercury BLAST. The architecture of the ungapped extension stage is described along with the context of this stage within the Mercury BLAST system. The design is compact and performs over 20× faster than that of the standard software ungapped extension, yielding close to 50× speedup over the complete software BLAST application. The quality of Mercury BLAST results is essentially equivalent to the standard BLAST results

    ToPoliNano: Nanoarchitectures Design Made Real

    Get PDF
    Many facts about emerging nanotechnologies are yet to be assessed. There are still major concerns, for instance, about maximum achievable device density, or about which architecture is best fit for a specific application. Growing complexity requires taking into account many aspects of technology, application and architecture at the same time. Researchers face problems that are not new per se, but are now subject to very different constraints, that need to be captured by design tools. Among the emerging nanotechnologies, two-dimensional nanowire based arrays represent promising nanostructures, especially for massively parallel computing architectures. Few attempts have been done, aimed at giving the possibility to explore architectural solutions, deriving information from extensive and reliable nanoarray characterization. Moreover, in the nanotechnology arena there is still not a clear winner, so it is important to be able to target different technologies, not to miss the next big thing. We present a tool, ToPoliNano, that enables such a multi-technological characterization in terms of logic behavior, power and timing performance, area and layout constraints, on the basis of specific technological and topological descriptions. This tool can aid the design process, beside providing a comprehensive simulation framework for DC and timing simulations, and detailed power analysis. Design and simulation results will be shown for nanoarray-based circuits. ToPoliNano is the first real design tool that tackles the top down design of a circuit based on emerging technologie

    RegExpBlasting (REB), a Regular Expression Blasting algorithm based on multiply aligned sequences

    Get PDF
    Background: One of the most frequent uses of bioinformatics tools concerns functional characterization of a newly produced nucleotide sequence (a query sequence) by applying Blast or FASTA against a set of sequences (the subject sequences). However, in some specific contexts, it is useful to compare the query sequence against a cluster such as a MultiAlignment (MA). We present here the RegExpBlasting (REB) algorithm, which compares an unclassified sequence with a dataset of patterns defined by application of Regular Expression rules to a given-as-input MA datasets. The REB algorithm workflow consists in i. the definition of a dataset of multialignments ii. the association of each MA to a pattern, defined by application of regular expression rules; iii. automatic characterization of a submitted biosequence according to the function of the sequences described by the pattern best matching the query sequence. Results: An application of this algorithm is used in the "characterize your sequence" tool available in the PPNEMA resource. PPNEMA is a resource of Ribosomal Cistron sequences from various species, grouped according to nematode genera. It allows the retrieval of plant nematode multialigned sequences or the classification of new nematode rDNA sequences by applying REB. The same algorithm also supports automatic updating of the PPNEMA database. The present paper gives examples of the use of REB within PPNEMA. Conclusion: The use of REB in PPNEMA updating, the PPNEMA "characterize your sequence" option clearly demonstrates the power of the method. Using REB can also rapidly solve any other bioinformatics problem, where the addition of a new sequence to a pre-existing cluster is required. The statistical tests carried out here show the powerful flexibility of the method

    Feedbacks in QCA: a Quantitative Approach

    Get PDF
    In the post-CMOS scenario a primary role is played by the quantum-dot cellular automata (QCA) technology. Irrespective of the specific implementation principle (e.g., either molecular, or magnetic or semiconductive in the current scenario) the intrinsic deep-level pipelined behavior is the dominant issue. It has important consequences on circuit design and performance, especially in the presence of feedbacks in sequential circuits. Though partially already addressed in literature, these consequences still must be fully understood and solutions thoroughly approached to allow this technology any further advancement. This paper conducts an exhaustive analysis of the effects and the consequences derived by the presence of loops in QCA circuits. For each problem arisen, a solution is presented. The analysis is performed using as a test architecture, a complex systolic array circuit for biosequences analysis (Smith–Waterman algorithm), which represents one of the most promising application for QCA technology. The circuit is based on nanomagnetic logic as QCA implementation, is designed down to the layout level considering technological constraints and experimentally validated structures, counts up to approximately 2.3 milion nanomagnets, and is described and simulated with HDL language using as a testbench realistic protein alignment sequences. The results here presented constitute a fundamental advancement in the emerging technologies field since: 1) they are based on a quantitative approach relying on a realistic and complex circuit involving a large variety of QCA blocks; 2) they strictly are reckoned starting from current technological limits without relying on unrealistic assumptions; 3) they provide general rules to design complex sequential circuits with intrinsically pipelined technologies, like QCA; and 4) they prove with a real application benchmark how to maximize the circuits performance

    A Mutagenetic Tree Hidden Markov Model for Longitudinal Clonal HIV Sequence Data

    Full text link
    RNA viruses provide prominent examples of measurably evolving populations. In HIV infection, the development of drug resistance is of particular interest, because precise predictions of the outcome of this evolutionary process are a prerequisite for the rational design of antiretroviral treatment protocols. We present a mutagenetic tree hidden Markov model for the analysis of longitudinal clonal sequence data. Using HIV mutation data from clinical trials, we estimate the order and rate of occurrence of seven amino acid changes that are associated with resistance to the reverse transcriptase inhibitor efavirenz.Comment: 20 pages, 6 figure

    Performance Evaluation for Hybrid Architectures

    Get PDF
    In this dissertation we discuss methologies for estimating the performance of applications on hybrid architectures, systems that include various types of computing resources (e.g. traditional general-purpose processors, chip multiprocessors, reconfigurable hardware). A common use of hybrid architectures will be to deploy coarse pipeline stages of application on suitable compute units with communication path for transferring data. The first problem we focus on relates to the sizing the data queues between the different processing elements of an hybrid system. Much of the discussion centers on our analytical models that can be used to derive performance metrics of interest such as, throughput and stalling probability for networks of processing elements with finite data buffering between them. We then discuss to the reliability of performance models. There we start by presenting scenarios where our analytical model is reliable, and introduce tests that can detect their inapplicability. As we transition into the question of reliability of performance models, we access the accuracy and applicability of various evaluation methods. We present results from our experiments to show the need for measuring and accounting for operating system effects in architectural modeling and estimation

    The Genome and Methylome of a Subsocial Small Carpenter Bee, Ceratina calcarata

    Get PDF
    Understanding the evolution of animal societies, considered to be a major transition in evolution, is a key topic in evolutionary biology. Recently, new gateways for understanding social evolution have opened up due to advances in genomics, allowing for unprecedented opportunities in studying social behavior on a molecular level. In particular, highly eusocial insect species (caste-containing societies with nonreproductives that care for siblings) have taken center stage in studies of the molecular evolution of sociality. Despite advances in genomic studies of both solitary and eusocial insects, we still lack genomic resources for early insect societies. To study the genetic basis of social traits requires comparison of genomes from a diversity of organisms ranging from solitary to complex social forms. Here we present the genome of a subsocial bee, Ceratina calcarata. This study begins to address the types of genomic changes associated with the earliest origins of simple sociality using the small carpenter bee. Genes associated with lipid transport and DNA recombination have undergone positive selection in C. calcarata relative to other bee lineages. Furthermore, we provide the first methylome of a noneusocial bee. Ceratina calcarata contains the complete enzymatic toolkit for DNA methylation. As in the honey bee and many other holometabolous insects, DNA methylation is targeted to exons. The addition of this genome allows for new lines of research into the genetic and epigenetic precursors to complex social behaviors
    • 

    corecore