57,074 research outputs found

    A two-base encoded DNA sequence alignment problem in computational biology

    Get PDF
    The recent introduction of instruments capable of producing millions of DNA sequence reads in a single run is rapidly changing the landscape of genetics. The primary objective of the "sequence alignment" problem is to search for a new algorithm that facilitates the use of two-base encoded data for large-scale re-sequencing projects. This algorithm should be able to perform local sequence alignment as well as error detection and correction in a reliable and systematic manner, enabling the direct comparison of encoded DNA sequence reads to a candidate reference DNA sequence. We will first briefly review two well-known sequence alignment approaches and provide a rudimentary improvement for implementation on parallel systems. Then, we carefully examin a unique sequencing technique known as the SOLiDTM System that can be implemented, and follow by the results from the global and local sequence alignment. In this report, the team presents an explanation of the algorithms for color space sequence data from the high-throughput re-sequencing technology and a theoretical parallel approach to the dynamic programming method for global and local alignment. The combination of the di-base approach and dynamic programming provides a possible viewpoint for large-scale re-sequencing projects. We anticipate the use of distributed computing to be the next-generation engine for large-scale problems like such

    Parallel Smith-Waterman Algorithm for Gene Sequencing

    Get PDF
    Smith-Waterman Algorithm represents a highly robust and efficient parallel computing system development for biological gene sequence. The research work here gives a deep understanding and knowledge transfer about exiting approach for gene sequencing and alignment using Smith-waterman their strength and weaknesses. Smith-Waterman algorithm calculates the local alignment of two given sequences used to identify similar RNA, DNA and protein segments. To identify the enhanced local alignments of biological gene pairs Smith-Waterman algorithm uses dynamic programming approach. It is proficient in finding the optimal local alignment considering the given scoring system. DOI: 10.17762/ijritcc2321-8169.150515

    Massively Parallel Algorithm for Multiple Sequence Alignment Based on Artificial Bee Colony

    Get PDF
    In silico biological sequence processing is a key task in molecular biology. This scientific area requires powerful computing resources for exploring large sets of biological data. Parallel in silico simulations based on methods and algorithms for analysis of biological data using high-performance distributed computing is essential for accelerating the research and reducing the investment. Multiple sequence alignment is a widely used method for biological sequence processing. The goal of this method is DNA and protein sequences alignment. This paper presents an innovative parallel algorithm MSA_BG for multiple alignment of biological sequences that is highly scalable and locality aware. The MSA_BG algorithm we describe is iterative and is based on the concept of Artificial Bee Colony metaheuristics and the concept of algorithmic and architectural spaces correlation. The metaphor of the ABC metaheuristics has been constructed and the functionalities of the agents have been defined. The conceptual parallel model of computation has been designed and the algorithmic framework of the designed parallel algorithm constructed. Experimental simulations on the basis of parallel implementation of MSA_BG algorithm for multiple sequences alignment on heterogeneouc compact computer cluster and supercomputer BlueGene/P have been carried out for the case study of the influenza virus variability investigation. The performance estimation and profiling analyses have shown that the parallel system is well balanced both in respect to the workload and machine size

    Comparison of Data Partitioning Schema of Parallel Pairwise Alignment on Shared Memory System

    Get PDF
    The pairwise alignment (PA) algorithm is widely used in bioinformatics to analyze biological sequence. With the advance of sequencer technology, a massive amount of DNA fragments are sequenced much quicker and cheaper. The alignment algorithm needs to be parallelized to be able to align them in a shorter time. Many previous researches have parallelize PA algorithm using various data partitioning schema, but it is unclear which one is the best. The data partitioning schema is important for parallel PA performance, because this algorithm use dynamic programming technique that needs intense inter-thread communication. In this paper, we compared four partitioning schemas to find the best performing one on shared memory system. Those schemas are: blocked columnwise, rowwise, antidiagonal, and blocked columnwise with manual scheduling and loop unrolling. The last schema gave the best performance of 89% efficiency on 4 threads. This result provided fine-grain parallelism that can be used further to develop parallel multiple sequence alignment (MSA)

    Multiple Biolgical Sequence Alignment: Scoring Functions, Algorithms, and Evaluations

    Get PDF
    Aligning multiple biological sequences such as protein sequences or DNA/RNA sequences is a fundamental task in bioinformatics and sequence analysis. These alignments may contain invaluable information that scientists need to predict the sequences\u27 structures, determine the evolutionary relationships between them, or discover drug-like compounds that can bind to the sequences. Unfortunately, multiple sequence alignment (MSA) is NP-Complete. In addition, the lack of a reliable scoring method makes it very hard to align the sequences reliably and to evaluate the alignment outcomes. In this dissertation, we have designed a new scoring method for use in multiple sequence alignment. Our scoring method encapsulates stereo-chemical properties of sequence residues and their substitution probabilities into a tree-structure scoring scheme. This new technique provides a reliable scoring scheme with low computational complexity. In addition to the new scoring scheme, we have designed an overlapping sequence clustering algorithm to use in our new three multiple sequence alignment algorithms. One of our alignment algorithms uses a dynamic weighted guidance tree to perform multiple sequence alignment in progressive fashion. The use of dynamic weighted tree allows errors in the early alignment stages to be corrected in the subsequence stages. Other two algorithms utilize sequence knowledge-bases and sequence consistency to produce biological meaningful sequence alignments. To improve the speed of the multiple sequence alignment, we have developed a parallel algorithm that can be deployed on reconfigurable computer models. Analytically, our parallel algorithm is the fastest progressive multiple sequence alignment algorithm

    DNA sequence alignment: hybrid parallel programming on a multicore cluster

    Get PDF
    DNA sequence alignment is one of the most important operations of computational biology. In 1981, Smith and Waterman developed a method for sequences local alignment. Due to its computational power and memory requirements, various heuristics have been developed to reduce execution time at the expense of a loss of accuracy in the result. This is why heuristics do not ensure that the best alignment is found. For this reason, it is interesting to study how to apply the computer power of different parallel platforms to speed up the sequence alignment process without losing result accuracy. In this article, a new parallelization strategy (HI-M) of Smith-Waterman algorithm on a multi-core cluster is presented, configuring a pipeline with a hybrid communication model. Additionally, a performance analysis is carried out and compared with two previously presented parallel solutions. Finally, experimental results are presented, as well as future research lines.Facultad de Informátic

    DNA sequence alignment: hybrid parallel programming on a multicore cluster

    Get PDF
    DNA sequence alignment is one of the most important operations of computational biology. In 1981, Smith and Waterman developed a method for sequences local alignment. Due to its computational power and memory requirements, various heuristics have been developed to reduce execution time at the expense of a loss of accuracy in the result. This is why heuristics do not ensure that the best alignment is found. For this reason, it is interesting to study how to apply the computer power of different parallel platforms to speed up the sequence alignment process without losing result accuracy. In this article, a new parallelization strategy (HI-M) of Smith-Waterman algorithm on a multi-core cluster is presented, configuring a pipeline with a hybrid communication model. Additionally, a performance analysis is carried out and compared with two previously presented parallel solutions. Finally, experimental results are presented, as well as future research lines.Facultad de Informátic
    corecore