Search CORE

694 research outputs found

Building Large Phylogenetic Trees on Coarse-Grained Parallel Machines

Author: Keane Thomas
McInerney James
Naughton Thomas J.
Page Andrew
Travers Simon
Publication venue: 'Springer Fachmedien Wiesbaden GmbH'
Publication date: 01/01/2006
Field of study

Abstract Phylogenetic analysis is an area of computational biology concerned with the reconstruction of evolutionary relationships between organisms, genes, and gene families. Maximum likelihood evaluation has proven to be one of the most reliable methods for constructing phylogenetic trees. The huge computational requirements associated with maximum likelihood analysis means that it is not feasible to produce large phylogenetic trees using a single processor. We have completed a fully cross platform coarse-grained distributed application, DPRml, which overcomes many of the limitations imposed by the current set of parallel phylogenetic programs. We have completed a set of efficiency tests that show how to maximise efficiency while using the program to build large phylogenetic trees. The software is publicly available under the terms of the GNU general public licence from the system webpage at http://www.cs.nuim.ie/distributed

MURAL - Maynooth University Research Archive Library

Building large phylogenetic trees on coarse-grained parallel machines

Author: Keane T.M.
McInerney J.O.
Naughton Thomas J.
Page A.J.
Travers S.A.A.
Publication venue: 'Springer Fachmedien Wiesbaden GmbH'
Publication date: 01/01/2005
Field of study

Phylogenetic analysis is an area of computational biology concerned with the reconstruction of evolutionary relationships between organisms, genes, and gene families. Maximum likelihood evaluation has proven to be one of the most reliable methods for constructing phylogenetic trees. The huge computa- tional requirements associated with maximum likelihood analysis means that it is not feasible to produce large phylogenetic trees using a single processor. We have completed a fully cross platform coarse grained distributed application, DPRml, which overcomes many of the limitations imposed by the current set of parallel phylogenetic programs. We have completed a set of efï¬ciency tests that show how to maximise efï¬ciency while using the program to build large phylogenetic trees. The software is publicly available under the terms of the GNU general public li- cence from the system webpage at http://www.cs.nuim.ie/distribute

MURAL - Maynooth University Research Archive Library

MultiPhyl: a high-throughput phylogenomics webserver using distributed computing

Author: Keane Thomas M
Mcinerney James
McInerney James O
Naughton Thomas J
Publication venue: Oxford University Press
Publication date: 01/01/2007
Field of study

With the number of fully sequenced genomes increasing steadily, there is greater interest in performing large-scale phylogenomic analyses from large numbers of individual gene families. Maximum likelihood (ML) has been shown repeatedly to be one of the most accurate methods for phylogenetic construction. Recently, there have been a number of algorithmic improvements in maximum-likelihood-based tree search methods. However, it can still take a long time to analyse the evolutionary history of many gene families using a single computer. Distributed computing refers to a method of combining the computing power of multiple computers in order to perform some larger overall calculation. In this article, we present the first high-throughput implementation of a distributed phylogenetics platform, MultiPhyl, capable of using the idle computational resources of many heterogeneous non-dedicated machines to form a phylogenetics supercomputer. MultiPhyl allows a user to upload hundreds or thousands of amino acid or nucleotide alignments simultaneously and perform computationally intensive tasks such as model selection, tree searching and bootstrapping of each of the alignments using many desktop machines. The program implements a set of 88 amino acid models and 56 nucleotide maximum likelihood models and a variety of statistical methods for choosing between alternative models. A MultiPhyl webserver is available for public use at: http://www.cs.nuim.ie/distributed/multiphyl.php

CiteSeerX

Crossref

MURAL - Maynooth University Research Archive Library

PubMed Central

NUI Maynooth Eprint Archive

Maynooth University ePrints and eTheses Archive

The University of Manchester - Institutional Repository

A Domain Decomposition Strategy for Alignment of Multiple Biological Sequences on Multiprocessor Platforms

Author: Ashfaq Khokhar
Berger
Cline
Crandall
Do
Edgar
Edgar
Edgar
Fahad Saeed
Hambrusch
Hambrusch
Hanmao
Jones
Kaddoura
Kumar
Lassmann
Lassmann
Mikhailov
Morgenstern
Muller
Notredame
Notredame
Pilkington
Ronaghi
Saeed
Sauder
Schmollinger
Schwartz
SF
Smith
Stoye
Sze
Thompson
Thompson
Wang
Willebeek-LeMair
Publication venue: 'Elsevier BV'
Publication date: 11/05/2009
Field of study

Multiple Sequences Alignment (MSA) of biological sequences is a fundamental problem in computational biology due to its critical significance in wide ranging applications including haplotype reconstruction, sequence homology, phylogenetic analysis, and prediction of evolutionary origins. The MSA problem is considered NP-hard and known heuristics for the problem do not scale well with increasing number of sequences. On the other hand, with the advent of new breed of fast sequencing techniques it is now possible to generate thousands of sequences very quickly. For rapid sequence analysis, it is therefore desirable to develop fast MSA algorithms that scale well with the increase in the dataset size. In this paper, we present a novel domain decomposition based technique to solve the MSA problem on multiprocessing platforms. The domain decomposition based technique, in addition to yielding better quality, gives enormous advantage in terms of execution time and memory requirements. The proposed strategy allows to decrease the time complexity of any known heuristic of O(N)^x complexity by a factor of O(1/p)^x, where N is the number of sequences, x depends on the underlying heuristic approach, and p is the number of processing nodes. In particular, we propose a highly scalable algorithm, Sample-Align-D, for aligning biological sequences using Muscle system as the underlying heuristic. The proposed algorithm has been implemented on a cluster of workstations using MPI library. Experimental results for different problem sizes are analyzed in terms of quality of alignment, execution time and speed-up.Comment: 36 pages, 17 figures, Accepted manuscript in Journal of Parallel and Distributed Computing(JPDC

arXiv.org e-Print Archive

Crossref

High performance reconfigurable architectures for bioinformatics and computational biology applications

Author: Kasap Server
Publication venue: The University of Edinburgh
Publication date: 01/01/2010
Field of study

Edinburgh Research Archive

On the design of architecture-aware algorithms for emerging applications

Author: Kang Seunghwa
Publication venue: Georgia Institute of Technology
Publication date: 30/01/2011
Field of study

This dissertation maps various kernels and applications to a spectrum of programming models and architectures and also presents architecture-aware algorithms for different systems. The kernels and applications discussed in this dissertation have widely varying computational characteristics. For example, we consider both dense numerical computations and sparse graph algorithms. This dissertation also covers emerging applications from image processing, complex network analysis, and computational biology. We map these problems to diverse multicore processors and manycore accelerators. We also use new programming models (such as Transactional Memory, MapReduce, and Intel TBB) to address the performance and productivity challenges in the problems. Our experiences highlight the importance of mapping applications to appropriate programming models and architectures. We also find several limitations of current system software and architectures and directions to improve those. The discussion focuses on system software and architectural support for nested irregular parallelism, Transactional Memory, and hybrid data transfer mechanisms. We believe that the complexity of parallel programming can be significantly reduced via collaborative efforts among researchers and practitioners from different domains. This dissertation participates in the efforts by providing benchmarks and suggestions to improve system software and architectures.Ph.D.Committee Chair: Bader, David; Committee Member: Hong, Bo; Committee Member: Riley, George; Committee Member: Vuduc, Richard; Committee Member: Wills, Scot

Scholarly Materials And Research @ Georgia Tech

The Role of Mutations in Protein Structural Dynamics and Function: A Multi-scale Computational Approach

Author
Publication venue
Publication date: 01/01/2011
Field of study

abstract: Proteins are a fundamental unit in biology. Although proteins have been extensively studied, there is still much to investigate. The mechanism by which proteins fold into their native state, how evolution shapes structural dynamics, and the dynamic mechanisms of many diseases are not well understood. In this thesis, protein folding is explored using a multi-scale modeling method including (i) geometric constraint based simulations that efficiently search for native like topologies and (ii) reservoir replica exchange molecular dynamics, which identify the low free energy structures and refines these structures toward the native conformation. A test set of eight proteins and three ancestral steroid receptor proteins are folded to 2.7Å all-atom RMSD from their experimental crystal structures. Protein evolution and disease associated mutations (DAMs) are most commonly studied by in silico multiple sequence alignment methods. Here, however, the structural dynamics are incorporated to give insight into the evolution of three ancestral proteins and the mechanism of several diseases in human ferritin protein. The differences in conformational dynamics of these evolutionary related, functionally diverged ancestral steroid receptor proteins are investigated by obtaining the most collective motion through essential dynamics. Strikingly, this analysis shows that evolutionary diverged proteins of the same family do not share the same dynamic subspace. Rather, those sharing the same function are simultaneously clustered together and distant from those functionally diverged homologs. This dynamics analysis also identifies 77% of mutations (functional and permissive) necessary to evolve new function. In silico methods for prediction of DAMs rely on differences in evolution rate due to purifying selection and therefore the accuracy of DAM prediction decreases at fast and slow evolvable sites. Here, we investigate structural dynamics through computing the contribution of each residue to the biologically relevant fluctuations and from this define a metric: the dynamic stability index (DSI). Using DSI we study the mechanism for three diseases observed in the human ferritin protein. The T30I and R40G DAMs show a loss of dynamic stability at the C-terminus helix and nearby regulatory loop, agreeing with experimental results implicating the same regulatory loop as a cause in cataracts syndrome.Dissertation/ThesisPh.D. Physics 201

ASU Digital Repository

Dynamic homology and phylogenetic systematics: a unified approach using POY

Author: Aagesen Lone
Arango Claudia P.
D’Haese Cyrille
Faivovich Julián
Giribet Gonzalo
Grant Taran
Janies Daniel
Smith William Leo
Varón Andrés
Wheeler Ward C.
Publication venue: 'American Museum of Natural History (BioOne sponsored)'
Publication date: 01/01/2006
Field of study

KU ScholarWorks

Darwin's Rainbow: Evolutionary radiation and the spectrum of consciousness

Author: Wallace Robert G.
Wallace Rodrick
Publication venue
Publication date: 01/09/2006
Field of study

Evolution is littered with paraphyletic convergences: many roads lead to functional Romes. We propose here another example - an equivalence class structure factoring the broad realm of possible realizations of the Baars Global Workspace consciousness model. The construction suggests many different physiological systems can support rapidly shifting, sometimes highly tunable, temporary assemblages of interacting unconscious cognitive modules. The discovery implies various animal taxa exhibiting behaviors we broadly recognize as conscious are, in fact, simply expressing different forms of the same underlying phenomenon. Mathematically, we find much slower, and even multiple simultaneous, versions of the basic structure can operate over very long timescales, a kind of paraconsciousness often ascribed to group phenomena. The variety of possibilities, a veritable rainbow, suggests minds today may be only a small surviving fraction of ancient evolutionary radiations - bush phylogenies of consciousness and paraconsciousness. Under this scenario, the resulting diversity was subsequently pruned by selection and chance extinction. Though few traces of the radiation may be found in the direct fossil record, exaptations and vestiges are scattered across the living mind. Humans, for instance, display an uncommonly profound synergism between individual consciousness and their embedding cultural heritages, enabling efficient Lamarkian adaptation

CogPrints Cognitive Sciences Eprint Archive