Search CORE

202,033 research outputs found

MSAProbs-MPI: parallel multiple sequence aligner for distributed-memory systems

Author: González-Domínguez Jorge
Liu Yongchao
Schmidt Bertil
Touriño Juan
Publication venue: 'Oxford University Press (OUP)'
Publication date: 01/01/2016
Field of study

This is a pre-copyedited, author-produced version of an article accepted for publication in Bioinformatics following peer review. The version of recordJorge González-Domínguez, Yongchao Liu, Juan Touriño, Bertil Schmidt; MSAProbs-MPI: parallel multiple sequence aligner for distributed-memory systems, Bioinformatics, Volume 32, Issue 24, 15 December 2016, Pages 3826–3828, https://doi.org/10.1093/bioinformatics/btw558is available online at: https://doi.org/10.1093/bioinformatics/btw558[Abstracts] MSAProbs is a state-of-the-art protein multiple sequence alignment tool based on hidden Markov models. It can achieve high alignment accuracy at the expense of relatively long runtimes for large-scale input datasets. In this work we present MSAProbs-MPI, a distributed-memory parallel version of the multithreaded MSAProbs tool that is able to reduce runtimes by exploiting the compute capabilities of common multicore CPU clusters. Our performance evaluation on a cluster with 32 nodes (each containing two Intel Haswell processors) shows reductions in execution time of over one order of magnitude for typical input datasets. Furthermore, MSAProbs-MPI using eight nodes is faster than the GPU-accelerated QuickProbs running on a Tesla K20. Another strong point is that MSAProbs-MPI can deal with large datasets for which MSAProbs and QuickProbs might fail due to time and memory constraints, respectively

Repositorio da Universidade da Coruña

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Crossref

Evaluating the Relationship Between Running Times and DNA Sequence Sizes using a Generic-Based Filtering Program.

Author: Oluwagbemi O. O.
Omonhinmin Conrad A.
Publication venue
Publication date: 01/11/2008
Field of study

Generic programming depends on the decomposition of programs into simpler components which may be developed separately and combined arbitrarily, subject only to well- defined interfaces. Bioinformatics deals with the application of computational techniques to data present in the Biological sciences. A genetic sequence is a succession of letters which represents the basic structure of a hypothetical DNA molecule, with the capacity to carry information. This research article studied the relationship between the running times of a generic-based filtering program and different samples of genetic sequences in an increasing order of magnitude. A graphical result was obtained to adequately depict this relationship. It was also discovered that the complexity of the generic tree program was O (log2 N). This research article provided one of the systematic approaches of generic programming to Bioinformatics, which could be instrumental in elucidating major discoveries in Bioinformatics, as regards efficient data management and analysis

Covenant University Repository

BioGUID: resolving, discovering, and minting identifiers for biodiversity informatics

Author: Page R.D.M.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2009
Field of study

Background: Linking together the data of interest to biodiversity researchers (including specimen records, images, taxonomic names, and DNA sequences) requires services that can mint, resolve, and discover globally unique identifiers (including, but not limited to, DOIs, HTTP URIs, and LSIDs). Results: BioGUID implements a range of services, the core ones being an OpenURL resolver for bibliographic resources, and a LSID resolver. The LSID resolver supports Linked Data-friendly resolution using HTTP 303 redirects and content negotiation. Additional services include journal ISSN look-up, author name matching, and a tool to monitor the status of biodiversity data providers. Conclusion: BioGUID is available at http://bioguid.info/. Source code is available from http://code.google.com/p/bioguid/

Springer - Publisher Connector

PubMed Central

Enlighten

Nature Precedings

3D time series analysis of cell shape using Laplacian approaches

Author: A Shariff
AJ Ridley
AR Kherlopian
BT Yeo
C Brechbuhler
C Ducroz
Cheng-Jin Du
CJ Du
D Gerlich
D Kierzkowski
D Martin
DA Lauffenburger
DH Ballard
DR Soll
DR Soll
EW Hobson
I Friedel
J Li
JP Eichorst
JY Zhang
K Kolev
L Grady
L Grady
L Shen
L Shen
L Shen
Len R Stephens
M Belkin
M Styner
MJ Black
MK Chung
MK Chung
N Otsu
PA Yushkevich
Phillip T Hawkins
R Eils
R Rangarajan
RJ Morris
RR Coifman
RT Whitaker
S Lafon
T Bulow
Till Bretschneider
V Lempitsky
X Gu
XD Hou
Y Zhang
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2013
Field of study

Background: Fundamental cellular processes such as cell movement, division or food uptake critically depend on cells being able to change shape. Fast acquisition of three-dimensional image time series has now become possible, but we lack efficient tools for analysing shape deformations in order to understand the real three-dimensional nature of shape changes. Results: We present a framework for 3D+time cell shape analysis. The main contribution is three-fold: First, we develop a fast, automatic random walker method for cell segmentation. Second, a novel topology fixing method is proposed to fix segmented binary volumes without spherical topology. Third, we show that algorithms used for each individual step of the analysis pipeline (cell segmentation, topology fixing, spherical parameterization, and shape representation) are closely related to the Laplacian operator. The framework is applied to the shape analysis of neutrophil cells. Conclusions: The method we propose for cell segmentation is faster than the traditional random walker method or the level set method, and performs better on 3D time-series of neutrophil cells, which are comparatively noisy as stacks have to be acquired fast enough to account for cell motion. Our method for topology fixing outperforms the tools provided by SPHARM-MAT and SPHARM-PDM in terms of their successful fixing rates. The different tasks in the presented pipeline for 3D+time shape analysis of cells can be solved using Laplacian approaches, opening the possibility of eventually combining individual steps in order to speed up computations

Crossref

Springer - Publisher Connector

PubMed Central

Warwick Research Archives Portal Repository

Differential Functional Constraints Cause Strain-Level Endemism in Polynucleobacter Populations.

Author: Geeta Rijal
Herbert Ssegane
Iratxe Zarraonaindia
Jack A. Gilbert
Janet K. Jansson
Jarrad T. Hampton-Marcell
M. Cristina Negri
Naseer Sangwan
Tifani W. Eshoo
Publication venue: eScholarship, University of California
Publication date: 01/05/2016
Field of study

The adaptation of bacterial lineages to local environmental conditions creates the potential for broader genotypic diversity within a species, which can enable a species to dominate across ecological gradients because of niche flexibility. The genus Polynucleobacter maintains both free-living and symbiotic ecotypes and maintains an apparently ubiquitous distribution in freshwater ecosystems. Subspecies-level resolution supplemented with metagenome-derived genotype analysis revealed that differential functional constraints, not geographic distance, produce and maintain strain-level genetic conservation in Polynucleobacter populations across three geographically proximal riverine environments. Genes associated with cofactor biosynthesis and one-carbon metabolism showed habitat specificity, and protein-coding genes of unknown function and membrane transport proteins were under positive selection across each habitat. Characterized by different median ratios of nonsynonymous to synonymous evolutionary changes (dN/dS ratios) and a limited but statistically significant negative correlation between the dN/dS ratio and codon usage bias between habitats, the free-living and core genotypes were observed to be evolving under strong purifying selection pressure. Highlighting the potential role of genetic adaptation to the local environment, the two-component system protein-coding genes were highly stable (dN/dS ratio, < 0.03). These results suggest that despite the impact of the habitat on genetic diversity, and hence niche partition, strong environmental selection pressure maintains a conserved core genome for Polynucleobacter populations. IMPORTANCE Understanding the biological factors influencing habitat-wide genetic endemism is important for explaining observed biogeographic patterns. Polynucleobacter is a genus of bacteria that seems to have found a way to colonize myriad freshwater ecosystems and by doing so has become one of the most abundant bacteria in these environments. We sequenced metagenomes from locations across the Chicago River system and assembled Polynucleobacter genomes from different sites and compared how the nucleotide composition, gene codon usage, and the ratio of synonymous (codes for the same amino acid) to nonsynonymous (codes for a different amino acid) mutations varied across these population genomes at each site. The environmental pressures at each site drove purifying selection for functional traits that maintained a streamlined core genome across the Chicago River Polynucleobacter population while allowing for site-specific genomic adaptation. These adaptations enable Polynucleobacter to become dominant across different riverine environmental gradients

Directory of Open Access Journals

PubMed Central

eScholarship - University of California

Cloud Storage and Bioinformatics in a private cloud deployment: Lessons for Data Intensive research

Author: Chang Victor
Walters Robert John
Wills Gary
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2013
Field of study

This paper describes service portability for a private cloud deployment, including a detailed case study about Cloud Storage and bioinformatics services developed as part of the Cloud Computing Adoption Framework (CCAF). Our Cloud Storage design and deployment is based on Storage Area Network (SAN) technologies, details of which include functionalities, technical implementation, architecture and user support. Experiments for data services (backup automation, data recovery and data migration) are performed and results confirm backup automation is completed swiftly and is reliable for data-intensive research. The data recovery result confirms that execution time is in proportion to quantity of recovered data, but the failure rate increases in an exponential manner. The data migration result confirms execution time is in proportion to disk volume of migrated data, but again the failure rate increases in an exponential manner. In addition, benefits of CCAF are illustrated using several bioinformatics examples such as tumour modelling, brain imaging, insulin molecules and simulations for medical training. Our Cloud Storage solution described here offers cost reduction, time-saving and user friendliness

Southampton (e-Prints Soton)

Crossref

Teeside University's Research Repository

A Robust and Universal Metaproteomics Workflow for Research Studies and Routine Diagnostics Within 24 h Using Phenol Extraction, FASP Digest, and the MetaProteomeAnalyzer

Author: Behne A.
Benndorf D.
Büdel A.
Dorl S.
Heyer R.
Kohrs F.
Muth T.
Püttker S.
Reichl U.
Saake G.
Schallert K.
Siewert C.
Zoun R.
Publication venue: 'Frontiers Media SA'
Publication date: 01/01/2019
Field of study

The investigation of microbial proteins by mass spectrometry (metaproteomics) is a key technology for simultaneously assessing the taxonomic composition and the functionality of microbial communities in medical, environmental, and biotechnological applications. We present an improved metaproteomics workflow using an updated sample preparation and a new version of the MetaProteomeAnalyzer software for data analysis. High resolution by multidimensional separation (GeLC, MudPIT) was sacrificed to aim at fast analysis of a broad range of different samples in less than 24 h. The improved workflow generated at least two times as many protein identifications than our previous workflow, and a drastic increase of taxonomic and functional annotations. Improvements of all aspects of the workflow, particularly the speed, are first steps toward potential routine clinical diagnostics (i.e., fecal samples) and analysis of technical and environmental samples. The MetaProteomeAnalyzer is provided to the scientific community as a central remote server solution at www.mpa.ovgu.de.Peer Reviewe

MPG.PuRe

Publikationsserver des Robert Koch-Instituts