12,449 research outputs found
Recommended from our members
Computational Strategies for Scalable Genomics Analysis.
The revolution in next-generation DNA sequencing technologies is leading to explosive data growth in genomics, posing a significant challenge to the computing infrastructure and software algorithms for genomics analysis. Various big data technologies have been explored to scale up/out current bioinformatics solutions to mine the big genomics data. In this review, we survey some of these exciting developments in the applications of parallel distributed computing and special hardware to genomics. We comment on the pros and cons of each strategy in the context of ease of development, robustness, scalability, and efficiency. Although this review is written for an audience from the genomics and bioinformatics fields, it may also be informative for the audience of computer science with interests in genomics applications
An Introduction to Programming for Bioscientists: A Python-based Primer
Computing has revolutionized the biological sciences over the past several
decades, such that virtually all contemporary research in the biosciences
utilizes computer programs. The computational advances have come on many
fronts, spurred by fundamental developments in hardware, software, and
algorithms. These advances have influenced, and even engendered, a phenomenal
array of bioscience fields, including molecular evolution and bioinformatics;
genome-, proteome-, transcriptome- and metabolome-wide experimental studies;
structural genomics; and atomistic simulations of cellular-scale molecular
assemblies as large as ribosomes and intact viruses. In short, much of
post-genomic biology is increasingly becoming a form of computational biology.
The ability to design and write computer programs is among the most
indispensable skills that a modern researcher can cultivate. Python has become
a popular programming language in the biosciences, largely because (i) its
straightforward semantics and clean syntax make it a readily accessible first
language; (ii) it is expressive and well-suited to object-oriented programming,
as well as other modern paradigms; and (iii) the many available libraries and
third-party toolkits extend the functionality of the core language into
virtually every biological domain (sequence and structure analyses,
phylogenomics, workflow management systems, etc.). This primer offers a basic
introduction to coding, via Python, and it includes concrete examples and
exercises to illustrate the language's usage and capabilities; the main text
culminates with a final project in structural bioinformatics. A suite of
Supplemental Chapters is also provided. Starting with basic concepts, such as
that of a 'variable', the Chapters methodically advance the reader to the point
of writing a graphical user interface to compute the Hamming distance between
two DNA sequences.Comment: 65 pages total, including 45 pages text, 3 figures, 4 tables,
numerous exercises, and 19 pages of Supporting Information; currently in
press at PLOS Computational Biolog
BioModels Database: An enhanced, curated and annotated resource for published quantitative kinetic models
Background: Quantitative models of biochemical and cellular systems are used to answer a variety of questions in the
biological sciences. The number of published quantitative models is growing steadily thanks to increasing interest in
the use of models as well as the development of improved software systems and the availability of better, cheaper
computer hardware. To maximise the benefits of this growing body of models, the field needs centralised model
repositories that will encourage, facilitate and promote model dissemination and reuse. Ideally, the models stored in
these repositories should be extensively tested and encoded in community-supported and standardised formats. In
addition, the models and their components should be cross-referenced with other resources in order to allow their
unambiguous identification.
Description: BioModels Database http://www.ebi.ac.uk/biomodels/ is aimed at addressing exactly these needs. It is a
freely-accessible online resource for storing, viewing, retrieving, and analysing published, peer-reviewed quantitative
models of biochemical and cellular systems. The structure and behaviour of each simulation model distributed by
BioModels Database are thoroughly checked; in addition, model elements are annotated with terms from controlled
vocabularies as well as linked to relevant data resources. Models can be examined online or downloaded in various
formats. Reaction network diagrams generated from the models are also available in several formats. BioModels
Database also provides features such as online simulation and the extraction of components from large scale models
into smaller submodels. Finally, the system provides a range of web services that external software systems can use to
access up-to-date data from the database.
Conclusions: BioModels Database has become a recognised reference resource for systems biology. It is being used by
the community in a variety of ways; for example, it is used to benchmark different simulation systems, and to study the
clustering of models based upon their annotations. Model deposition to the database today is advised by several
publishers of scientific journals. The models in BioModels Database are freely distributed and reusable; the underlying
software infrastructure is also available from SourceForge https://sourceforge.net/projects/biomodels/ under the GNU
General Public License
STOCHSIMGPU Parallel stochastic simulation for the Systems\ud Biology Toolbox 2 for MATLAB
Motivation: The importance of stochasticity in biological systems is becoming increasingly recognised and the computational cost of biologically realistic stochastic simulations urgently requires development of efficient software. We present a new software tool STOCHSIMGPU which exploits graphics processing units (GPUs)for parallel stochastic simulations of biological/chemical reaction systems and show that significant gains in efficiency can be made. It is integrated into MATLAB and works with the Systems Biology Toolbox 2 (SBTOOLBOX2) for MATLAB.\ud
\ud
Results: The GPU-based parallel implementation of the Gillespie stochastic simulation algorithm (SSA), the logarithmic direct method (LDM), and the next reaction method (NRM) is approximately 85 times faster than the sequential implementation of the NRM on a central processing unit (CPU). Using our software does not require any changes to the userâs models, since it acts as a direct replacement of the stochastic simulation software of the SBTOOLBOX2
GPU acceleration for statistical gene classification
The use of Bioinformatic tools in routine clinical diagnostics is still facing a number of issues. The more complex and advanced bioinformatic tools become, the more performance is required by the computing platforms. Unfortunately, the cost of parallel computing platforms is usually prohibitive for both public and small private medical practices. This paper presents a successful experience in using the parallel processing capabilities of Graphical Processing Units (GPU) to speed up bioinformatic tasks such as statistical classification of gene expression profiles. The results show that using open source CUDA programming libraries allows to obtain a significant increase in performances and therefore to shorten the gap between advanced bioinformatic tools and real medical practic
Molecular simulations and visualization: introduction and overview
Here we provide an introduction and overview of current progress in the field of molecular simulation and visualization, touching on the following topics: (1) virtual and augmented reality for immersive molecular simulations; (2) advanced visualization and visual analytic techniques; (3) new developments in high performance computing; and (4) applications and model building
- âŚ