Search CORE

4,822 research outputs found

An Open Framework for Extensible Multi-Stage Bioinformatics Software

Author: Bellgard Matthew
Keeble-Gagnère Gabriel
Mizuguchi Kenji
Nyström-Persson Johan
Publication venue
Publication date: 01/01/2012
Field of study

In research labs, there is often a need to customise software at every step in a given bioinformatics workflow, but traditionally it has been difficult to obtain both a high degree of customisability and good performance. Performance-sensitive tools are often highly monolithic, which can make research difficult. We present a novel set of software development principles and a bioinformatics framework, Friedrich, which is currently in early development. Friedrich applications support both early stage experimentation and late stage batch processing, since they simultaneously allow for good performance and a high degree of flexibility and customisability. These benefits are obtained in large part by basing Friedrich on the multiparadigm programming language Scala. We present a case study in the form of a basic genome assembler and its extension with new functionality. Our architecture has the potential to greatly increase the overall productivity of software developers and researchers in bioinformatics.Comment: 12 pages, 1 figure, to appear in proceedings of PRIB 201

arXiv.org e-Print Archive

Queensland University of Technology ePrints Archive

Research Repository

Ontology-based knowledge representation of experiment metadata in biological data mining

Author: Burke Squires
Carl Dahlke
Hagler Herb
Herb Hagler
Jamie Lee
Jeff Wiser
Jennifer Cai
Karp David
Megan Kong
Patrick Dunn
Richard Scheuermann
Smith Barry
Yu Qian
Publication venue
Publication date: 01/01/2009
Field of study

According to the PubMed resource from the U.S. National Library of Medicine, over 750,000 scientific articles have been published in the ~5000 biomedical journals worldwide in the year 2007 alone. The vast majority of these publications include results from hypothesis-driven experimentation in overlapping biomedical research domains. Unfortunately, the sheer volume of information being generated by the biomedical research enterprise has made it virtually impossible for investigators to stay aware of the latest findings in their domain of interest, let alone to be able to assimilate and mine data from related investigations for purposes of meta-analysis. While computers have the potential for assisting investigators in the extraction, management and analysis of these data, information contained in the traditional journal publication is still largely unstructured, free-text descriptions of study design, experimental application and results interpretation, making it difficult for computers to gain access to the content of what is being conveyed without significant manual intervention. In order to circumvent these roadblocks and make the most of the output from the biomedical research enterprise, a variety of related standards in knowledge representation are being developed, proposed and adopted in the biomedical community. In this chapter, we will explore the current status of efforts to develop minimum information standards for the representation of a biomedical experiment, ontologies composed of shared vocabularies assembled into subsumption hierarchical structures, and extensible relational data models that link the information components together in a machine-readable and human-useable framework for data mining purposes

PhilPapers

Bioconductor: open software development for computational biology and bioinformatics.

Author: Bates Douglas
Bolstad Ben
Carey Vincent
Dettling Marcel
Dudoit Sandrine
Ellis Byron
Gautier Laurent
Ge Yongchao
Gentleman Robert
Gentry Jeff
Hornik Kurt
Hothorn Torsten
Huber Wolfgang
Iacus Stefano
Irizarry Rafael
Leisch Friedrich
Li Cheng
Maechler Martin
Rossini Anthony
Sawitzki Gunther
Smith Colin
Smyth Gordon
Tierney Luke
Yang Jean
Zhang Jianhua
Publication venue: eScholarship, University of California
Publication date: 01/01/2004
Field of study

The Bioconductor project is an initiative for the collaborative creation of extensible software for computational biology and bioinformatics. The goals of the project include: fostering collaborative development and widespread use of innovative software, reducing barriers to entry into interdisciplinary scientific research, and promoting the achievement of remote reproducibility of research results. We describe details of our aims and methods, identify current challenges, compare Bioconductor to other open bioinformatics projects, and provide working examples

Repository for Publications and Research Data

AIR Universita degli studi di Milano

Springer - Publisher Connector

PubMed Central

eScholarship - University of California

ZHAW digitalcollection

Collection Of Biostatistics Research Archive

Online Research Database In Technology

University of Melbourne Institutional Repository

Object-Oriented Paradigms for Modelling Vascular\ud Tumour Growth: a Case Study

Author: Byrne H. M.
Connor A. J.
Cooper J.
Maini P. K.
McKeever S.
Publication venue
Publication date: 01/01/2012
Field of study

Motivated by a family of related hybrid multiscale models, we have built an object-oriented framework for developing and implementing multiscale models of vascular tumour growth. The models are implemented in our framework as a case study to highlight how object-oriented programming techniques and good object-oriented design may be used effectively to develop hybrid multiscale models of vascular tumour growth. The intention is that this paper will serve as a useful reference for researchers modelling complex biological systems and that these researchers will employ some of the techniques presented herein in their own projects

Oxford University Research Archive

An open and extensible framework for spatially explicit land use change modelling in R: the lulccR package (0.1.0)

Author: Buytaert W
Mijic A
Moulds SC
Publication venue: 'Copernicus GmbH'
Publication date: 23/03/2015
Field of study

Land use change has important consequences for biodiversity and the sustainability of ecosystem services, as well as for global environmental change. Spatially explicit land use change models improve our understanding of the processes driving change and make predictions about the quantity and location of future and past change. Here we present the lulccR package, an object-oriented framework for land use change modelling written in the R programming language. The contribution of the work is to resolve the following limitations associated with the current land use change modelling paradigm: (1) the source code for model implementations is frequently unavailable, severely compromising the reproducibility of scientific results and making it impossible for members of the community to improve or adapt models for their own purposes; (2) ensemble experiments to capture model structural uncertainty are difficult because of fundamental differences between implementations of different models; (3) different aspects of the modelling procedure must be performed in different environments because existing applications usually only perform the spatial allocation of change. The package includes a stochastic ordered allocation procedure as well as an implementation of the widely used CLUE-S algorithm. We demonstrate its functionality by simulating land use change at the Plum Island Ecosystems site, using a dataset included with the package. It is envisaged that lulccR will enable future model development and comparison within an open environment

Directory of Open Access Journals

Spiral - Imperial College Digital Repository

Harnessing the Power of Many: Extensible Toolkit for Scalable Ensemble Applications

Author: Balasubramanian Vivek
Cervone Guido
Hu Weiming
Jha Shantenu
Lefebvre Matthieu
Lei Wenjie
Tromp Jeroen
Turilli Matteo
Publication venue
Publication date: 16/05/2018
Field of study

Many scientific problems require multiple distinct computational tasks to be executed in order to achieve a desired solution. We introduce the Ensemble Toolkit (EnTK) to address the challenges of scale, diversity and reliability they pose. We describe the design and implementation of EnTK, characterize its performance and integrate it with two distinct exemplar use cases: seismic inversion and adaptive analog ensembles. We perform nine experiments, characterizing EnTK overheads, strong and weak scalability, and the performance of two use case implementations, at scale and on production infrastructures. We show how EnTK meets the following general requirements: (i) implementing dedicated abstractions to support the description and execution of ensemble applications; (ii) support for execution on heterogeneous computing infrastructures; (iii) efficient scalability up to O(10^4) tasks; and (iv) fault tolerance. We discuss novel computational capabilities that EnTK enables and the scientific advantages arising thereof. We propose EnTK as an important addition to the suite of tools in support of production scientific computing

arXiv.org e-Print Archive

Crossref

On Designing Multicore-aware Simulators for Biological Systems

Author: Aldinucci Marco
Coppo Mario
Damiani Ferruccio
Drocco Maurizio
Torquati Massimo
Troina Angelo
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 13/10/2010
Field of study

The stochastic simulation of biological systems is an increasingly popular technique in bioinformatics. It often is an enlightening technique, which may however result in being computational expensive. We discuss the main opportunities to speed it up on multi-core platforms, which pose new challenges for parallelisation techniques. These opportunities are developed in two general families of solutions involving both the single simulation and a bulk of independent simulations (either replicas of derived from parameter sweep). Proposed solutions are tested on the parallelisation of the CWC simulator (Calculus of Wrapped Compartments) that is carried out according to proposed solutions by way of the FastFlow programming framework making possible fast development and efficient execution on multi-cores.Comment: 19 pages + cover pag

arXiv.org e-Print Archive

CiteSeerX

Crossref

Archivio della Ricerca - Università di Pisa

Archivio istituzionale della ricerca - Alma Mater Studiorum Università di Bologna

Institutional Research Information System University of Turin

Recommended from our members

The Alliance of Genome Resources: Building a Modern Data Ecosystem for Model Organism Databases.

Author: Alliance of Genome Resources Consortium
Publication venue: eScholarship, University of California
Publication date: 01/12/2019
Field of study

Model organisms are essential experimental platforms for discovering gene functions, defining protein and genetic networks, uncovering functional consequences of human genome variation, and for modeling human disease. For decades, researchers who use model organisms have relied on Model Organism Databases (MODs) and the Gene Ontology Consortium (GOC) for expertly curated annotations, and for access to integrated genomic and biological information obtained from the scientific literature and public data archives. Through the development and enforcement of data and semantic standards, these genome resources provide rapid access to the collected knowledge of model organisms in human readable and computation-ready formats that would otherwise require countless hours for individual researchers to assemble on their own. Since their inception, the MODs for the predominant biomedical model organisms [Mus sp (laboratory mouse), Saccharomyces cerevisiae, Drosophila melanogaster, Caenorhabditis elegans, Danio rerio, and Rattus norvegicus] along with the GOC have operated as a network of independent, highly collaborative genome resources. In 2016, these six MODs and the GOC joined forces as the Alliance of Genome Resources (the Alliance). By implementing shared programmatic access methods and data-specific web pages with a unified "look and feel," the Alliance is tackling barriers that have limited the ability of researchers to easily compare common data types and annotations across model organisms. To adapt to the rapidly changing landscape for evaluating and funding core data resources, the Alliance is building a modern, extensible, and operationally efficient "knowledge commons" for model organisms using shared, modular infrastructure

eScholarship - University of California