Search CORE

229 research outputs found

The Importance of Modularity in Bioinformatics Tools

Author: Chris T. Evelo
Martijn P. van Iersel
Martina Kutmon
Thomas Kelder
Publication venue
Publication date: 13/07/2011
Field of study

In the last decade the amount of Bioinformatics tools has increased enormously. There are tools to store, analyse, visualize, edit or generate biological data and there are still more in development. Still, the demand for increased functionality in a single piece of software must be balanced by the need for modularity to keep the software maintainable. In complex systems, the conﬂicting demands of features and maintainability are often solved by plug-in systems.

For example Cytoscape, an open source platform for Complex-Network Analysis and Visualization, is using a plug-in system to allow the extension of the application without changing the core. This not only allows the integration of new functionality without a new release but offers the possibility for other developers to contribute plug-ins which are needed in their research.

Most tools have their own, individual plug-in system to meet the needs of the application. These are often very simple and easy to use. However, the increasing complexity of plug-ins demands more functionality of the plug-in system. We want to reuse components in different contexts, we want to have simple plug-in interfaces and we want to allow communication and dependencies between plug-ins. Many tools implemented in Java are facing these problems and there seems to be a common solution: the integration of an established modularity framework, like OSGi. To our knowledge, a number of developers of bioinformatics tools are already implementing, planning or thinking about the integration of OSGi into their applications, e.g. Cytoscape, Protege, PathVisio, ImageJ, Jalview or Chipster. The adoption of modularity frameworks in the development of bioinformatics applications is steadily increasing and should be considered in the design of new software.

By modularity in the traditional computer science sense, we mean the division of a software application into logical parts with separate concerns. To ease the development of software tools the application is separated into smaller logical parts, which are implemented individually. A set of modules can form a larger application but only if a proper glue is used, OSGi is an example of such a glue. OSGi allows to build an infrastructure into an application to add and use different modules. It provides mechanisms to allow the individual modules to rely on and interact with each other, opening the possibility to put together different modules to solve the problem at hand. Later, modules can be removed and new ones can be added to tackle another problem. As Katy Boerner in her article 'Plug-and-Play Macroscopes' writes, we should 'implement software frameworks that empower domain scientists to assemble their own continuously evolving macroscopes, adding and upgrading existing (and removing obsolete) plug-ins to arrive at a set that is truly relevant for their work'.

Some of these modules are going to be speciﬁc for one application but a lot of these modules can actually be reused by other tools. We are talking about general features like the import or export of different ﬁle formats, a layout algorithm that could be used by several visualization tools or the lookup in an external online database. Why should every tool implement its own parser or algorithm? Modularity can help to share functionality. There is no need to start from scratch and implement everything anew, thus developers can focus on new and important features.

Adding modularity, or better, a modularity framework to an existing software application is not a trivial task. The developers of Cytoscape are currently undertaking this challenge with the coming version 3. We are also working on the integration of OSGi into our pathway visualization tool PathVisio and we now want to share and compare our experiences, so others can beneﬁt from our discoveries. This will not only help them in making a decision if OSGi is a suitable solution for them but also in the integration process itself

Crossref

Nature Precedings

WikiPathways: building research communities on biological pathways.

Author: Conklin Bruce R
Evelo Chris T
Hanspers Kristina
Kelder Thomas
Kutmon Martina
Pico Alexander R
van Iersel Martijn P
Publication venue: eScholarship, University of California
Publication date: 16/11/2011
Field of study

Here, we describe the development of WikiPathways (http://www.wikipathways.org), a public wiki for pathway curation, since it was first published in 2008. New features are discussed, as well as developments in the community of contributors. New features include a zoomable pathway viewer, support for pathway ontology annotations, the ability to mark pathways as private for a limited time and the availability of stable hyperlinks to pathways and the elements therein. WikiPathways content is freely available in a variety of formats such as the BioPAX standard, and the content is increasingly adopted by external databases and tools, including Wikipedia. A recent development is the use of WikiPathways as a staging ground for centrally curated databases such as Reactome. WikiPathways is seeing steady growth in the number of users, page views and edits for each pathway. To assess whether the community curation experiment can be considered successful, here we analyze the relation between use and contribution, which gives results in line with other wiki projects. The novel use of pathway pages as supplementary material to publications, as well as the addition of tailored content for research domains, is expected to stimulate growth further

PubMed Central

eScholarship - University of California

BridgeDb: standardized access to gene, protein and metabolite identifier mapping services

Author: Alexander R. Pico
Bruce R. Conklin
Chris T. A. Evelo
Isaac Ho
Jianjiong Gao
Kristina Hanspers
Martijn P. van Iersel
Thomas Kelder
Publication venue
Publication date: 17/10/2010
Field of study

Many interesting problems in bioinformatics require integration of data from various sources. For example when combining microarray data with a pathway database, or merging co-citation networks with protein-protein interaction networks. Invariably this leads to an identifier mapping problem, where different datasets are annotated with identifiers that are related, but originate from different databases.

Solutions for the identifier mapping problem exist, such as Biomart, Synergizer, Cronos, PICR, HMS and many more. This creates an opportunity for bioinformatics tool developers. Tools can be made to flexibly support multiple mapping services or mapping services could be combined to get broader coverage. This approach requires an interface layer between tools and mapping services. BridgeDb provides such an interface layer, in the form of both a Java and REST API.

Because of the standardized interface layer, BridgeDb is not tied to a specific source of mapping information. You can switch easily between flat files, relational databases and several different web services. Mapping services can be combined to support multi-omics experiments or to integrate custom microarray annotations. BridgeDb isn't just yet another mapping service: it tries to build further on existing work, and integrate multiple partial solutions. The framework is intended for customization and adaptation to any identifier mapping service. 

BridgeDb makes it easy to add an important capability to existing tools. BridgeDb has already been integrated into several popular bioinformatics applications, such as Cytoscape, WikiPathways, PathVisio, Vanted and Taverna. To encourage tool developers to start using BridgeDb, we've created code examples, online documentation, and a mailinglist to ask questions. 

We believe that, to meet the challenges that are encountered in bioinformatics today, the software development process should follow a few essential principles: user friendliness, code reuse, modularity and open source. BridgeDb adheres to these principles, and can serve as a useful model for others to follow. BridgeDb can function to increase user-friendliness of graphical applications. It re-uses work from other projects such as BioMart and MIRIAM. BridgeDb consists of several small modules, integrated through a common interface (API). Components of BridgeDb can be left out or replaced, for maximum flexibility. BridgeDb was open source from the very beginning of the project. The philosophy of open source is closely aligned to academic values, of building on top of the work of giants. 

Many interesting problems in bioinformatics require integration of data from various sources. For example when combining microarray data with a pathway database, or merging co-citation networks with protein-protein interaction networks. Invariably this leads to an identifier mapping problem, where different datasets are annotated with identifiers that are related, but originate from different databases.

Solutions for the identifier mapping problem exist, such as Biomart, Synergizer, Cronos, PICR, HMS and many more. This creates an opportunity for bioinformatics tool developers. Tools can be made to flexibly support multiple mapping services or mapping services could be combined to get broader coverage. This approach requires an interface layer between tools and mapping services. BridgeDb provides such an interface layer, in the form of both a Java and REST API.

Because of the standardized interface layer, BridgeDb is not tied to a specific source of mapping information. You can switch easily between flat files, relational databases and several different web services. Mapping services can be combined to support multi-omics experiments or to integrate custom microarray annotations. BridgeDb isn't just yet another mapping service: it tries to build further on existing work, and integrate multiple partial solutions. The framework is intended for customization and adaptation to any identifier mapping service. 

BridgeDb makes it easy to add an important capability to existing tools. BridgeDb has already been integrated into several popular bioinformatics applications, such as Cytoscape, WikiPathways, PathVisio, Vanted and Taverna. To encourage tool developers to start using BridgeDb, we've created code examples, online documentation, and a mailinglist to ask questions. 

We believe that, to meet the challenges that are encountered in bioinformatics today, the software development process should follow a few essential principles: user friendliness, code reuse, modularity and open source. BridgeDb adheres to these principles, and can serve as a useful model for others to follow. BridgeDb can function to increase user-friendliness of graphical applications. It re-uses work from other projects such as BioMart and MIRIAM. BridgeDb consists of several small modules, integrated through a common interface (API). Components of BridgeDb can be left out or replaced, for maximum flexibility. BridgeDb was open source from the very beginning of the project. The philosophy of open source is closely aligned to academic values, of building on top of the work of giants. 

The BridgeDb library is available at "http://www.bridgedb.org":http://www.bridgedb.org.
A paper about BridgeDb was published in BMC _Bioinformatics_, 2010 Jan 4;11(1):5.

BridgeDb blog: "http://www.helixsoft.nl/blog/?tag=bridgedb":http://www.helixsoft.nl/blog/?tag=bridged

Crossref

Nature Precedings

Answering biological questions: querying a systems biology database for nutrigenomics

Author: B Andreopoulos
B Ommen van
Chris T. Evelo
D Field
EH Baehrecke
Jahn-Takeshi Saito
Kees van Bochove
L Beltrame
MP Iersel van
MP Iersel van
NF Noy
P Rocca-Serra
T Kelder
TF Rayner
The Gene Ontology Consortium
TR Gruber
Publication venue: Springer-Verlag
Publication date: 01/01/2011
Field of study

The requirement of systems biology for connecting different levels of biological research leads directly to a need for integrating vast amounts of diverse information in general and of omics data in particular. The nutritional phenotype database addresses this challenge for nutrigenomics. A particularly urgent objective in coping with the data avalanche is making biologically meaningful information accessible to the researcher. This contribution describes how we intend to meet this objective with the nutritional phenotype database. We outline relevant parts of the system architecture, describe the kinds of data managed by it, and show how the system can support retrieval of biologically meaningful information by means of ontologies, full-text queries, and structured queries. Our contribution points out critical points, describes several technical hurdles. It demonstrates how pathway analysis can improve queries and comparisons for nutrition studies. Finally, three directions for future research are given

Maastricht University Research Portal

Crossref

Springer - Publisher Connector

PubMed Central

Folding and unfolding phylogenetic trees and networks

Author: C Semple
D Gusfield
DH Huson
F Pardi
G Cardona
G Cardona
I Kanj
K Huber
Katharina T. Huber
L Iersel van
M Lott
Mike Steel
P Boldi
S Willson
T Marcussen
T Wu
Taoyang Wu
Vincent Moulton
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 14/06/2015
Field of study

Phylogenetic networks are rooted, labelled directed acyclic graphs which are commonly used to represent reticulate evolution. There is a close relationship between phylogenetic networks and multi-labelled trees (MUL-trees). Indeed, any phylogenetic network

N

can be "unfolded" to obtain a MUL-tree

U(N)

and, conversely, a MUL-tree

T

can in certain circumstances be "folded" to obtain a phylogenetic network

F(T)

that exhibits

T

. In this paper, we study properties of the operations

U

and

F

in more detail. In particular, we introduce the class of stable networks, phylogenetic networks

N

for which

F(U(N))

is isomorphic to

N

, characterise such networks, and show that they are related to the well-known class of tree-sibling networks.We also explore how the concept of displaying a tree in a network

N

can be related to displaying the tree in the MUL-tree

U(N)

. To do this, we develop a phylogenetic analogue of graph fibrations. This allows us to view

U(N)

as the analogue of the universal cover of a digraph, and to establish a close connection between displaying trees in

U(N)

and reconcilingphylogenetic trees with networks

arXiv.org e-Print Archive

Crossref

Springer - Publisher Connector

PubMed Central

University of East Anglia digital repository

Reconstructing phylogenetic level-1 networks from nondense binet and trinet sets

Author: AV Aho
B Holland
C Choy
C Semple
Celine Scornavacca
D Gusfield
D Huson
DH Huson
E Bapteste
F Pardi
G Cardona
H Poormohammadi
J Jansson
J Jansson
J Jansson
K Strimmer
Katharina T. Huber
KT Huber
KT Huber
KT Huber
Leo van Iersel
LJJ Iersel van
P Gambette
Taoyang Wu
Vincent Moulton
Y Yu
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 25/11/2014
Field of study

Binets and trinets are phylogenetic networks with two and three leaves, respectively. Here we consider the problem of deciding if there exists a binary level-1 phylogenetic network displaying a given set T of binary binets or trinets over a taxon set X, and constructing such a network whenever it exists. We show that this is NP-hard for trinets but polynomial-time solvable for binets. Moreover, we show that the problem is still polynomial-time solvable for inputs consisting of binets and trinets as long as the cycles in the trinets have size three. Finally, we present an O(3^{|X|} poly(|X|)) time algorithm for general sets of binets and trinets. The latter two algorithms generalise to instances containing level-1 networks with arbitrarily many leaves, and thus provide some of the first supernetwork algorithms for computing networks from a set of rooted 1 phylogenetic networks

arXiv.org e-Print Archive

CiteSeerX

Crossref

TU Delft Repository

Springer - Publisher Connector

INRIA a CCSD electronic archive server

HAL Descartes

HAL-IRD

University of East Anglia digital repository

HAL-CIRAD

Locating a Tree in a Phylogenetic Network in Quadratic Time

Author: BME Moret
G Cardona
IA Kanj
JM Chan
K McBreen
L Iersel van
L Nakhleh
L Parida
L Wang
P Jenkins
T Dagan
T Marcussen
TJ Treangen
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 11/02/2015
Field of study

International audienceA fundamental problem in the study of phylogenetic networks is to determine whether or not a given phylogenetic network contains a given phylogenetic tree. We develop a quadratic-time algorithm for this problem for binary nearly-stable phylogenetic networks. We also show that the number of reticulations in a reticulation visible or nearly stable phylogenetic network is bounded from above by a function linear in the number of taxa

arXiv.org e-Print Archive

Crossref

Hal-Diderot

HAL-Ecole des Ponts ParisTech

HAL - UPEC / UPEM

Mining Biological Pathways Using WikiPathways Web Services

Author: A Doerr
AL Tarca
Alexander R. Pico
AR Pico
Bruce R. Conklin
Chris Evelo
D Nam
JW Huss 3rd
K Tarassov
Kristina Hanspers
L Matthews
LD Stein
M Kanehisa
Martijn P. van Iersel
MP van Iersel
MS Cline
N Salomonis
O Keskin
P Fisher
P Shannon
SW Doniger
T Ideker
T Oinn
Thomas Kelder
Winston Hide
Y Li
Publication venue: Public Library of Science
Publication date: 01/01/2009
Field of study

WikiPathways is a platform for creating, updating, and sharing biological pathways [1]. Pathways can be edited and downloaded using the wiki-style website. Here we present a SOAP web service that provides programmatic access to WikiPathways that is complementary to the website. We describe the functionality that this web service offers and discuss several use cases in detail. Exposing WikiPathways through a web service opens up new ways of utilizing pathway information and assisting the community curation process

CiteSeerX

Public Library of Science (PLOS)

Maastricht University Research Portal

Crossref

Directory of Open Access Journals

PubMed Central

eScholarship - University of California

Bioinformatics for the NuGO proof of principle study: analysis of gene expression in muscle of ApoE3*Leiden mice on a high-fat diet using PathVisio

Author: AR Pico
B Efron
Chris T. A. Evelo
DE Kelley
EW Kraegen
JW Ryder
JY Kim
L Li
M Muurling
Marjan van Erk
Martijn P. van Iersel
ME Adriaens
MP Iersel van
PA Hansen
R Kleemann
R Vettor
Robert Kleemann
S Zadelaar
Susan L. M. Coort
Teake Kooistra
VK Randhawa
Y Kim
Z Jiang
Publication venue: Springer-Verlag
Publication date: 01/01/2008
Field of study

Insulin resistance is a characteristic of type-2 diabetes and its development is associated with an increased fat consumption. Muscle is one of the tissues that becomes insulin resistant after high fat (HF) feeding. The aim of the present study is to identify processes involved in the development of HF-induced insulin resistance in muscle of ApOE3*Leiden mice by using microarrays. These mice are known to become insulin resistant on a HF diet. Differential gene expression was measured in muscle using the Affymetrix mouse plus 2.0 array. To get more insight in the processes, affected pathway analysis was performed with a new tool, PathVisio. PathVisio is a pathway editor customized with plug-ins (1) to visualize microarray data on pathways and (2) to perform statistical analysis to select pathways of interest. The present study demonstrated that with pathway analysis, using PathVisio, a large variety of processes can be investigated. The significantly regulated genes in muscle of ApOE3*Leiden mice after 12 weeks of HF feeding were involved in several biological pathways including fatty acid beta oxidation, fatty acid biosynthesis, insulin signaling, oxidative stress and inflammation

Maastricht University Research Portal

Crossref

Springer - Publisher Connector

PubMed Central

The BridgeDb framework: standardized access to gene, protein and metabolite identifier mapping services

Author: A Kasprzyk
Alexander R Pico
AR Pico
B Waegele
Bruce R Conklin
C Perez-Iratxeta
Chris T Evelo
DS Wishart
F Iragne
GF Berriz
Isaac Ho
Jianjiong Gao
KJ Bussey
Kristina Hanspers
Martijn P van Iersel
MP van Iersel
N Le Novere
N Salomonis
P Shannon
RG Cote
S Ahmed
Thomas Kelder
TJ Hubbard
W Huang da
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

BACKGROUND: Many complementary solutions are available for the identifier mapping problem. This creates an opportunity for bioinformatics tool developers. Tools can be made to flexibly support multiple mapping services or mapping services could be combined to get broader coverage. This approach requires an interface layer between tools and mapping services. RESULTS: Here we present BridgeDb, a software framework for gene, protein and metabolite identifier mapping. This framework provides a standardized interface layer through which bioinformatics tools can be connected to different identifier mapping services. This approach makes it easier for tool developers to support identifier mapping. Mapping services can be combined or merged to support multi-omics experiments or to integrate custom microarray annotations. BridgeDb provides its own ready-to-go mapping services, both in webservice and local database forms. However, the framework is intended for customization and adaptation to any identifier mapping service. BridgeDb has already been integrated into several bioinformatics applications. CONCLUSION: By uncoupling bioinformatics tools from mapping services, BridgeDb improves capability and flexibility of those tools. All described software is open source and available at http://www.bridgedb.org

Maastricht University Research Portal

Crossref

Springer - Publisher Connector

PubMed Central

eScholarship - University of California