144 research outputs found

    The Importance of Modularity in Bioinformatics Tools

    Get PDF
    In the last decade the amount of Bioinformatics tools has increased enormously. There are tools to store, analyse, visualize, edit or generate biological data and there are still more in development. Still, the demand for increased functionality in a single piece of software must be balanced by the need for modularity to keep the software maintainable. In complex systems, the conflicting demands of features and maintainability are often solved by plug-in systems.

For example Cytoscape, an open source platform for Complex-Network Analysis and Visualization, is using a plug-in system to allow the extension of the application without changing the core. This not only allows the integration of new functionality without a new release but offers the possibility for other developers to contribute plug-ins which are needed in their research.

Most tools have their own, individual plug-in system to meet the needs of the application. These are often very simple and easy to use. However, the increasing complexity of plug-ins demands more functionality of the plug-in system. We want to reuse components in different contexts, we want to have simple plug-in interfaces and we want to allow communication and dependencies between plug-ins. Many tools implemented in Java are facing these problems and there seems to be a common solution: the integration of an established modularity framework, like OSGi. To our knowledge, a number of developers of bioinformatics tools are already implementing, planning or thinking about the integration of OSGi into their applications, e.g. Cytoscape, Protege, PathVisio, ImageJ, Jalview or Chipster. The adoption of modularity frameworks in the development of bioinformatics applications is steadily increasing and should be considered in the design of new software.

By modularity in the traditional computer science sense, we mean the division of a software application into logical parts with separate concerns. To ease the development of software tools the application is separated into smaller logical parts, which are implemented individually. A set of modules can form a larger application but only if a proper glue is used, OSGi is an example of such a glue. OSGi allows to build an infrastructure into an application to add and use different modules. It provides mechanisms to allow the individual modules to rely on and interact with each other, opening the possibility to put together different modules to solve the problem at hand. Later, modules can be removed and new ones can be added to tackle another problem. As Katy Boerner in her article 'Plug-and-Play Macroscopes' writes, we should 'implement software frameworks that empower domain scientists to assemble their own continuously evolving macroscopes, adding and upgrading existing (and removing obsolete) plug-ins to arrive at a set that is truly relevant for their work'.

Some of these modules are going to be specific for one application but a lot of these modules can actually be reused by other tools. We are talking about general features like the import or export of different file formats, a layout algorithm that could be used by several visualization tools or the lookup in an external online database. Why should every tool implement its own parser or algorithm? Modularity can help to share functionality. There is no need to start from scratch and implement everything anew, thus developers can focus on new and important features.

Adding modularity, or better, a modularity framework to an existing software application is not a trivial task. The developers of Cytoscape are currently undertaking this challenge with the coming version 3. We are also working on the integration of OSGi into our pathway visualization tool PathVisio and we now want to share and compare our experiences, so others can benefit from our discoveries. This will not only help them in making a decision if OSGi is a suitable solution for them but also in the integration process itself

    eXamine: a Cytoscape app for exploring annotated modules in networks

    Get PDF
    Background. Biological networks have growing importance for the interpretation of high-throughput "omics" data. Statistical and combinatorial methods allow to obtain mechanistic insights through the extraction of smaller subnetwork modules. Further enrichment analyses provide set-based annotations of these modules. Results. We present eXamine, a set-oriented visual analysis approach for annotated modules that displays set membership as contours on top of a node-link layout. Our approach extends upon Self Organizing Maps to simultaneously lay out nodes, links, and set contours. Conclusions. We implemented eXamine as a freely available Cytoscape app. Using eXamine we study a module that is activated by the virally-encoded G-protein coupled receptor US28 and formulate a novel hypothesis about its functioning

    Tools for visualization and analysis of molecular networks, pathways, and -omics data.

    Get PDF
    Biological pathways have become the standard way to represent the coordinated reactions and actions of a series of molecules in a cell. A series of interconnected pathways is referred to as a biological network, which denotes a more holistic view on the entanglement of cellular reactions. Biological pathways and networks are not only an appropriate approach to visualize molecular reactions. They have also become one leading method in -omics data analysis and visualization. Here, we review a set of pathway and network visualization and analysis methods and take a look at potential future developments in the field

    Bioinformatics for the NuGO proof of principle study: analysis of gene expression in muscle of ApoE3*Leiden mice on a high-fat diet using PathVisio

    Get PDF
    Insulin resistance is a characteristic of type-2 diabetes and its development is associated with an increased fat consumption. Muscle is one of the tissues that becomes insulin resistant after high fat (HF) feeding. The aim of the present study is to identify processes involved in the development of HF-induced insulin resistance in muscle of ApOE3*Leiden mice by using microarrays. These mice are known to become insulin resistant on a HF diet. Differential gene expression was measured in muscle using the Affymetrix mouse plus 2.0 array. To get more insight in the processes, affected pathway analysis was performed with a new tool, PathVisio. PathVisio is a pathway editor customized with plug-ins (1) to visualize microarray data on pathways and (2) to perform statistical analysis to select pathways of interest. The present study demonstrated that with pathway analysis, using PathVisio, a large variety of processes can be investigated. The significantly regulated genes in muscle of ApOE3*Leiden mice after 12 weeks of HF feeding were involved in several biological pathways including fatty acid beta oxidation, fatty acid biosynthesis, insulin signaling, oxidative stress and inflammation

    Using a data triangle to understand molecular nutrition

    Get PDF
    Until recently nutrigenomics was mainly about transcriptomics related data. That already confronted us with overwhelming analytical problems. We learned to mathematically and statistically treat genome wide expression studies and studies directed to gene expression regulation. Nutrigenomics researchers had to become bilingual speaking: English and R1 and learned to think about co-expression, clusters and false discovery rates. The latter in fact proofed to be a trap. Removing all the false positives made us loose the information we were really interested in. To understand the results of our genomics experiments we often had to confront what we were measuring with what we already knew. After all false positives are not likely to all be related to the same meaningful biological process. That asked for the development of new analytical tools like Cytoscape for network analysis and PathVisio for pathway analysis. More importantly we had to structure what we know. Text mining and data mining helped us to do that, but what was really needed was mobilization of all the knowledge that is present in the heads of the scientific community. WikiPathways was our contribution to the rapidly emerging field of community curation. Thus we started to become able to integrate different types of technologies that span the full gene expression pipeline and to understand that in the biological context. 
Today the story repeats itself. Genome wide genetics is becoming real. We can do Genome Wide Association Studies and soon we can sequence individual genomes in relation to food intake and phenotypic responses. And then what? How can we deal with that new avalanche of data? The oversampling problems will be a few orders of magnitude larger; after all there can be hundreds of SNPs in every gene. There will just be too many to understand which SNPs are important from the data alone. We will again have to relate them to the biological processes. But is that enough? I think not. We will only understand the outcome of those large scale genetics studies if we not only attribute the SNPs to genes and thereby to pathways. We will also have to consider the actual sequences and see what the functional effect is that the SNP causes. Is it likely to influence transcription factor binding, miRNA effects, or protein-protein interactions? This calls for new types of data integration, for which we already have the tools. And it calls for new creative ways to do that. What we really need is teams of creative minds. Some new initiatives seem to show that these are already being formed.

1: http://www.r-project.org 
&#xa

    Answering biological questions: querying a systems biology database for nutrigenomics

    Get PDF
    The requirement of systems biology for connecting different levels of biological research leads directly to a need for integrating vast amounts of diverse information in general and of omics data in particular. The nutritional phenotype database addresses this challenge for nutrigenomics. A particularly urgent objective in coping with the data avalanche is making biologically meaningful information accessible to the researcher. This contribution describes how we intend to meet this objective with the nutritional phenotype database. We outline relevant parts of the system architecture, describe the kinds of data managed by it, and show how the system can support retrieval of biologically meaningful information by means of ontologies, full-text queries, and structured queries. Our contribution points out critical points, describes several technical hurdles. It demonstrates how pathway analysis can improve queries and comparisons for nutrition studies. Finally, three directions for future research are given

    Data integration with biological pathways

    Get PDF
    Biological experiments generate many data, but unfortunately these are not always optimally used. That is why BiGCaT, the Bio-informatics department of UM, has developed new software in collaboration with the Gladstone institute in San Francisco. This new software can link these data to dozens of online databases. Moreover, the data are attractively presented on illustrations that represent the processes in the cell, the so-called biological pathways. These illustrations are made by means of a specially developed wiki. With our software, two earlier studies into long-term food shortage were joined together. This reanalysis has led to new insights, without the need for an expensive experiment. The results will contribute to a better treatment of patients that have problems absorbing food due to illness

    Biotransformation pathway maps in WikiPathways enable direct visualization of drug metabolism related expression changes.

    Get PDF
    In recent decades, our knowledge of the genetics and functional genomics of drug-metabolizing enzymes has increased and a wealth of data on drug-related 'omics' has become available. Despite the availability of large amounts of biological information on xenobiotic biotransformation, the number of available biotransformation pathway maps that can easily be used for visualization of multiple omics data is limited. Here, we created integrated biotransformation pathway maps suitable for multiple omics analysis using PathVisio. The ease of visualizing data on these maps was demonstrated by using published microarray data from human hepatocyte-like cell models, exemplifying - where a sufficient capacity for metabolizing chemicals is a prerequisite for a suited model - how the biotransformation pathway maps can be used for model selection

    WikiPathways: building research communities on biological pathways.

    Get PDF
    Here, we describe the development of WikiPathways (http://www.wikipathways.org), a public wiki for pathway curation, since it was first published in 2008. New features are discussed, as well as developments in the community of contributors. New features include a zoomable pathway viewer, support for pathway ontology annotations, the ability to mark pathways as private for a limited time and the availability of stable hyperlinks to pathways and the elements therein. WikiPathways content is freely available in a variety of formats such as the BioPAX standard, and the content is increasingly adopted by external databases and tools, including Wikipedia. A recent development is the use of WikiPathways as a staging ground for centrally curated databases such as Reactome. WikiPathways is seeing steady growth in the number of users, page views and edits for each pathway. To assess whether the community curation experiment can be considered successful, here we analyze the relation between use and contribution, which gives results in line with other wiki projects. The novel use of pathway pages as supplementary material to publications, as well as the addition of tailored content for research domains, is expected to stimulate growth further
    corecore