221 research outputs found

    Using a data triangle to understand molecular nutrition

    Get PDF
    Until recently nutrigenomics was mainly about transcriptomics related data. That already confronted us with overwhelming analytical problems. We learned to mathematically and statistically treat genome wide expression studies and studies directed to gene expression regulation. Nutrigenomics researchers had to become bilingual speaking: English and R1 and learned to think about co-expression, clusters and false discovery rates. The latter in fact proofed to be a trap. Removing all the false positives made us loose the information we were really interested in. To understand the results of our genomics experiments we often had to confront what we were measuring with what we already knew. After all false positives are not likely to all be related to the same meaningful biological process. That asked for the development of new analytical tools like Cytoscape for network analysis and PathVisio for pathway analysis. More importantly we had to structure what we know. Text mining and data mining helped us to do that, but what was really needed was mobilization of all the knowledge that is present in the heads of the scientific community. WikiPathways was our contribution to the rapidly emerging field of community curation. Thus we started to become able to integrate different types of technologies that span the full gene expression pipeline and to understand that in the biological context. 
Today the story repeats itself. Genome wide genetics is becoming real. We can do Genome Wide Association Studies and soon we can sequence individual genomes in relation to food intake and phenotypic responses. And then what? How can we deal with that new avalanche of data? The oversampling problems will be a few orders of magnitude larger; after all there can be hundreds of SNPs in every gene. There will just be too many to understand which SNPs are important from the data alone. We will again have to relate them to the biological processes. But is that enough? I think not. We will only understand the outcome of those large scale genetics studies if we not only attribute the SNPs to genes and thereby to pathways. We will also have to consider the actual sequences and see what the functional effect is that the SNP causes. Is it likely to influence transcription factor binding, miRNA effects, or protein-protein interactions? This calls for new types of data integration, for which we already have the tools. And it calls for new creative ways to do that. What we really need is teams of creative minds. Some new initiatives seem to show that these are already being formed.

1: http://www.r-project.org 
&#xa

    Exposing WikiPathways as Linked Open Data

    Get PDF
    Biology has become a data intensive science. Discovery of new biological facts increasingly relies on the ability to find and match appropriate biological data. For instance for functional annotation of genes of interest or for identification of pathways affected by over-expressed genes. Functional and pathway information about genes and proteins is typically distributed over a variety of databases and the literature.

Pathways are a convenient, easy to interpret way to describe known biological interactions. WikiPathways provides community curated pathways. WikiPathways users integrate their knowledge with facts from the literature and biological databases. The curated pathway is then reviewed and possibly corrected or enriched. Different tools (e.g. Pathvisio and Cytoscape) support the integration of WikiPathways-knowledge for additional tasks, such as the integration with personal data sets. 

Data from WikiPathways is increasingly also used for advanced analysis where it is integrated or compared with other data, Currently, integration with data from different biological sources is mostly done manually. This can be a very time consuming task because the curator often first needs to find the available resources, needs to learn about their specific content and qualities and often spends a lot of time to technically combine the two. 

Semantic web and Linked Data technologies eliminate the barriers between database silos by relying on a set of standards and best practices for representing and describing data. The architecture of the semantic web relies on the architecture of the web itself for integrating and mapping universal resource identifiers (URI), coupled with basic inference mechanisms to enable matching concepts and properties across data sources. Semantic Web and Linked Data technologies are increasingly being successfully applied as integration engines for linking biological elements. 

Exposing WikiPathways content as Linked Open Data to the Semantic Web, enables rapid, semi-automated integration with a the growing amount of biological resources available from the linked open data cloud, it also allows really fast queries of WikiPathways itself. 

We have harmonised WikiPathways content according to a selected set of vocabularies (Biopax, CHEMBL, etc), common to resources already available as Linked Open Data. 
WikiPathways content is now available as Linked Open Data for dynamic querying through a SPARQL endpoint: http://semantics.bigcat.unimaas.nl:8000/sparql

    The Importance of Modularity in Bioinformatics Tools

    Get PDF
    In the last decade the amount of Bioinformatics tools has increased enormously. There are tools to store, analyse, visualize, edit or generate biological data and there are still more in development. Still, the demand for increased functionality in a single piece of software must be balanced by the need for modularity to keep the software maintainable. In complex systems, the conflicting demands of features and maintainability are often solved by plug-in systems.

For example Cytoscape, an open source platform for Complex-Network Analysis and Visualization, is using a plug-in system to allow the extension of the application without changing the core. This not only allows the integration of new functionality without a new release but offers the possibility for other developers to contribute plug-ins which are needed in their research.

Most tools have their own, individual plug-in system to meet the needs of the application. These are often very simple and easy to use. However, the increasing complexity of plug-ins demands more functionality of the plug-in system. We want to reuse components in different contexts, we want to have simple plug-in interfaces and we want to allow communication and dependencies between plug-ins. Many tools implemented in Java are facing these problems and there seems to be a common solution: the integration of an established modularity framework, like OSGi. To our knowledge, a number of developers of bioinformatics tools are already implementing, planning or thinking about the integration of OSGi into their applications, e.g. Cytoscape, Protege, PathVisio, ImageJ, Jalview or Chipster. The adoption of modularity frameworks in the development of bioinformatics applications is steadily increasing and should be considered in the design of new software.

By modularity in the traditional computer science sense, we mean the division of a software application into logical parts with separate concerns. To ease the development of software tools the application is separated into smaller logical parts, which are implemented individually. A set of modules can form a larger application but only if a proper glue is used, OSGi is an example of such a glue. OSGi allows to build an infrastructure into an application to add and use different modules. It provides mechanisms to allow the individual modules to rely on and interact with each other, opening the possibility to put together different modules to solve the problem at hand. Later, modules can be removed and new ones can be added to tackle another problem. As Katy Boerner in her article 'Plug-and-Play Macroscopes' writes, we should 'implement software frameworks that empower domain scientists to assemble their own continuously evolving macroscopes, adding and upgrading existing (and removing obsolete) plug-ins to arrive at a set that is truly relevant for their work'.

Some of these modules are going to be specific for one application but a lot of these modules can actually be reused by other tools. We are talking about general features like the import or export of different file formats, a layout algorithm that could be used by several visualization tools or the lookup in an external online database. Why should every tool implement its own parser or algorithm? Modularity can help to share functionality. There is no need to start from scratch and implement everything anew, thus developers can focus on new and important features.

Adding modularity, or better, a modularity framework to an existing software application is not a trivial task. The developers of Cytoscape are currently undertaking this challenge with the coming version 3. We are also working on the integration of OSGi into our pathway visualization tool PathVisio and we now want to share and compare our experiences, so others can benefit from our discoveries. This will not only help them in making a decision if OSGi is a suitable solution for them but also in the integration process itself

    Bilberries potentially alleviate stress-related retinal gene expression induced by a high-fat diet in mice

    Get PDF
    PURPOSE: Obesity- and diabetes-associated visual impairment and vascular dysfunctions are increasing as causes of vision loss. The detailed mechanisms of how obesity and diabetes affect eye health are still largely unknown, but animal models have been useful in exploring the effects of potential protective compounds, i.e., compounds characterized by antioxidant and anti-inflammatory properties. These properties occur in anthocyanins, and bilberries (European wild blueberries, Vaccinium myrtillus) are a major source of dietary anthocyanins in Nordic diets. The main aim of the present work was to study the protective effects of dietary bilberries (BB) on the level of gene expression in retinas in mice that develop obesity when fed a high-fat diet (HFD). METHODS: Mice (n=6 per group, four groups) were fed ad libitum a normal control diet (NCD), a HFD, or a diet with 5% bilberries (NCD+BB, HFD+BB) for 12 weeks. Food consumption, weight gain, and blood pressure were measured during the feeding period and whole blood serum markers of obesity at sacrifice. Retinas were collected, and RNA extracted from all 24 mice and pooled samples from four mice per group were hybridized to Mouse-Ref8 V2 Expression BeadChips (Illumina platform) with 25,697 probes for genes and transcript variants. The expression profiles in the retinas were analyzed using R, PathVisio, and DAVID to screen for high fat–induced changes as well as for bilberry-induced changes in the HFD up- or downregulated transcripts. RESULTS: The HFD and HFD+BB groups gained weight from week 5 and final weight, blood glucose, serum free fatty acids, and systolic blood pressure as compared to mice fed the control diets (Mann–Whitney’s U-test, p<0.05). Bilberries had no significant effect on these parameters other than a trend to reduce systolic blood pressure in the HFD-fed mice (101±4 versus 113±9 mmHg, p=0.10). Gene ontology enrichment analysis of 810 differentially expressed genes (F-test, p<0.05) in the retina displayed differential regulation of genes in ontology groups, mainly pathways for apoptosis, inflammation, and oxidative stress, especially systemic lupus erythematosus, mitogen-activated protein kinase, and glutathione metabolism. Mice fed a HFD had increased retinal gene expression of several crystallins, while the HFD+BB mice showed potential downregulation of these crystallins when compared to the HFD mice. Bilberries also reduced the expression of genes in the mitogen-activated protein kinase (MAPK) pathway and increased those in the glutathione metabolism pathway. CONCLUSIONS: HFD feeding induces differential expression of several stress-related genes in the mouse retina. Despite minor effects in the phenotype, a diet rich in bilberries mitigates the upregulation of crystallins otherwise induced by HFD. Thus, the early stages of obesity-associated and stress-related gene expression changes in the retina may be prevented with bilberries in the diet

    WikiPathways: building research communities on biological pathways.

    Get PDF
    Here, we describe the development of WikiPathways (http://www.wikipathways.org), a public wiki for pathway curation, since it was first published in 2008. New features are discussed, as well as developments in the community of contributors. New features include a zoomable pathway viewer, support for pathway ontology annotations, the ability to mark pathways as private for a limited time and the availability of stable hyperlinks to pathways and the elements therein. WikiPathways content is freely available in a variety of formats such as the BioPAX standard, and the content is increasingly adopted by external databases and tools, including Wikipedia. A recent development is the use of WikiPathways as a staging ground for centrally curated databases such as Reactome. WikiPathways is seeing steady growth in the number of users, page views and edits for each pathway. To assess whether the community curation experiment can be considered successful, here we analyze the relation between use and contribution, which gives results in line with other wiki projects. The novel use of pathway pages as supplementary material to publications, as well as the addition of tailored content for research domains, is expected to stimulate growth further

    BridgeDb: standardized access to gene, protein and metabolite identifier mapping services

    Get PDF
    Many interesting problems in bioinformatics require integration of data from various sources. For example when combining microarray data with a pathway database, or merging co-citation networks with protein-protein interaction networks. Invariably this leads to an identifier mapping problem, where different datasets are annotated with identifiers that are related, but originate from different databases.&#xd;&#xa;&#xd;&#xa;Solutions for the identifier mapping problem exist, such as Biomart, Synergizer, Cronos, PICR, HMS and many more. This creates an opportunity for bioinformatics tool developers. Tools can be made to flexibly support multiple mapping services or mapping services could be combined to get broader coverage. This approach requires an interface layer between tools and mapping services. BridgeDb provides such an interface layer, in the form of both a Java and REST API.&#xd;&#xa;&#xd;&#xa;Because of the standardized interface layer, BridgeDb is not tied to a specific source of mapping information. You can switch easily between flat files, relational databases and several different web services. Mapping services can be combined to support multi-omics experiments or to integrate custom microarray annotations. BridgeDb isn&#x27;t just yet another mapping service: it tries to build further on existing work, and integrate multiple partial solutions. The framework is intended for customization and adaptation to any identifier mapping service. &#xd;&#xa;&#xd;&#xa;BridgeDb makes it easy to add an important capability to existing tools. BridgeDb has already been integrated into several popular bioinformatics applications, such as Cytoscape, WikiPathways, PathVisio, Vanted and Taverna. To encourage tool developers to start using BridgeDb, we&#x27;ve created code examples, online documentation, and a mailinglist to ask questions. &#xd;&#xa;&#xd;&#xa;We believe that, to meet the challenges that are encountered in bioinformatics today, the software development process should follow a few essential principles: user friendliness, code reuse, modularity and open source. BridgeDb adheres to these principles, and can serve as a useful model for others to follow. BridgeDb can function to increase user-friendliness of graphical applications. It re-uses work from other projects such as BioMart and MIRIAM. BridgeDb consists of several small modules, integrated through a common interface (API). Components of BridgeDb can be left out or replaced, for maximum flexibility. BridgeDb was open source from the very beginning of the project. The philosophy of open source is closely aligned to academic values, of building on top of the work of giants. &#xd;&#xa;&#xd;&#xa;Many interesting problems in bioinformatics require integration of data from various sources. For example when combining microarray data with a pathway database, or merging co-citation networks with protein-protein interaction networks. Invariably this leads to an identifier mapping problem, where different datasets are annotated with identifiers that are related, but originate from different databases.&#xd;&#xa;&#xd;&#xa;Solutions for the identifier mapping problem exist, such as Biomart, Synergizer, Cronos, PICR, HMS and many more. This creates an opportunity for bioinformatics tool developers. Tools can be made to flexibly support multiple mapping services or mapping services could be combined to get broader coverage. This approach requires an interface layer between tools and mapping services. BridgeDb provides such an interface layer, in the form of both a Java and REST API.&#xd;&#xa;&#xd;&#xa;Because of the standardized interface layer, BridgeDb is not tied to a specific source of mapping information. You can switch easily between flat files, relational databases and several different web services. Mapping services can be combined to support multi-omics experiments or to integrate custom microarray annotations. BridgeDb isn&#x27;t just yet another mapping service: it tries to build further on existing work, and integrate multiple partial solutions. The framework is intended for customization and adaptation to any identifier mapping service. &#xd;&#xa;&#xd;&#xa;BridgeDb makes it easy to add an important capability to existing tools. BridgeDb has already been integrated into several popular bioinformatics applications, such as Cytoscape, WikiPathways, PathVisio, Vanted and Taverna. To encourage tool developers to start using BridgeDb, we&#x27;ve created code examples, online documentation, and a mailinglist to ask questions. &#xd;&#xa;&#xd;&#xa;We believe that, to meet the challenges that are encountered in bioinformatics today, the software development process should follow a few essential principles: user friendliness, code reuse, modularity and open source. BridgeDb adheres to these principles, and can serve as a useful model for others to follow. BridgeDb can function to increase user-friendliness of graphical applications. It re-uses work from other projects such as BioMart and MIRIAM. BridgeDb consists of several small modules, integrated through a common interface (API). Components of BridgeDb can be left out or replaced, for maximum flexibility. BridgeDb was open source from the very beginning of the project. The philosophy of open source is closely aligned to academic values, of building on top of the work of giants. &#xd;&#xa;&#xd;&#xa;The BridgeDb library is available at &#x22;http://www.bridgedb.org&#x22;:http://www.bridgedb.org.&#xd;&#xa;A paper about BridgeDb was published in BMC _Bioinformatics_, 2010 Jan 4;11(1):5.&#xd;&#xa;&#xd;&#xa;BridgeDb blog: &#x22;http://www.helixsoft.nl/blog/?tag=bridgedb&#x22;:http://www.helixsoft.nl/blog/?tag=bridged

    CyTargetLinker app update: A flexible solution for network extension in Cytoscape

    Get PDF
    Here, we present an update of the open-source CyTargetLinker app for Cytoscape ( http://apps.cytoscape.org/apps/cytargetlinker) that introduces new automation features. CyTargetLinker provides a simple interface to extend networks with links to relevant data and/or knowledge extracted from so-called linksets. The linksets are provided on the CyTargetLinker website ( https://cytargetlinker.github.io/) or can be custom-made for specific use cases. The new automation feature enables users to programmatically execute the app's functionality in Cytoscape (command line tool) and with external tools (e.g. R, Jupyter, Python, etc). This allows users to share their analysis workflows and therefore increase repeatability and reproducibility. Three use cases demonstrate automated workflows, combinations with other Cytoscape apps and core Cytoscape functionality. We first extend a protein-protein interaction network created with the stringApp, with compound-target interactions and disease-gene annotations. In the second use case, we created a workflow to load differentially expressed genes from an experimental dataset and extend it with gene-pathway associations. Lastly, we chose an example outside the biological domain and used CyTargetLinker to create an author-article-journal network for the five authors of this manuscript using a two-step extension mechanism. With 400 downloads per month in the last year and nearly 20,000 downloads in total, CyTargetLinker shows the adoption and relevance of the app in the field of network biology. In August 2019, the original publication was cited in 83 articles demonstrating the applicability in biomedical research

    An integrated bioinformatics approach to improve two-color microarray quality-control: impact on biological conclusions

    Get PDF
    Omics technology used for large-scale measurements of gene expression is rapidly evolving. This work pointed out the need of an extensive bioinformatics analyses for array quality assessment before and after gene expression clustering and pathway analysis. A study focused on the effect of red wine polyphenols on rat colon mucosa was used to test the impact of quality control and normalisation steps on the biological conclusions. The integration of data visualization, pathway analysis and clustering revealed an artifact problem that was solved with an adapted normalisation. We propose a possible point to point standard analysis procedure, based on a combination of clustering and data visualization for the analysis of microarray data

    Beyond Pathway Analysis: Identification of Active Subnetworks in Rett Syndrome

    Get PDF
    Pathway and network approaches are valuable tools in analysis and interpretation of large complex omics data. Even in the field of rare diseases, like Rett syndrome, omics data are available, and the maximum use of such data requires sophisticated tools for comprehensive analysis and visualization of the results. Pathway analysis with differential gene expression data has proven to be extremely successful in identifying affected processes in disease conditions. In this type of analysis, pathways from different databases like WikiPathways and Reactome are used as separate, independent entities. Here, we show for the first time how these pathway models can be used and integrated into one large network using the WikiPathways RDF containing all human WikiPathways and Reactome pathways, to perform network analysis on transcriptomics data. This network was imported into the network analysis tool Cytoscape to perform active submodule analysis. Using a publicly available Rett syndrome gene expression dataset from frontal and temporal cortex, classical enrichment analysis, including pathway and Gene Ontology analysis, revealed mainly immune response, neuron specific and extracellular matrix processes. Our active module analysis provided a valuable extension of the analysis prominently showing the regulatory mechanism of MECP2, especially on DNA maintenance, cell cycle, transcription, and translation. In conclusion, using pathway models for classical enrichment and more advanced network analysis enables a more comprehensive analysis of gene expression data and provides novel results

    Introducing WikiPathways as a Data-Source to Support Adverse Outcome Pathways for Regulatory Risk Assessment of Chemicals and Nanomaterials

    Get PDF
    A paradigm shift is taking place in risk assessment to replace animal models, reduce the number of economic resources, and refine the methodologies to test the growing number of chemicals and nanomaterials. Therefore, approaches such as transcriptomics, proteomics, and metabolomics have become valuable tools in toxicological research, and are finding their way into regulatory toxicity. One promising framework to bridge the gap between the molecular-level measurements and risk assessment is the concept of adverse outcome pathways (AOPs). These pathways comprise mechanistic knowledge and connect biological events from a molecular level toward an adverse effect outcome after exposure to a chemical. However, the implementation of omics-based approaches in the AOPs and their acceptance by the risk assessment community is still a challenge. Because the existing modules in the main repository for AOPs, the AOP Knowledge Base (AOP-KB), do not currently allow the integration of omics technologies, additional tools are required for omics-based data analysis and visualization. Here we show how WikiPathways can serve as a supportive tool to make omics data interoperable with the AOP-Wiki, part of the AOP-KB. Manual matching of key events (KEs) indicated that 67% could be linked with molecular pathways. Automatic connection through linkage of identifiers between the databases showed that only 30% of AOP-Wiki chemicals were found on WikiPathways. More loose linkage through gene names in KE and Key Event Relationships descriptions gave an overlap of 70 and 71%, respectively. This shows many opportunities to create more direct connections, for example with extended ontology annotations, improving its interoperability. This interoperability allows the needed integration of omics data linked to the molecular pathways with AOPs. A new AOP Portal on WikiPathways is presented to allow the community of AOP developers to collaborate and populate the molecular pathways that underlie the KEs of AOP-Wiki. We conclude that the integration of WikiPathways and AOP-Wiki will improve risk assessment because omics data will be linked directly to KEs and therefore allow the comprehensive understanding and description of AOPs. To make this assessment reproducible and valid, major changes are needed in both WikiPathways and AOP-Wiki
    • 

    corecore