    Service-based analysis of biological pathways

    Background: Computer-based pathway discovery is concerned with two important objectives: pathway identification and analysis. Conventional mining and modeling approaches aimed at pathway discovery are often effective at achieving either objective, but not both. Such limitations can be effectively tackled leveraging a Web service-based modeling and mining approach. Results: Inspired by molecular recognitions and drug discovery processes, we developed a Web service mining tool, named PathExplorer, to discover potentially interesting biological pathways linking service models of biological processes. The tool uses an innovative approach to identify useful pathways based on graph-based hints and service-based simulation verifying user's hypotheses. Conclusion: Web service modeling of biological processes allows the easy access and invocation of these processes on the Web. Web service mining techniques described in this paper enable the discovery of biological pathways linking these process service models. Algorithms presented in this paper for automatically highlighting interesting subgraph within an identified pathway network enable the user to formulate hypothesis, which can be tested out using our simulation algorithm that are also described in this paper

    CROssBAR: comprehensive resource of biomedical relations with knowledge graph representations

    Systemic analysis of available large-scale biological/biomedical data is critical for studying biological mechanisms, and developing novel and effective treatment approaches against diseases. However, different layers of the available data are produced using different technologies and scattered across individual computational resources without any explicit connections to each other, which hinders extensive and integrative multi-omics-based analysis. We aimed to address this issue by developing a new data integration/representation methodology and its application by constructing a biological data resource. CROssBAR is a comprehensive system that integrates large-scale biological/biomedical data from various resources and stores them in a NoSQL database. CROssBAR is enriched with the deep-learning-based prediction of relationships between numerous data entries, which is followed by the rigorous analysis of the enriched data to obtain biologically meaningful modules. These complex sets of entities and relationships are displayed to users via easy-tointerpret, interactive knowledge graphs within an open-access service. CROssBAR knowledge graphs incorporate relevant genes-proteins, molecular interactions, pathways, phenotypes, diseases, as well as known/predicted drugs and bioactive compounds, and they are constructed on-the-fly based on simple non-programmatic user queries. These intensely processed heterogeneous networks are expected to aid systems-level research, especially to infer biological mechanisms in relation to genes, proteins, their ligands, and diseases

    BioNessie - a grid enabled biochemical networks simulation environment

    The simulation of biochemical networks provides insight and understanding about the underlying biochemical processes and pathways used by cells and organisms. BioNessie is a biochemical network simulator which has been developed at the University of Glasgow. This paper describes the simulator and focuses in particular on how it has been extended to benefit from a wide variety of high performance compute resources across the UK through Grid technologies to support larger scale simulations

    WikiPathways: building research communities on biological pathways.

    Here, we describe the development of WikiPathways (http://www.wikipathways.org), a public wiki for pathway curation, since it was first published in 2008. New features are discussed, as well as developments in the community of contributors. New features include a zoomable pathway viewer, support for pathway ontology annotations, the ability to mark pathways as private for a limited time and the availability of stable hyperlinks to pathways and the elements therein. WikiPathways content is freely available in a variety of formats such as the BioPAX standard, and the content is increasingly adopted by external databases and tools, including Wikipedia. A recent development is the use of WikiPathways as a staging ground for centrally curated databases such as Reactome. WikiPathways is seeing steady growth in the number of users, page views and edits for each pathway. To assess whether the community curation experiment can be considered successful, here we analyze the relation between use and contribution, which gives results in line with other wiki projects. The novel use of pathway pages as supplementary material to publications, as well as the addition of tailored content for research domains, is expected to stimulate growth further

    Spectral analysis of gene expression profiles using gene networks

    Microarrays have become extremely useful for analysing genetic phenomena, but establishing a relation between microarray analysis results (typically a list of genes) and their biological significance is often difficult. Currently, the standard approach is to map a posteriori the results onto gene networks to elucidate the functions perturbed at the level of pathways. However, integrating a priori knowledge of the gene networks could help in the statistical analysis of gene expression data and in their biological interpretation. Here we propose a method to integrate a priori the knowledge of a gene network in the analysis of gene expression data. The approach is based on the spectral decomposition of gene expression profiles with respect to the eigenfunctions of the graph, resulting in an attenuation of the high-frequency components of the expression profiles with respect to the topology of the graph. We show how to derive unsupervised and supervised classification algorithms of expression profiles, resulting in classifiers with biological relevance. We applied the method to the analysis of a set of expression profiles from irradiated and non-irradiated yeast strains. It performed at least as well as the usual classification but provides much more biologically relevant results and allows a direct biological interpretation

    Understanding trade pathways to target biosecurity surveillance

    Increasing trends in global trade make it extremely difficult to prevent the entry of all potential invasive species (IS). Establishing early detection strategies thus becomes an important part of the continuum used to reduce the introduction of invasive species. One part necessary to ensure the success of these strategies is the determination of priority survey areas based on invasion pressure. We used a pathway-centred conceptual model of pest invasion to address these questions: what role does global trade play in invasion pressure of plant ecosystems and how could an understanding of this role be used to enhance early detection strategies? We concluded that the relative level of invasion pressure for destination ecosystems can be influenced by the intensity of pathway usage (import volume and frequency), the number and type of pathways with a similar destination, and the number of different ecological regions that serve as the source for imports to the same destination. As these factors increase, pressure typically intensifies because of increasing a) propagule pressure, b) likelihood of transporting pests with higher intrinsic invasion potential, and c) likelihood of transporting pests into ecosystems with higher invasibility. We used maritime containerized imports of live plants into the contiguous U.S. as a case study to illustrate the practical implications of the model to determine hotspot areas of relative invasion pressure for agricultural and forest ecosystems (two ecosystems with high potential invasibility). Our results illustrated the importance of how a pathway-centred model could be used to highlight potential target areas for early detection strategies for IS. Many of the hotspots in agricultural and forest ecosystems were within major U.S. metropolitan areas. Invasion ecologists can utilize pathway-centred conceptual models to a) better understand the role of human-mediated pathways in pest establishment, b) enhance current methodologies for IS risk analysis, and c) develop strategies for IS early detection-rapid response programs

    CancerLinker: Explorations of Cancer Study Network

    Interactive visualization tools are highly desirable to biologist and cancer researchers to explore the complex structures, detect patterns and find out the relationships among bio-molecules responsible for a cancer type. A pathway contains various bio-molecules in different layers of the cell which is responsible for specific cancer type. Researchers are highly interested in understanding the relationships among the proteins of different pathways and furthermore want to know how those proteins are interacting in different pathways for various cancer types. Biologists find it useful to merge the data of different cancer studies in a single network and see the relationships among the different proteins which can help them detect the common proteins in cancer studies and hence reveal the pattern of interactions of those proteins. We introduce the CancerLinker, a visual analytic tool that helps researchers explore cancer study interaction network. Twenty-six cancer studies are merged to explore pathway data and bio-molecules relationships that can provide the answers to some significant questions which are helpful in cancer research. The CancerLinker also helps biologists explore the critical mutated proteins in multiple cancer studies. A bubble graph is constructed to visualize common protein based on its frequency and biological assemblies. Parallel coordinates highlight patterns of patient profiles (obtained from cBioportal by WebAPI services) on different attributes for a specified cancer studyComment: 7 pages, 9 figure

    Uniformly curated signaling pathways reveal tissue-specific cross-talks and support drug target discovery

    Motivation: Signaling pathways control a large variety of cellular processes. However, currently, even within the same database signaling pathways are often curated at different levels of detail. This makes comparative and cross-talk analyses difficult. Results: We present SignaLink, a database containing 8 major signaling pathways from Caenorhabditis elegans, Drosophila melanogaster, and humans. Based on 170 review and approx. 800 research articles, we have compiled pathways with semi-automatic searches and uniform, well-documented curation rules. We found that in humans any two of the 8 pathways can cross-talk. We quantified the possible tissue- and cancer-specific activity of cross-talks and found pathway-specific expression profiles. In addition, we identified 327 proteins relevant for drug target discovery. Conclusions: We provide a novel resource for comparative and cross-talk analyses of signaling pathways. The identified multi-pathway and tissue-specific cross-talks contribute to the understanding of the signaling complexity in health and disease and underscore its importance in network-based drug target selection. Availability: http://SignaLink.orgComment: 9 pages, 4 figures, 2 tables and a supplementary info with 5 Figures and 13 Table

    Data access and integration in the ISPIDER proteomics grid

    Grid computing has great potential for supporting the integration of complex, fast changing biological data repositories to enable distributed data analysis. One scenario where Grid computing has such potential is provided by proteomics resources which are rapidly being developed with the emergence of affordable, reliable methods to study the proteome. The protein identifications arising from these methods derive from multiple repositories which need to be integrated to enable uniform access to them. A number of technologies exist which enable these resources to be accessed in a Grid environment, but the independent development of these resources means that significant data integration challenges, such as heterogeneity and schema evolution, have to be met. This paper presents an architecture which supports the combined use of Grid data access (OGSA-DAI), Grid distributed querying (OGSA-DQP) and data integration (AutoMed) software tools to support distributed data analysis. We discuss the application of this architecture for the integration of several autonomous proteomics data resources