1,578 research outputs found

    A network approach for managing and processing big cancer data in clouds

    Get PDF
    Translational cancer research requires integrative analysis of multiple levels of big cancer data to identify and treat cancer. In order to address the issues that data is decentralised, growing and continually being updated, and the content living or archiving on different information sources partially overlaps creating redundancies as well as contradictions and inconsistencies, we develop a data network model and technology for constructing and managing big cancer data. To support our data network approach for data process and analysis, we employ a semantic content network approach and adopt the CELAR cloud platform. The prototype implementation shows that the CELAR cloud can satisfy the on-demanding needs of various data resources for management and process of big cancer data

    Efficient Subgraph Similarity Search on Large Probabilistic Graph Databases

    Full text link
    Many studies have been conducted on seeking the efficient solution for subgraph similarity search over certain (deterministic) graphs due to its wide application in many fields, including bioinformatics, social network analysis, and Resource Description Framework (RDF) data management. All these works assume that the underlying data are certain. However, in reality, graphs are often noisy and uncertain due to various factors, such as errors in data extraction, inconsistencies in data integration, and privacy preserving purposes. Therefore, in this paper, we study subgraph similarity search on large probabilistic graph databases. Different from previous works assuming that edges in an uncertain graph are independent of each other, we study the uncertain graphs where edges' occurrences are correlated. We formally prove that subgraph similarity search over probabilistic graphs is #P-complete, thus, we employ a filter-and-verify framework to speed up the search. In the filtering phase,we develop tight lower and upper bounds of subgraph similarity probability based on a probabilistic matrix index, PMI. PMI is composed of discriminative subgraph features associated with tight lower and upper bounds of subgraph isomorphism probability. Based on PMI, we can sort out a large number of probabilistic graphs and maximize the pruning capability. During the verification phase, we develop an efficient sampling algorithm to validate the remaining candidates. The efficiency of our proposed solutions has been verified through extensive experiments.Comment: VLDB201

    Organizing Knowledge for Web Retrieval using SKOS: A Case Study in Human Protein Chain

    Get PDF
    Effective knowledge management is the most challenging task today to organize and control the millions of web resources in any scholarly publications. An effort is made to map human protein chain against different neurological disorders. After analyzing the facets in this domain, a thesaurus is constructed, relational structure of SKOS is made and finally converted into XML:RDF compliant format for knowledge representation, manipulation, interoperability and effective retrieval

    XML in Motion from Genome to Drug

    Get PDF
    Information technology (IT) has emerged as a central to the solution of contemporary genomics and drug discovery problems. Researchers involved in genomics, proteomics, transcriptional profiling, high throughput structure determination, and in other sub-disciplines of bioinformatics have direct impact on this IT revolution. As the full genome sequences of many species, data from structural genomics, micro-arrays, and proteomics became available, integration of these data to a common platform require sophisticated bioinformatics tools. Organizing these data into knowledgeable databases and developing appropriate software tools for analyzing the same are going to be major challenges. XML (eXtensible Markup Language) forms the backbone of biological data representation and exchange over the internet, enabling researchers to aggregate data from various heterogeneous data resources. The present article covers a comprehensive idea of the integration of XML on particular type of biological databases mainly dealing with sequence-structure-function relationship and its application towards drug discovery. This e-medical science approach should be applied to other scientific domains and the latest trend in semantic web applications is also highlighted

    OHMI: The Ontology of Host-Microbiome Interactions

    Get PDF
    Host-microbiome interactions (HMIs) are critical for the modulation of biological processes and are associated with several diseases, and extensive HMI studies have generated large amounts of data. We propose that the logical representation of the knowledge derived from these data and the standardized representation of experimental variables and processes can foster integration of data and reproducibility of experiments and thereby further HMI knowledge discovery. A community-based Ontology of Host-Microbiome Interactions (OHMI) was developed following the OBO Foundry principles. OHMI leverages established ontologies to create logically structured representations of microbiomes, microbial taxonomy, host species, host anatomical entities, and HMIs under different conditions and associated study protocols and types of data analysis and experimental results

    Development of Integrative Bioinformatics Applications using Cloud Computing resources and Knowledge Organization Systems (KOS).

    Get PDF
    Use of semantic web abstractions, in particular of domain neural Knowledge Organization Systems (KOS), to manage distributed, cloud based, integrative bioinformatics infrastructure. This presentation derives from recent publication:

Almeida JS, Deus HF, Maass W. (2010) S3DB core: a framework for RDF generation and management in bioinformatics infrastructures. BMC Bioinformatics. 2010 Jul 20;11(1):387. [PMID 20646315].

These PowerPoint slides were presented at Semantic Web Applications and Tools for Life Sciences December 10th, 2010, Berlin, Germany (http://www.swat4ls.org/2010/progr.php), keynote 9-10 am

    Biological data integration using Semantic Web technologies

    Get PDF
    International audienceCurrent research in biology heavily depends on the availability and efficient use of information. In order to build new knowledge, various sources of biological data must often be combined. Semantic Web technologies, which provide a common framework allowing data to be shared and reused between applications, can be applied to the management of disseminated biological data. However, due to some specificities of biological data, the application of these technologies to life science constitutes a real challenge. Through a use case of biological data integration, we show in this paper that current Semantic Web technologies start to become mature and can be applied for the development of large applications. However, in order to get the best from these technologies, improvements are needed both at the level of tool performance and knowledge modeling

    Development of Integrative Bioinformatics Applications using Cloud Computing resources and Knowledge Organization Systems (KOS).

    Get PDF
    Use of semantic web abstractions, in particular of domain neural Knowledge Organization Systems (KOS), to manage distributed, cloud based, integrative bioinformatics infrastructure. This presentation derives from recent publication:

Almeida JS, Deus HF, Maass W. (2010) S3DB core: a framework for RDF generation and management in bioinformatics infrastructures. BMC Bioinformatics. 2010 Jul 20;11(1):387. [PMID 20646315].

These PowerPoint slides were presented at Semantic Web Applications and Tools for Life Sciences December 10th, 2010, Berlin, Germany (http://www.swat4ls.org/2010/progr.php), keynote 9-10 am
    corecore