675,808 research outputs found

    A Query Integrator and Manager for the Query Web

    Get PDF
    We introduce two concepts: the Query Web as a layer of interconnected queries over the document web and the semantic web, and a Query Web Integrator and Manager (QI) that enables the Query Web to evolve. QI permits users to write, save and reuse queries over any web accessible source, including other queries saved in other installations of QI. The saved queries may be in any language (e.g. SPARQL, XQuery); the only condition for interconnection is that the queries return their results in some form of XML. This condition allows queries to chain off each other, and to be written in whatever language is appropriate for the task. We illustrate the potential use of QI for several biomedical use cases, including ontology view generation using a combination of graph-based and logical approaches, value set generation for clinical data management, image annotation using terminology obtained from an ontology web service, ontology-driven brain imaging data integration, small-scale clinical data integration, and wider-scale clinical data integration. Such use cases illustrate the current range of applications of QI and lead us to speculate about the potential evolution from smaller groups of interconnected queries into a larger query network that layers over the document and semantic web. The resulting Query Web could greatly aid researchers and others who now have to manually navigate through multiple information sources in order to answer specific questions

    Exposing WikiPathways as Linked Open Data

    Get PDF
    Biology has become a data intensive science. Discovery of new biological facts increasingly relies on the ability to find and match appropriate biological data. For instance for functional annotation of genes of interest or for identification of pathways affected by over-expressed genes. Functional and pathway information about genes and proteins is typically distributed over a variety of databases and the literature.

Pathways are a convenient, easy to interpret way to describe known biological interactions. WikiPathways provides community curated pathways. WikiPathways users integrate their knowledge with facts from the literature and biological databases. The curated pathway is then reviewed and possibly corrected or enriched. Different tools (e.g. Pathvisio and Cytoscape) support the integration of WikiPathways-knowledge for additional tasks, such as the integration with personal data sets. 

Data from WikiPathways is increasingly also used for advanced analysis where it is integrated or compared with other data, Currently, integration with data from different biological sources is mostly done manually. This can be a very time consuming task because the curator often first needs to find the available resources, needs to learn about their specific content and qualities and often spends a lot of time to technically combine the two. 

Semantic web and Linked Data technologies eliminate the barriers between database silos by relying on a set of standards and best practices for representing and describing data. The architecture of the semantic web relies on the architecture of the web itself for integrating and mapping universal resource identifiers (URI), coupled with basic inference mechanisms to enable matching concepts and properties across data sources. Semantic Web and Linked Data technologies are increasingly being successfully applied as integration engines for linking biological elements. 

Exposing WikiPathways content as Linked Open Data to the Semantic Web, enables rapid, semi-automated integration with a the growing amount of biological resources available from the linked open data cloud, it also allows really fast queries of WikiPathways itself. 

We have harmonised WikiPathways content according to a selected set of vocabularies (Biopax, CHEMBL, etc), common to resources already available as Linked Open Data. 
WikiPathways content is now available as Linked Open Data for dynamic querying through a SPARQL endpoint: http://semantics.bigcat.unimaas.nl:8000/sparql

    Using Ontologies for Semantic Data Integration

    Get PDF
    While big data analytics is considered as one of the most important paths to competitive advantage of today’s enterprises, data scientists spend a comparatively large amount of time in the data preparation and data integration phase of a big data project. This shows that data integration is still a major challenge in IT applications. Over the past two decades, the idea of using semantics for data integration has become increasingly crucial, and has received much attention in the AI, database, web, and data mining communities. Here, we focus on a specific paradigm for semantic data integration, called Ontology-Based Data Access (OBDA). The goal of this paper is to provide an overview of OBDA, pointing out both the techniques that are at the basis of the paradigm, and the main challenges that remain to be addressed

    A Shared Ontology Approach to Semantic Representation of BIM Data

    Get PDF
    Architecture, engineering, construction and facility management (AEC-FM) projects involve a large number of participants that must exchange information and combine their knowledge for successful completion of a project. Currently, most of the AEC-FM domains store their information about a project in text documents or use XML, relational, or object-oriented formats that make information integration difficult. The AEC-FM industry is not taking advantage of the full potential of the Semantic Web for streamlining sharing, connecting, and combining information from different domains. The Semantic Web is designed to solve the information integration problem by creating a web of structured and connected data that can be processed by machines. It allows combining information from different sources with different underlying schemas distributed over the Internet. In the Semantic Web, all data instances and data schema are stored in a graph data store, which makes it easy to merge data from different sources. This paper presents a shared ontology approach to semantic representation of building information. The semantic representation of building information facilitates finding and integrating building information distributed in several knowledge bases. A case study demonstrates the development of a semantic based building design knowledge base

    The Semantic Automated Discovery and Integration (SADI) Web service Design-Pattern, API and Reference Implementation

    Get PDF
    Background. 
The complexity and inter-related nature of biological data poses a difficult challenge for data and tool integration. There has been a proliferation of interoperability standards and projects over the past decade, none of which has been widely adopted by the bioinformatics community. Recent attempts have focused on the use of semantics to assist integration, and Semantic Web technologies are being welcomed by this community.

Description. 
SADI – Semantic Automated Discovery and Integration – is a lightweight set of fully standards-compliant Semantic Web service design patterns that simplify the publication of services of the type commonly found in bioinformatics and other scientific domains. Using Semantic Web technologies at every level of the Web services “stack”, SADI services consume and produce instances of OWL Classes following a small number of very straightforward best-practices. In addition, we provide codebases that support these best-practices, and plug-in tools to popular developer and client software that dramatically simplify deployment of services by providers, and the discovery and utilization of those services by their consumers.

Conclusions.
SADI Services are fully compliant with, and utilize only foundational Web standards; are simple to create and maintain for service providers; and can be discovered and utilized in a very intuitive way by biologist end-users. In addition, the SADI design patterns significantly improve the ability of software to automatically discover appropriate services based on user-needs, and automatically chain these into complex analytical workflows. We show that, when resources are exposed through SADI, data compliant with a given ontological model can be automatically gathered, or generated, from these distributed, non-coordinating resources - a behavior we have not observed in any other Semantic system. Finally, we show that, using SADI, data dynamically generated from Web services can be explored in a manner very similar to data housed in static triple-stores, thus facilitating the intersection of Web services and Semantic Web technologies

    UniProt in RDF: Tackling Data Integration and Distributed Annotation with the Semantic Web

    Get PDF
    The UniProt knowledgebase (UniProtKB) is a comprehensive repository of protein sequence and annotation data. We collect information from the scientific literature and other databases and provide links to over one hundred biological resources. Such links between different databases are an important basis for data integration, but the lack of a common standard to represent and link information makes data integration an expensive business. At UniProt we have started to tackle this problem by using the Resource Description Framework ("http://www.w3.org/RDF/":http://www.w3.org/RDF/) to represent our data. RDF is a core technology for the World Wide Web Consortium's Semantic Web activities ("http://www.w3.org/2001/sw/":http://www.w3.org/2001/sw/) and is therefore well suited to work in a distributed and decentralized environment. The RDF data model represents arbitrary information as a set of simple statements of the form subject-predicate-object. To enable the linking of data on the Web, RDF requires that each resource must have a (globally) unique identifier. These identifiers allow everybody to make statements about a given resource and, together with the simple structure of the RDF data model, make it easy to combine the statements made by different people (or databases) to allow queries across different datasets. RDF is thus an industry standard that can make a major contribution to solve two important problems of bioinformatics: distributed annotation and data integration
    • …
    corecore