6 research outputs found

    Normalizing Resource Identifiers using Lexicons in the Global Change Information System: Linking Earth Science Identifiers, Concepts, and Communities

    Get PDF
    Earth Science informatics involves collaboration between multiple groups of people with diverse specializations and goals,often using variations in terminology to refer to common resources. The uniformity of the resource identifiers often does not cross organizational boundaries. Because of this, permanent, widely used, unambiguous identifiers for resources are elusive. We examine real world cases of changing and inconsistent identifiers which inherently work against persistence and uniformity. We also present a solution which mediates factors in these situations; namely the creation of lexicons:mappings of sets of terms to URIs which are curated within the Global Change Information System (GCIS). We discuss aspects of the GCIS which facilitate the use of lexicons: an information model which disambiguates resources, a RESTful API which provides metadata through content-negotiation, and a strategy for long term curation of URIs, including mechanisms for handling changes to URIs and variations in terms used by different communities while providing persistent URIs and preserving relationships between resources We provide working definitions of terms,contexts, and lexicons, and relate them to the practical challenges of disambiguation and curation. We also discuss the mechanisms employed and architecture of the GCIS, and how these choices facilitate representation of persistent identifiers and mappings of them to identifiers used colloquially within various earth science communities of practice

    Distributed Join Approaches for W3C-Conform SPARQL Endpoints

    Get PDF
    Currently many SPARQL endpoints are freely available and accessible without any costs to users: Everyone can submit SPARQL queries to SPARQL endpoints via a standardized protocol, where the queries are processed on the datasets of the SPARQL endpoints and the query results are sent back to the user in a standardized format. As these distributed execution environments for semantic big data (as intersection of semantic data and big data) are freely accessible, the Semantic Web is an ideal playground for big data research. However, when utilizing these distributed execution environments, questions about the performance arise. Especially when several datasets (locally and those residing in SPARQL endpoints) need to be combined, distributed joins need to be computed. In this work we give an overview of the various possibilities of distributed join processing in SPARQL endpoints, which follow the SPARQL specification and hence are "W3C conform". We also introduce new distributed join approaches as variants of the Bitvector-Join and combination of the Semi- and Bitvector-Join. Finally we compare all the existing and newly proposed distributed join approaches for W3C conform SPARQL endpoints in an extensive experimental evaluation

    Querying the web of interlinked datasets using VOID descriptions

    No full text
    WWW 2012 Workshop on Linked Data on the Web, LDOW 2012 -- 16 April 2012 through 16 April 2012 -- Lyon -- 102197Query processing is an important way of accessing data on the Semantic Web. Today, the Semantic Web is characterized as a web of interlinked datasets, and thus querying the web can be seen as dataset integration on the web. Also, this dataset integration must be transparent from the data consumer as if she is querying the whole web. To decide which datasets should be selected and integrated for a query, one requires a metadata of the web of data. In this paper, to enable this transparency, we introduce a federated query engine called WoDQA (Web of Data Query Analyzer) which discovers datasets relevant with a query in an automated manner using VOID documents as metadata. WoDQA focuses on powerful dataset elimination by analyzing query structure with respect to the metadata of datasets. Dataset and linkset descriptions in VOID documents are analyzed for a SPARQL query and a federated query is constructed. By means of linkset concept of VOID, links between datasets are incorporated into selection of federated data sources. Current version ofWoDQA is available as a SPARQL endpoint

    Optimizing Analytical Queries over Semantic Web Sources

    Get PDF
    corecore