2 research outputs found

    Identifying candidate datasets for data interlinking

    Get PDF
    One of the design principles that can stimulate the growth and increase the usefulness of the Web of data is URIs linkage. However, the related URIs are typically in different datasets managed by different publishers. Hence, the designer of a new dataset must be aware of the existing datasets and inspect their content to define sameAs links. This paper proposes a technique based on probabilistic classifiers that, given a datasets S to be published and a set T of known published datasets, ranks each Ti ∈ T according to the probability that links between S and Ti can be found by inspecting the most relevant datasets. Results from our technique show that the search space can be reduced up to 85%, thereby greatly decreasing the computational effort. The final publication is available at Springer via http://dx.doi.org/10.1007/978-3-642-39200-9_29
    corecore