slides

Luzzu - A Framework for Linked Data Quality Assessment

Abstract

With the increasing adoption and growth of the Linked Open Data cloud [9], with RDFa, Microformats and other ways of embedding data into ordinary Web pages, and with initiatives such as schema.org, the Web is currently being complemented with a Web of Data. Thus, the Web of Data shares many characteristics with the original Web of Documents, which also varies in quality. This heterogeneity makes it challenging to determine the quality of the data published on the Web and to subsequently make this information explicit to data consumers. The main contribution of this article is LUZZU, a quality assessment framework for Linked Open Data. Apart from providing quality metadata and quality problem reports that can be used for data cleaning, LUZZU is extensible: third party metrics can be easily plugged-in the framework. The framework does not rely on SPARQL endpoints, and is thus free of all the problems that come with them, such as query timeouts. Another advantage over SPARQL based qual- ity assessment frameworks is that metrics implemented in LUZZU can have more complex functionality than triple matching. Using the framework, we performed a quality assessment of a number of statistical linked datasets that are available on the LOD cloud. For this evaluation, 25 metrics from ten different dimensions were implemented

    Similar works