With the increasing adoption and growth of the Linked Open Data cloud [9],
with RDFa, Microformats and other ways of embedding data into ordinary Web
pages, and with initiatives such as schema.org, the Web is currently being
complemented with a Web of Data. Thus, the Web of Data shares many
characteristics with the original Web of Documents, which also varies in
quality. This heterogeneity makes it challenging to determine the quality of
the data published on the Web and to subsequently make this information
explicit to data consumers. The main contribution of this article is LUZZU, a
quality assessment framework for Linked Open Data. Apart from providing quality
metadata and quality problem reports that can be used for data cleaning, LUZZU
is extensible: third party metrics can be easily plugged-in the framework. The
framework does not rely on SPARQL endpoints, and is thus free of all the
problems that come with them, such as query timeouts. Another advantage over
SPARQL based qual- ity assessment frameworks is that metrics implemented in
LUZZU can have more complex functionality than triple matching. Using the
framework, we performed a quality assessment of a number of statistical linked
datasets that are available on the LOD cloud. For this evaluation, 25 metrics
from ten different dimensions were implemented