The W3C standardized Semantic Web languages enable users to capture
data without a schema in a manner which is intuitive to them. The challenge is that,
for the data to be useful, it should be possible to query the data and to query it
efficiently, which necessitates a schema. Understanding the structure of data is thus
important to both users and storage implementers: The structure of the data gives
insight to users in how to query the data while storage implementers can use the
structure to optimize queries. In this paper we propose that data mining routines
be used to infer candidate n-ary relations with related uniqueness- and null-free
constraints, which can be used to construct an informative Armstrong RDF dataset.
The benefit of an informative Armstrong RDF dataset is that it provides example
data based on the original data which is a fraction of the size of the original data,
while capturing the constraints of the original data faithfully. A case study on a
DBPedia person dataset showed that the associated informative Armstrong RDF
dataset contained 0.00003% of the statements of the original DBPedia dataset.https://www.iospress.nl/bookserie/frontiers-in-artificial-intelligence-and-applicationsam2019Informatic