In privacy-preserving data publishing, approaches using Value Generalization
Hierarchies (VGHs) form an important class of anonymization algorithms. VGHs
play a key role in the utility of published datasets as they dictate how the
anonymization of the data occurs. For categorical attributes, it is imperative
to preserve the semantics of the original data in order to achieve a higher
utility. Despite this, semantics have not being formally considered in the
specification of VGHs. Moreover, there are no methods that allow the users to
assess the quality of their VGH. In this paper, we propose a measurement
scheme, based on ontologies, to quantitatively evaluate the quality of VGHs, in
terms of semantic consistency and taxonomic organization, with the aim of
producing higher-quality anonymizations. We demonstrate, through a case study,
how our evaluation scheme can be used to compare the quality of multiple VGHs
and can help to identify faulty VGHs.Comment: 18 pages, 7 figures, presented in the Privacy in Statistical
Databases Conference 2014 (Ibiza, Spain