Many applications, including provenance and some analyses of social networks,
require path-based queries over graph-structured data. When these graphs
contain sensitive information, paths may be broken, resulting in uninformative
query results. This paper presents innovative techniques that give users more
informative graph query results; the techniques leverage a common industry
practice of providing what we call surrogates: alternate, less sensitive
versions of nodes and edges releasable to a broader community. We describe
techniques for interposing surrogate nodes and edges to protect sensitive graph
components, while maximizing graph connectivity and giving users as much
information as possible. In this work, we formalize the problem of creating a
protected account G' of a graph G. We provide a utility measure to compare the
informativeness of alternate protected accounts and an opacity measure for
protected accounts, which indicates the likelihood that an attacker can
recreate the topology of the original graph from the protected account. We
provide an algorithm to create a maximally useful protected account of a
sensitive graph, and show through evaluation with the PLUS prototype that using
surrogates and protected accounts adds value for the user, with no significant
impact on the time required to generate results for graph queries.Comment: VLDB201