In many applications, an organization may want to acquire data from many data
owners. Data marketplaces allow data owners to produce data assemblage needed
by data buyers through coalition. To encourage coalitions to produce data, it
is critical to allocate revenue to data owners in a fair manner according to
their contributions. Although in literature Shapley fairness and alternatives
have been well explored to facilitate revenue allocation in data assemblage,
computing exact Shapley value for many data owners and large assembled data
sets through coalition remains challenging due to the combinatoric nature of
Shapley value. In this paper, we explore the decomposability of utility in data
assemblage by formulating the independent utility assumption. We argue that
independent utility enjoys many applications. Moreover, we identify interesting
properties of independent utility and develop fast computation techniques for
exact Shapley value under independent utility. Our experimental results on a
series of benchmark data sets show that our new approach not only guarantees
the exactness of Shapley value, but also achieves faster computation by orders
of magnitudes.Comment: Accepted by VLDB 202