1 research outputs found
Tight Bound of Incremental Cover Trees for Dynamic Diversification
Dynamic diversification---finding a set of data points with maximum diversity
from a time-dependent sample pool---is an important task in recommender
systems, web search, database search, and notification services, to avoid
showing users duplicate or very similar items. The incremental cover tree (ICT)
with high computational efficiency and flexibility has been applied to this
task, and shown good performance. Specifically, it was empirically observed
that ICT typically provides a set with its diversity only marginally ( times) worse than the greedy max-min (GMM) algorithm, the state-of-the-art
method for static diversification with its performance bound optimal for any
polynomial time algorithm. Nevertheless, the known performance bound for ICT is
4 times worse than this optimal bound. With this paper, we aim to fill this
very gap between theory and empirical observations. For achieving this, we
first analyze variants of ICT methods, and derive tighter performance bounds.
We then investigate the gap between the obtained bound and empirical
observations by using specially designed artificial data for which the optimal
diversity is known. Finally, we analyze the tightness of the bound, and show
that the bound cannot be further improved, i.e., this paper provides the
tightest possible bound for ICT methods. In addition, we demonstrate a new use
of dynamic diversification for generative image samplers, where prototypes are
incrementally collected from a stream of artificial images generated by an
image sampler