The simplicial condition and other stronger conditions that imply it have
recently played a central role in developing polynomial time algorithms with
provable asymptotic consistency and sample complexity guarantees for topic
estimation in separable topic models. Of these algorithms, those that rely
solely on the simplicial condition are impractical while the practical ones
need stronger conditions. In this paper, we demonstrate, for the first time,
that the simplicial condition is a fundamental, algorithm-independent,
information-theoretic necessary condition for consistent separable topic
estimation. Furthermore, under solely the simplicial condition, we present a
practical quadratic-complexity algorithm based on random projections which
consistently detects all novel words of all topics using only up to
second-order empirical word moments. This algorithm is amenable to distributed
implementation making it attractive for 'big-data' scenarios involving a
network of large distributed databases