Recent work has shown that finite mixture models with m components are
identifiable, while making no assumptions on the mixture components, so long as
one has access to groups of samples of size 2m−1 which are known to come from
the same mixture component. In this work we generalize that result and show
that, if every subset of k mixture components of a mixture model are linearly
independent, then that mixture model is identifiable with only (2m−1)/(k−1)
samples per group. We further show that this value cannot be improved. We prove
an analogous result for a stronger form of identifiability known as
"determinedness" along with a corresponding lower bound. This independence
assumption almost surely holds if mixture components are chosen randomly from a
k-dimensional space. We describe some implications of our results for
multinomial mixture models and topic modeling