11 research outputs found
Low-Rank Approximation of Weighted Tree Automata
We describe a technique to minimize weighted tree automata (WTA), a powerful
formalisms that subsumes probabilistic context-free grammars (PCFGs) and
latent-variable PCFGs. Our method relies on a singular value decomposition of
the underlying Hankel matrix defined by the WTA. Our main theoretical result is
an efficient algorithm for computing the SVD of an infinite Hankel matrix
implicitly represented as a WTA. We provide an analysis of the approximation
error induced by the minimization, and we evaluate our method on real-world
data originating in newswire treebank. We show that the model achieves lower
perplexity than previous methods for PCFG minimization, and also is much more
stable due to the absence of local optima.Comment: To appear in AISTATS 201