Most current word prediction systems make use of n-gram language models (LM)
to estimate the probability of the following word in a phrase. In the past
years there have been many attempts to enrich such language models with further
syntactic or semantic information. We want to explore the predictive powers of
Latent Semantic Analysis (LSA), a method that has been shown to provide
reliable information on long-distance semantic dependencies between words in a
context. We present and evaluate here several methods that integrate LSA-based
information with a standard language model: a semantic cache, partial
reranking, and different forms of interpolation. We found that all methods show
significant improvements, compared to the 4-gram baseline, and most of them to
a simple cache model as well.Comment: 10 pages ; EMNLP'2007 Conference (Prague