1 research outputs found
Making Predictions with Textual Contents
Forecasting real-world quantities with basis on information from textual descriptions has recently attracted significant interest as a research problem, although previous studies have
focused on applications involving only the English language.
This document presents an experimental study on the subject of making predictions with textual
contents written in Portuguese, using documents from three distinct domains. I specifically
report on experiments using different types of regression models, using state-of-the-art feature
weighting schemes, and using features derived from cluster-based word representations.
Through controlled experiments, I have shown that prediction models using the textual information achieve better results than simple baselines such as taking the average value over the training data, and that richer document representations (i.e., using Brown clusters and the Delta- TF-IDF feature weighting scheme) result in slight performance improvements