4 research outputs found
Using sentiment and social network analyses to predict opening-movie box-office success
Thesis (M. Eng.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2010.Cataloged from PDF version of thesis.Includes bibliographical references (p. 59-60).In this thesis, we explore notions of collective intelligence in the form of web metrics, social network analysis and sentiment analysis to predict the box-office income of movies. Successful prediction techniques would be advantageous for those in the movie industry to gauge their likely return and adjust pre- and post-release marketing efforts. Additionally, the approaches in this thesis may also be applied to other markets for prediction as well. We explore several modeling approaches to predict performance on the Hollywood Stock Exchange (HSX) prediction market as well as overall gross income. Some models use only a single movie's data to predict its future success, while other models build from the data of all the movies together. The most successful model presented in this thesis improves on HSX and provides high correlations/low predictive error on both HSX delist prices as well as the final gross income of the movies. We also provide insights for future work to build on this thesis to potentially uncover movies that perform exceptionally poorly or exceptionally well.by Lyric Doshi.M.Eng
Kepler: Robust Learning for Faster Parametric Query Optimization
Most existing parametric query optimization (PQO) techniques rely on
traditional query optimizer cost models, which are often inaccurate and result
in suboptimal query performance. We propose Kepler, an end-to-end
learning-based approach to PQO that demonstrates significant speedups in query
latency over a traditional query optimizer. Central to our method is Row Count
Evolution (RCE), a novel plan generation algorithm based on perturbations in
the sub-plan cardinality space. While previous approaches require accurate cost
models, we bypass this requirement by evaluating candidate plans via actual
execution data and training an ML model to predict the fastest plan given
parameter binding values. Our models leverage recent advances in neural network
uncertainty in order to robustly predict faster plans while avoiding
regressions in query performance. Experimentally, we show that Kepler achieves
significant improvements in query runtime on multiple datasets on PostgreSQL.Comment: SIGMOD 202