961 research outputs found

    Stock Forecasts with LSTM and Web Sentiment

    Get PDF
    Traditional time-series techniques, such as auto-regressive and moving average models, can have difficulties when applied to stock data due to the randomness inherent to the markets. In this study, Long Short-Term Memory Recurrent Neural Networks, or LSTMs, have been applied to pricing data along with sentiment scores derived from web sources such as Twitter and other financial media outlets. The project team utilized this approach to complement the technical indicators observed at the end of each trading day for three stocks from the NASDAQ stock exchange over a 12-year span. A common benchmark to assess model performance on time series data is using the prior day’s closing price of a given stock to predict the next day’s closing value, which is a naive, but surprisingly accurate method when calculating the mean absolute error. The main objective of the paper is to use predictions from the various models assembled for the research, and then calculate whether the next day’s closing price will rise or fall when compared against the last predicted value. All models showed on average a roughly 2% accuracy improvement over the largely balanced up and down movements for the tickers used in the study

    Purchase Intentions on Social Media as Predictors of Consumer Spending

    Get PDF
    The paper addresses the problem of forecasting consumer expenditure from social media data. Previous research of the topic exploited the intuition that search engine traffic reflects purchase intentions and constructed predictive models of consumer behaviour from search query volumes. In contrast, we derive predictors from explicit expressions of purchase intentions found in social media posts. Two types of predictors created from these expressions are explored: those based on word embeddings and those based on topical word clusters. We introduce a new clustering method, which takes into account temporal co-occurrence of words, in addition to their semantic similarity, in order to create predictors relevant to the forecasting problem. The predictors are evaluated against baselines that use only macroeconomic variables, and against models trained on search traffic data. Conducting experiments with three different regression methods on Facebook and Twitter data, we find that both word embeddings and word clusters help to reduce forecasting errors in comparison to purely macroeconomic models. In most experimental settings, the error reduction is statistically significant, and is comparable to error reduction achieved with search traffic variables

    Nowcasting user behaviour with social media and smart devices on a longitudinal basis: from macro- to micro-level modelling

    Get PDF
    The adoption of social media and smart devices by millions of users worldwide over the last decade has resulted in an unprecedented opportunity for NLP and social sciences. Users publish their thoughts and opinions on everyday issues through social media platforms, while they record their digital traces through their smart devices. Mining these rich resources offers new opportunities in sensing real-world events and indices (e.g., political preference, mental health indices) in a longitudinal fashion, either at the macro (population)-, or at the micro(user)-level. The current project aims at developing approaches to “nowcast" (predict the current state of) such indices at both levels of granularity. First, we build natural language resources for the static tasks of sentiment analysis, emotion disclosure and sarcasm detection over user-generated content. These are important for opinion monitoring on a large scale. Second, we propose a general approach that leverages textual data derived from generic social media streams to nowcast political indices at the macro-level. Third, we leverage temporally sensitive and asynchronous information to nowcast the political stance of social media users, at the micro-level using multiple kernel learning. We then focus further on the micro-level modelling, to account for heterogeneous data sources, such as information derived from users' smart phones, SMS and social media messages, to nowcast time-varying mental health indices of a small cohort of users on a longitudinal basis. Finally, we present the challenges faced when applying such micro-level approaches in a real-world setting and propose directions for future research

    Analyzing Tweets For Predicting Mental Health States Using Data Mining And Machine Learning Algorithms

    Get PDF
    Tweets are usually the outcome of peoples’ feelings on various topics. Twitter allows users to post casual and emotional thoughts to share in real-time. Around 20% of U.S. adults use Twitter. Using the word-frequency and singular value decomposition methods, we identified the behavior of individuals through their tweets. We graded depressive and anti-depressive keywords using the tweet time-series, time-window, and time-stamp methods. We have collected around four million tweets since 2018. A parameter (Depressive Index) is computed using the F1 score and Mathews correlation coefficient (MCC) to indicate the depressive level. A framework showing the Depressive Index and the Happiness Index is prepared with the time, location, and keywords and delivers F1 Score, MCC, and CI values. COVID-19 changed the routines of most peoples\u27 lives and affected mental health. We studied the tweets and compared them with the COVID-19 growth. The Happiness Index from our work and World Happiness Report for Georgia, New York, and Sri Lanka is compared. An interactive framework is prepared to analyze the tweets, depict the happiness index, and compare it. Bad words in tweets are analyzed, and a map showing the Happiness Index is computed for all the US states and was compared with WalletHub data. We add tweets continuously and a framework delivering an atlas of maps based on the Happiness Index and make these maps available for further study. We forecasted tweets with real-time data. Our results of tweets and COVID-19 reports (WHO) are in a similar pattern. A new moving average method was presented; this unique process gave perfect results at peaks of the function and improved the error percentage. An interactive GUI portal computes the Happiness Index, depression index, feel-good- factors, prediction of the keywords, and prepares a Happiness Index map. We plan to create a public web portal to facilitate users to get these results. Upon completing the proposed GUI application, the users can get the Happiness Index, Depression Index values, Happiness map, and prediction of keywords of the desired dates and geographical locations instantaneously

    Proceedings of Mathsport international 2017 conference

    Get PDF
    Proceedings of MathSport International 2017 Conference, held in the Botanical Garden of the University of Padua, June 26-28, 2017. MathSport International organizes biennial conferences dedicated to all topics where mathematics and sport meet. Topics include: performance measures, optimization of sports performance, statistics and probability models, mathematical and physical models in sports, competitive strategies, statistics and probability match outcome models, optimal tournament design and scheduling, decision support systems, analysis of rules and adjudication, econometrics in sport, analysis of sporting technologies, financial valuation in sport, e-sports (gaming), betting and sports
    • 

    corecore