12 research outputs found
A multi-layer dimension reduction algorithm for text mining of news in forex / Arman Khadjeh Nassirtoussi
Information Explosion has caused the demand for customized text-mining in every imaginable area to sky-rocket. Text mining is needed in many areas, a few of which are: search engine development, spam-filtering and text-summarization. Every context requires its own customized text mining algorithms in order to achieve best results. The specific context of this research is market prediction for the foreign exchange market. The objective is to utilize news-headlines to predict market-movements 1 to 3 hours after news release.
The literature on recent research efforts in behavioral economics confirms that investors’ aggregate behavioral reactions to information released in the news can drive prices up or down. This theoretical basis constitutes the economic foundation of this investigation.
After economic comprehension of the problem at hand; available systems in the literature which operate in a comparable context are reviewed. The major finding of this review is that context-specific text mining algorithms are lacking. The main underlying text-mining challenge that seems to deserve immediate attention is the sparse and high dimensional nature of the feature-space.
Therefore, this work produces a multi-layer dimension reduction algorithm to respond to this need.
The algorithm tackles a different root cause of the problem at each layer. The first layer is termed the Semantic Abstraction Layer and addresses the problem of co-reference in text mining that is contributing to sparsity. Co-reference occurs when two or more words in a text corpus refer to the same concept. This work produces a custom approach by the name of Heuristic-Hypernyms Modeling which creates a way to recognize words with the same parent-word to be regarded as one entity. As a result, prediction accuracy increases significantly at this layer which is attributed to appropriate noise-reduction from the feature-space.
The second layer is termed Sentiment Integration Layer, which integrates sentiment analysis capability into the algorithm by proposing a sentiment weight by the name of SumScore that reflects investors’ sentiment. This layer reduces the dimensions by eliminating those that are of zero value in terms of sentiment and thereby improves prediction accuracy.
The third layer encompasses a dynamic model creation algorithm, termed Synchronous Targeted Feature Reduction (STFR). It is suitable for the challenge at hand whereby the mining of a stream of text is concerned. It updates the models with the most recent information available and, more importantly, it ensures that the dimensions are reduced to a number that is many times smaller.
The algorithm and each of its layers are extensively evaluated using real market data and news content across multiple years and have proven to be solid and superior to any other comparable solution. On top of a well-rounded multifaceted algorithm, this work contributes a much needed research framework for this context with a test-bed of data that must make future research endeavors more convenient. The produced algorithm is scalable and its modular design allows improvement in each of its layers in future research
FineNews: fine-grained semantic sentiment analysis on financial microblogs and news
In this paper, a fine-grained supervised approach is proposed to identify bullish and bearish sentiments associated with companies and stocks, by predicting a real-valued score between − 1 and + 1. We propose a supervised approach learned by using several feature sets, consisting of lexical features, semantic features and a combination of lexical and semantic features. Our study reveals that semantic features, most notably BabelNet synsets and semantic frames, can be successfully applied for Sentiment Analysis within the financial domain to achieve better results. Moreover, a comparative study has been conducted between our supervised approach and unsupervised approaches. The obtained experimental results show how our approach outperforms the others
Using frame-based resources for sentiment analysis within the financial domain
User-generated data in blogs and social networks have recently become a valuable resource for sentiment analysis in the financial domain, since they have been shown to be extremely significant to marketing research companies and public opinion organizations. In order to identify bullish and bearish sentiments associated with companies and stocks, we propose a fine-grained approach that returns a continuous score in the [-1,+1] range. Our supervised approach leverages a frame-based ontological resource which produces feature sets such as lexical features, semantic features and their combination. One of the outcome of our analysis suggests that the frame-based ontological resource we have used might be successfully applied for sentiment analysis within the financial domain achieving better results than traditional sentiment analysis methods that do not embody semantics. We also show the higher performance of a fine-grained approach based solely on the evaluation of specific substrings of the message, rather than on features extracted from the whole text of a financial microblog message through the frame-based ontological resource. We have also compared our system with semi-supervised and unsupervised approaches and results indicate that our approach outperforms the others. Last but not the least, our approach is general and can be applied on top of any existing supervised method of polarity detection
Forex exchange rate forecasting using deep recurrent neural networks
Deep learning has substantially advanced the state of the art in computer vision, natural language processing, and other fields. The paper examines the potential of deep learning for exchange rate forecasting. We systematically compare long short-term memory networks and gated recurrent units to traditional recurrent network architectures as well as feedforward networks in terms of their directional forecasting accuracy and the profitability of trading model predictions. Empirical results indicate the suitability of deep networks for exchange rate forecasting in general but also evidence the difficulty of implementing and tuning corresponding architectures. Especially with regard to trading profit, a simpler neural network may perform as well as if not better than a more complex deep neural network.Peer Reviewe