35,866 research outputs found

    Mutual-Excitation of Cryptocurrency Market Returns and Social Media Topics

    Get PDF
    Cryptocurrencies have recently experienced a new wave of price volatility and interest; activity within social media communities relating to cryptocurrencies has increased significantly. There is currently limited documented knowledge of factors which could indicate future price movements. This paper aims to decipher relationships between cryptocurrency price changes and topic discussion on social media to provide, among other things, an understanding of which topics are indicative of future price movements. To achieve this a well-known dynamic topic modelling approach is applied to social media communication to retrieve information about the temporal occurrence of various topics. A Hawkes model is then applied to find interactions between topics and cryptocurrency prices. The results show particular topics tend to precede certain types of price movements, for example the discussion of 'risk and investment vs trading' being indicative of price falls, the discussion of 'substantial price movements' being indicative of volatility, and the discussion of 'fundamental cryptocurrency value' by technical communities being indicative of price rises. The knowledge of topic relationships gained here could be built into a real-time system, providing trading or alerting signals.Comment: 3rd International Conference on Knowledge Engineering and Applications (ICKEA 2018) - Moscow, Russia (June 25-27 2018

    Stock market random forest-text mining system mining critical indicators of stock market movements

    Get PDF
    Stock Market (SM) is believed to be a significant sector of a free market economy as it plays a crucial role in the growth of commerce and industry of a country. The increasing importance of SMs and their direct influence on economy were the main reasons for analysing SM movements. The need to determine early warning indicators for SM crisis has been the focus of study by many economists and politicians. Whilst most research into the identification of these critical indicators applied data mining to uncover hidden knowledge, very few attempted to adopt a text mining approach. This paper demonstrates how text mining combined with Random Forest algorithm can offer a novel approach to the extraction of critical indicators, and classification of related news articles. The findings of this study extend the current classification of critical indicators from three to eight classes; it also show that Random Forest can outperform other classifiers and produce high accuracy

    Forecasting movements of health-care stock prices based on different categories of news articles using multiple kernel learning

    Get PDF
    —The market state changes when a new piece of information arrives. It affects decisions made by investors and is considered to be an important data source that can be used for financial forecasting. Recently information derived from news articles has become a part of financial predictive systems. The usage of news articles and their forecasting potential have been extensively researched. However, so far no attempts have been made to utilise different categories of news articles simultaneously. This paper studies how the concurrent, and appropriately weighted, usage of news articles, having different degrees of relevance to the target stock, can improve the performance of financial forecasting and support the decision-making process of investors and traders. Stock price movements are predicted using the multiple kernel learning technique which integrates information extracted from multiple news categories while separate kernels are utilised to analyse each category. News articles are partitioned according to their relevance to the target stock, its sub industry, industry, group industry and sector. The experiments are run on stocks from the Health Care sector and show that increasing the number of relevant news categories used as data sources for financial forecasting improves the performance of the predictive system in comparison with approaches based on a lower number of categories

    Liquidity commonality does not imply liquidity resilience commonality: A functional characterisation for ultra-high frequency cross-sectional LOB data

    Full text link
    We present a large-scale study of commonality in liquidity and resilience across assets in an ultra high-frequency (millisecond-timestamped) Limit Order Book (LOB) dataset from a pan-European electronic equity trading facility. We first show that extant work in quantifying liquidity commonality through the degree of explanatory power of the dominant modes of variation of liquidity (extracted through Principal Component Analysis) fails to account for heavy tailed features in the data, thus producing potentially misleading results. We employ Independent Component Analysis, which both decorrelates the liquidity measures in the asset cross-section, but also reduces higher-order statistical dependencies. To measure commonality in liquidity resilience, we utilise a novel characterisation as the time required for return to a threshold liquidity level. This reflects a dimension of liquidity that is not captured by the majority of liquidity measures and has important ramifications for understanding supply and demand pressures for market makers in electronic exchanges, as well as regulators and HFTs. When the metric is mapped out across a range of thresholds, it produces the daily Liquidity Resilience Profile (LRP) for a given asset. This daily summary of liquidity resilience behaviour from the vast LOB dataset is then amenable to a functional data representation. This enables the comparison of liquidity resilience in the asset cross-section via functional linear sub-space decompositions and functional regression. The functional regression results presented here suggest that market factors for liquidity resilience (as extracted through functional principal components analysis) can explain between 10 and 40% of the variation in liquidity resilience at low liquidity thresholds, but are less explanatory at more extreme levels, where individual asset factors take effect

    Using Text Mining to Analyze Quality Aspects of Unstructured Data: A Case Study for “stock-touting” Spam Emails

    Get PDF
    The growth in the utilization of text mining tools and techniques in the last decade has been primarily driven by the increase in the sheer volume of unstructured texts and the need to extract useful and more importantly, quality information from them. The impetus to analyse unstructured data efficiently and effectively as part of the decision making processes within an organization has further motivated the need to better understand how to use text mining tools and techniques. This paper describes a case study of a stock spam e-mail architecture that demonstrates the process of refining linguistic resources to extract relevant, high quality information including stock profile, financial key words, stock and company news (positive/negative), and compound phrases from stock spam e-mails. The context of such a study is to identify high quality information patterns that can be used to support relevant authorities in detecting and analyzing fraudulent activities

    Analysis of S&P500 using News Headlines Applying Machine Learning Algorithms

    Get PDF
    Dissertation presented as the partial requirement for obtaining a Master's degree in Information Management, specialization in Knowledge Management and Business IntelligenceFinancial risk is in everyone’s life now, directly or indirectly impacting people´s daily life, empowering people on their decisions and the consequences of the same. This financial system comprises all the companies that produce and sell, making them an essential factor. This study addresses the impact people can have, by the news headlines written, on companies’ stock prices. S&P 500 is the index that will be studied in this research, compiling the biggest 500 companies in the USA and how the index can be affected by the News Articles written by humans from distinct and powerful Newspapers. Many people worldwide “play the game” of investing in stock prices, winning or losing much money. This study also tries to understand how strongly this news and the Index, previously mentioned, can be correlated. With the increased data available, it is necessary to have some computational power to help process all of this data. There it is when the machine learning methods can have a crucial involvement. For this is necessary to understand how these methods can be applied and influence the final decision of the human that always has the same question: Can stock prices be predicted? For that is necessary to understand first the correlation between news articles, one of the elements able to impact the stock prices, and the stock prices themselves. This study will focus on the correlation between News and S&P 500

    ALGA: Automatic Logic Gate Annotator for Building Financial News Events Detectors

    Get PDF
    We present a new automatic data labelling framework called ALGA - Automatic Logic Gate Annotator. The framework helps to create large amounts of annotated data for training domain-specific financial news events detection classifiers quicker. ALGA framework implements a rules-based approach to annotate a training dataset. This method has following advantages: 1) unlike traditional data labelling methods, it helps to filter relevant news articles from noise; 2) allows easier transferability to other domains and better interpretability of models trained on automatically labelled data. To create this framework, we focus on the U.S.-based companies that operate in the Apparel and Footwear industry. We show that event detection classifiers trained on the data generated by our framework can achieve state-of-the-art performance in the domain-specific financial events detection task. Besides, we create a domain-specific events synonyms dictionary

    Text-Mining in Streams of Textual Data Using Time Series Applied to Stock Market

    Get PDF
    Each day, a lot of text data is generated. This data comes from various sources and may contain valuable information. In this article, we use text mining methods to discover if there is a connection between news articles and changes of the S&P 500 stock index. The index values and documents were divided into time windows according to the direction of the index value changes. We achieved a classification accuracy of 65-74 %.O
    corecore