109,513 research outputs found

    Machine Learning and Alternative Data Analytics for Fashion Finance

    Get PDF
    This dissertation investigates the application of Machine Learning, Natural Language Processing and computational finance to a novel area Fashion Finance. Specifically identifying investment opportunities within the Apparel industry using influential alternative data sources such as Instagram. Fashion investment is challenging due to the ephemeral nature of the industry and the difficulty for investors who lack an understanding of how to analyze trend-driven consumer brands. Unstructured online data (e-commerce stores, social media, online blogs, news, etc.), introduce new opportunities for investment signals extraction. We focus on how trading signals can be generated from the Instagram data and events reported in the news articles. Part of this research work was done in collaboration with Arabesque Asset Management. Farfetch, the online luxury retailer, and Living Bridge Private Equity provided industry advice. Research Datasets The datasets used for this research are collected from various sources and include the following types of data: - Financial data: daily stock prices of 50 U.S. and European Apparel and Footwear equities, daily U.S. Retail Trade and U.S. Consumer Non-Durables sectors indices, Form 10-K reports. - Instagram data: daily Instagram profile followers for 11 fashion companies. - News data: 0.5 mln news articles that mention selected 50 equities. Research Experiments The thesis consists of the below studies: 1. Relationship between Instagram Popularity and Stock Prices. This study investigates a link between the changes in a company's popularity (daily followers counts) on Instagram and its stock price, revenue movements. We use cross-correlation analysis to find whether the signals derived from the followers' data could help to infer a company's future financial performance. Two hypothetical trading strategies are designed to test if the changes in a company's Instagram popularity could improve the returns. To test the hypotheses, Wilcoxon signed-rank test is used. 2. Dynamic Density-based News Clustering. The aim of this study is twofold: 1) analyse the characteristics of relevant news event articles and how they differ from the noisy/irrelevant news; 2) using the insights, design an unsupervised framework that clusters news articles and identifies events clusters without predefined parameters or expert knowledge. The framework incorporates the density-based clustering algorithm DBSCAN where the clustering parameters are selected dynamically with Gaussian Mixture Model and by maximizing the inter-cluster Information Entropy. 3. ALGA: Automatic Logic Gate Annotator for Event Detection. We design a news classification model for detecting fashion events that are likely to impact a company's stock price. The articles are represented by the following text embeddings: TF-IDF, Doc2Vec and BERT (Transformer Neural Network). The study is comprised of two parts: 1) we design a domain-specific automatic news labelling framework ALGA. The framework incorporates topic extraction (Latent Dirichlet Allocation) and clustering (DBSCAN) algorithms in addition to other filters to annotate the dataset; 2) using the labelled dataset, we train Logistic Regression classifier for identifying financially relevant news. The model shows the state-of-the-art results in the domain-specific financial event detection problem. Contribution to Science This research work presents the following contributions to science: - Introducing original work in Machine Learning and Natural Language Processing application for analysing alternative data on ephemeral fashion assets. - Introducing the new metrics to measure and track a fashion brand's popularity for investment decision making. - Design of the dynamic news events clustering framework that finds events clusters of various sizes in the news articles without predefined parameters. - Present the original Automatic Logic Gate Annotator framework (ALGA) for automatic labelling of news articles for the financial event detection task. - Design of the Apparel and Footwear news events classifier using the datasets generated by the ALGA's framework and show the state-of-the-art performance in a domain-specific financial event detection task. - Build the \textit{Fashion Finance Dictionary} that contains 320 phrases related to various financially-relevant events in the Apparel and Footwear industry

    Good and bad events: Combining network-based event detection with sentiment analysis

    Get PDF
    This is the final version. Available on open access from Springer via the DOI in this recordThe huge volume and velocity of media content published on the Web presents a substantial challenge to human analysts. In prior work, we developed a system (network event detection, NED) to assist analysts by detecting events within high-volume news streams in real time. NED can process a heterogeneous stream of news articles or social media user posts, combining text mining and network analysis to detect breaking news stories and generate an easy-to-understand event summary. In this paper, we expand the NED event detection and summarisation approach in two ways. First, we introduce a new approach to named entity disambiguation for tweets, which contain minimal information due to brevity. Second, we apply sentiment analysis techniques to documents associated with a detected event to characterise the event as either broadly ‘positive’ or ‘negative’ based on media portrayal. Our expansion focuses on Twitter streams since Twitter has become an important news dissemination platform and is often the site where emerging events are first seen. To test the extended methodology, we apply it here to three data sets related to political elections in the UK and the USA. The addition of sentiment analysis to the NED event detection methodology improves the insight gained by the user by allowing quick evaluation of the perceived impact of an event. This approach may have potential applications in domains where public sentiment is relevant to decision-making around events, such as financial markets and politics.Adarga Ltd.Turing InstituteUniversity of Exete

    A classification-based approach to economic event detection in Dutch news text

    Get PDF
    Breaking news on economic events such as stock splits or mergers and acquisitions has been shown to have a substantial impact on the financial markets. As it is important to be able to automatically identify events in news items accurately and in a timely manner, we present in this paper proof-of-concept experiments for a supervised machine learning approach to economic event detection in newswire text. For this purpose, we created a corpus of Dutch financial news articles in which 10 types of company-specific economic events were annotated. We trained classifiers using various lexical, syntactic and semantic features. We obtain good results based on a basic set of shallow features, thus showing that this method is a viable approach for economic event detection in news text

    The Effects of Twitter Sentiment on Stock Price Returns

    Get PDF
    Social media are increasingly reflecting and influencing behavior of other complex systems. In this paper we investigate the relations between a well-know micro-blogging platform Twitter and financial markets. In particular, we consider, in a period of 15 months, the Twitter volume and sentiment about the 30 stock companies that form the Dow Jones Industrial Average (DJIA) index. We find a relatively low Pearson correlation and Granger causality between the corresponding time series over the entire time period. However, we find a significant dependence between the Twitter sentiment and abnormal returns during the peaks of Twitter volume. This is valid not only for the expected Twitter volume peaks (e.g., quarterly announcements), but also for peaks corresponding to less obvious events. We formalize the procedure by adapting the well-known "event study" from economics and finance to the analysis of Twitter data. The procedure allows to automatically identify events as Twitter volume peaks, to compute the prevailing sentiment (positive or negative) expressed in tweets at these peaks, and finally to apply the "event study" methodology to relate them to stock returns. We show that sentiment polarity of Twitter peaks implies the direction of cumulative abnormal returns. The amount of cumulative abnormal returns is relatively low (about 1-2%), but the dependence is statistically significant for several days after the events

    News, liquidity dynamics and intraday jumps: evidence from the HUF/EUR market

    Get PDF
    We study intraday jumps on a pure limit order FX market by linking them to news announcements and liquidity shocks. First, we show that jumps are frequent and contribute greatly to the return volatility. Nearly half of the jumps can be linked with scheduled and unscheduled news announcements. Furthermore, we show that jumps are information based, whether they are linked with news announcements or not. Prior to jumps, liquidity does not deviate from its normal level, nor do liquidity shocks offer any predictive power for jump occurrence. Jumps emerge not as a result of unusually low liquidity but rather as a result of an unusually high demand for immediacy concentrated on one side of the book. During and after the jump, a dynamic order placement process emerges: some participants endogenously become liquidity providers and absorb the increased demand for immediacy. We detect an interesting asymmetry and find the liquidity providers to be more reluctant to add liquidity when confronted with a news announcement around the jump. Further evidence shows that participants submit more limit orders relative to market orders after a jump. Consequently, the informational role of order flow becomes less pronounced in the thick order book after the jump

    Economic event detection in company-specific news text

    Get PDF
    This paper presents a dataset and supervised classification approach for economic event detection in English news articles. Currently, the economic domain is lacking resources and methods for data-driven supervised event detection. The detection task is conceived as a sentence-level classification task for 10 different economic event types. Two different machine learning approaches were tested: a rich feature set Support Vector Machine (SVM) set-up and a word-vector-based long short-term memory recurrent neural network (RNN-LSTM) set-up. We show satisfactory results for most event types, with the linear kernel SVM outperforming the other experimental set-ups
    corecore