Learning to explain causal rationale of stock price changes in financial reports

Abstract

Department of Computer Science and EngineeringWhen a critical event occurs, it is often necessary to provide appropriate explanations. Previously, several theoretical and empirical foundations which discover causes and effects in temporal data have been established. However, for textual data, a simple causality modeling is not enough to handle variations in natural languages. To address the challenges in textual causality modeling, we annotate and create a large causality text dataset, called ???Causal Rationale of Stock Price Changes??? (CR-SPC) to fine-tune pre-trained language models. Our dataset includes 283K sentences from the 10-K annual reports of the U.S. companies, and sentence-level labels, from which we observe diverse patterns of causality from each industrial sector for stock price changes. Because of this diversity and an imbalance in training data across sectors, BERT+fine-tune baseline on Sector-only data shows a biased performance. We propose to transfer from related sectors, implemented as a two-stage fine tuning framework. First-stage fine tuning transfers from related sector, to overcome the limited training resource, then the second stage follows to fine tune for the given sector. Our proposed framework yields significantly improved results for detecting causal rationale from industrial sectors with low amounts of data. Furthermore, we generate labels for 382K unlabeled sentences and augment the size of the dataset by self-training on CR-SPC dataset.clos

    Similar works