1,008 research outputs found

    A Novel Distributed Representation of News (DRNews) for Stock Market Predictions

    Full text link
    In this study, a novel Distributed Representation of News (DRNews) model is developed and applied in deep learning-based stock market predictions. With the merit of integrating contextual information and cross-documental knowledge, the DRNews model creates news vectors that describe both the semantic information and potential linkages among news events through an attributed news network. Two stock market prediction tasks, namely the short-term stock movement prediction and stock crises early warning, are implemented in the framework of the attention-based Long Short Term-Memory (LSTM) network. It is suggested that DRNews substantially enhances the results of both tasks comparing with five baselines of news embedding models. Further, the attention mechanism suggests that short-term stock trend and stock market crises both receive influences from daily news with the former demonstrates more critical responses on the information related to the stock market {\em per se}, whilst the latter draws more concerns on the banking sector and economic policies.Comment: 25 page

    An Ensemble Classifier for Stock Trend Prediction Using Sentence-Level Chinese News Sentiment and Technical Indicators

    Get PDF
    In the financial market, predicting stock trends based on stock market news is a challenging task, and researchers are devoted to developing forecasting models. From the existing literature, the performance of the forecasting model is better when news sentiment and technical analysis are considered than when only one of them is used. However, analyzing news sentiment for trend forecasting is a difficult task, especially for Chinese news, because it is unstructured data and extracting the most important features is difficult. Moreover, positive or negative news does not always affect stock prices in a certain way. Therefore, in this paper, we propose an approach to build an ensemble classifier using sentiment in Chinese news at sentence level and technical indicators to predict stock trends. In the training stages, we first divide each news item into a set of sentences. TextRank and word2vec are then used to generate a predefined number of key sentences. The sentiment scores of these key sentences are computed using the given financial lexicon. The sentiment values of the key phrases, the three values of the technical indicators and the stock trend label are merged as a training instance. Based on the sentiment values of the key sets, the corpora are divided into positive and negative news datasets. The two datasets formed are then used to build positive and negative stock trend prediction models using the support vector machine. To increase the reliability of the prediction model, a third classifier is created using the Bollinger Bands. These three classifiers are combined to form an ensemble classifier. In the testing phase, a voting mechanism is used with the trained ensemble classifier to make the final decision based on the trading signals generated by the three classifiers. Finally, experiments were conducted on five years of news and stock prices of one company to show the effectiveness of the proposed approach, and results show that the accuracy and P / L ratio of the proposed approach are 61% and 4.0821 are better than the existing approach

    Recent Advances in Stock Market Prediction Using Text Mining: A Survey

    Get PDF
    Market prediction offers great profit avenues and is a fundamental stimulus for most researchers in this area. To predict the market, most researchers use either technical or fundamental analysis. Technical analysis focuses on analyzing the direction of prices to predict future prices, while fundamental analysis depends on analyzing unstructured textual information like financial news and earning reports. More and more valuable market information has now become publicly available online. This draws a picture of the significance of text mining strategies to extract significant information to analyze market behavior. While many papers reviewed the prediction techniques based on technical analysis methods, the papers that concentrate on the use of text mining methods were scarce. In contrast to the other current review articles that concentrate on discussing many methods used for forecasting the stock market, this study aims to compare many machine learning (ML) and deep learning (DL) methods used for sentiment analysis to find which method could be more effective in prediction and for which types and amount of data. The study also clarifies the recent research findings and its potential future directions by giving a detailed analysis of the textual data processing and future research opportunity for each reviewed study

    Recent Advances in Social Data and Artificial Intelligence 2019

    Get PDF
    The importance and usefulness of subjects and topics involving social data and artificial intelligence are becoming widely recognized. This book contains invited review, expository, and original research articles dealing with, and presenting state-of-the-art accounts pf, the recent advances in the subjects of social data and artificial intelligence, and potentially their links to Cyberspace

    Lagged correlation networks

    Get PDF
    Technological advances have provided scientists with large high-dimensional datasets that describe the behaviors of complex systems: from the statistics of energy levels in complex quantum systems, to the time-dependent transcription of genes, to price fluctuations among assets in a financial market. In this environment, where it may be difficult to infer the joint distribution of the data, network science has flourished as a way to gain insight into the structure and organization of such systems by focusing on pairwise interactions. This work focuses on a particular setting, in which a system is described by multivariate time series data. We consider time-lagged correlations among elements in this system, in such a way that the measured interactions among elements are asymmetric. Finally, we allow these interactions to be characteristically weak, so that statistical uncertainties may be important to consider when inferring the structure of the system. We introduce a methodology for constructing statistically validated networks to describe such a system, extend the methodology to accommodate interactions with a periodic component, and show how consideration of bipartite community structures in these networks can aid in the construction of robust statistical models. An example of such a system is a financial market, in which high frequency returns data may be used to describe contagion, or the spreading of shocks in price among assets. These data provide the experimental testing ground for our methodology. We study NYSE data from both the present day and one decade ago, examine the time scales over which the validated lagged correlation networks exist, and relate differences in the topological properties of the networks to an increasing economic efficiency. We uncover daily periodicities in the validated interactions, and relate our findings to explanations of the Epps Effect, an empirical phenomenon of financial time series. We also study bipartite community structures in networks composed of market returns and news sentiment signals for 40 countries. We compare the degrees to which markets anticipate news, and news anticipate markets, and use the community structures to construct a recommender system for inputs to prediction models. Finally, we complement this work with novel investigations of the exogenous news items that may drive the financial system using topic models. This includes an analysis of how investors and the general public may interact with these news items using Internet search data, and how the diversity of stories in the news both responds to and influences market movements

    4th. International Conference on Advanced Research Methods and Analytics (CARMA 2022)

    Full text link
    Research methods in economics and social sciences are evolving with the increasing availability of Internet and Big Data sources of information. As these sources, methods, and applications become more interdisciplinary, the 4th International Conference on Advanced Research Methods and Analytics (CARMA) is a forum for researchers and practitioners to exchange ideas and advances on how emerging research methods and sources are applied to different fields of social sciences as well as to discuss current and future challenges. Due to the covid pandemic, CARMA 2022 is planned as a virtual and face-to-face conference, simultaneouslyDoménech I De Soria, J.; Vicente Cuervo, MR. (2022). 4th. International Conference on Advanced Research Methods and Analytics (CARMA 2022). Editorial Universitat Politècnica de València. https://doi.org/10.4995/CARMA2022.2022.1595

    텍스트 데이터를 이용한 주식 가격, 기준 금리 및 스프레드 예측

    Get PDF
    학위논문 (박사)-- 서울대학교 대학원 : 공과대학 산업공학과, 2018. 2. 조성준.Methodologies in financial research based on a variety of predictions models have been actively developed for the analysis of market behaviors. The significance of prediction modeling in the financial market cannot be emphasized better especially given that it leads directly to large transaction profit. In terms of applicability for the active agents in the market requires, these research results require both predictability and interpretability. In this study, we propose methodologies suitable for incorporating distinct characteristics across different financial data in the analysis for the purpose of effective prediction modeling. Firstly, we propose a methodology that quantitatively and qualitatively predicts the stock price movements through sentiment analysis of corporate disclosures in the stock market. The proposed method predicts stock price movements by embedding the documents, and the class of documents defined to fit the purpose of our study, to the same projection space based on the distributed representations learned, and compares the predictive performance against various existing models. The results provide prime evidence of effectiveness of our prediction results through visualization of document sentiments. In addition, we propose a methodology specifically designed for predicting the vote results of the base interest rate, which is the most important factor in the bond market, developed within the premise of the Korean bond market. Our methodology allows computation of sentence sentiments using the monetary policy decision recorded as text data, which is released before the announcement of the vote result, which are then aggregated to the document level to express the document sentiment of monetary policy decision into values. Using these sentiments, we predict the vote results of the base rate. Finally, we define the framework for predicting the spread, the difference between two bond rates with different maturities. The framework mainly considers the following three aspects as the standards for the effectiveness of research: interpretability, proper prediction metrics, and the reporting methods. The framework use wrapper approaches for the practical interpretation of important variables, while using PARE, in combination with MAE, as prediction metrics, for taking into account the tolerance of the spread. Lately, we suggest various visualizations and hierarchical illustration of significant variables as more applicable and effective reporting methods. This dissertation defines a variety of financial problems, proposes analytical methodologies, compares quantitative prediction power, and provide the qualitative evidence. The proposed methodologies prove to serve as a quick and accurate data-driven decision making support tool to active agents in the real-site.Chapter 1 Introduction 1 1.1 Financial Markets 1 1.2 Data-driven Decision Making 4 1.3 Outlook of this Dissertation 8 Chapter 2 Literature Review 11 2.1 Financial Predictability Modeling 11 2.2 Financial Interpretability Modeling 14 2.3 Data-driven Modeling Techniques 18 Chapter 3 Prediction of Stock Price through Sentiment Analysis of Corporate Disclosures 34 3.1 Background 34 3.2 Proposed Method 38 3.2.1 Distributed Representation 38 3.2.2 Visualization 42 3.2.3 Model-based Prediction 43 3.3 Experimental Results 45 3.3.1 Data Descriptions 45 3.3.2 Experimental Settings 47 3.3.3 Quantitative Prediction 47 3.3.4 Qualitative Prediction 48 3.4 Summary 53 Chapter 4 Predicting the Korean Monetary Policy Committees Vote Results with Monetary Policy Decision Text 56 4.1 Background 56 4.2 Proposed Method 63 4.2.1 Sentence Representation 63 4.2.2 Prediction Models of Sentence Sentiment 64 4.2.3 Aggregation of Sentence Sentiment 66 4.3 Experimental Results 69 4.3.1 Data Descriptions 69 4.3.2 Sentence Sentiment Prediction of a Monetary Policy Decision 70 4.3.3 Vote Result Prediction 73 4.4 Summary 75 Chapter 5 Modeling the 3-10 Year Spreads with Economic Indicators 78 5.1 Background 78 5.2 Proposed Method 80 5.2.1 Preprocessing 83 5.2.2 Prediction Models 83 5.2.3 Feature Selection 83 5.2.4 Evaluation 85 5.2.5 Reporting 88 5.3 Experimental Results 89 5.3.1 Data Descriptions 89 5.3.2 Experimental Settings 90 5.3.3 Spread Prediction 92 5.4 Summary 94 Chapter 6 Conclusion 97 6.1 Contributions 97 6.2 Future Work 101 Bibliography 102 국문초록 115Docto

    Robustness, Heterogeneity and Structure Capturing for Graph Representation Learning and its Application

    Get PDF
    Graph neural networks (GNNs) are potent methods for graph representation learn- ing (GRL), which extract knowledge from complicated (graph) structured data in various real-world scenarios. However, GRL still faces many challenges. Firstly GNN-based node classification may deteriorate substantially by overlooking the pos- sibility of noisy data in graph structures, as models wrongly process the relation among nodes in the input graphs as the ground truth. Secondly, nodes and edges have different types in the real-world and it is essential to capture this heterogeneity in graph representation learning. Next, relations among nodes are not restricted to pairwise relations and it is necessary to capture the complex relations accordingly. Finally, the absence of structural encodings, such as positional information, deterio- rates the performance of GNNs. This thesis proposes novel methods to address the aforementioned problems: 1. Bayesian Graph Attention Network (BGAT): Developed for situations with scarce data, this method addresses the influence of spurious edges. Incor- porating Bayesian principles into the graph attention mechanism enhances robustness, leading to competitive performance against benchmarks (Chapter 3). 2. Neighbour Contrastive Heterogeneous Graph Attention Network (NC-HGAT): By enhancing a cutting-edge self-supervised heterogeneous graph neural net- work model (HGAT) with neighbour contrastive learning, this method ad- dresses heterogeneity and uncertainty simultaneously. Extra attention to edge relations in heterogeneous graphs also aids in subsequent classification tasks (Chapter 4). 3. A novel ensemble learning framework is introduced for predicting stock price movements. It adeptly captures both group-level and pairwise relations, lead- ing to notable advancements over the existing state-of-the-art. The integration of hypergraph and graph models, coupled with the utilisation of auxiliary data via GNNs before recurrent neural network (RNN), provides a deeper under- standing of long-term dependencies between similar entities in multivariate time series analysis (Chapter 5). 4. A novel framework for graph structure learning is introduced, segmenting graphs into distinct patches. By harnessing the capabilities of transformers and integrating other position encoding techniques, this approach robustly capture intricate structural information within a graph. This results in a more comprehensive understanding of its underlying patterns (Chapter 6)

    Machine Learning

    Get PDF
    Machine Learning can be defined in various ways related to a scientific domain concerned with the design and development of theoretical and implementation tools that allow building systems with some Human Like intelligent behavior. Machine learning addresses more specifically the ability to improve automatically through experience
    corecore