2,727 research outputs found

    Six papers on computational methods for the analysis of structured and unstructured data in the economic domain

    Get PDF
    This work investigates the application of computational methods for structured and unstructured data. The domains of application are two closely connected fields with the common goal of promoting the stability of the financial system: systemic risk and bank supervision. The work explores different families of models and applies them to different tasks: graphical Gaussian network models to address bank interconnectivity, topic models to monitor bank news and deep learning for text classification. New applications and variants of these models are investigated posing a particular attention on the combined use of textual and structured data. In the penultimate chapter is introduced a sentiment polarity classification tool in Italian, based on deep learning, to simplify future researches relying on sentiment analysis. The different models have proven useful for leveraging numerical (structured) and textual (unstructured) data. Graphical Gaussian Models and Topic models have been adopted for inspection and descriptive tasks while deep learning has been applied more for predictive (classification) problems. Overall, the integration of textual (unstructured) and numerical (structured) information has proven useful for systemic risk and bank supervision related analysis. The integration of textual data with numerical data in fact, has brought either to higher predictive performances or enhanced capability of explaining phenomena and correlating them to other events.This work investigates the application of computational methods for structured and unstructured data. The domains of application are two closely connected fields with the common goal of promoting the stability of the financial system: systemic risk and bank supervision. The work explores different families of models and applies them to different tasks: graphical Gaussian network models to address bank interconnectivity, topic models to monitor bank news and deep learning for text classification. New applications and variants of these models are investigated posing a particular attention on the combined use of textual and structured data. In the penultimate chapter is introduced a sentiment polarity classification tool in Italian, based on deep learning, to simplify future researches relying on sentiment analysis. The different models have proven useful for leveraging numerical (structured) and textual (unstructured) data. Graphical Gaussian Models and Topic models have been adopted for inspection and descriptive tasks while deep learning has been applied more for predictive (classification) problems. Overall, the integration of textual (unstructured) and numerical (structured) information has proven useful for systemic risk and bank supervision related analysis. The integration of textual data with numerical data in fact, has brought either to higher predictive performances or enhanced capability of explaining phenomena and correlating them to other events

    A Combined Approach for Extracting Financial Instrument-Specific Investor Sentiment from Weblogs

    Get PDF
    Investor sentiment about future returns of financial instruments is a highly relevant information source for investment managers and other stakeholders in the financial industry. Investor sentiments are abundant in financial blog texts. Making use of these sentiments constitutes a massive information management challenge when considering the millions of blog articles with everchanging and growing amounts of information that need to be acquired and interpreted. We propose a novel approach for investor sentiment extraction from blogs by combining machine-learning on the document-level and knowledgebased information extraction on the sentence-level. The proposed artifact is a financial instrument-specific investor sentiment extraction method, which we apply to a set of blog articles. The evaluation suggests that the combined approach achieves a higher precision compared to a standalone knowledge-based approach

    Stock Prediction Based on Social Media Data via Sentiment Analysis: a Study on Reddit

    Get PDF
    With the development of internet and information technology, online text data has become available and accessible for research in many fields including stock prediction. Social media, being one of the biggest content generators on the internet, is a great data resource for text mining and stock prediction. It has a large capacity, high data density, and fast information spread. In this thesis, analyses on the relationship between the stock-related text in social media (Reddit) and the price changes of corresponding stocks are implemented. In the analysis, sentiment analysis is first applied to extract the individual users’ emotions and opinions about the stocks. After that, the extracted features are analyzed via descriptive statistics and predictive analysis using the Pearson correlation coefficient and machine learning models. The predictive analysis is designed to examine the dependence between the social media text data and stock price change by evaluating the performance of predictions, four indicators are used in the evaluation including “prediction accuracy on price change direction” and three indicators in simulated algorithm trading experiments based on prediction results. They are “total profit with trading strategy for single stock”, “daily profit efficiency of trading strategy” and “total profit with Portfolio trading strategy”. From the results and the comparison with a Buy and Hold (B&H) baseline strategy, the predictions show good results in terms of “daily profit efficiency” and “total profit with Portfolio trading strategy”. Therefore, the online forum text from Reddit are proved to be correlated with future stock price changes and might be used to make more profit than B&H strategy by incorporating their information in portfolio trading strategies

    Financial sentiment analysis of quarterly reports and stock performance

    Get PDF
    This thesis aims to examine the use of financial sentiment analysis for quarterly reports published by companies listed on the Oslo Stock Exchange (OSE). Additionally, the intention of the study is to use methods from computer science to enable the transformation of financial reports, from the raw PDF format to the financial sentiment scores. Furthermore, this thesis aims to discuss the relationship between predicted financial sentiment and stock performance for chosen companies and industries. This thesis applies the famous and recently developed language model for financial sentiment analysis, FinBERT. The model is built upon a more general language model, BERT. The motivation for the study is the increasing interest in machine learning and Natural Language Processing (NLP) for financial applications. Modern modeling techniques are allowing investors to make more informed decisions, and the rise of language modeling has made it possible to derive insight into the opinions of people through news and social networks. However, there are only a minority of studies investigating the language of quarterly reports. Methodologically, quarterly reports from the first quarter of 2019 to the fourth quarter of 2021 are downloaded from the investor relations pages of the selected companies. The downloaded reports are the input of a data pipeline that extracts the text and predicts the financial sentiment using Python tools such as PDFMiner and the Transformers library. The predicted sentiment is then loaded into a pipeline for visualization and stock performance comparisons based on stock data downloaded with the yfinance open source tool. The thesis concludes that extracting text from financial PDF files is feasible. Furthermore, the FinBERT model predicts the financial sentiment with a higher accuracy than the more general BERT model. However, the relationship between stock performance and predicted sentiment is not strong, despite individual differences. Additionally, the relationship is stronger for stock performance in the past. However, this thesis demonstrates the value of domain-specific NLP for applications in the financial industry.M-I

    Analysis of S&P500 using News Headlines Applying Machine Learning Algorithms

    Get PDF
    Dissertation presented as the partial requirement for obtaining a Master's degree in Information Management, specialization in Knowledge Management and Business IntelligenceFinancial risk is in everyone’s life now, directly or indirectly impacting people®s daily life, empowering people on their decisions and the consequences of the same. This financial system comprises all the companies that produce and sell, making them an essential factor. This study addresses the impact people can have, by the news headlines written, on companies’ stock prices. S&P 500 is the index that will be studied in this research, compiling the biggest 500 companies in the USA and how the index can be affected by the News Articles written by humans from distinct and powerful Newspapers. Many people worldwide “play the game” of investing in stock prices, winning or losing much money. This study also tries to understand how strongly this news and the Index, previously mentioned, can be correlated. With the increased data available, it is necessary to have some computational power to help process all of this data. There it is when the machine learning methods can have a crucial involvement. For this is necessary to understand how these methods can be applied and influence the final decision of the human that always has the same question: Can stock prices be predicted? For that is necessary to understand first the correlation between news articles, one of the elements able to impact the stock prices, and the stock prices themselves. This study will focus on the correlation between News and S&P 500

    Predictive Analytics on Emotional Data Mined from Digital Social Networks with a Focus on Financial Markets

    Get PDF
    This dissertation is a cumulative dissertation and is comprised of five articles. User-Generated Content (UGC) comprises a substantial part of communication via social media. In this dissertation, UGC that carries and facilitates the exchange of emotions is referred to as “emotional data.” People “produce” emotional data, that is, they express their emotions via tweets, forum posts, blogs, and so on, or they “consume” it by being influenced by expressed sentiments, feelings, opinions, and the like. Decisions often depend on shared emotions and data – which again lead to new data because decisions may change behaviors or results. “Emotional Data Intelligence” ultimately seeks an answer to the question of how all the different emotions expressed in public online sources influence decision-making processes. The overarching research topic of this dissertation follows the question whether network structures and emotional sentiment data extracted from digital social networks contain predictive information or they are just noise. Underlying data was collected from different social media sources, such as Twitter, blogs, message boards, or online news and social networking sites, such as Xing. By means of methodologies of social network analysis (SNA), sentiment analysis, and predictive analysis the individual contributions of this dissertation study whether sentiment data from social media or online social networking structures can predict real-world behaviors. The focus lies on the analysis of emotional data and network structures and its predictive power for financial markets. With the formal construction of the data analyses methodologies introduced in the individual contributions this dissertation contributes to the theories of social network analysis, sentiment analysis, and predictive analytics

    Evidence on Competitive Advantage and Superior Stock Market Performance

    Get PDF
    This article analyzes the value-relevance of industry-based and resource-based competitive advantage in a large sample of firms listed on the Oslo Stock Exchange. We measure competitive advantage by a single variable and perform a new decomposition into its underlying sources. In 1986-2005, the industry-based and the resource-based competitive advantage explain more than 20% of abnormal stock market returns, accumulated over five years. The resource-based advantage is almost four times more important than the industry-based advantage. Differences in both the return and the risk capability of firms’ net assets relative to their industry peers are significant parts of the resource-based advantage, estimated at 60% and 40%, respectively.Competitive advantage; superior performance; value-relevance of performance metrics
    • 

    corecore