1,430 research outputs found

    Predicting volatility with Twitter sentiment:an application to the US stock market.

    Get PDF
    This dissertation investigates whether individual tweets related to the S&P500 Index can predict volatility in future returns. A sample of 3,329,267 tweets containing the keyword “SPX” was collected from the period 2012 to 2021. We applied Principal Component Analysis (PCA) to reduce the dimensionality of the word frequency data and then integrated it with the Heterogeneous Autoregressive (HAR) model. We evaluated the in-sample and out-of-sample forecasting performance of various HAR-PCA models using different estimation window schemes and compared them with the original HAR model. We found that HAR-PCA models generally outperform the HAR model, especially during periods of particularly high and low volatility. Our findings demonstrate the economic relevance of HAR-PCA models for portfolio investment and contribute to the literature by linking investor sentiment to return volatility using a word-based method, which avoids the complications of applying advanced algorithms

    Machine Learning-Driven Decision Making based on Financial Time Series

    Get PDF
    L'abstract è presente nell'allegato / the abstract is in the attachmen

    Signed path dependence in financial markets: Applications and implications

    Get PDF
    Despite decades of studies, there is still no consensus on what type of serial dependence, if any, might be present in risky asset returns. The serial dependence structure in asset returns is complex and challenging to study, it varies over time, it varies over observed time resolution, it varies by asset type, it varies with liquidity and exchange and it even varies in statistical structure. The focus of the work in this thesis is to capture a previously unexplored notion of serial dependence that is applicable to any asset class and can be both parameteric or non-parameteric depending on the modelling approach preferred. The aim of this research is to develop new approaches by providing a model-free definition of serial dependence based on how the sign of cumulative innovations for a given lookback horizon correlates with the future cumulative innovations for a given forecast horizon. This concept is then theoretically validated on well-known time series model classes and used to build a predictive econometric model for future market returns, which is applied to empirical forecasting by means of a profit seeking trading strategy. The empirical experiment revealed strong evidence of serial dependence in equity markets, being statistically and economically significant even in the presence of trading costs. Subsequently, this thesis provides an empirical study of the prices of Energy Commodities, Gold and Copper in the futures markets and demonstrates that, for these assets, the level of asymmetry of asset returns varies through time and can be forecast using past returns. A new time series model is proposed based on this phenomenon, also empirically validated. The thesis concludes by embedding into option pricing theory the findings of previous chapters pertaining to signed path dependence structure. This is achieved by devising a model-free empirical risk-neutral distribution based on Polynomial Chaos Expansion and Stochastic Bridge Interpolators that includes information from the entire set of observable European call option prices under all available strikes and maturities for a given underlying asset, whilst the real-world measure includes the effects of serial dependence based on the sign of previous returns. The risk premium behaviour is subsequently inferred from the two distributions using the Radon-Nikodym derivative of the empirical riskneutral distribution with respect to the modelled real-world distribution

    Explaining Exchange Rate Forecasts with Macroeconomic Fundamentals Using Interpretive Machine Learning

    Full text link
    The complexity and ambiguity of financial and economic systems, along with frequent changes in the economic environment, have made it difficult to make precise predictions that are supported by theory-consistent explanations. Interpreting the prediction models used for forecasting important macroeconomic indicators is highly valuable for understanding relations among different factors, increasing trust towards the prediction models, and making predictions more actionable. In this study, we develop a fundamental-based model for the Canadian-U.S. dollar exchange rate within an interpretative framework. We propose a comprehensive approach using machine learning to predict the exchange rate and employ interpretability methods to accurately analyze the relationships among macroeconomic variables. Moreover, we implement an ablation study based on the output of the interpretations to improve the predictive accuracy of the models. Our empirical results show that crude oil, as Canada's main commodity export, is the leading factor that determines the exchange rate dynamics with time-varying effects. The changes in the sign and magnitude of the contributions of crude oil to the exchange rate are consistent with significant events in the commodity and energy markets and the evolution of the crude oil trend in Canada. Gold and the TSX stock index are found to be the second and third most important variables that influence the exchange rate. Accordingly, this analysis provides trustworthy and practical insights for policymakers and economists and accurate knowledge about the predictive model's decisions, which are supported by theoretical considerations

    Reconstructing Dynamical Systems From Stochastic Differential Equations to Machine Learning

    Get PDF
    Die Modellierung komplexer Systeme mit einer großen Anzahl von Freiheitsgraden ist in den letzten Jahrzehnten zu einer großen Herausforderung geworden. In der Regel werden nur einige wenige Variablen komplexer Systeme in Form von gemessenen Zeitreihen beobachtet, während die meisten von ihnen - die möglicherweise mit den beobachteten Variablen interagieren - verborgen bleiben. In dieser Arbeit befassen wir uns mit dem Problem der Rekonstruktion und Vorhersage der zugrunde liegenden Dynamik komplexer Systeme mit Hilfe verschiedener datengestützter Ansätze. Im ersten Teil befassen wir uns mit dem umgekehrten Problem der Ableitung einer unbekannten Netzwerkstruktur komplexer Systeme, die Ausbreitungsphänomene widerspiegelt, aus beobachteten Ereignisreihen. Wir untersuchen die paarweise statistische Ähnlichkeit zwischen den Sequenzen von Ereigniszeitpunkten an allen Knotenpunkten durch Ereignissynchronisation (ES) und Ereignis-Koinzidenz-Analyse (ECA), wobei wir uns auf die Idee stützen, dass funktionale Konnektivität als Stellvertreter für strukturelle Konnektivität dienen kann. Im zweiten Teil konzentrieren wir uns auf die Rekonstruktion der zugrunde liegenden Dynamik komplexer Systeme anhand ihrer dominanten makroskopischen Variablen unter Verwendung verschiedener stochastischer Differentialgleichungen (SDEs). In dieser Arbeit untersuchen wir die Leistung von drei verschiedenen SDEs - der Langevin-Gleichung (LE), der verallgemeinerten Langevin-Gleichung (GLE) und dem Ansatz der empirischen Modellreduktion (EMR). Unsere Ergebnisse zeigen, dass die LE bessere Ergebnisse für Systeme mit schwachem Gedächtnis zeigt, während sie die zugrunde liegende Dynamik von Systemen mit Gedächtniseffekten und farbigem Rauschen nicht rekonstruieren kann. In diesen Situationen sind GLE und EMR besser geeignet, da die Wechselwirkungen zwischen beobachteten und unbeobachteten Variablen in Form von Speichereffekten berücksichtigt werden. Im letzten Teil dieser Arbeit entwickeln wir ein Modell, das auf dem Echo State Network (ESN) basiert und mit der PNF-Methode (Past Noise Forecasting) kombiniert wird, um komplexe Systeme in der realen Welt vorherzusagen. Unsere Ergebnisse zeigen, dass das vorgeschlagene Modell die entscheidenden Merkmale der zugrunde liegenden Dynamik der Klimavariabilität erfasst.Modeling complex systems with large numbers of degrees of freedom have become a grand challenge over the past decades. Typically, only a few variables of complex systems are observed in terms of measured time series, while the majority of them – which potentially interact with the observed ones - remain hidden. Throughout this thesis, we tackle the problem of reconstructing and predicting the underlying dynamics of complex systems using different data-driven approaches. In the first part, we address the inverse problem of inferring an unknown network structure of complex systems, reflecting spreading phenomena, from observed event series. We study the pairwise statistical similarity between the sequences of event timings at all nodes through event synchronization (ES) and event coincidence analysis (ECA), relying on the idea that functional connectivity can serve as a proxy for structural connectivity. In the second part, we focus on reconstructing the underlying dynamics of complex systems from their dominant macroscopic variables using different Stochastic Differential Equations (SDEs). We investigate the performance of three different SDEs – the Langevin Equation (LE), Generalized Langevin Equation (GLE), and the Empirical Model Reduction (EMR) approach in this thesis. Our results reveal that LE demonstrates better results for systems with weak memory while it fails to reconstruct underlying dynamics of systems with memory effects and colored-noise forcing. In these situations, the GLE and EMR are more suitable candidates since the interactions between observed and unobserved variables are considered in terms of memory effects. In the last part of this thesis, we develop a model based on the Echo State Network (ESN), combined with the past noise forecasting (PNF) method, to predict real-world complex systems. Our results show that the proposed model captures the crucial features of the underlying dynamics of climate variability

    Computational Intelligence Applied to Financial Price Prediction: A State of the Art Review

    Get PDF
    The following work aims to review the most important research from computational intelligence applied to the financial price prediction problem. The article is organized as follows: The first section summarizes the role of predictability in the Neoclassical financial world. This section also criticizes the zero predictability framework. The second section presents the main computational intelligence techniques applied to financial price prediction. The third section depicts common features of revised works

    Machine learning applied to active fixed-income portfolio management: a Lasso logit approach

    Get PDF
    El uso de métodos cuantitativos es fundamental en la gestión de carteras de inversores institucionales. En la última década, se han realizado diversos estudios empíricos que emplean modelos probabilísticos o de clasificación para predecir los rendimientos del mercado de valores, modelar calificaciones de riesgo y probabilidades de incumplimiento de bonos, así como pronosticar la curva de rendimientos. Sin embargo, existe una escasa investigación sobre la aplicación de estos modelos en la gestión activa de renta fija. Este documento busca abordar esta brecha al comparar un algoritmo de aprendizaje automático, la regresión logística Lasso, con una estrategia de inversión pasiva (comprar y mantener) en la construcción de un modelo de gestión de duración para carteras de bonos gubernamentales, con enfoque específico en los bonos del Tesoro de Estados Unidos. Además, se propone un procedimiento de dos pasos, junto con un promedio simple entre variables de características estadísticas similares, con el objetivo de minimizar el posible sobreajuste de los algoritmos tradicionales de aprendizaje automático. Asimismo, se introduce un método para seleccionar umbrales que conviertan probabilidades en señales basadas en distribuciones de probabilidad condicional. Se utiliza un amplio conjunto de variables financieras y económicas para obtener una señal de duración y se comparan otras estrategias de inversión. Como resultado, la mayoría de las variables seleccionadas por el modelo están relacionadas con flujos financieros y fundamentos económicos, aunque los parámetros no parecen ser estables a lo largo del tiempo, lo que sugiere que la relevancia de las variables es dinámica y se requiere una evaluación continua del modelo. Además, el modelo logra un exceso de retorno estadísticamente significativo en comparación con la estrategia pasiva. Estos resultados respaldan la inclusión de herramientas cuantitativas en el proceso de gestión activa de carteras para inversores institucionales, con especial atención en el posible sobreajuste y en los parámetros inestables. Las herramientas cuantitativas deben considerarse como un complemento del análisis cualitativo y fundamental, junto con la experiencia del gestor de carteras, para tomar decisiones de inversión fundamentadas de manera más sólida.The use of quantitative methods constitutes a standard component of the institutional investors’ portfolio management toolkit. In the last decade, several empirical studies have employed probabilistic or classification models to predict stock market excess returns, model bond ratings and default probabilities, as well as to forecast yield curves. To the authors’ knowledge, little research exists into their application to active fixed-income management. This paper contributes to filling this gap by comparing a machine learning algorithm, the Lasso logit regression, with a passive (buy-and-hold) investment strategy in the construction of a duration management model for high-grade bond portfolios, specifically focusing on US treasury bonds. Additionally, a two-step procedure is proposed, together with a simple ensemble averaging aimed at minimising the potential overfitting of traditional machine learning algorithms. A method to select thresholds that translate probabilities into signals based on conditional probability distributions is also introduced
    corecore