4,082 research outputs found

    A Model-driven Visual Analytic Framework for Local Pattern Analysis

    Get PDF
    The ultimate goal of any visual analytic task is to make sense of the data and gain insights. Unfortunately, the process of discovering useful information is becoming more challenging nowadays due to the growing data scale. Particularly, the human cognitive capabilities remain constant whereas the scale and complexity of data are not. Meanwhile, visual analytics largely relies on human analytic in the loop which imposes challenge to traditional human-driven workflow. It is almost impossible to show every aspect of details to the user while diving into local region of the data to explain phenomenons hidden in the data. For example, while exploring the data subsets, it is always important to determine which partitions of data contain more important information. Also, determining the subset of features is vital before further doing other analysis. Furthermore, modeling on these subsets of data locally can yield great finding but also introduces bias. In this work, a model driven visual analytic framework is proposed to help identify interesting local patterns from the above three aspects. This dissertation work aims to tackle these subproblems in the following three topics: model-driven data exploration, model-driven feature analysis and local model diagnosis. First, the model-driven data exploration focus on the problem of modeling subset of data to identify the co-movement of time-series data within certain subset time partitions, which is an important application in a number of domains such as medical science, finance, business and engineering. Second, the model-driven feature analysis is to discover the important subset of interesting features while analyzing local feature similarities. Within the financial risk dataset collected by domain expert, we discover that the feature correlation among different data partitions (i.e., small and large companies) are very different. Third, local model diagnosis provides a tool to identify interesting local regression models at local regions of the data space which makes it possible for the analysts to model the whole data space with a set of local models while knowing the strength and weakness of them. The three tools provide an integrated solution for identifying interesting patterns within local subsets of data

    Adaptive Algorithms For Classification On High-Frequency Data Streams: Application To Finance

    Get PDF
    Mención Internacional en el título de doctorIn recent years, the problem of concept drift has gained importance in the financial domain. The succession of manias, panics and crashes have stressed the nonstationary nature and the likelihood of drastic structural changes in financial markets. The most recent literature suggests the use of conventional machine learning and statistical approaches for this. However, these techniques are unable or slow to adapt to non-stationarities and may require re-training over time, which is computationally expensive and brings financial risks. This thesis proposes a set of adaptive algorithms to deal with high-frequency data streams and applies these to the financial domain. We present approaches to handle different types of concept drifts and perform predictions using up-to-date models. These mechanisms are designed to provide fast reaction times and are thus applicable to high-frequency data. The core experiments of this thesis are based on the prediction of the price movement direction at different intraday resolutions in the SPDR S&P 500 exchange-traded fund. The proposed algorithms are benchmarked against other popular methods from the data stream mining literature and achieve competitive results. We believe that this thesis opens good research prospects for financial forecasting during market instability and structural breaks. Results have shown that our proposed methods can improve prediction accuracy in many of these scenarios. Indeed, the results obtained are compatible with ideas against the efficient market hypothesis. However, we cannot claim that we can beat consistently buy and hold; therefore, we cannot reject it.Programa de Doctorado en Ciencia y Tecnología Informática por la Universidad Carlos III de MadridPresidente: Gustavo Recio Isasi.- Secretario: Pedro Isasi Viñuela.- Vocal: Sandra García Rodrígue

    Analysis of Stock Portfolio with Global Economic Factors using Dynamic Data Modelling

    Get PDF
    This article analyzes stock portfolios and worldwide economic challenges. This study uses APIs to retrieve data and analyze correlations. To determine and measure how economic indicators affect stock performance. This work introduces an autoencoder-based model to better comprehend the complex relationship between economic conditions and stock portfolio dynamics. The analysis begins with an API retrieval of a wide range of macroeconomic information. These indicators include global economic metrics like GDP, unemployment, CPI, federal funds rates, and treasury bill rates. After collection, data is carefully curated and prepared for analysis. This study uses correlation analysis to understand economic variables and stock portfolio performance. This study explores how economic conditions affect stock prices and portfolio returns. This study seeks to discover trends, dependencies, and future issues that may affect investment decisions. This study also introduces an autoencoder-based neural network model to capture complex nonlinear relationships between economic variables and stock portfolio behavior. Deep learning improves interpretability and prediction, allowing a better understanding of the complex financial ecosystem dynamics. The inquiry provides valuable insights for investors, financial experts, and regulators. This study advances data-driven investment and risk management solutions. The autoencoder-based approach also reveals latent structures and hidden factors that affect stock portfolios. This novel approach opens new study options. In conclusion, this study provides a thorough stock portfolio analysis approach for global economic challenges. API data retrieval, correlation analysis, and a novel autoencoder model are used in this work to better understand the complicated relationships between economic indicators and financial markets. These insights can improve investment and policy decisions in a more integrated and dynamic global economy

    Machine Learning Methods to Exploit the Predictive Power of Open, High, Low, Close (OHLC) Data

    Get PDF
    Novel machine learning techniques are developed for the prediction of financial markets, with a combination of supervised, unsupervised and Bayesian optimisation machine learning methods shown able to give a predictive power rarely previously observed. A new data mining technique named Deep Candlestick Mining (DCM) is proposed that is able to discover highly predictive dataset specific candlestick patterns (arrangements of open, high, low, close (OHLC) aggregated price data structures) which significantly outperform traditional candlestick patterns. The power that OHLC features can provide is further investigated, using LSTM RNNs and XGBoost trees, in the prediction of a mid-price directional change, defined here as the mid-point between either the open and close or high and low of an OHLC bar. This target variable has been overlooked in the literature, which is surprising given the relative ease of predicting it, significantly in excess of noisier financial quantities. However, the true value of this quantity is only known upon the period's ending – i.e. it is an after-the-fact observation. To make use of and enhance the remarkable predictability of the mid-price directional change, multi-period predictions are investigated by training many LSTM RNNs (XGBoost trees being used to identify powerful OHLC input feature combinations), over different time horizons, to construct a Bayesian optimised trend prediction ensemble. This fusion of long-, medium- and short-term information results in a model capable of predicting market trend direction to greater than 70% better than random. A trading strategy is constructed to demonstrate how this predictive power can be used by exploiting an artefact of the LSTM RNN training process which allows the trading system to size and place trades in accordance with the ensemble's predictive certainty
    • …
    corecore