4,082 research outputs found
A Model-driven Visual Analytic Framework for Local Pattern Analysis
The ultimate goal of any visual analytic task is to make sense of the data and gain insights. Unfortunately, the process of discovering useful information is becoming more challenging nowadays due to the growing data scale. Particularly, the human cognitive capabilities remain constant whereas the scale and complexity of data are not. Meanwhile, visual analytics largely relies on human analytic in the loop which imposes challenge to traditional human-driven workflow. It is almost impossible to show every aspect of details to the user while diving into local region of the data to explain phenomenons hidden in the data. For example, while exploring the data subsets, it is always important to determine which partitions of data contain more important information. Also, determining the subset of features is vital before further doing other analysis. Furthermore, modeling on these subsets of data locally can yield great finding but also introduces bias. In this work, a model driven visual analytic framework is proposed to help identify interesting local patterns from the above three aspects. This dissertation work aims to tackle these subproblems in the following three topics: model-driven data exploration, model-driven feature analysis and local model diagnosis. First, the model-driven data exploration focus on the problem of modeling subset of data to identify the co-movement of time-series data within certain subset time partitions, which is an important application in a number of domains such as medical science, finance, business and engineering. Second, the model-driven feature analysis is to discover the important subset of interesting features while analyzing local feature similarities. Within the financial risk dataset collected by domain expert, we discover that the feature correlation among different data partitions (i.e., small and large companies) are very different. Third, local model diagnosis provides a tool to identify interesting local regression models at local regions of the data space which makes it possible for the analysts to model the whole data space with a set of local models while knowing the strength and weakness of them. The three tools provide an integrated solution for identifying interesting patterns within local subsets of data
Adaptive Algorithms For Classification On High-Frequency Data Streams: Application To Finance
Mención Internacional en el tÃtulo de doctorIn recent years, the problem of concept drift has gained importance in the financial
domain. The succession of manias, panics and crashes have stressed the nonstationary
nature and the likelihood of drastic structural changes in financial markets.
The most recent literature suggests the use of conventional machine learning and statistical
approaches for this. However, these techniques are unable or slow to adapt
to non-stationarities and may require re-training over time, which is computationally
expensive and brings financial risks.
This thesis proposes a set of adaptive algorithms to deal with high-frequency data
streams and applies these to the financial domain. We present approaches to handle
different types of concept drifts and perform predictions using up-to-date models.
These mechanisms are designed to provide fast reaction times and are thus applicable
to high-frequency data. The core experiments of this thesis are based on the prediction
of the price movement direction at different intraday resolutions in the SPDR S&P 500
exchange-traded fund. The proposed algorithms are benchmarked against other popular
methods from the data stream mining literature and achieve competitive results.
We believe that this thesis opens good research prospects for financial forecasting
during market instability and structural breaks. Results have shown that our proposed
methods can improve prediction accuracy in many of these scenarios. Indeed, the
results obtained are compatible with ideas against the efficient market hypothesis.
However, we cannot claim that we can beat consistently buy and hold; therefore, we
cannot reject it.Programa de Doctorado en Ciencia y TecnologÃa Informática por la Universidad Carlos III de MadridPresidente: Gustavo Recio Isasi.- Secretario: Pedro Isasi Viñuela.- Vocal: Sandra GarcÃa RodrÃgue
Recommended from our members
A survey of simulation techniques in commerce and defence
Despite the developments in Modelling and Simulation (M&S) tools and techniques over the past years, there has been a gap in the M&S research and practice in healthcare on developing a toolkit to assist the modellers and simulation practitioners with selecting an appropriate set of techniques. This study is a preliminary step towards this goal. This paper presents some results from a systematic literature survey on applications of M&S in the commerce and defence domains that could inspire some improvements in the healthcare. Interim results show that in the commercial sector Discrete-Event Simulation (DES) has been the most widely used technique with System Dynamics (SD) in second place. However in the defence sector, SD has gained relatively more attention. SD has been found quite useful for qualitative and soft factors analysis. From both the surveys it becomes clear that there is a growing trend towards using hybrid M&S approaches
Analysis of Stock Portfolio with Global Economic Factors using Dynamic Data Modelling
This article analyzes stock portfolios and worldwide economic challenges. This study uses APIs to retrieve data and analyze correlations. To determine and measure how economic indicators affect stock performance. This work introduces an autoencoder-based model to better comprehend the complex relationship between economic conditions and stock portfolio dynamics. The analysis begins with an API retrieval of a wide range of macroeconomic information. These indicators include global economic metrics like GDP, unemployment, CPI, federal funds rates, and treasury bill rates. After collection, data is carefully curated and prepared for analysis. This study uses correlation analysis to understand economic variables and stock portfolio performance. This study explores how economic conditions affect stock prices and portfolio returns. This study seeks to discover trends, dependencies, and future issues that may affect investment decisions. This study also introduces an autoencoder-based neural network model to capture complex nonlinear relationships between economic variables and stock portfolio behavior. Deep learning improves interpretability and prediction, allowing a better understanding of the complex financial ecosystem dynamics. The inquiry provides valuable insights for investors, financial experts, and regulators. This study advances data-driven investment and risk management solutions. The autoencoder-based approach also reveals latent structures and hidden factors that affect stock portfolios. This novel approach opens new study options. In conclusion, this study provides a thorough stock portfolio analysis approach for global economic challenges. API data retrieval, correlation analysis, and a novel autoencoder model are used in this work to better understand the complicated relationships between economic indicators and financial markets. These insights can improve investment and policy decisions in a more integrated and dynamic global economy
Machine Learning Methods to Exploit the Predictive Power of Open, High, Low, Close (OHLC) Data
Novel machine learning techniques are developed for the prediction of financial markets, with a combination of supervised, unsupervised and Bayesian optimisation machine learning methods shown able to give a predictive power rarely previously observed. A new data mining technique named Deep Candlestick Mining (DCM) is proposed that is able to discover highly predictive dataset specific candlestick patterns (arrangements of open, high, low, close (OHLC) aggregated price data structures) which significantly outperform traditional candlestick patterns. The power that OHLC features can provide is further investigated, using LSTM RNNs and XGBoost trees, in the prediction of a mid-price directional change, defined here as the mid-point between either the open and close or high and low of an OHLC bar. This target variable has been overlooked in the literature, which is surprising given the relative ease of predicting it, significantly in excess of noisier financial quantities. However, the true value of this quantity is only known upon the period's ending – i.e. it is an after-the-fact observation. To make use of and enhance the remarkable predictability of the mid-price directional change, multi-period predictions are investigated by training
many LSTM RNNs (XGBoost trees being used to identify powerful OHLC input feature combinations), over different time horizons, to construct a Bayesian optimised trend prediction ensemble. This fusion of long-, medium- and short-term information results in a model capable of predicting market trend direction to greater than 70% better than random. A trading strategy is constructed to demonstrate how this predictive power can be used by exploiting an artefact of the LSTM RNN training process which allows the trading system to size and place trades in accordance with the ensemble's predictive certainty
- …