101 research outputs found

    Deep learning neural networks based algorithmic trading strategy for colombian financial market using tick by tick and order book data

    Get PDF
    This work presents an innovative and highly competitive Algorithmic Trading (AT) Strategy, based on a Convolutional Neural Network price direction predictor that uses High Frequency (HF) transactions and Limit Order Book (LOB) data. Information used includes data from US and Colombian market. Data processing include more than 5 million raw data files of 21 stocks from different industries (Energy, Finance, Technology, Construction, among others). Since data include two different sources (Transaction and LOB), applying feature engineering is necessary to homogenize inputs. For transaction data, an image-like representation (Grammian Angular Field GAF) is used. It converts Financial Time Series (FTS) to polar coordinates and creates a kernel based on cosine differences. Additionally, this work proposes a transformation for LOB data. This representation includes all available information deviated from LOB raw data and it will create an image-like representation of LOB. These two sources will feed up into a proposed 3D-Convolutional Neural Network (3D-CNN) architecture that generates price direction predictions. These predictions will serve as a trading signal generator for two Algorithmic Trading Strategies. Both of them take real market constrains into consideration, such as liquidity provision, transaction costs, among others. The two proposed strategies works under different risk aversion constrains. Results from the proposed 3D-CNN predictor present a strong performance, ranging between 70% and 74% in Directional Accuracy (DA), while reducing model parameters as well as making inputs time invariant. Moreover, trading strategies results illustrate that the proposed CNN predictor can lead to profitable trades and liquidity improvement in the Colombian Market. Testing results for both AT strategies on Colombian Market Data lead to interesting findings. Under different constrains of take profit, stop loss and transaction cost, both strategies aggressive and conservative lead to positive returns over the same period of time. Moreover, results of number of trades performed by the aggressive AT helps to understand how AT may impact positively liquidity provision in developing financial markets.Resumen: Este trabajo presenta dos estrategias algorítmicas de trading, basadas en un método innovador y altamente competitivo de redes convolucionales para predecir de la dirección en los precios de series financieras de tiempo de alta frecuencia, tanto del Libro de Ordenes como en las Transacciones. La información usada incluye datos del mercado americano y colombiano. Se procesaron más de cinco millones de archivos con información de 21 acciones de diferentes sectores (energía, financiero, tecnología, construcción, entre otros). La información de entrada incluye dos fuentes de datos diferentes (Transaciones y Libro de Ordenes), por lo cual se hace necesario aplicar ingeniería de características para homogenizarla. Para la información de las transacciones, se usó una representación basada en imágenes con una transformación conocida como Gramian Angular Field (GAF). ésta convierte una serie de tiempo en coordenadas polares y crea un kernel basado en diferencia de cosenos. Además, este trabajo propone una transformación del Libro de órdenes. Esta representación incluye toda la información disponible del Libro de órdenes y la transforma a una imagen. La información representada se pasa a una arquitectura de red convolucional propuesta, la cual genera predicciones de la dirección de los precios. Las predicciones servirán de señales de negociación para dos estrategias de trading algorítmico. Ambas incluyen restricciones reales de mercado, como niveles de liquidez y costos de transacción. Las dos estrategias propuestas trabajan bajo differentes condiciones de riesgo. Los resultados de predicción de la red convolucional propuesta presenta un desempeño entre el 70% al 74% de precición direccional; a la vez que reduce los paramétros del modelo y hace las entradas invariantes en el tiempo. Adicionalmente, los resultados de las estrategias de negociación ilustran que el predictor convolucional puede liderar a generación de ganacias y mejoras de liquidez en el mercado colombiano. Las pruebas realizadas para las dos estrategias de trading en el mercado colombiano conllevan interesantes hallazagos. Bajo diferentes condiciones de take profit, stop loss y costos de transacción, tanto la estrategia agresiva como la conservadora reportaron retornos positivos para el mismo período de tiempo. Adicionalmente, la estrategia agresiva permite entender el impacto positivo en liquidez para mercados financieros emergentes.Doctorad

    Forex Trading Signal Extraction with Deep Learning Models

    Get PDF
    The rise of AI technology has popularized deep learning models for financial trading prediction, promising substantial profits with minimal risk. Institutions like Westpac, Commonwealth Bank of Australia, Macquarie Bank, and Bloomberg invest heavily in this transformative technology. Researchers have also explored AI's potential in the exchange rate market. This thesis focuses on developing advanced deep learning models for accurate forex market prediction and AI-powered trading strategies. Three deep learning models are introduced: an event-driven LSTM model, an Attention-based VGG16 model named MHATTN-VGG16, and a pre-trained model called TradingBERT. These models aim to enhance signal extraction and price forecasting in forex trading, offering valuable insights for decision-making. The first model, an LSTM, predicts retracement points crucial for identifying trend reversals. It outperforms baseline models like GRU and RNN, thanks to noise reduction in the training data. Experiments determine the optimal number of timesteps for trend identification, showing promise for building a robotic trading platform. The second model, MHATTN-VGG16, predicts maximum and minimum price movements in forex chart images. It combines VGG16 with multi-head attention and positional encoding to effectively classify financial chart images. The third model utilizes a pre-trained BERT architecture to transform trading price data into normalized embeddings, enabling meaningful signal extraction from financial data. This study pioneers the use of pre-trained models in financial trading and introduces a method for converting continuous price data into categorized elements, leveraging the success of BERT. This thesis contributes innovative approaches to deep learning in algorithmic trading, offering traders and investors precision and confidence in navigating financial markets

    트랜스포머를 통한 복잡한 추론 능력 정복을 위한 연구: 시각적, 대화적, 수학적 추론에의 적용

    Get PDF
    학위논문 (박사) -- 서울대학교 대학원 : 공과대학 산업공학과, 2021. 2. 조성준.As deep learning models advanced, research is focusing on sophisticated tasks that require complex reasoning, rather than simple classification tasks. These complex tasks require multiple reasoning steps that resembles human intelligence. Architecture-wise, recurrent neural networks and convolutional neural networks have long been the main stream model for deep learning. However, both models suffer from shortcomings from their innate architecture. Nowadays, the attention-based Transformer is replacing them due to its superior architecture and performance. Particularly, the encoder of the Transformer has been extensively studied in the field of natural language processing. However, for the Transformer to be effective in data with distinct structures and characteristics, appropriate adjustments to its structure is required. In this dissertation, we propose novel architectures based on the Transformer encoder for various supervised learning tasks with different data types and characteristics. The tasks that we consider are visual IQ tests, dialogue state tracking and mathematical question answering. For the visual IQ test, the input is in a visual format with hierarchy. To deal with this, we propose using a hierarchical Transformer encoder with structured representation that employs a novel neural network architecture to improve both perception and reasoning. The hierarchical structure of the Transformer encoders and the architecture of each individual Transformer encoder all fit to the characteristics of the data of visual IQ tests. For dialogue state tracking, value prediction for multiple domain-slot pairs is required. To address this issue, we propose a dialogue state tracking model using a pre-trained language model, which is a pre-trained Transformer encoder, for domain-slot relationship modeling. We introduced special tokens for each domain-slot pair which enables effective dependency modeling among domain-slot pairs through the pre-trained language encoder. Finally, for mathematical question answering, we propose a method to pre-train a Transformer encoder on a mathematical question answering dataset for improved performance. Our pre-training method, Question-Answer Masked Language Modeling, utilizes both the question and answer text, which is suitable for the mathematical question answering dataset. Through experiments, we show that each of our proposed methods is effective in their corresponding task and data type.순환 신경망과 합성곱 신경망은 오랫동안 딥러닝 분야에서 주요 모델로 쓰여왔다. 하지만 두 모델 모두 자체적인 구조에서 오는 한계를 가진다. 최근에는 어텐션(attention)에 기반한 트랜스포머(Transformer)가 더 나은 성능과 구조로 인해서 이들을 대체해 나가고 있다. 트랜스포머 인코더(Transformer encoder)는 자연어 처리 분야에서 특별히 더 많은 연구가 이루어지고 있다. 하지만 Transformer가 특별한 구조와 특징을 가진 데이터에 대해서도 제대로 작동하기 위해서는 그 구조에 적절한 변화가 요구된다. 본 논문에서는 다양한 데이터 종류와 특성에 대한 교사 학습에 적용할 수 있는 트랜스포머 인코더에 기반한 새로운 구조의 모델들을 제안한다. 이번 연구에서 다루는 과업은 시각 IQ 테스트, 대화 상태 트래킹 그리고 수학 질의 응답이다. 시각 IQ 테스트의 입력 변수는 위계를 가진 시각적인 형태이다. 이에 대응하기 위해서 우리는 인지와 사고 측면에서 성능을 향상 시킬 수 있는 새로운 뉴럴 네트워크 구조인, 구조화된 표현형을 처리할 수 있는 계층적인 트랜스포머 인코더 모델을 제안한다. 트랜스 포머 인코더의 계층적 구조와 각각의 트랜스포머 인코더의 구조 모두가 시각 IQ 테스트 데이터의 특징에 적합하다. 대화 상태 트래킹은 여러 개의 도메인-슬롯(domain-slot)쌍에 대한 값(value)이 요구된다. 이를 해결하기 위해서 우리는 사전 학습된 트랜스포머 인코더인, 사전 학습 언어 모델을 활용하여 도메인-슬롯의 관계를 모델링하는 것을 제안한다. 각 도메인-슬롯 쌍에 대한 특수 토큰을 도입함으로써 효과적으로 도메인-슬롯 쌍들 간의 관계를 모델링 할 수 있다. 마지막으로, 수학 질의 응답을 위해서는 수학 질의 응답 데이터에 대해서 사전 학습을 진행함으로써 수학 질의 응답 과업에 대해서 성능을 높히는 방법을 제안한다. 우리의 사전 학습 방법인 질의-응답 마스킹 언어 모델링은 질의와 응답 텍스트 모두를 활용 함으로써 수학 질의 응답 데이터에 적합한 형태이다. 실험을 통해서 각각의 제안된 방법론들이 해당하는 과업과 데이터 종류에 대해서 효과적인 것을 밝혔다.Abstract i Contents vi List of Tables viii List of Figures xii Chapter 1 Introduction 1 Chapter 2 Literature Review 7 2.1 Related Works on Transformer . . . . . . . . . . . . . . . . . . . . . 7 2.2 Related Works on Visual IQ Tests . . . . . . . . . . . . . . . . . . . 10 2.2.1 RPM-related studies . . . . . . . . . . . . . . . . . . . . . . . 10 2.2.2 Object Detection related studies . . . . . . . . . . . . . . . . 11 2.3 Related works on Dialogue State Tracking . . . . . . . . . . . . . . . 12 2.4 Related Works on Mathematical Question Answering . . . . . . . . . 14 2.4.1 Pre-training of Neural Networks . . . . . . . . . . . . . . . . 14 2.4.2 Language Model Pre-training . . . . . . . . . . . . . . . . . . 15 2.4.3 Mathematical Reasoning with Neural Networks . . . . . . . . 17 Chapter 3 Hierarchical end-to-end architecture of Transformer encoders for solving visual IQ tests 19 3.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 3.1.1 Perception . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 3.1.2 Reasoning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 3.2 Proposed Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 3.2.1 Perception Module: Object Detection Model . . . . . . . . . 24 3.2.2 Reasoning Module: Hierarchical Transformer Encoder . . . . 26 3.2.3 Contrasting Module and Loss function . . . . . . . . . . . . . 29 3.3 Experimental results . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 3.3.1 Dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 3.3.2 Experimental Setup . . . . . . . . . . . . . . . . . . . . . . . 34 3.3.3 Results for Perception Module . . . . . . . . . . . . . . . . . 35 3.3.4 Results for Reasoning Module . . . . . . . . . . . . . . . . . . 36 3.3.5 Ablation studies . . . . . . . . . . . . . . . . . . . . . . . . . 37 3.4 Chapter Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 Chapter 4 Domain-slot relationship modeling using Transformers for dialogue state tracking 40 4.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 4.2 Proposed Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 4.2.1 Domain-Slot-Context Encoder . . . . . . . . . . . . . . . . . 44 4.2.2 Slot-gate classifier . . . . . . . . . . . . . . . . . . . . . . . . 48 4.2.3 Slot-value classifier . . . . . . . . . . . . . . . . . . . . . . . . 49 4.2.4 Total objective function . . . . . . . . . . . . . . . . . . . . . 50 4.3 Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . . . . 51 4.3.1 Dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51 4.3.2 Experimental Setup . . . . . . . . . . . . . . . . . . . . . . . 51 4.3.3 Results for the MultiWOZ-2.1 dataset . . . . . . . . . . . . . 52 4.3.4 Ablation Studies . . . . . . . . . . . . . . . . . . . . . . . . . 53 4.4 Chapter Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61 Chapter 5 Pre-training of Transformers with Question-Answer Masked Language Modeling for Mathematical Question Answering 62 5.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62 5.2 Proposed Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65 5.2.1 Pre-training: Question-Answer Masked Language Modeling . 65 5.2.2 Fine-tuning: Mathematical Question Answering . . . . . . . . 67 5.3 Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . . . . 69 5.3.1 Dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69 5.3.2 Experimental Setup . . . . . . . . . . . . . . . . . . . . . . . 70 5.3.3 Experimental Results on the Mathematics dataset . . . . . . 71 5.4 Chapter Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77 Chapter 6 Conclusion 79 6.1 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79 6.2 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81 Bibliography 83 국문초록 101 감사의 글 103Docto

    Machine Learning based Models for Fresh Produce Yield and Price Forecasting for Strawberry Fruit

    Get PDF
    Building market price forecasting models of Fresh Produce (FP) is crucial to protect retailers and consumers from highly priced FP. However, the task of forecasting FP prices is highly complex due to the very short shelf life of FP, inability to store for long term and external factors like weather and climate change. This forecasting problem has been traditionally modelled as a time series problem. Models for grain yield forecasting and other non-agricultural prices forecasting are common. However, forecasting of FP prices is recent and has not been fully explored. In this thesis, the forecasting models built to fill this void are solely machine learning based which is also a novelty. The growth and success of deep learning, a type of machine learning algorithm, has largely been attributed to the availability of big data and high end computational power. In this thesis, work is done on building several machine learning models (both conventional and deep learning based) to predict future yield and prices of FP (price forecast of strawberries are said to be more difficult than other FP and hence is used here as the main product). The data used in building these prediction models comprises of California weather data, California strawberry yield, California strawberry farm-gate prices and a retailer purchase price data. A comparison of the various prediction models is done based on a new aggregated error measure (AGM) proposed in this thesis which combines mean absolute error, mean squared error and R^2 coefficient of determination. The best two models are found to be an Attention CNN-LSTM (AC-LSTM) and an Attention ConvLSTM (ACV-LSTM). Different stacking ensemble techniques such as voting regressor and stacking with Support vector Regression (SVR) are then utilized to come up with the best prediction. The experiment results show that across the various examined applications, the proposed model which is a stacking ensemble of the AC-LSTM and ACV-LSTM using a linear SVR is the best performing based on the proposed aggregated error measure. To show the robustness of the proposed model, it was used also tested for predicting WTI and Brent crude oil prices and the results proved consistent with that of the FP price prediction

    An analysis of ensemble empirical mode decomposition applied to trend prediction on financial time series

    Get PDF
    Orientador : Luiz Eduardo S. OliveiraCoorientador : David MenottiDissertação (mestrado) - Universidade Federal do Paraná, Setor de Ciências Exatas, Programa de Pós-Graduação em Informática. Defesa: Curitiba, 20/07/2017Inclui referências : f. 63-72Resumo: As séries temporais financeiras são notoriamente difíceis de analisar e prever dada sua natureza não estacionária e altamente oscilatória. Nesta tese, a eficácia da técnica de decomposição não-paramétrica Ensemble Empirical Mode Decomposition (EEMD) é avaliada como uma técnica de extração de característica de séries temporais provenientes de índices de mercado e taxas de câmbio, características estas usadas na classificação, juntamente com diferentes modelos de aprendizado de máquina, de tendências de curto prazo. Os resultados obtidos em dois datasets de dados financeiros distintos sugerem que os resultados promissores relatados na literatura foram obtidos com a adição, inadvertida, de lookahead bias (viés) proveniente da aplicação desta técnica como parte do pré-processamento das séries temporais. Em contraste com as conclusões encontradas na literatura, nossos resultados indicam que a aplicação do EEMD com o objetivo de gerar uma melhor representação dos dados financeiração, por si só, não é suficiente para melhorar substancialmente a precisão e retorno cumulativo obtidos por modelos preditivos em comparação aos resultados obtidos com a utilização de series temporais de mudanças percentuais. Palavras-chave: Predição de Tendencias, Aprendizado de Máquina, Séries Temporais Financeiras.Abstract: Financial time series are notoriously difficult to analyse and predict, given their nonstationary, highly oscillatory nature. In this thesis, the effectiveness of the Ensemble Empirical Mode Decomposition (EEMD) is evaluated at generating a representation for market indexes and exchange rates that improves short-term trend prediction for these financial instruments. The results obtained in two different financial datasets suggest that the promising results reported using EEMD on financial time series in other studies were obtained by inadvertently adding look-ahead bias to the testing protocol via pre-processing the entire series with EEMD, which do affect the predictive results. In contrast to conclusions found in the literature, our results indicate that the application of EEMD with the objective of generating a better representation for financial time series is not sufficient, by itself, to substantially improve the accuracy and cumulative return obtained by the same models using the raw data. Keywords: Trend Prediction, Machine Learning, Financial Time Series
    corecore