Search CORE

4,318 research outputs found

A Shared Task on Bandit Learning for Machine Translation

Author: Danchenko Pavel
Fürstenau Hagen
Kreutzer Julia
Riezler Stefan
Sokolov Artem
Sunderland Kellen
Szymaniak Witold
Publication venue
Publication date: 01/01/2017
Field of study

We introduce and describe the results of a novel shared task on bandit learning for machine translation. The task was organized jointly by Amazon and Heidelberg University for the first time at the Second Conference on Machine Translation (WMT 2017). The goal of the task is to encourage research on learning machine translation from weak user feedback instead of human references or post-edits. On each of a sequence of rounds, a machine translation system is required to propose a translation for an input, and receives a real-valued estimate of the quality of the proposed translation for learning. This paper describes the shared task's learning and evaluation setup, using services hosted on Amazon Web Services (AWS), the data and evaluation metrics, and the results of various machine translation architectures and learning protocols.Comment: Conference on Machine Translation (WMT) 201

arXiv.org e-Print Archive

Crossref

Modeling, forecasting and trading the EUR exchange rates with hybrid rolling genetic algorithms: support vector regression forecast combinations

Author: Karathanasopoulos Andreas
Sermpinis Georgios
Stasinakis Charalampos
Theofilatos Konstantinos
Publication venue: 'Elsevier BV'
Publication date: 01/12/2015
Field of study

The motivation of this paper is to introduce a hybrid Rolling Genetic Algorithm-Support Vector Regression (RG-SVR) model for optimal parameter selection and feature subset combination. The algorithm is applied to the task of forecasting and trading the EUR/USD, EUR/GBP and EUR/JPY exchange rates. The proposed methodology genetically searches over a feature space (pool of individual forecasts) and then combines the optimal feature subsets (SVR forecast combinations) for each exchange rate. This is achieved by applying a fitness function specialized for financial purposes and adopting a sliding window approach. The individual forecasts are derived from several linear and non-linear models. RG-SVR is benchmarked against genetically and non-genetically optimized SVRs and SVMs models that are dominating the relevant literature, along with the robust ARBF-PSO neural network. The statistical and trading performance of all models is investigated during the period of 1999–2012. As it turns out, RG-SVR presents the best performance in terms of statistical accuracy and trading efficiency for all the exchange rates under study. This superiority confirms the success of the implemented fitness function and training procedure, while it validates the benefits of the proposed algorithm

Enlighten

Forex Trading Signal Extraction with Deep Learning Models

Author: Qi Ling
Publication venue: 'Journal of the Faculty of Engineering and Architecture of Gazi University'
Publication date: 01/01/2023
Field of study

The rise of AI technology has popularized deep learning models for financial trading prediction, promising substantial profits with minimal risk. Institutions like Westpac, Commonwealth Bank of Australia, Macquarie Bank, and Bloomberg invest heavily in this transformative technology. Researchers have also explored AI's potential in the exchange rate market. This thesis focuses on developing advanced deep learning models for accurate forex market prediction and AI-powered trading strategies. Three deep learning models are introduced: an event-driven LSTM model, an Attention-based VGG16 model named MHATTN-VGG16, and a pre-trained model called TradingBERT. These models aim to enhance signal extraction and price forecasting in forex trading, offering valuable insights for decision-making. The first model, an LSTM, predicts retracement points crucial for identifying trend reversals. It outperforms baseline models like GRU and RNN, thanks to noise reduction in the training data. Experiments determine the optimal number of timesteps for trend identification, showing promise for building a robotic trading platform. The second model, MHATTN-VGG16, predicts maximum and minimum price movements in forex chart images. It combines VGG16 with multi-head attention and positional encoding to effectively classify financial chart images. The third model utilizes a pre-trained BERT architecture to transform trading price data into normalized embeddings, enabling meaningful signal extraction from financial data. This study pioneers the use of pre-trained models in financial trading and introduces a method for converting continuous price data into categorized elements, leveraging the success of BERT. This thesis contributes innovative approaches to deep learning in algorithmic trading, offering traders and investors precision and confidence in navigating financial markets

Sydney eScholarship

SuperNeurons: Dynamic GPU Memory Management for Training Deep Neural Networks

Author: Abadi M.
Coates A.
Collobert R.
Dean J.
Krizhevsky A.
LeCun Y.
Szegedy C.
Vanhoucke V.
Wang L.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 12/01/2018
Field of study

Going deeper and wider in neural architectures improves the accuracy, while the limited GPU DRAM places an undesired restriction on the network design domain. Deep Learning (DL) practitioners either need change to less desired network architectures, or nontrivially dissect a network across multiGPUs. These distract DL practitioners from concentrating on their original machine learning tasks. We present SuperNeurons: a dynamic GPU memory scheduling runtime to enable the network training far beyond the GPU DRAM capacity. SuperNeurons features 3 memory optimizations, \textit{Liveness Analysis}, \textit{Unified Tensor Pool}, and \textit{Cost-Aware Recomputation}, all together they effectively reduce the network-wide peak memory usage down to the maximal memory usage among layers. We also address the performance issues in those memory saving techniques. Given the limited GPU DRAM, SuperNeurons not only provisions the necessary memory for the training, but also dynamically allocates the memory for convolution workspaces to achieve the high performance. Evaluations against Caffe, Torch, MXNet and TensorFlow have demonstrated that SuperNeurons trains at least 3.2432 deeper network than current ones with the leading performance. Particularly, SuperNeurons can train ResNet2500 that has

10^4

basic network layers on a 12GB K40c.Comment: PPoPP '2018: 23nd ACM SIGPLAN Symposium on Principles and Practice of Parallel Programmin

arXiv.org e-Print Archive

Crossref

Energy balance between voltage-frequency scaling and resilience for linear algebra routines on low-power multicore architectures

Author: Alameldeen
Alonso
Anderson
Bacha
Cappello
Catalán
Chandrakasan
Degalahal
Enrique S. Quintana-Ortí
Ernst
Gensh
Golub
Hennessy
Ibtesham
Johnston
José R. Herrero
Karpuzcu
Leng
Mills
Moore
Petersen
Rafael Rodríguez-Sánchez
Sandra Catalán
Smith
Smith
Tan
Tan
Wei
Weiser
Wilkerson
Zee
Zee
Publication venue: 'Elsevier BV'
Publication date: 01/01/2017
Field of study

[EN] Near Threshold Voltage (NTV) computing has been recently proposed as a technique to save energy, at the cost of incurring higher error rates including, among others, Silent Data Corruption (SDC). In this paper, we evaluate the energy efficiency of dense linear algebra routines using several low-power multicore processors and we analyze whether the potential energy reduction achieved when scaling the processor to operate at a low voltage compensates the cost of integrating a fault tolerance mechanism that tackles SDC. Our study targets algorithmic-based fault-tolerant versions of the dense matrix-vector and matrix(matrix) multiplication kernels (GEMV and GEMM, respectively), using the BLIS framework, as well as an implementation of the LU factorization with partial pivoting built on top of GEMM, Furthermore, we tailor the study for a number of representative 32-bit and 64-bit multicore processors from ARM that were specifically designed for energy efficiency. (C) 2017 Elsevier B.V. All rights reserved.The researchers from Universidad Jaume I were supported by project CICYT TIN2014-53495-R of MINECO and FEDER, and the FPU program of MECD. The researcher from Universitat Politecnica de Catalunya was supported by projects TIN2015-65316-P from the Spanish Ministry of Education and 2014 SGR 1051 from the Generalitat de Catalunya, Dep. d'Innovacio, Universitats i Empresa.Catalán, S.; Herrero, JR.; Quintana Ortí, ES.; Rodríguez-Sánchez, R. (2018). Energy balance between voltage-frequency scaling and resilience for linear algebra routines on low-power multicore architectures. Parallel Computing. 73:28-39. https://doi.org/10.1016/j.parco.2017.05.004S28397

Crossref

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Repositori Institucional de la Universitat Jaume I

RiuNet