Search CORE

1,694 research outputs found

Towards Generalizable Reinforcement Learning for Trade Execution

Author: Chen Jianyu
Chen Xiaoyu
Duan Yitong
Li Jian
Zhang Chuheng
Zhao Li
Publication venue
Publication date: 11/05/2023
Field of study

Optimized trade execution is to sell (or buy) a given amount of assets in a given time with the lowest possible trading cost. Recently, reinforcement learning (RL) has been applied to optimized trade execution to learn smarter policies from market data. However, we find that many existing RL methods exhibit considerable overfitting which prevents them from real deployment. In this paper, we provide an extensive study on the overfitting problem in optimized trade execution. First, we model the optimized trade execution as offline RL with dynamic context (ORDC), where the context represents market variables that cannot be influenced by the trading policy and are collected in an offline manner. Under this framework, we derive the generalization bound and find that the overfitting issue is caused by large context space and limited context samples in the offline setting. Accordingly, we propose to learn compact representations for context to address the overfitting problem, either by leveraging prior knowledge or in an end-to-end manner. To evaluate our algorithms, we also implement a carefully designed simulator based on historical limit order book (LOB) data to provide a high-fidelity benchmark for different algorithms. Our experiments on the high-fidelity simulator demonstrate that our algorithms can effectively alleviate overfitting and achieve better performance.Comment: Accepted by IJCAI-2

arXiv.org e-Print Archive

MQLV: Optimal Policy of Money Management in Retail Banking with Q-Learning

Author: Charlier Jeremy
Hilger Jean
Ormazabal Gaston
State Radu
Publication venue
Publication date: 21/08/2019
Field of study

Reinforcement learning has become one of the best approach to train a computer game emulator capable of human level performance. In a reinforcement learning approach, an optimal value function is learned across a set of actions, or decisions, that leads to a set of states giving different rewards, with the objective to maximize the overall reward. A policy assigns to each state-action pairs an expected return. We call an optimal policy a policy for which the value function is optimal. QLBS, Q-Learner in the Black-Scholes(-Merton) Worlds, applies the reinforcement learning concepts, and noticeably, the popular Q-learning algorithm, to the financial stochastic model of Black, Scholes and Merton. It is, however, specifically optimized for the geometric Brownian motion and the vanilla options. Its range of application is, therefore, limited to vanilla option pricing within financial markets. We propose MQLV, Modified Q-Learner for the Vasicek model, a new reinforcement learning approach that determines the optimal policy of money management based on the aggregated financial transactions of the clients. It unlocks new frontiers to establish personalized credit card limits or to fulfill bank loan applications, targeting the retail banking industry. MQLV extends the simulation to mean reverting stochastic diffusion processes and it uses a digital function, a Heaviside step function expressed in its discrete form, to estimate the probability of a future event such as a payment default. In our experiments, we first show the similarities between a set of historical financial transactions and Vasicek generated transactions and, then, we underline the potential of MQLV on generated Monte Carlo simulations. Finally, MQLV is the first Q-learning Vasicek-based methodology addressing transparent decision making processes in retail banking

arXiv.org e-Print Archive

Open Repository and Bibliography - Luxembourg

Agent-based simulation of electricity markets: a literature review

Author: Genoese Massimo
Möst Dominik
Ragwitz Mario
Sensfuß Frank
Publication venue
Publication date
Field of study

Liberalisation, climate policy and promotion of renewable energy are challenges to players of the electricity sector in many countries. Policy makers have to consider issues like market power, bounded rationality of players and the appearance of fluctuating energy sources in order to provide adequate legislation. Furthermore the interactions between markets and environmental policy instruments become an issue of increasing importance. A promising approach for the scientific analysis of these developments is the field of agent-based simulation. The goal of this article is to provide an overview of the current work applying this methodology to the analysis of electricity markets. --

Research Papers in Economics

Modelling of a System for the Detection of Weak Signals Through Text Mining and NLP. Proposal of Improvement by a Quantum Variational Circuit

Author: Griol Barres Israel
Publication venue: 'Universitat Politecnica de Valencia'
Publication date: 30/05/2022
Field of study

Tesis por compendio[ES] En esta tesis doctoral se propone y evalúa un sistema para detectar señales débiles (weak signals) relacionadas con cambios futuros trascendentales. Si bien la mayoría de las soluciones conocidas se basan en el uso de datos estructurados, el sistema propuesto detecta cuantitativamente estas señales utilizando información heterogénea y no estructurada de fuentes científicas, periodísticas y de redes sociales. La predicción de nuevas tendencias en un medio tiene muchas aplicaciones. Por ejemplo, empresas y startups se enfrentan a cambios constantes en sus mercados que son muy difíciles de predecir. Por esta razón, el desarrollo de sistemas para detectar automáticamente cambios futuros significativos en una etapa temprana es relevante para que cualquier organización tome decisiones acertadas a tiempo. Este trabajo ha sido diseñado para obtener señales débiles del futuro en cualquier campo dependiendo únicamente del conjunto de datos de entrada de documentos. Se aplican técnicas de minería de textos y procesamiento del lenguaje natural para procesar todos estos documentos. Como resultado, se obtiene un mapa con un ranking de términos, una lista de palabras clave clasificadas automáticamente y una lista de expresiones formadas por múltiples palabras. El sistema completo se ha probado en cuatro sectores diferentes: paneles solares, inteligencia artificial, sensores remotos e imágenes médicas. Este trabajo ha obtenido resultados prometedores, evaluados con dos metodologías diferentes. Como resultado, el sistema ha sido capaz de detectar de forma satisfactoria nuevas tendencias en etapas muy tempranas que se han vuelto cada vez más importantes en la actualidad. La computación cuántica es un nuevo paradigma para una multitud de aplicaciones informáticas. En esta tesis doctoral también se presenta un estudio de las tecnologías disponibles en la actualidad para la implementación física de qubits y puertas cuánticas, estableciendo sus principales ventajas y desventajas, y los marcos disponibles para la programación e implementación de circuitos cuánticos. Con el fin de mejorar la efectividad del sistema, se describe un diseño de un circuito cuántico basado en máquinas de vectores de soporte (SVM) para la resolución de problemas de clasificación. Este circuito está especialmente diseñado para los ruidosos procesadores cuánticos de escala intermedia (NISQ) que están disponibles actualmente. Como experimento, el circuito ha sido probado en un computador cuántico real basado en qubits superconductores por IBM como una mejora para el subsistema de minería de texto en la detección de señales débiles. Los resultados obtenidos con el experimento cuántico muestran también conclusiones interesantes y una mejora en el rendimiento de cerca del 20% sobre los sistemas convencionales, pero a su vez confirman que aún se requiere un desarrollo tecnológico continuo para aprovechar al máximo la computación cuántica.[CA] En aquesta tesi doctoral es proposa i avalua un sistema per detectar senyals febles (weak signals) relacionats amb canvis futurs transcendentals. Si bé la majoria de solucions conegudes es basen en l'ús de dades estructurades, el sistema proposat detecta quantitativament aquests senyals utilitzant informació heterogènia i no estructurada de fonts científiques, periodístiques i de xarxes socials. La predicció de noves tendències en un medi té moltes aplicacions. Per exemple, empreses i startups s'enfronten a canvis constants als seus mercats que són molt difícils de predir. Per això, el desenvolupament de sistemes per detectar automàticament canvis futurs significatius en una etapa primerenca és rellevant perquè les organitzacions prenguen decisions encertades a temps. Aquest treball ha estat dissenyat per obtenir senyals febles del futur a qualsevol camp depenent únicament del conjunt de dades d'entrada de documents. S'hi apliquen tècniques de mineria de textos i processament del llenguatge natural per processar tots aquests documents. Com a resultat, s'obté un mapa amb un rànquing de termes, un llistat de paraules clau classificades automàticament i un llistat d'expressions formades per múltiples paraules. El sistema complet s'ha provat en quatre sectors diferents: panells solars, intel·ligència artificial, sensors remots i imatges mèdiques. Aquest treball ha obtingut resultats prometedors, avaluats amb dues metodologies diferents. Com a resultat, el sistema ha estat capaç de detectar de manera satisfactòria noves tendències en etapes molt primerenques que s'han tornat cada cop més importants actualment. La computació quàntica és un paradigma nou per a una multitud d'aplicacions informàtiques. En aquesta tesi doctoral també es presenta un estudi de les tecnologies disponibles actualment per a la implementació física de qubits i portes quàntiques, establint-ne els principals avantatges i desavantatges, i els marcs disponibles per a la programació i implementació de circuits quàntics. Per tal de millorar l'efectivitat del sistema, es descriu un disseny d'un circuit quàntic basat en màquines de vectors de suport (SVM) per resoldre problemes de classificació. Aquest circuit està dissenyat especialment per als sorollosos processadors quàntics d'escala intermèdia (NISQ) que estan disponibles actualment. Com a experiment, el circuit ha estat provat en un ordinador quàntic real basat en qubits superconductors per IBM com una millora per al subsistema de mineria de text. Els resultats obtinguts amb l'experiment quàntic també mostren conclusions interessants i una millora en el rendiment de prop del 20% sobre els sistemes convencionals, però a la vegada confirmen que encara es requereix un desenvolupament tecnològic continu per aprofitar al màxim la computació quàntica.[EN] In this doctoral thesis, a system to detect weak signals related to future transcendental changes is proposed and tested. While most known solutions are based on the use of structured data, the proposed system quantitatively detects these signals using heterogeneous and unstructured information from scientific, journalistic, and social sources. Predicting new trends in an environment has many applications. For instance, companies and startups face constant changes in their markets that are very difficult to predict. For this reason, developing systems to automatically detect significant future changes at an early stage is relevant for any organization to make right decisions on time. This work has been designed to obtain weak signals of the future in any field depending only on the input dataset of documents. Text mining and natural language processing techniques are applied to process all these documents. As a result, a map of ranked terms, a list of automatically classified keywords and a list of multi-word expressions are obtained. The overall system has been tested in four different sectors: solar panels, artificial intelligence, remote sensing, and medical imaging. This work has obtained promising results that have been evaluated with two different methodologies. As a result, the system was able to successfully detect new trends at a very early stage that have become more and more important today. Quantum computing is a new paradigm for a multitude of computing applications. This doctoral thesis also presents a study of the technologies that are currently available for the physical implementation of qubits and quantum gates, establishing their main advantages and disadvantages and the available frameworks for programming and implementing quantum circuits. In order to improve the effectiveness of the system, a design of a quantum circuit based on support vector machines (SVMs) is described for the resolution of classification problems. This circuit is specially designed for the noisy intermediate-scale quantum (NISQ) computers that are currently available. As an experiment, the circuit has been tested on a real quantum computer based on superconducting qubits by IBM as an improvement for the text mining subsystem in the detection of weak signals. The results obtained with the quantum experiment show interesting outcomes with an improvement of close to 20% better performance than conventional systems, but also confirm that ongoing technological development is still required to take full advantage of quantum computing.Griol Barres, I. (2022). Modelling of a System for the Detection of Weak Signals Through Text Mining and NLP. Proposal of Improvement by a Quantum Variational Circuit [Tesis doctoral]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/183029TESISCompendi

RiuNet

Agent-Based Simulations of Blockchain protocols illustrated via Kadena's Chainweb

Author: Chitra Tarun
Haber Stuart
Martino Will
Quaintance Monica
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 29/04/2019
Field of study

While many distributed consensus protocols provide robust liveness and consistency guarantees under the presence of malicious actors, quantitative estimates of how economic incentives affect security are few and far between. In this paper, we describe a system for simulating how adversarial agents, both economically rational and Byzantine, interact with a blockchain protocol. This system provides statistical estimates for the economic difficulty of an attack and how the presence of certain actors influences protocol-level statistics, such as the expected time to regain liveness. This simulation system is influenced by the design of algorithmic trading and reinforcement learning systems that use explicit modeling of an agent's reward mechanism to evaluate and optimize a fully autonomous agent. We implement and apply this simulation framework to Kadena's Chainweb, a parallelized Proof-of-Work system, that contains complexity in how miner incentive compliance affects security and censorship resistance. We provide the first formal description of Chainweb that is in the literature and use this formal description to motivate our simulation design. Our simulation results include a phase transition in block height growth rate as a function of shard connectivity and empirical evidence that censorship in Chainweb is too costly for rational miners to engage in. We conclude with an outlook on how simulation can guide and optimize protocol development in a variety of contexts, including Proof-of-Stake parameter optimization and peer-to-peer networking design.Comment: 10 pages, 7 figures, accepted to the IEEE S&B 2019 conferenc

arXiv.org e-Print Archive

Crossref

IMM: An Imitative Reinforcement Learning Approach with Predictive Representation Learning for Automatic Market Making

Author: An Bo
Guo Jian
Li Jian
Li Siyuan
Lin Zhouchi
Niu Hui
Zheng Jiahao
Publication venue
Publication date: 17/08/2023
Field of study

Market making (MM) has attracted significant attention in financial trading owing to its essential function in ensuring market liquidity. With strong capabilities in sequential decision-making, Reinforcement Learning (RL) technology has achieved remarkable success in quantitative trading. Nonetheless, most existing RL-based MM methods focus on optimizing single-price level strategies which fail at frequent order cancellations and loss of queue priority. Strategies involving multiple price levels align better with actual trading scenarios. However, given the complexity that multi-price level strategies involves a comprehensive trading action space, the challenge of effectively training profitable RL agents for MM persists. Inspired by the efficient workflow of professional human market makers, we propose Imitative Market Maker (IMM), a novel RL framework leveraging both knowledge from suboptimal signal-based experts and direct policy interactions to develop multi-price level MM strategies efficiently. The framework start with introducing effective state and action representations adept at encoding information about multi-price level orders. Furthermore, IMM integrates a representation learning unit capable of capturing both short- and long-term market trends to mitigate adverse selection risk. Subsequently, IMM formulates an expert strategy based on signals and trains the agent through the integration of RL and imitation learning techniques, leading to efficient learning. Extensive experimental results on four real-world market datasets demonstrate that IMM outperforms current RL-based market making strategies in terms of several financial criteria. The findings of the ablation study substantiate the effectiveness of the model components

arXiv.org e-Print Archive