1,694 research outputs found

    Towards Generalizable Reinforcement Learning for Trade Execution

    Full text link
    Optimized trade execution is to sell (or buy) a given amount of assets in a given time with the lowest possible trading cost. Recently, reinforcement learning (RL) has been applied to optimized trade execution to learn smarter policies from market data. However, we find that many existing RL methods exhibit considerable overfitting which prevents them from real deployment. In this paper, we provide an extensive study on the overfitting problem in optimized trade execution. First, we model the optimized trade execution as offline RL with dynamic context (ORDC), where the context represents market variables that cannot be influenced by the trading policy and are collected in an offline manner. Under this framework, we derive the generalization bound and find that the overfitting issue is caused by large context space and limited context samples in the offline setting. Accordingly, we propose to learn compact representations for context to address the overfitting problem, either by leveraging prior knowledge or in an end-to-end manner. To evaluate our algorithms, we also implement a carefully designed simulator based on historical limit order book (LOB) data to provide a high-fidelity benchmark for different algorithms. Our experiments on the high-fidelity simulator demonstrate that our algorithms can effectively alleviate overfitting and achieve better performance.Comment: Accepted by IJCAI-2

    MQLV: Optimal Policy of Money Management in Retail Banking with Q-Learning

    Get PDF
    Reinforcement learning has become one of the best approach to train a computer game emulator capable of human level performance. In a reinforcement learning approach, an optimal value function is learned across a set of actions, or decisions, that leads to a set of states giving different rewards, with the objective to maximize the overall reward. A policy assigns to each state-action pairs an expected return. We call an optimal policy a policy for which the value function is optimal. QLBS, Q-Learner in the Black-Scholes(-Merton) Worlds, applies the reinforcement learning concepts, and noticeably, the popular Q-learning algorithm, to the financial stochastic model of Black, Scholes and Merton. It is, however, specifically optimized for the geometric Brownian motion and the vanilla options. Its range of application is, therefore, limited to vanilla option pricing within financial markets. We propose MQLV, Modified Q-Learner for the Vasicek model, a new reinforcement learning approach that determines the optimal policy of money management based on the aggregated financial transactions of the clients. It unlocks new frontiers to establish personalized credit card limits or to fulfill bank loan applications, targeting the retail banking industry. MQLV extends the simulation to mean reverting stochastic diffusion processes and it uses a digital function, a Heaviside step function expressed in its discrete form, to estimate the probability of a future event such as a payment default. In our experiments, we first show the similarities between a set of historical financial transactions and Vasicek generated transactions and, then, we underline the potential of MQLV on generated Monte Carlo simulations. Finally, MQLV is the first Q-learning Vasicek-based methodology addressing transparent decision making processes in retail banking

    Agent-based simulation of electricity markets: a literature review

    Get PDF
    Liberalisation, climate policy and promotion of renewable energy are challenges to players of the electricity sector in many countries. Policy makers have to consider issues like market power, bounded rationality of players and the appearance of fluctuating energy sources in order to provide adequate legislation. Furthermore the interactions between markets and environmental policy instruments become an issue of increasing importance. A promising approach for the scientific analysis of these developments is the field of agent-based simulation. The goal of this article is to provide an overview of the current work applying this methodology to the analysis of electricity markets. --

    Modelling of a System for the Detection of Weak Signals Through Text Mining and NLP. Proposal of Improvement by a Quantum Variational Circuit

    Full text link
    Tesis por compendio[ES] En esta tesis doctoral se propone y evalĂșa un sistema para detectar señales dĂ©biles (weak signals) relacionadas con cambios futuros trascendentales. Si bien la mayorĂ­a de las soluciones conocidas se basan en el uso de datos estructurados, el sistema propuesto detecta cuantitativamente estas señales utilizando informaciĂłn heterogĂ©nea y no estructurada de fuentes cientĂ­ficas, periodĂ­sticas y de redes sociales. La predicciĂłn de nuevas tendencias en un medio tiene muchas aplicaciones. Por ejemplo, empresas y startups se enfrentan a cambios constantes en sus mercados que son muy difĂ­ciles de predecir. Por esta razĂłn, el desarrollo de sistemas para detectar automĂĄticamente cambios futuros significativos en una etapa temprana es relevante para que cualquier organizaciĂłn tome decisiones acertadas a tiempo. Este trabajo ha sido diseñado para obtener señales dĂ©biles del futuro en cualquier campo dependiendo Ășnicamente del conjunto de datos de entrada de documentos. Se aplican tĂ©cnicas de minerĂ­a de textos y procesamiento del lenguaje natural para procesar todos estos documentos. Como resultado, se obtiene un mapa con un ranking de tĂ©rminos, una lista de palabras clave clasificadas automĂĄticamente y una lista de expresiones formadas por mĂșltiples palabras. El sistema completo se ha probado en cuatro sectores diferentes: paneles solares, inteligencia artificial, sensores remotos e imĂĄgenes mĂ©dicas. Este trabajo ha obtenido resultados prometedores, evaluados con dos metodologĂ­as diferentes. Como resultado, el sistema ha sido capaz de detectar de forma satisfactoria nuevas tendencias en etapas muy tempranas que se han vuelto cada vez mĂĄs importantes en la actualidad. La computaciĂłn cuĂĄntica es un nuevo paradigma para una multitud de aplicaciones informĂĄticas. En esta tesis doctoral tambiĂ©n se presenta un estudio de las tecnologĂ­as disponibles en la actualidad para la implementaciĂłn fĂ­sica de qubits y puertas cuĂĄnticas, estableciendo sus principales ventajas y desventajas, y los marcos disponibles para la programaciĂłn e implementaciĂłn de circuitos cuĂĄnticos. Con el fin de mejorar la efectividad del sistema, se describe un diseño de un circuito cuĂĄntico basado en mĂĄquinas de vectores de soporte (SVM) para la resoluciĂłn de problemas de clasificaciĂłn. Este circuito estĂĄ especialmente diseñado para los ruidosos procesadores cuĂĄnticos de escala intermedia (NISQ) que estĂĄn disponibles actualmente. Como experimento, el circuito ha sido probado en un computador cuĂĄntico real basado en qubits superconductores por IBM como una mejora para el subsistema de minerĂ­a de texto en la detecciĂłn de señales dĂ©biles. Los resultados obtenidos con el experimento cuĂĄntico muestran tambiĂ©n conclusiones interesantes y una mejora en el rendimiento de cerca del 20% sobre los sistemas convencionales, pero a su vez confirman que aĂșn se requiere un desarrollo tecnolĂłgico continuo para aprovechar al mĂĄximo la computaciĂłn cuĂĄntica.[CA] En aquesta tesi doctoral es proposa i avalua un sistema per detectar senyals febles (weak signals) relacionats amb canvis futurs transcendentals. Si bĂ© la majoria de solucions conegudes es basen en l'Ășs de dades estructurades, el sistema proposat detecta quantitativament aquests senyals utilitzant informaciĂł heterogĂšnia i no estructurada de fonts cientĂ­fiques, periodĂ­stiques i de xarxes socials. La predicciĂł de noves tendĂšncies en un medi tĂ© moltes aplicacions. Per exemple, empreses i startups s'enfronten a canvis constants als seus mercats que sĂłn molt difĂ­cils de predir. Per aixĂČ, el desenvolupament de sistemes per detectar automĂ ticament canvis futurs significatius en una etapa primerenca Ă©s rellevant perquĂš les organitzacions prenguen decisions encertades a temps. Aquest treball ha estat dissenyat per obtenir senyals febles del futur a qualsevol camp depenent Ășnicament del conjunt de dades d'entrada de documents. S'hi apliquen tĂšcniques de mineria de textos i processament del llenguatge natural per processar tots aquests documents. Com a resultat, s'obtĂ© un mapa amb un rĂ nquing de termes, un llistat de paraules clau classificades automĂ ticament i un llistat d'expressions formades per mĂșltiples paraules. El sistema complet s'ha provat en quatre sectors diferents: panells solars, intel·ligĂšncia artificial, sensors remots i imatges mĂšdiques. Aquest treball ha obtingut resultats prometedors, avaluats amb dues metodologies diferents. Com a resultat, el sistema ha estat capaç de detectar de manera satisfactĂČria noves tendĂšncies en etapes molt primerenques que s'han tornat cada cop mĂ©s importants actualment. La computaciĂł quĂ ntica Ă©s un paradigma nou per a una multitud d'aplicacions informĂ tiques. En aquesta tesi doctoral tambĂ© es presenta un estudi de les tecnologies disponibles actualment per a la implementaciĂł fĂ­sica de qubits i portes quĂ ntiques, establint-ne els principals avantatges i desavantatges, i els marcs disponibles per a la programaciĂł i implementaciĂł de circuits quĂ ntics. Per tal de millorar l'efectivitat del sistema, es descriu un disseny d'un circuit quĂ ntic basat en mĂ quines de vectors de suport (SVM) per resoldre problemes de classificaciĂł. Aquest circuit estĂ  dissenyat especialment per als sorollosos processadors quĂ ntics d'escala intermĂšdia (NISQ) que estan disponibles actualment. Com a experiment, el circuit ha estat provat en un ordinador quĂ ntic real basat en qubits superconductors per IBM com una millora per al subsistema de mineria de text. Els resultats obtinguts amb l'experiment quĂ ntic tambĂ© mostren conclusions interessants i una millora en el rendiment de prop del 20% sobre els sistemes convencionals, perĂČ a la vegada confirmen que encara es requereix un desenvolupament tecnolĂČgic continu per aprofitar al mĂ xim la computaciĂł quĂ ntica.[EN] In this doctoral thesis, a system to detect weak signals related to future transcendental changes is proposed and tested. While most known solutions are based on the use of structured data, the proposed system quantitatively detects these signals using heterogeneous and unstructured information from scientific, journalistic, and social sources. Predicting new trends in an environment has many applications. For instance, companies and startups face constant changes in their markets that are very difficult to predict. For this reason, developing systems to automatically detect significant future changes at an early stage is relevant for any organization to make right decisions on time. This work has been designed to obtain weak signals of the future in any field depending only on the input dataset of documents. Text mining and natural language processing techniques are applied to process all these documents. As a result, a map of ranked terms, a list of automatically classified keywords and a list of multi-word expressions are obtained. The overall system has been tested in four different sectors: solar panels, artificial intelligence, remote sensing, and medical imaging. This work has obtained promising results that have been evaluated with two different methodologies. As a result, the system was able to successfully detect new trends at a very early stage that have become more and more important today. Quantum computing is a new paradigm for a multitude of computing applications. This doctoral thesis also presents a study of the technologies that are currently available for the physical implementation of qubits and quantum gates, establishing their main advantages and disadvantages and the available frameworks for programming and implementing quantum circuits. In order to improve the effectiveness of the system, a design of a quantum circuit based on support vector machines (SVMs) is described for the resolution of classification problems. This circuit is specially designed for the noisy intermediate-scale quantum (NISQ) computers that are currently available. As an experiment, the circuit has been tested on a real quantum computer based on superconducting qubits by IBM as an improvement for the text mining subsystem in the detection of weak signals. The results obtained with the quantum experiment show interesting outcomes with an improvement of close to 20% better performance than conventional systems, but also confirm that ongoing technological development is still required to take full advantage of quantum computing.Griol Barres, I. (2022). Modelling of a System for the Detection of Weak Signals Through Text Mining and NLP. Proposal of Improvement by a Quantum Variational Circuit [Tesis doctoral]. Universitat PolitĂšcnica de ValĂšncia. https://doi.org/10.4995/Thesis/10251/183029TESISCompendi

    Agent-Based Simulations of Blockchain protocols illustrated via Kadena's Chainweb

    Full text link
    While many distributed consensus protocols provide robust liveness and consistency guarantees under the presence of malicious actors, quantitative estimates of how economic incentives affect security are few and far between. In this paper, we describe a system for simulating how adversarial agents, both economically rational and Byzantine, interact with a blockchain protocol. This system provides statistical estimates for the economic difficulty of an attack and how the presence of certain actors influences protocol-level statistics, such as the expected time to regain liveness. This simulation system is influenced by the design of algorithmic trading and reinforcement learning systems that use explicit modeling of an agent's reward mechanism to evaluate and optimize a fully autonomous agent. We implement and apply this simulation framework to Kadena's Chainweb, a parallelized Proof-of-Work system, that contains complexity in how miner incentive compliance affects security and censorship resistance. We provide the first formal description of Chainweb that is in the literature and use this formal description to motivate our simulation design. Our simulation results include a phase transition in block height growth rate as a function of shard connectivity and empirical evidence that censorship in Chainweb is too costly for rational miners to engage in. We conclude with an outlook on how simulation can guide and optimize protocol development in a variety of contexts, including Proof-of-Stake parameter optimization and peer-to-peer networking design.Comment: 10 pages, 7 figures, accepted to the IEEE S&B 2019 conferenc

    IMM: An Imitative Reinforcement Learning Approach with Predictive Representation Learning for Automatic Market Making

    Full text link
    Market making (MM) has attracted significant attention in financial trading owing to its essential function in ensuring market liquidity. With strong capabilities in sequential decision-making, Reinforcement Learning (RL) technology has achieved remarkable success in quantitative trading. Nonetheless, most existing RL-based MM methods focus on optimizing single-price level strategies which fail at frequent order cancellations and loss of queue priority. Strategies involving multiple price levels align better with actual trading scenarios. However, given the complexity that multi-price level strategies involves a comprehensive trading action space, the challenge of effectively training profitable RL agents for MM persists. Inspired by the efficient workflow of professional human market makers, we propose Imitative Market Maker (IMM), a novel RL framework leveraging both knowledge from suboptimal signal-based experts and direct policy interactions to develop multi-price level MM strategies efficiently. The framework start with introducing effective state and action representations adept at encoding information about multi-price level orders. Furthermore, IMM integrates a representation learning unit capable of capturing both short- and long-term market trends to mitigate adverse selection risk. Subsequently, IMM formulates an expert strategy based on signals and trains the agent through the integration of RL and imitation learning techniques, leading to efficient learning. Extensive experimental results on four real-world market datasets demonstrate that IMM outperforms current RL-based market making strategies in terms of several financial criteria. The findings of the ablation study substantiate the effectiveness of the model components
    • 

    corecore