1,675 research outputs found

    Practical Deep Reinforcement Learning Approach for Stock Trading

    Full text link
    Stock trading strategy plays a crucial role in investment companies. However, it is challenging to obtain optimal strategy in the complex and dynamic stock market. We explore the potential of deep reinforcement learning to optimize stock trading strategy and thus maximize investment return. 30 stocks are selected as our trading stocks and their daily prices are used as the training and trading market environment. We train a deep reinforcement learning agent and obtain an adaptive trading strategy. The agent's performance is evaluated and compared with Dow Jones Industrial Average and the traditional min-variance portfolio allocation strategy. The proposed deep reinforcement learning approach is shown to outperform the two baselines in terms of both the Sharpe ratio and cumulative returns

    Study on stock trading and portfolio optimization using genetic network programming

    Get PDF
    制度:新 ; 報告番号:甲3002号 ; 学位の種類:博士(工学) ; 授与年月日: 2010/3/15 ; 早大学位記番号:新525

    Study on probabilistic model building genetic network programming

    Get PDF
    制度:新 ; 報告番号:甲3776号 ; 学位の種類:博士(工学) ; 授与年月日:2013/3/15 ; 早大学位記番号:新6149Waseda Universit

    Reinforcement Learning Applied to Trading Systems: A Survey

    Full text link
    Financial domain tasks, such as trading in market exchanges, are challenging and have long attracted researchers. The recent achievements and the consequent notoriety of Reinforcement Learning (RL) have also increased its adoption in trading tasks. RL uses a framework with well-established formal concepts, which raises its attractiveness in learning profitable trading strategies. However, RL use without due attention in the financial area can prevent new researchers from following standards or failing to adopt relevant conceptual guidelines. In this work, we embrace the seminal RL technical fundamentals, concepts, and recommendations to perform a unified, theoretically-grounded examination and comparison of previous research that could serve as a structuring guide for the field of study. A selection of twenty-nine articles was reviewed under our classification that considers RL's most common formulations and design patterns from a large volume of available studies. This classification allowed for precise inspection of the most relevant aspects regarding data input, preprocessing, state and action composition, adopted RL techniques, evaluation setups, and overall results. Our analysis approach organized around fundamental RL concepts allowed for a clear identification of current system design best practices, gaps that require further investigation, and promising research opportunities. Finally, this review attempts to promote the development of this field of study by facilitating researchers' commitment to standards adherence and helping them to avoid straying away from the RL constructs' firm ground.Comment: 38 page

    An investigation into the use of neural networks for the prediction of the stock exchange of Thailand

    Get PDF
    Stock markets are affected by many interrelated factors such as economics and politics at both national and international levels. Predicting stock indices and determining the set of relevant factors for making accurate predictions are complicated tasks. Neural networks are one of the popular approaches used for research on stock market forecast. This study developed neural networks to predict the movement direction of the next trading day of the Stock Exchange of Thailand (SET) index. The SET has yet to be studied extensively and research focused on the SET will contribute to understanding its unique characteristics and will lead to identifying relevant information to assist investment in this stock market. Experiments were carried out to determine the best network architecture, training method, and input data to use for this task. With regards network architecture, feedforward networks with three layers were used - an input layer, a hidden layer and an output layer - and networks with different numbers of nodes in the hidden layers were tested and compared. With regards training method, neural networks were trained with back-propagation and with genetic algorithms. With regards input data, three set of inputs, namely internal indicators, external indicators and a combination of both were used. The internal indicators are based on calculations derived from the SET while the external indicators are deemed to be factors beyond the control of the Thailand such as the Down Jones Index

    Affinity-Based Reinforcement Learning : A New Paradigm for Agent Interpretability

    Get PDF
    The steady increase in complexity of reinforcement learning (RL) algorithms is accompanied by a corresponding increase in opacity that obfuscates insights into their devised strategies. Methods in explainable artificial intelligence seek to mitigate this opacity by either creating transparent algorithms or extracting explanations post hoc. A third category exists that allows the developer to affect what agents learn: constrained RL has been used in safety-critical applications and prohibits agents from visiting certain states; preference-based RL agents have been used in robotics applications and learn state-action preferences instead of traditional reward functions. We propose a new affinity-based RL paradigm in which agents learn strategies that are partially decoupled from reward functions. Unlike entropy regularisation, we regularise the objective function with a distinct action distribution that represents a desired behaviour; we encourage the agent to act according to a prior while learning to maximise rewards. The result is an inherently interpretable agent that solves problems with an intrinsic affinity for certain actions. We demonstrate the utility of our method in a financial application: we learn continuous time-variant compositions of prototypical policies, each interpretable by its action affinities, that are globally interpretable according to customers’ financial personalities. Our method combines advantages from both constrained RL and preferencebased RL: it retains the reward function but generalises the policy to match a defined behaviour, thus avoiding problems such as reward shaping and hacking. Unlike Boolean task composition, our method is a fuzzy superposition of different prototypical strategies to arrive at a more complex, yet interpretable, strategy.publishedVersio

    Stock Market Prediction via Deep Learning Techniques: A Survey

    Full text link
    The stock market prediction has been a traditional yet complex problem researched within diverse research areas and application domains due to its non-linear, highly volatile and complex nature. Existing surveys on stock market prediction often focus on traditional machine learning methods instead of deep learning methods. Deep learning has dominated many domains, gained much success and popularity in recent years in stock market prediction. This motivates us to provide a structured and comprehensive overview of the research on stock market prediction focusing on deep learning techniques. We present four elaborated subtasks of stock market prediction and propose a novel taxonomy to summarize the state-of-the-art models based on deep neural networks from 2011 to 2022. In addition, we also provide detailed statistics on the datasets and evaluation metrics commonly used in the stock market. Finally, we highlight some open issues and point out several future directions by sharing some new perspectives on stock market prediction

    my Human Brain Project (mHBP)

    Get PDF
    How can we make an agent that thinks like us humans? An agent that can have proprioception, intrinsic motivation, identify deception, use small amounts of energy, transfer knowledge between tasks and evolve? This is the problem that this thesis is focusing on. Being able to create a piece of software that can perform tasks like a human being, is a goal that, if achieved, will allow us to extend our own capabilities to a very high level, and have more tasks performed in a predictable fashion. This is one of the motivations for this thesis. To address this problem, we have proposed a modular architecture for Reinforcement Learning computation and developed an implementation to have this architecture exercised. This software, that we call mHBP, is created in Python using Webots as an environment for the agent, and Neo4J, a graph database, as memory. mHBP takes the sensory data or other inputs, and produces, based on the body parts / tools that the agent has available, an output consisting of actions to perform. This thesis involves experimental design with several iterations, exploring a theoretical approach to RL based on graph databases. We conclude, with our work in this thesis, that it is possible to represent episodic data in a graph, and is also possible to interconnect Webots, Python and Neo4J to support a stable architecture for Reinforcement Learning. In this work we also find a way to search for policies using the Neo4J querying language: Cypher. Another key conclusion of this work is that state representation needs to have further research to find a state definition that enables policy search to produce more useful policies. The article “REINFORCEMENT LEARNING: A LITERATURE REVIEW (2020)” at Research Gate with doi 10.13140/RG.2.2.30323.76327 is an outcome of this thesis.Como podemos criar um agente que pense como nós humanos? Um agente que tenha propriocepção, motivação intrínseca, seja capaz de identificar ilusão, usar pequenas quantidades de energia, transferir conhecimento entre tarefas e evoluir? Este é o problema em que se foca esta tese. Ser capaz de criar uma peça de software que desempenhe tarefas como um ser humano é um objectivo que, se conseguido, nos permitirá estender as nossas capacidades a um nível muito alto, e conseguir realizar mais tarefas de uma forma previsível. Esta é uma das motivações desta tese. Para endereçar este problema, propomos uma arquitectura modular para computação de aprendizagem por reforço e desenvolvemos uma implementação para exercitar esta arquitetura. Este software, ao qual chamamos mHBP, foi criado em Python usando o Webots como um ambiente para o agente, e o Neo4J, uma base de dados de grafos, como memória. O mHBP recebe dados sensoriais ou outros inputs, e produz, baseado nas partes do corpo / ferramentas que o agente tem disponíveis, um output que consiste em ações a desempenhar. Uma boa parte desta tese envolve desenho experimental com diversas iterações, explorando uma abordagem teórica assente em bases de dados de grafos. Concluímos, com o trabalho nesta tese, que é possível representar episódios em um grafo, e que é, também, possível interligar o Webots, com o Python e o Neo4J para suportar uma arquitetura estável para a aprendizagem por reforço. Neste trabalho, também, encontramos uma forma de procurar políticas usando a linguagem de pesquisa do Neo4J: Cypher. Outra conclusão chave deste trabalho é que a representação de estados necessita de mais investigação para encontrar uma definição de estado que permita à pesquisa de políticas produzir políticas que sejam mais úteis. O artigo “REINFORCEMENT LEARNING: A LITERATURE REVIEW (2020)” no Research Gate com o doi 10.13140/RG.2.2.30323.76327 é um sub-produto desta tese
    corecore