1,675 research outputs found
Practical Deep Reinforcement Learning Approach for Stock Trading
Stock trading strategy plays a crucial role in investment companies. However,
it is challenging to obtain optimal strategy in the complex and dynamic stock
market. We explore the potential of deep reinforcement learning to optimize
stock trading strategy and thus maximize investment return. 30 stocks are
selected as our trading stocks and their daily prices are used as the training
and trading market environment. We train a deep reinforcement learning agent
and obtain an adaptive trading strategy. The agent's performance is evaluated
and compared with Dow Jones Industrial Average and the traditional min-variance
portfolio allocation strategy. The proposed deep reinforcement learning
approach is shown to outperform the two baselines in terms of both the Sharpe
ratio and cumulative returns
Study on stock trading and portfolio optimization using genetic network programming
制度:新 ; 報告番号:甲3002号 ; 学位の種類:博士(工学) ; 授与年月日: 2010/3/15 ; 早大学位記番号:新525
Study on probabilistic model building genetic network programming
制度:新 ; 報告番号:甲3776号 ; 学位の種類:博士(工学) ; 授与年月日:2013/3/15 ; 早大学位記番号:新6149Waseda Universit
Reinforcement Learning Applied to Trading Systems: A Survey
Financial domain tasks, such as trading in market exchanges, are challenging
and have long attracted researchers. The recent achievements and the consequent
notoriety of Reinforcement Learning (RL) have also increased its adoption in
trading tasks. RL uses a framework with well-established formal concepts, which
raises its attractiveness in learning profitable trading strategies. However,
RL use without due attention in the financial area can prevent new researchers
from following standards or failing to adopt relevant conceptual guidelines. In
this work, we embrace the seminal RL technical fundamentals, concepts, and
recommendations to perform a unified, theoretically-grounded examination and
comparison of previous research that could serve as a structuring guide for the
field of study. A selection of twenty-nine articles was reviewed under our
classification that considers RL's most common formulations and design patterns
from a large volume of available studies. This classification allowed for
precise inspection of the most relevant aspects regarding data input,
preprocessing, state and action composition, adopted RL techniques, evaluation
setups, and overall results. Our analysis approach organized around fundamental
RL concepts allowed for a clear identification of current system design best
practices, gaps that require further investigation, and promising research
opportunities. Finally, this review attempts to promote the development of this
field of study by facilitating researchers' commitment to standards adherence
and helping them to avoid straying away from the RL constructs' firm ground.Comment: 38 page
An investigation into the use of neural networks for the prediction of the stock exchange of Thailand
Stock markets are affected by many interrelated factors such as economics and politics at both national and international levels. Predicting stock indices and determining the set of relevant factors for making accurate predictions are complicated tasks. Neural networks are one of the popular approaches used for research on stock market forecast. This study developed neural networks to predict the movement direction of the next trading day of the Stock Exchange of Thailand (SET) index. The SET has yet to be studied extensively and research focused on the SET will contribute to understanding its unique characteristics and will lead to identifying relevant information to assist investment in this stock market. Experiments were carried out to determine the best network architecture, training method, and input data to use for this task. With regards network architecture, feedforward networks with three layers were used - an input layer, a hidden layer and an output layer - and networks with different numbers of nodes in the hidden layers were tested and compared. With regards training method, neural networks were trained with back-propagation and with genetic algorithms. With regards input data, three set of inputs, namely internal indicators, external indicators and a combination of both were used. The internal indicators are based on calculations derived from the SET while the external indicators are deemed to be factors beyond the control of the Thailand such as the Down Jones Index
Affinity-Based Reinforcement Learning : A New Paradigm for Agent Interpretability
The steady increase in complexity of reinforcement learning (RL) algorithms is accompanied by a corresponding increase in opacity that obfuscates insights into their devised strategies. Methods in explainable artificial intelligence seek to mitigate this opacity by either creating transparent algorithms or extracting explanations post hoc. A third category exists that allows the developer to affect what agents learn: constrained RL has been used in safety-critical applications and prohibits agents from visiting certain states; preference-based RL agents have been used in robotics applications and learn state-action preferences instead of traditional reward functions. We propose a new affinity-based RL paradigm in which agents learn strategies that are partially decoupled from reward functions. Unlike entropy regularisation, we regularise the objective function with a distinct action distribution that represents a desired behaviour; we encourage the agent to act according to a prior while learning to maximise rewards. The result is an inherently interpretable agent that solves problems with an intrinsic affinity for certain actions. We demonstrate the utility of our method in a financial application: we learn continuous time-variant compositions of prototypical policies, each interpretable by its action affinities, that are globally interpretable according to customers’ financial personalities.
Our method combines advantages from both constrained RL and preferencebased RL: it retains the reward function but generalises the policy to match a defined behaviour, thus avoiding problems such as reward shaping and hacking. Unlike Boolean task composition, our method is a fuzzy superposition of different prototypical strategies to arrive at a more complex, yet interpretable, strategy.publishedVersio
Stock Market Prediction via Deep Learning Techniques: A Survey
The stock market prediction has been a traditional yet complex problem
researched within diverse research areas and application domains due to its
non-linear, highly volatile and complex nature. Existing surveys on stock
market prediction often focus on traditional machine learning methods instead
of deep learning methods. Deep learning has dominated many domains, gained much
success and popularity in recent years in stock market prediction. This
motivates us to provide a structured and comprehensive overview of the research
on stock market prediction focusing on deep learning techniques. We present
four elaborated subtasks of stock market prediction and propose a novel
taxonomy to summarize the state-of-the-art models based on deep neural networks
from 2011 to 2022. In addition, we also provide detailed statistics on the
datasets and evaluation metrics commonly used in the stock market. Finally, we
highlight some open issues and point out several future directions by sharing
some new perspectives on stock market prediction
my Human Brain Project (mHBP)
How can we make an agent that thinks like us humans? An agent that can have
proprioception, intrinsic motivation, identify deception, use small amounts of energy, transfer
knowledge between tasks and evolve? This is the problem that this thesis is focusing on.
Being able to create a piece of software that can perform tasks like a human being, is
a goal that, if achieved, will allow us to extend our own capabilities to a very high level, and
have more tasks performed in a predictable fashion. This is one of the motivations for this
thesis.
To address this problem, we have proposed a modular architecture for
Reinforcement Learning computation and developed an implementation to have this
architecture exercised. This software, that we call mHBP, is created in Python using Webots
as an environment for the agent, and Neo4J, a graph database, as memory. mHBP takes
the sensory data or other inputs, and produces, based on the body parts / tools that the
agent has available, an output consisting of actions to perform.
This thesis involves experimental design with several iterations, exploring a
theoretical approach to RL based on graph databases. We conclude, with our work in this
thesis, that it is possible to represent episodic data in a graph, and is also possible to
interconnect Webots, Python and Neo4J to support a stable architecture for Reinforcement
Learning. In this work we also find a way to search for policies using the Neo4J querying
language: Cypher. Another key conclusion of this work is that state representation needs to
have further research to find a state definition that enables policy search to produce more
useful policies.
The article “REINFORCEMENT LEARNING: A LITERATURE REVIEW (2020)” at
Research Gate with doi 10.13140/RG.2.2.30323.76327 is an outcome of this thesis.Como podemos criar um agente que pense como nós humanos? Um agente que tenha
propriocepção, motivação intrínseca, seja capaz de identificar ilusão, usar pequenas
quantidades de energia, transferir conhecimento entre tarefas e evoluir? Este é o problema
em que se foca esta tese.
Ser capaz de criar uma peça de software que desempenhe tarefas como um ser
humano é um objectivo que, se conseguido, nos permitirá estender as nossas capacidades
a um nível muito alto, e conseguir realizar mais tarefas de uma forma previsível. Esta é uma
das motivações desta tese.
Para endereçar este problema, propomos uma arquitectura modular para
computação de aprendizagem por reforço e desenvolvemos uma implementação para
exercitar esta arquitetura. Este software, ao qual chamamos mHBP, foi criado em Python
usando o Webots como um ambiente para o agente, e o Neo4J, uma base de dados de
grafos, como memória. O mHBP recebe dados sensoriais ou outros inputs, e produz,
baseado nas partes do corpo / ferramentas que o agente tem disponíveis, um output que
consiste em ações a desempenhar.
Uma boa parte desta tese envolve desenho experimental com diversas iterações,
explorando uma abordagem teórica assente em bases de dados de grafos. Concluímos,
com o trabalho nesta tese, que é possível representar episódios em um grafo, e que é,
também, possível interligar o Webots, com o Python e o Neo4J para suportar uma
arquitetura estável para a aprendizagem por reforço. Neste trabalho, também, encontramos
uma forma de procurar políticas usando a linguagem de pesquisa do Neo4J: Cypher. Outra
conclusão chave deste trabalho é que a representação de estados necessita de mais
investigação para encontrar uma definição de estado que permita à pesquisa de políticas
produzir políticas que sejam mais úteis.
O artigo “REINFORCEMENT LEARNING: A LITERATURE REVIEW (2020)” no
Research Gate com o doi 10.13140/RG.2.2.30323.76327 é um sub-produto desta tese
- …