61 research outputs found
A Survey of Forex and Stock Price Prediction Using Deep Learning
The prediction of stock and foreign exchange (Forex) had always been a hot
and profitable area of study. Deep learning application had proven to yields
better accuracy and return in the field of financial prediction and
forecasting. In this survey we selected papers from the DBLP database for
comparison and analysis. We classified papers according to different deep
learning methods, which included: Convolutional neural network (CNN), Long
Short-Term Memory (LSTM), Deep neural network (DNN), Recurrent Neural Network
(RNN), Reinforcement Learning, and other deep learning methods such as HAN,
NLP, and Wavenet. Furthermore, this paper reviewed the dataset, variable,
model, and results of each article. The survey presented the results through
the most used performance metrics: RMSE, MAPE, MAE, MSE, accuracy, Sharpe
ratio, and return rate. We identified that recent models that combined LSTM
with other methods, for example, DNN, are widely researched. Reinforcement
learning and other deep learning method yielded great returns and performances.
We conclude that in recent years the trend of using deep-learning based method
for financial modeling is exponentially rising
The 8th International Conference on Time Series and Forecasting
The aim of ITISE 2022 is to create a friendly environment that could lead to the establishment or strengthening of scientific collaborations and exchanges among attendees. Therefore, ITISE 2022 is soliciting high-quality original research papers (including significant works-in-progress) on any aspect time series analysis and forecasting, in order to motivating the generation and use of new knowledge, computational techniques and methods on forecasting in a wide range of fields
Realizing EDGAR: eliminating information asymmetries through artificial intelligence analysis of SEC filings
The U.S. Securities and Exchange Commission (SEC) maintains a publicly-accessible database of all required filings of all publicly traded companies. Known as EDGAR (Electronic Data Gathering, Analysis, and Retrieval), this database contains documents ranging from annual reports of major companies to personal disclosures of senior managers. However, the common user and particularly the retail investor are overwhelmed by the deluge of information, not empowered. EDGAR as it currently functions entrenches the information asymmetry between these retail investors and the large financial institutions with which they often trade. With substantial research staffs and budgets coupled to an industry standard of “playing both sides” of a transaction, these investors “in the know” lead price fluctuations while others must follow.
In general, this thesis applies recent technological advancements to the development of software tools that will derive valuable insights from EDGAR documents in an efficient time period. While numerous such commercial products currently exist, all come with significant price tags and many still rely on significant human involvement in deriving such insights. Recent years, however, have seen an explosion in the fields of Machine Learning (ML) and Natural Language Processing (NLP), which show promise in automating many of these functions with greater efficiency. ML aims to develop software which learns parameters from large datasets as opposed to traditional software which merely applies a programmer’s logic. NLP aims to read, understand, and generate language naturally, an area where recent ML advancements have proven particularly adept.
Specifically, this thesis serves as an exploratory study in applying recent advancements in ML and NLP to the vast range of documents contained in the EDGAR database. While algorithms will likely never replace the hordes of research analysts that now saturate securities markets nor the advantages that accrue to large and diverse trading desks, they do hold the potential to provide small yet significant insights at little cost.
This study first examines methods for document acquisition from EDGAR with a focus on a baseline efficiency sufficient for the real-time trading needs of market participants. Next, it applies recent advancements in ML and NLP, specifically recurrent neural networks, to the task of standardizing financial statements across different filers. Finally, the conclusion contextualizes these findings in an environment of continued technological and commercial evolution
Large Language Models in Finance: A Survey
Recent advances in large language models (LLMs) have opened new possibilities
for artificial intelligence applications in finance. In this paper, we provide
a practical survey focused on two key aspects of utilizing LLMs for financial
tasks: existing solutions and guidance for adoption.
First, we review current approaches employing LLMs in finance, including
leveraging pretrained models via zero-shot or few-shot learning, fine-tuning on
domain-specific data, and training custom LLMs from scratch. We summarize key
models and evaluate their performance improvements on financial natural
language processing tasks.
Second, we propose a decision framework to guide financial professionals in
selecting the appropriate LLM solution based on their use case constraints
around data, compute, and performance needs. The framework provides a pathway
from lightweight experimentation to heavy investment in customized LLMs.
Lastly, we discuss limitations and challenges around leveraging LLMs in
financial applications. Overall, this survey aims to synthesize the
state-of-the-art and provide a roadmap for responsibly applying LLMs to advance
financial AI.Comment: Accepted by 4th ACM International Conference on AI in Finance
(ICAIF-23) https://ai-finance.or
DATA-DRIVEN ANALYTICAL MODELS FOR IDENTIFICATION AND PREDICTION OF OPPORTUNITIES AND THREATS
During the lifecycle of mega engineering projects such as: energy facilities,
infrastructure projects, or data centers, executives in charge should take into account
the potential opportunities and threats that could affect the execution of such projects.
These opportunities and threats can arise from different domains; including for
example: geopolitical, economic or financial, and can have an impact on different
entities, such as, countries, cities or companies. The goal of this research is to provide
a new approach to identify and predict opportunities and threats using large and diverse
data sets, and ensemble Long-Short Term Memory (LSTM) neural network models to
inform domain specific foresights. In addition to predicting the opportunities and
threats, this research proposes new techniques to help decision-makers for deduction
and reasoning purposes. The proposed models and results provide structured output to
inform the executive decision-making process concerning large engineering projects
(LEPs). This research proposes new techniques that not only provide reliable timeseries
predictions but uncertainty quantification to help make more informed decisions.
The proposed ensemble framework consists of the following components: first,
processed domain knowledge is used to extract a set of entity-domain features; second,
structured learning based on Dynamic Time Warping (DTW), to learn similarity
between sequences and Hierarchical Clustering Analysis (HCA), is used to determine
which features are relevant for a given prediction problem; and finally, an automated
decision based on the input and structured learning from the DTW-HCA is used to
build a training data-set which is fed into a deep LSTM neural network for time-series
predictions. A set of deeper ensemble programs are proposed such as Monte Carlo
Simulations and Time Label Assignment to offer a controlled setting for assessing the
impact of external shocks and a temporal alert system, respectively. The developed
model can be used to inform decision makers about the set of opportunities and threats
that their entities and assets face as a result of being engaged in an LEP accounting for
epistemic uncertainty
INVESTMENTS IN TIMES OF UNCERTAINTY: FORMATION OF PORTFOLIOS USING RANDOM FOREST
Objective: based on a systematic approach using machine learning, this research aims to propose a model of selection and allocation of assets that allows for building profitable and safe portfolios, even in times of insecurity and low predictability.
Methodology: we used the machine learning algorithm called random forest to associate the independent variables with a dependent one and learn the probability of positive returns in the month following the data collection. According to the probabilities, the stocks were allocated into long, short, or non-allocated portfolios. Finally, we allocated a share of gold, which is a protection asset much used in times of crisis and uncertainty.
Results and contributions: the study reached its goal and demonstrated being possible to build profitable and safe investment portfolios, even in times of greater uncertainty and volatility, as in 2020 due to the Covid-19 pandemic. We found that the model is effective in moments of crisis and also of greater predictability, as in the period from 2016 to 2019 when the stock exchange has an uptrend.
Relevance: the relevance of this study points to an unprecedented historical context in Brazil, where uncertainties regarding both the local and world economy have demanded advanced studies of prediction to minimize risks and contribute to results for investors. In addition, we highlight that following a short period of low Selic (2019 to 2021), the Central Bank increased the rate again, raising the interest more in profitable and safer assets than the investment in stocks.
Impact on the area: the study has a positive impact on the finance area since studies in this field promote greater stability to investors and thus better capital flow for companies, which, in turn, contribute to society and the growth of the country
Stock Trading Optimization through Model-based Reinforcement Learning with Normalizing Fl
With the fast development of quantitative portfolio optimization in financial
engineering, lots of promising algorithmic trading strategies have shown
competitive advantages in recent years. However, the environment from real
financial markets is complex and hard to be fully simulated, considering
non-stationarity of the stock data, unpredictable hidden causal factors and so
on. Fortunately, difference of stock prices is often stationary series, and the
internal relationship between difference of stocks can be linked to the
decision-making process, then the portfolio should be able to achieve better
performance. In this paper, we demonstrate normalizing flows is adopted to
simulated high-dimensional joint probability of the complex trading
environment, and develop a novel model based reinforcement learning framework
to better understand the intrinsic mechanisms of quantitative online trading.
Second, we experiment various stocks from three different financial markets
(Dow, NASDAQ and S&P 500) and show that among these three financial markets,
Dow gets the best performance results on various evaluation metrics under our
back-testing system. Especially, our proposed method even resists big drop
(less maximum drawdown) during COVID-19 pandemic period when the financial
market got unpredictable crisis. All these results are comparatively better
than modeling the state transition dynamics with independent Gaussian
Processes. Third, we utilize a causal analysis method to study the causal
relationship among different stocks of the environment. Further, by visualizing
high dimensional state transition data comparisons from real and virtual buffer
with t-SNE, we uncover some effective patterns of betComment: arXiv admin note: text overlap with arXiv:2205.1505
A Comprehensive Survey of Deep Learning: Advancements, Applications, and Challenges
Artificial intelligence's "deep learning" discipline has taken off, revolutionizing a variety of industries, from computer vision and natural language processing to healthcare and finance. Deep learning has shown extraordinary effectiveness in resolving complicated issues, and it has a wide range of potential applications, from autonomous vehicles to healthcare. The purpose of the survey to study deep learning's present condition, including recent advancements, difficulties, and constraints since the subject is currently fast growing. The basic ideas of deep learning, such as neural networks, activation functions, and optimization algorithms, are first introduced. We next explore numerous topologies, emphasizing their distinct properties and uses, including convolutional neural networks (CNNs), recurrent neural networks (RNNs), and generative adversarial networks (GANs). Further concepts, applications, and difficulties of deep learning are all covered in this survey paper's thorough review. This survey aid the academics, professionals, and individuals who want to learn more about deep learning and explore its applications to challenging situations in the real world
- …