92 research outputs found
A Systematic Review for Transformer-based Long-term Series Forecasting
The emergence of deep learning has yielded noteworthy advancements in time
series forecasting (TSF). Transformer architectures, in particular, have
witnessed broad utilization and adoption in TSF tasks. Transformers have proven
to be the most successful solution to extract the semantic correlations among
the elements within a long sequence. Various variants have enabled transformer
architecture to effectively handle long-term time series forecasting (LTSF)
tasks. In this article, we first present a comprehensive overview of
transformer architectures and their subsequent enhancements developed to
address various LTSF tasks. Then, we summarize the publicly available LTSF
datasets and relevant evaluation metrics. Furthermore, we provide valuable
insights into the best practices and techniques for effectively training
transformers in the context of time-series analysis. Lastly, we propose
potential research directions in this rapidly evolving field
Two deep learning approaches to forecasting disaggregated freight flows: convolutional and encoder–decoder recurrent
Time series forecasting of disaggregated freight flow is a key issue in decision-making by port authorities. For this purpose
and to test new deep learning techniques we have selected seven time series of imported goods from Morocco to Spain
through the port of Algeciras, and we have tested two forecasting deep neural networks models: dilated causal
convolutional and encoder–decoder recurrent. We have experimented with four different granularities for each series:
quarterly, monthly, weekly and daily. The results show that our neural network models can manage these raw series without
first removing seasonality or trend. We also highlight the ability of neural models to work with a fixed input size of one
year, being able to make good predictions using the same input size for all granularities. The two deep learning models have
globally improved the benchmarks of the M4 Competition of forecasting. Each neural network model obtains its best results
under different circumstances: the recurrent one with daily granularity and intermittent series, and the convolutional one
with weekly and monthly granularitie
Unified Data Management and Comprehensive Performance Evaluation for Urban Spatial-Temporal Prediction [Experiment, Analysis & Benchmark]
The field of urban spatial-temporal prediction is advancing rapidly with the
development of deep learning techniques and the availability of large-scale
datasets. However, challenges persist in accessing and utilizing diverse urban
spatial-temporal datasets from different sources and stored in different
formats, as well as determining effective model structures and components with
the proliferation of deep learning models. This work addresses these challenges
and provides three significant contributions. Firstly, we introduce "atomic
files", a unified storage format designed for urban spatial-temporal big data,
and validate its effectiveness on 40 diverse datasets, simplifying data
management. Secondly, we present a comprehensive overview of technological
advances in urban spatial-temporal prediction models, guiding the development
of robust models. Thirdly, we conduct extensive experiments using diverse
models and datasets, establishing a performance leaderboard and identifying
promising research directions. Overall, this work effectively manages urban
spatial-temporal data, guides future efforts, and facilitates the development
of accurate and efficient urban spatial-temporal prediction models. It can
potentially make long-term contributions to urban spatial-temporal data
management and prediction, ultimately leading to improved urban living
standards.Comment: 14 pages, 3 figures. arXiv admin note: text overlap with
arXiv:2304.1434
FC-GAGA: Fully Connected Gated Graph Architecture for Spatio-Temporal Traffic Forecasting
Forecasting of multivariate time-series is an important problem that has
applications in traffic management, cellular network configuration, and
quantitative finance. A special case of the problem arises when there is a
graph available that captures the relationships between the time-series. In
this paper we propose a novel learning architecture that achieves performance
competitive with or better than the best existing algorithms, without requiring
knowledge of the graph. The key element of our proposed architecture is the
learnable fully connected hard graph gating mechanism that enables the use of
the state-of-the-art and highly computationally efficient fully connected
time-series forecasting architecture in traffic forecasting applications.
Experimental results for two public traffic network datasets illustrate the
value of our approach, and ablation studies confirm the importance of each
element of the architecture. The code is available here:
https://github.com/boreshkinai/fc-gaga
Deep Attentive Time Series Modelling for Quantitative Finance
Mención Internacional en el título de doctorTime series modelling and forecasting is a persistent problem with extensive
implications in scientific, business, industrial, and economic areas. This thesis’s contribution
is twofold. Firstly, we propose a novel probabilistic time series forecasting
methodology that introduces the use of Fourier domain-based attention models,
merging classic signal processing spectral filtering techniques with machine learning
architectures. Secondly, we take advantage of the abundance of financial intraday
high-frequency data to develop deep learning-based solutions for modelling financial
time series. Machine learning methods can potentially enhance the performance
of traditional methodologies used by practitioners. Deep neural networks’ feature
extraction capabilities, which can benefit from the rising accessibility of highfrequency
data, and attention mechanisms, which help to model temporal patterns,
are mostly to blame for this.
Concerning our first major contribution, this thesis empirically demonstrates
that spectral domain-based machine learning models can learn the properties of time
series datasets and integrate this information to improve the forecasting accuracy.
Simultaneously, Fourier domain-based models alleviate some of the inconveniences
commonly associated with deep autoregressive models. These architectures, prone
to prioritising recent past data, often ignore critical global information not contained
in previous time steps. Additionally, they are susceptible to error accumulation
and propagation and may not yield illustrative results. The proposed model, the
Spectral Attention Autoregressive Model (SAAM), mitigates these problems by
combining deep autoregressive models with a Spectral Attention (SA) module. This
module uses two attention models operating over the Fourier domain representation
of the time series’ embedding. Through spectral filtering, SAAM differentiates
between the components of the frequency domain that should be considered noise
and subsequently filtered out, and the global patterns that are relevant and should
be incorporated into the predictions. Empirical evaluation proves how the proposed
Spectral Attention module can be integrated into various deep autoregressive
models, consistently improving the results of these base architectures and achieving
state-of-the-art performance.
Afterwards, this thesis shifts toward showcasing the benefits of machine learning
solutions in two different quantitative finance scenarios, proving how attention-based deep learning approaches compare favourably to classic parametric-based models and providing
solutions for various algorithmic and high-frequency trading problems. In the context of volatility
forecasting, which plays a central role among equity risk measures, we show that Dilated Causal
Convolutional-based neural networks offer significant performance gains compared to
well-established volatility-oriented parametric models. The proposed model, called DeepVol,
showcases how data- driven models can avoid the limitations of classical methods by taking
advantage of the abundance of high-frequency data. DeepVol outperforms baseline methods while
exhibiting robustness in the presence of volatility shocks, showing its ability to extract
universal features and transfer learning to out-of-distribution data. Consequently, data-driven
approaches should be carefully considered in the context of volatility forecasting, as they can be
instrumental in the valuation of financial
derivatives, risk management, and the formation of investment portfolios.
Finally, this thesis presents a survival analysis model for estimating the distri- bution of fill
times for limit orders posted in the Limit Order Book (LOB). The proposed model, which does not
make assumptions about the underlying stochastic processes, employs a convolutional-Transformer
encoder and a monotonic neural network decoder to relate the time-varying features of the LOB to
the distribution of fill times. It grants practitioners the capability of making informed decisions
between market orders and limit orders, which in practice entails a trade-off between immediate
execution and price premium. We offer an exhaustive comparison of the survival functions resulting
from different order placement strategies, offering insight into the fill probability of orders
placed within the spread. Empirical evaluation reveals the superior performance of the monotonic
encoder-decoder convolutional- Transformer compared to state-of-the-art benchmarks, leading to more
accurate
predictions and improved economic value.El modelado y predicción de series temporales es un problema persistente con amplias
implicaciones en áreas científicas, comerciales, industriales y económicas. Esta tesis
propone una doble contribución en este ámbito. En primer lugar, formulamos una
novedosa metodología para la predicción probabilística de series temporales que
introduce el uso de modelos de atención basados en el dominio de la frecuencia,
con la transformada de Fourier desempeñando un papel fundamental. El modelo
propuesto fusiona técnicas clásicas de filtrado espectral, pertenecientes al campo
del procesado de señal, con modelos de aprendizaje automático. En segundo lugar,
desarrollamos varias soluciones basadas en aprendizaje profundo para el modelado
de datos financieros intradía, aprovechando la cada vez mayor disponibilidad de los
mismos. Los métodos de aprendizaje automático poseen el potencial para mejorar los
resultados obtenidos por las metodologías clásicas que los profesionales del ámbito
de las finanzas cuantitativas acostumbran a utilizar. La capacidad de extracción
de características de las redes neuronales, que pueden aprovechar la creciente
accesibilidad a los datos financieros de alta frecuencia, y el uso de los mecanismos
de atención para el modelado temporal, son los principales responsables de ésto.
En lo relativo a la primera de las contribuciones mencionadas anteriormente, es
decir, el uso de modelos de aprendizaje automático que operan sobre el dominio de la
frecuencia, esta tesis demuestra de manera empírica que los modelos de aprendizaje
profundo basados en el dominio espectral pueden aprender de forma más eficiente
las propiedades de las series temporales a predecir. De esta manera, logran mejorar
la precisión de las predicciones a la vez que solventan varios de los problemas
que lastran el rendimiento de los modelos autoregresivos. Estas arquitecturas son
propensas a sobreponderar los datos del pasado inmediato, ignorando a menudo
valiosa información global que no está contenida en estas observaciones recientes.
Además, son susceptibles a la acumulación y propagación de errores. Finalmente,
los resultados que producen son difícilmente interpretables. Proponemos un nuevo
modelo, llamado “Spectral Attention Autoregressive Model”(SAAM) (Modelo
Autorregresivo con Atención Espectral), que mitiga estos problemas combinando
modelos autorregresivos basados en aprendizaje profundo con un módulo de Atención
Espectral. Dicho módulo contiene dos modelos de atención que operan sobre la
representación en el dominio de Fourier del “embedding” obtenido a partir de la serie temporal a predecir. Usando técnicas de filtrado espectral, SAAM diferencia entre
los componentes del espectro que deben ser considerados ruido, y por consiguiente
deben ser filtrados, y aquellos patrones globales que son relevantes y deben ser
incorporados en las predicciones. Mediante una exhaustiva evaluación empírica,
demostramos que nuestro modelo de Atención Espectral puede ser integrado en
diversos modelos autorregresivos que forman parte del estado del arte actual,
mejorando de forma consistente los resultados obtenidos.
En lo relativo a la segunda contribución principal de esta tesis doctoral, demostramos
los beneficios que las metodologías de aprendizaje automático basadas
en modelos de atención pueden aportar en dos problemas propios de las finanzas
cuantitativas. Diversos experimentos demuestran cómo este tipo de modelos pueden
mejorar los resultados obtenidos por los modelos clásicos empleados en este campo,
proporcionando soluciones innovadoras para diversos problemas recurrentes dentro
del trading algorítmico de alta frecuencia.
La predicción de volatilidad en mercados financieros es el primero de estos
problemas en ser abordado en la presente tesis. La estimación de volatilidad
desempeña un papel central entre las medidas de riesgo utilizadas en los mercados
de renta variable. En esta tesis demostramos que las redes neuronales basadas
en “Dilated Causal Convolutions” (Convolucionales Causales Dilatadas) ofrecen
ganancias significativas en comparación con los modelos paramétricos clásicos
desarrollados única y exclusivamente para predicción de volatilidad. El modelo
propuesto, llamado DeepVol, evidencia que el uso de modelos de aprendizaje
profundo puede evitar las numerosas limitaciones propias de los métodos clásicos,
logrando aprovechar la abundancia de datos de alta frecuencia para aprender las
funciones deseadas. DeepVol supera a todos los modelos de referencia usados
como comparativa, a la vez que exhibe robustez en períodos que contienen shocks
de volatilidad, demostrando su capacidad para extraer características universales
comunes a diferentes instrumentos financieros. Los resultados obtenidos en esta
parte de la tesis nos llevan a concluir que los modelos de aprendizaje automático
deben considerarse cuidadosamente en el contexto de predicción de volatilidad,
pudiendo ser especialmente relevantes en la valoración de derivados financieros,
gestión del riesgo, y creación de carteras de inversión.
Para terminar, esta tesis presenta un modelo de análisis de supervivencia para
estimar la distribución de probabilidad de ejecución subyacente a órdenes limitadas
publicadas en el conocido como “Limit Order Book” (Libro de Órdenes Limitadas).
El modelo propuesto, que no necesita partir de suposiciones sobre los procesos
estocásticos subyacentes, emplea una arquitectura codificador/decodificador que
utiliza un “Transformer” convolutional para codificar la información del libro de
órdenes y una red monotónica que decodifica la función de supervivencia a estimar.Programa de Doctorado en Multimedia y Comunicaciones por la Universidad Carlos III de Madrid y la Universidad Rey Juan CarlosPresidente: Juan José Murillo Fuentes.- Secretario: Emilio Parrado Hernández.- Vocal: Manuel Gómez Rodrígue
Spatiotemporal convolutional network for time-series prediction and causal inference
Making predictions in a robust way is not easy for nonlinear systems. In this
work, a neural network computing framework, i.e., a spatiotemporal
convolutional network (STCN), was developed to efficiently and accurately
render a multistep-ahead prediction of a time series by employing a
spatial-temporal information (STI) transformation. The STCN combines the
advantages of both the temporal convolutional network (TCN) and the STI
equation, which maps the high-dimensional/spatial data to the future temporal
values of a target variable, thus naturally providing the prediction of the
target variable. From the observed variables, the STCN also infers the causal
factors of the target variable in the sense of Granger causality, which are in
turn selected as effective spatial information to improve the prediction
robustness. The STCN was successfully applied to both benchmark systems and
real-world datasets, all of which show superior and robust performance in
multistep-ahead prediction, even when the data were perturbed by noise. From
both theoretical and computational viewpoints, the STCN has great potential in
practical applications in artificial intelligence (AI) or machine learning
fields as a model-free method based only on the observed data, and also opens a
new way to explore the observed high-dimensional data in a dynamical manner for
machine learning.Comment: 23 pages, 6 figure
Online advertising revenue forecasting: an interpretable deep learning approach
This paper investigates whether publishers’ Google AdSense online advertising revenues can be predicted from peekd’s proprietary database using deep learning methodologies. Peekd is a Berlin (Germany) based data science company, which primarily provides e Retailers with sales and shopper intelligence. I find that using a single deep learning model, AdSense revenues can be predicted across publishers. Additionally, using unsupervised clustering, publishers were grouped and related time series were fed as covariates when making predictions. No performance improvement was found in relation with this technique. Finally, I find that in the short-term, publishers’ AdSense revenues embed similar temporal patterns as web traffic
- …