92 research outputs found

    A Systematic Review for Transformer-based Long-term Series Forecasting

    Full text link
    The emergence of deep learning has yielded noteworthy advancements in time series forecasting (TSF). Transformer architectures, in particular, have witnessed broad utilization and adoption in TSF tasks. Transformers have proven to be the most successful solution to extract the semantic correlations among the elements within a long sequence. Various variants have enabled transformer architecture to effectively handle long-term time series forecasting (LTSF) tasks. In this article, we first present a comprehensive overview of transformer architectures and their subsequent enhancements developed to address various LTSF tasks. Then, we summarize the publicly available LTSF datasets and relevant evaluation metrics. Furthermore, we provide valuable insights into the best practices and techniques for effectively training transformers in the context of time-series analysis. Lastly, we propose potential research directions in this rapidly evolving field

    Two deep learning approaches to forecasting disaggregated freight flows: convolutional and encoder–decoder recurrent

    Get PDF
    Time series forecasting of disaggregated freight flow is a key issue in decision-making by port authorities. For this purpose and to test new deep learning techniques we have selected seven time series of imported goods from Morocco to Spain through the port of Algeciras, and we have tested two forecasting deep neural networks models: dilated causal convolutional and encoder–decoder recurrent. We have experimented with four different granularities for each series: quarterly, monthly, weekly and daily. The results show that our neural network models can manage these raw series without first removing seasonality or trend. We also highlight the ability of neural models to work with a fixed input size of one year, being able to make good predictions using the same input size for all granularities. The two deep learning models have globally improved the benchmarks of the M4 Competition of forecasting. Each neural network model obtains its best results under different circumstances: the recurrent one with daily granularity and intermittent series, and the convolutional one with weekly and monthly granularitie

    Unified Data Management and Comprehensive Performance Evaluation for Urban Spatial-Temporal Prediction [Experiment, Analysis & Benchmark]

    Full text link
    The field of urban spatial-temporal prediction is advancing rapidly with the development of deep learning techniques and the availability of large-scale datasets. However, challenges persist in accessing and utilizing diverse urban spatial-temporal datasets from different sources and stored in different formats, as well as determining effective model structures and components with the proliferation of deep learning models. This work addresses these challenges and provides three significant contributions. Firstly, we introduce "atomic files", a unified storage format designed for urban spatial-temporal big data, and validate its effectiveness on 40 diverse datasets, simplifying data management. Secondly, we present a comprehensive overview of technological advances in urban spatial-temporal prediction models, guiding the development of robust models. Thirdly, we conduct extensive experiments using diverse models and datasets, establishing a performance leaderboard and identifying promising research directions. Overall, this work effectively manages urban spatial-temporal data, guides future efforts, and facilitates the development of accurate and efficient urban spatial-temporal prediction models. It can potentially make long-term contributions to urban spatial-temporal data management and prediction, ultimately leading to improved urban living standards.Comment: 14 pages, 3 figures. arXiv admin note: text overlap with arXiv:2304.1434

    FC-GAGA: Fully Connected Gated Graph Architecture for Spatio-Temporal Traffic Forecasting

    Full text link
    Forecasting of multivariate time-series is an important problem that has applications in traffic management, cellular network configuration, and quantitative finance. A special case of the problem arises when there is a graph available that captures the relationships between the time-series. In this paper we propose a novel learning architecture that achieves performance competitive with or better than the best existing algorithms, without requiring knowledge of the graph. The key element of our proposed architecture is the learnable fully connected hard graph gating mechanism that enables the use of the state-of-the-art and highly computationally efficient fully connected time-series forecasting architecture in traffic forecasting applications. Experimental results for two public traffic network datasets illustrate the value of our approach, and ablation studies confirm the importance of each element of the architecture. The code is available here: https://github.com/boreshkinai/fc-gaga

    Deep Attentive Time Series Modelling for Quantitative Finance

    Get PDF
    Mención Internacional en el título de doctorTime series modelling and forecasting is a persistent problem with extensive implications in scientific, business, industrial, and economic areas. This thesis’s contribution is twofold. Firstly, we propose a novel probabilistic time series forecasting methodology that introduces the use of Fourier domain-based attention models, merging classic signal processing spectral filtering techniques with machine learning architectures. Secondly, we take advantage of the abundance of financial intraday high-frequency data to develop deep learning-based solutions for modelling financial time series. Machine learning methods can potentially enhance the performance of traditional methodologies used by practitioners. Deep neural networks’ feature extraction capabilities, which can benefit from the rising accessibility of highfrequency data, and attention mechanisms, which help to model temporal patterns, are mostly to blame for this. Concerning our first major contribution, this thesis empirically demonstrates that spectral domain-based machine learning models can learn the properties of time series datasets and integrate this information to improve the forecasting accuracy. Simultaneously, Fourier domain-based models alleviate some of the inconveniences commonly associated with deep autoregressive models. These architectures, prone to prioritising recent past data, often ignore critical global information not contained in previous time steps. Additionally, they are susceptible to error accumulation and propagation and may not yield illustrative results. The proposed model, the Spectral Attention Autoregressive Model (SAAM), mitigates these problems by combining deep autoregressive models with a Spectral Attention (SA) module. This module uses two attention models operating over the Fourier domain representation of the time series’ embedding. Through spectral filtering, SAAM differentiates between the components of the frequency domain that should be considered noise and subsequently filtered out, and the global patterns that are relevant and should be incorporated into the predictions. Empirical evaluation proves how the proposed Spectral Attention module can be integrated into various deep autoregressive models, consistently improving the results of these base architectures and achieving state-of-the-art performance. Afterwards, this thesis shifts toward showcasing the benefits of machine learning solutions in two different quantitative finance scenarios, proving how attention-based deep learning approaches compare favourably to classic parametric-based models and providing solutions for various algorithmic and high-frequency trading problems. In the context of volatility forecasting, which plays a central role among equity risk measures, we show that Dilated Causal Convolutional-based neural networks offer significant performance gains compared to well-established volatility-oriented parametric models. The proposed model, called DeepVol, showcases how data- driven models can avoid the limitations of classical methods by taking advantage of the abundance of high-frequency data. DeepVol outperforms baseline methods while exhibiting robustness in the presence of volatility shocks, showing its ability to extract universal features and transfer learning to out-of-distribution data. Consequently, data-driven approaches should be carefully considered in the context of volatility forecasting, as they can be instrumental in the valuation of financial derivatives, risk management, and the formation of investment portfolios. Finally, this thesis presents a survival analysis model for estimating the distri- bution of fill times for limit orders posted in the Limit Order Book (LOB). The proposed model, which does not make assumptions about the underlying stochastic processes, employs a convolutional-Transformer encoder and a monotonic neural network decoder to relate the time-varying features of the LOB to the distribution of fill times. It grants practitioners the capability of making informed decisions between market orders and limit orders, which in practice entails a trade-off between immediate execution and price premium. We offer an exhaustive comparison of the survival functions resulting from different order placement strategies, offering insight into the fill probability of orders placed within the spread. Empirical evaluation reveals the superior performance of the monotonic encoder-decoder convolutional- Transformer compared to state-of-the-art benchmarks, leading to more accurate predictions and improved economic value.El modelado y predicción de series temporales es un problema persistente con amplias implicaciones en áreas científicas, comerciales, industriales y económicas. Esta tesis propone una doble contribución en este ámbito. En primer lugar, formulamos una novedosa metodología para la predicción probabilística de series temporales que introduce el uso de modelos de atención basados en el dominio de la frecuencia, con la transformada de Fourier desempeñando un papel fundamental. El modelo propuesto fusiona técnicas clásicas de filtrado espectral, pertenecientes al campo del procesado de señal, con modelos de aprendizaje automático. En segundo lugar, desarrollamos varias soluciones basadas en aprendizaje profundo para el modelado de datos financieros intradía, aprovechando la cada vez mayor disponibilidad de los mismos. Los métodos de aprendizaje automático poseen el potencial para mejorar los resultados obtenidos por las metodologías clásicas que los profesionales del ámbito de las finanzas cuantitativas acostumbran a utilizar. La capacidad de extracción de características de las redes neuronales, que pueden aprovechar la creciente accesibilidad a los datos financieros de alta frecuencia, y el uso de los mecanismos de atención para el modelado temporal, son los principales responsables de ésto. En lo relativo a la primera de las contribuciones mencionadas anteriormente, es decir, el uso de modelos de aprendizaje automático que operan sobre el dominio de la frecuencia, esta tesis demuestra de manera empírica que los modelos de aprendizaje profundo basados en el dominio espectral pueden aprender de forma más eficiente las propiedades de las series temporales a predecir. De esta manera, logran mejorar la precisión de las predicciones a la vez que solventan varios de los problemas que lastran el rendimiento de los modelos autoregresivos. Estas arquitecturas son propensas a sobreponderar los datos del pasado inmediato, ignorando a menudo valiosa información global que no está contenida en estas observaciones recientes. Además, son susceptibles a la acumulación y propagación de errores. Finalmente, los resultados que producen son difícilmente interpretables. Proponemos un nuevo modelo, llamado “Spectral Attention Autoregressive Model”(SAAM) (Modelo Autorregresivo con Atención Espectral), que mitiga estos problemas combinando modelos autorregresivos basados en aprendizaje profundo con un módulo de Atención Espectral. Dicho módulo contiene dos modelos de atención que operan sobre la representación en el dominio de Fourier del “embedding” obtenido a partir de la serie temporal a predecir. Usando técnicas de filtrado espectral, SAAM diferencia entre los componentes del espectro que deben ser considerados ruido, y por consiguiente deben ser filtrados, y aquellos patrones globales que son relevantes y deben ser incorporados en las predicciones. Mediante una exhaustiva evaluación empírica, demostramos que nuestro modelo de Atención Espectral puede ser integrado en diversos modelos autorregresivos que forman parte del estado del arte actual, mejorando de forma consistente los resultados obtenidos. En lo relativo a la segunda contribución principal de esta tesis doctoral, demostramos los beneficios que las metodologías de aprendizaje automático basadas en modelos de atención pueden aportar en dos problemas propios de las finanzas cuantitativas. Diversos experimentos demuestran cómo este tipo de modelos pueden mejorar los resultados obtenidos por los modelos clásicos empleados en este campo, proporcionando soluciones innovadoras para diversos problemas recurrentes dentro del trading algorítmico de alta frecuencia. La predicción de volatilidad en mercados financieros es el primero de estos problemas en ser abordado en la presente tesis. La estimación de volatilidad desempeña un papel central entre las medidas de riesgo utilizadas en los mercados de renta variable. En esta tesis demostramos que las redes neuronales basadas en “Dilated Causal Convolutions” (Convolucionales Causales Dilatadas) ofrecen ganancias significativas en comparación con los modelos paramétricos clásicos desarrollados única y exclusivamente para predicción de volatilidad. El modelo propuesto, llamado DeepVol, evidencia que el uso de modelos de aprendizaje profundo puede evitar las numerosas limitaciones propias de los métodos clásicos, logrando aprovechar la abundancia de datos de alta frecuencia para aprender las funciones deseadas. DeepVol supera a todos los modelos de referencia usados como comparativa, a la vez que exhibe robustez en períodos que contienen shocks de volatilidad, demostrando su capacidad para extraer características universales comunes a diferentes instrumentos financieros. Los resultados obtenidos en esta parte de la tesis nos llevan a concluir que los modelos de aprendizaje automático deben considerarse cuidadosamente en el contexto de predicción de volatilidad, pudiendo ser especialmente relevantes en la valoración de derivados financieros, gestión del riesgo, y creación de carteras de inversión. Para terminar, esta tesis presenta un modelo de análisis de supervivencia para estimar la distribución de probabilidad de ejecución subyacente a órdenes limitadas publicadas en el conocido como “Limit Order Book” (Libro de Órdenes Limitadas). El modelo propuesto, que no necesita partir de suposiciones sobre los procesos estocásticos subyacentes, emplea una arquitectura codificador/decodificador que utiliza un “Transformer” convolutional para codificar la información del libro de órdenes y una red monotónica que decodifica la función de supervivencia a estimar.Programa de Doctorado en Multimedia y Comunicaciones por la Universidad Carlos III de Madrid y la Universidad Rey Juan CarlosPresidente: Juan José Murillo Fuentes.- Secretario: Emilio Parrado Hernández.- Vocal: Manuel Gómez Rodrígue

    Spatiotemporal convolutional network for time-series prediction and causal inference

    Full text link
    Making predictions in a robust way is not easy for nonlinear systems. In this work, a neural network computing framework, i.e., a spatiotemporal convolutional network (STCN), was developed to efficiently and accurately render a multistep-ahead prediction of a time series by employing a spatial-temporal information (STI) transformation. The STCN combines the advantages of both the temporal convolutional network (TCN) and the STI equation, which maps the high-dimensional/spatial data to the future temporal values of a target variable, thus naturally providing the prediction of the target variable. From the observed variables, the STCN also infers the causal factors of the target variable in the sense of Granger causality, which are in turn selected as effective spatial information to improve the prediction robustness. The STCN was successfully applied to both benchmark systems and real-world datasets, all of which show superior and robust performance in multistep-ahead prediction, even when the data were perturbed by noise. From both theoretical and computational viewpoints, the STCN has great potential in practical applications in artificial intelligence (AI) or machine learning fields as a model-free method based only on the observed data, and also opens a new way to explore the observed high-dimensional data in a dynamical manner for machine learning.Comment: 23 pages, 6 figure

    Online advertising revenue forecasting: an interpretable deep learning approach

    Get PDF
    This paper investigates whether publishers’ Google AdSense online advertising revenues can be predicted from peekd’s proprietary database using deep learning methodologies. Peekd is a Berlin (Germany) based data science company, which primarily provides e Retailers with sales and shopper intelligence. I find that using a single deep learning model, AdSense revenues can be predicted across publishers. Additionally, using unsupervised clustering, publishers were grouped and related time series were fed as covariates when making predictions. No performance improvement was found in relation with this technique. Finally, I find that in the short-term, publishers’ AdSense revenues embed similar temporal patterns as web traffic
    corecore