45,835 research outputs found
A neural network for mining large volumes of time series data
Efficiently mining large volumes of time series data is amongst the most challenging problems that are fundamental in many fields such as industrial process monitoring, medical data analysis and business forecasting. This paper discusses a high-performance neural network for mining large time series data set and some practical issues on time series data mining. Examples of how this technology is used to search the engine data within a major UK eScience Grid project (DAME) for supporting the maintenance of Rolls-Royce aero-engine are presented
Predicting expected TCP throughput using genetic algorithm
Predicting the expected throughput of TCP is important for several aspects such as e.g. determining handover criteria for future multihomed mobile nodes or determining the expected throughput of a given MPTCP subflow for load-balancing reasons. However, this is challenging due to time varying behavior of the underlying network characteristics. In this paper, we present a genetic-algorithm-based prediction model for estimating TCP throughput values. Our approach tries to find the best matching combination of mathematical functions that approximate a given time series that accounts for the TCP throughput samples using genetic algorithm. Based on collected historical datapoints about measured TCP throughput samples, our algorithm estimates expected throughput over time. We evaluate the quality of the prediction using different selection and diversity strategies for creating new chromosomes. Also, we explore the use of different fitness functions in order to evaluate the goodness of a chromosome. The goal is to show how different tuning on the genetic algorithm may have an impact on the prediction. Using extensive simulations over several TCP throughput traces, we find that the genetic algorithm successfully finds reasonable matching mathematical functions that allow to describe the TCP sampled throughput values with good fidelity. We also explore the effectiveness of predicting time series throughput samples for a given prediction horizon and estimate the prediction error and confidence.Peer ReviewedPostprint (author's final draft
Neural Networks for Complex Data
Artificial neural networks are simple and efficient machine learning tools.
Defined originally in the traditional setting of simple vector data, neural
network models have evolved to address more and more difficulties of complex
real world problems, ranging from time evolving data to sophisticated data
structures such as graphs and functions. This paper summarizes advances on
those themes from the last decade, with a focus on results obtained by members
of the SAMM team of Universit\'e Paris
Discussion of "Feature Matching in Time Series Modeling" by Y. Xia and H. Tong
Discussion of "Feature Matching in Time Series Modeling" by Y. Xia and H.
Tong [arXiv:1104.3073]Comment: Published in at http://dx.doi.org/10.1214/11-STS345B the Statistical
Science (http://www.imstat.org/sts/) by the Institute of Mathematical
Statistics (http://www.imstat.org
Improving Short-Term Electricity Price Forecasting Using Day-Ahead LMP with ARIMA Models
Short-term electricity price forecasting has become important for demand side
management and power generation scheduling. Especially as the electricity
market becomes more competitive, a more accurate price prediction than the
day-ahead locational marginal price (DALMP) published by the independent system
operator (ISO) will benefit participants in the market by increasing profit or
improving load demand scheduling. Hence, the main idea of this paper is to use
autoregressive integrated moving average (ARIMA) models to obtain a better LMP
prediction than the DALMP by utilizing the published DALMP, historical
real-time LMP (RTLMP) and other useful information. First, a set of seasonal
ARIMA (SARIMA) models utilizing the DALMP and historical RTLMP are developed
and compared with autoregressive moving average (ARMA) models that use the
differences between DALMP and RTLMP on their forecasting capability. A
generalized autoregressive conditional heteroskedasticity (GARCH) model is
implemented to further improve the forecasting by accounting for the price
volatility. The models are trained and evaluated using real market data in the
Midcontinent Independent System Operator (MISO) region. The evaluation results
indicate that the ARMAX-GARCH model, where an exogenous time series indicates
weekend days, improves the short-term electricity price prediction accuracy and
outperforms the other proposed ARIMA modelsComment: IEEE PES 2017 General Meeting, Chicago, I
Personalized Purchase Prediction of Market Baskets with Wasserstein-Based Sequence Matching
Personalization in marketing aims at improving the shopping experience of
customers by tailoring services to individuals. In order to achieve this,
businesses must be able to make personalized predictions regarding the next
purchase. That is, one must forecast the exact list of items that will comprise
the next purchase, i.e., the so-called market basket. Despite its relevance to
firm operations, this problem has received surprisingly little attention in
prior research, largely due to its inherent complexity. In fact,
state-of-the-art approaches are limited to intuitive decision rules for pattern
extraction. However, the simplicity of the pre-coded rules impedes performance,
since decision rules operate in an autoregressive fashion: the rules can only
make inferences from past purchases of a single customer without taking into
account the knowledge transfer that takes place between customers. In contrast,
our research overcomes the limitations of pre-set rules by contributing a novel
predictor of market baskets from sequential purchase histories: our predictions
are based on similarity matching in order to identify similar purchase habits
among the complete shopping histories of all customers. Our contributions are
as follows: (1) We propose similarity matching based on subsequential dynamic
time warping (SDTW) as a novel predictor of market baskets. Thereby, we can
effectively identify cross-customer patterns. (2) We leverage the Wasserstein
distance for measuring the similarity among embedded purchase histories. (3) We
develop a fast approximation algorithm for computing a lower bound of the
Wasserstein distance in our setting. An extensive series of computational
experiments demonstrates the effectiveness of our approach. The accuracy of
identifying the exact market baskets based on state-of-the-art decision rules
from the literature is outperformed by a factor of 4.0.Comment: Accepted for oral presentation at 25th ACM SIGKDD Conference on
Knowledge Discovery and Data Mining (KDD 2019
Evolutionary rule-based system for IPO underpricing prediction
Genetic And Evolutionary Computation Conference. Washington DC, USA, 25-29 June 2005Academic literature has documented for a long time the existence of important price gains in the first trading day of initial public offerings (IPOs).Most of the empirical analysis that has been carried out to date to explain underpricing through the offering structure is based on multiple linear regression. The alternative that we suggest is a rule-based system defined by a genetic algorithm using a Michigan approach. The system offers significant advantages in two areas, 1) a higher predictive performance, and 2) robustness to outlier patterns. The importance of the latter should be emphasized since the non-trivial task of selecting the patterns to be excluded from the training sample severely affects the results.We compare the predictions provided by the algorithm to those obtained from linear models frequently used in the IPO literature. The predictions are based on seven classic variables. The results suggest that there is a clear correlation between the selected variables and the initial return, therefore making possible to predict, to a certain extent, the closing price.This article has been financed by the Spanish founded research MCyT project TRACER, Ref: TIC2002-04498-C05-04M
- …