45,835 research outputs found

    A neural network for mining large volumes of time series data

    Get PDF
    Efficiently mining large volumes of time series data is amongst the most challenging problems that are fundamental in many fields such as industrial process monitoring, medical data analysis and business forecasting. This paper discusses a high-performance neural network for mining large time series data set and some practical issues on time series data mining. Examples of how this technology is used to search the engine data within a major UK eScience Grid project (DAME) for supporting the maintenance of Rolls-Royce aero-engine are presented

    Predicting expected TCP throughput using genetic algorithm

    Get PDF
    Predicting the expected throughput of TCP is important for several aspects such as e.g. determining handover criteria for future multihomed mobile nodes or determining the expected throughput of a given MPTCP subflow for load-balancing reasons. However, this is challenging due to time varying behavior of the underlying network characteristics. In this paper, we present a genetic-algorithm-based prediction model for estimating TCP throughput values. Our approach tries to find the best matching combination of mathematical functions that approximate a given time series that accounts for the TCP throughput samples using genetic algorithm. Based on collected historical datapoints about measured TCP throughput samples, our algorithm estimates expected throughput over time. We evaluate the quality of the prediction using different selection and diversity strategies for creating new chromosomes. Also, we explore the use of different fitness functions in order to evaluate the goodness of a chromosome. The goal is to show how different tuning on the genetic algorithm may have an impact on the prediction. Using extensive simulations over several TCP throughput traces, we find that the genetic algorithm successfully finds reasonable matching mathematical functions that allow to describe the TCP sampled throughput values with good fidelity. We also explore the effectiveness of predicting time series throughput samples for a given prediction horizon and estimate the prediction error and confidence.Peer ReviewedPostprint (author's final draft

    Neural Networks for Complex Data

    Full text link
    Artificial neural networks are simple and efficient machine learning tools. Defined originally in the traditional setting of simple vector data, neural network models have evolved to address more and more difficulties of complex real world problems, ranging from time evolving data to sophisticated data structures such as graphs and functions. This paper summarizes advances on those themes from the last decade, with a focus on results obtained by members of the SAMM team of Universit\'e Paris

    Discussion of "Feature Matching in Time Series Modeling" by Y. Xia and H. Tong

    Full text link
    Discussion of "Feature Matching in Time Series Modeling" by Y. Xia and H. Tong [arXiv:1104.3073]Comment: Published in at http://dx.doi.org/10.1214/11-STS345B the Statistical Science (http://www.imstat.org/sts/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Improving Short-Term Electricity Price Forecasting Using Day-Ahead LMP with ARIMA Models

    Full text link
    Short-term electricity price forecasting has become important for demand side management and power generation scheduling. Especially as the electricity market becomes more competitive, a more accurate price prediction than the day-ahead locational marginal price (DALMP) published by the independent system operator (ISO) will benefit participants in the market by increasing profit or improving load demand scheduling. Hence, the main idea of this paper is to use autoregressive integrated moving average (ARIMA) models to obtain a better LMP prediction than the DALMP by utilizing the published DALMP, historical real-time LMP (RTLMP) and other useful information. First, a set of seasonal ARIMA (SARIMA) models utilizing the DALMP and historical RTLMP are developed and compared with autoregressive moving average (ARMA) models that use the differences between DALMP and RTLMP on their forecasting capability. A generalized autoregressive conditional heteroskedasticity (GARCH) model is implemented to further improve the forecasting by accounting for the price volatility. The models are trained and evaluated using real market data in the Midcontinent Independent System Operator (MISO) region. The evaluation results indicate that the ARMAX-GARCH model, where an exogenous time series indicates weekend days, improves the short-term electricity price prediction accuracy and outperforms the other proposed ARIMA modelsComment: IEEE PES 2017 General Meeting, Chicago, I

    Personalized Purchase Prediction of Market Baskets with Wasserstein-Based Sequence Matching

    Full text link
    Personalization in marketing aims at improving the shopping experience of customers by tailoring services to individuals. In order to achieve this, businesses must be able to make personalized predictions regarding the next purchase. That is, one must forecast the exact list of items that will comprise the next purchase, i.e., the so-called market basket. Despite its relevance to firm operations, this problem has received surprisingly little attention in prior research, largely due to its inherent complexity. In fact, state-of-the-art approaches are limited to intuitive decision rules for pattern extraction. However, the simplicity of the pre-coded rules impedes performance, since decision rules operate in an autoregressive fashion: the rules can only make inferences from past purchases of a single customer without taking into account the knowledge transfer that takes place between customers. In contrast, our research overcomes the limitations of pre-set rules by contributing a novel predictor of market baskets from sequential purchase histories: our predictions are based on similarity matching in order to identify similar purchase habits among the complete shopping histories of all customers. Our contributions are as follows: (1) We propose similarity matching based on subsequential dynamic time warping (SDTW) as a novel predictor of market baskets. Thereby, we can effectively identify cross-customer patterns. (2) We leverage the Wasserstein distance for measuring the similarity among embedded purchase histories. (3) We develop a fast approximation algorithm for computing a lower bound of the Wasserstein distance in our setting. An extensive series of computational experiments demonstrates the effectiveness of our approach. The accuracy of identifying the exact market baskets based on state-of-the-art decision rules from the literature is outperformed by a factor of 4.0.Comment: Accepted for oral presentation at 25th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD 2019

    Evolutionary rule-based system for IPO underpricing prediction

    Get PDF
    Genetic And Evolutionary Computation Conference. Washington DC, USA, 25-29 June 2005Academic literature has documented for a long time the existence of important price gains in the first trading day of initial public offerings (IPOs).Most of the empirical analysis that has been carried out to date to explain underpricing through the offering structure is based on multiple linear regression. The alternative that we suggest is a rule-based system defined by a genetic algorithm using a Michigan approach. The system offers significant advantages in two areas, 1) a higher predictive performance, and 2) robustness to outlier patterns. The importance of the latter should be emphasized since the non-trivial task of selecting the patterns to be excluded from the training sample severely affects the results.We compare the predictions provided by the algorithm to those obtained from linear models frequently used in the IPO literature. The predictions are based on seven classic variables. The results suggest that there is a clear correlation between the selected variables and the initial return, therefore making possible to predict, to a certain extent, the closing price.This article has been financed by the Spanish founded research MCyT project TRACER, Ref: TIC2002-04498-C05-04M
    • …
    corecore