52 research outputs found

    A review and comparison of strategies for multi-step ahead time series forecasting based on the NN5 forecasting competition

    Full text link
    Multi-step ahead forecasting is still an open challenge in time series forecasting. Several approaches that deal with this complex problem have been proposed in the literature but an extensive comparison on a large number of tasks is still missing. This paper aims to fill this gap by reviewing existing strategies for multi-step ahead forecasting and comparing them in theoretical and practical terms. To attain such an objective, we performed a large scale comparison of these different strategies using a large experimental benchmark (namely the 111 series from the NN5 forecasting competition). In addition, we considered the effects of deseasonalization, input variable selection, and forecast combination on these strategies and on multi-step ahead forecasting at large. The following three findings appear to be consistently supported by the experimental results: Multiple-Output strategies are the best performing approaches, deseasonalization leads to uniformly improved forecast accuracy, and input selection is more effective when performed in conjunction with deseasonalization

    Multi-resolution forecast aggregation for time series in agri datasets

    Get PDF
    A wide variety of phenomena are characterized by time dependent dynamics that can be analyzed using time series methods. Various time series analysis techniques have been presented, each addressing certain aspects of the data. In time series analysis, forecasting is a challenging problem when attempting to estimate extended time horizons which effectively encapsulate multi-step-ahead (MSA) predictions. Two original solutions to MSA are the direct and the recursive approaches. Recent studies have mainly focused on combining previous methods as an attempt to overcome the problem of discarding sequential correlation in the direct strategy or accumulation of error in the recursive strategy. This paper introduces a technique known as Multi-Resolution Forecast Aggregation (MRFA) which incorporates an additional concept known as Resolutions of Impact. MRFA is shown to have favourable prediction capabilities in comparison to a number of state of the art methods

    Extreme Learning Machine Based Prognostics of Battery Life

    Get PDF
    This paper presents a prognostic scheme for estimating the remaining useful life of Lithium-ion batteries. The proposed scheme utilizes a prediction module that aims to obtain precise predictions for both short and long prediction horizons. The prediction module makes use of extreme learning machines for one-step and multi-step ahead predictions, using various prediction strategies, including iterative, direct and DirRec, which use the constant-current experimental capacity data for the estimation of the remaining useful life. The data-driven prognostic approach is highly dependent on the availability of high quantity of quality observations. Insufficient amount of available data can result in unsatisfactory prognostics. In this paper, the prognostics scheme is utilized to estimate the remaining useful life of a battery, with insufficient direct data available, but taking advantage of observations available from a fleet of similar batteries with similar working conditions. Experimental results show that the proposed prognostic scheme provides a fast and efficient estimation of the remaining useful life of the batteries and achieves superior results when compared with various state-of-the-art prediction techniques

    Machine condition prognosis using multi-step ahead prediction and neuro-fuzzy systems

    Get PDF
    This paper presents an approach to predict the operating conditions of machine based on adaptive neuro-fuzzy inference system (ANFIS) in association with direct prediction strategy for multi-step ahead prediction of time series techniques. In this study, the number of available observations and the number of predicted steps are initially determined by using false nearest neighbor method and auto mutual information technique, respectively. These values are subsequently utilized as inputs for prediction models to forecast the future values of the machine’s operating conditions. The performance of the proposed approach is then evaluated by using real trending data of low methane compressor. The results show that the ANFIS prediction model can track the change in machine conditions and has the potential for using as a tool to machine fault prognosis

    Three Essays on Mixture Model and Gaussian Processes

    Get PDF
    This dissertation includes three essays. In the first essay I study the problem of density estimation using normal mixture models. Instead of selecting the ‘right’ number of components in a normal mixture model, I propose an Averaged Normal Mixture (ANM) model to estimate the underlying densities based on model averaging methods, combining normal mixture models with different number of components. I use two methods to estimate the mixing weights of the proposed Averaged Normal Mixture model, one is based on likelihood cross validation and the other is based on Bayesian information criterion (BIC) weights. I also establish the theoretical properties of the proposed estimator and the simulation results demonstrate its good performance in estimating different types of underlying densities. The proposed method is also employed to a real world data set, empirical evidence demonstrates the efficiency of this estimator. The second essay studies short term electricity demand forecasting using Gaussian Processes and different forecast strategies. I propose a hybrid forecasting strategy that combines the strength of different forecasting schemes to predict 24 hourly electricity demand for the next day. This method is shown to provide superior point and overall probabilistic forecasts. I demonstrate the economic utility of the proposed method by illustrating how the Gaussian Process probabilistic forecasts can be used to reduce the expected cost of electricity supply relative to conventional regression methods, and in a decision-theoretic framework to derive an optimal bidding strategy under a stylized asymmetric loss function for electricity suppliers. The third essay studies a non-stationary modeling approach based on the method of Gaussian process regression for crop yields modeling and crop insurance rate estimation. Our approach departs from the conventional two-step estimation procedure and allows the yield distributions to vary over time. I further develop a performance weighted model averaging method to construct densities as mixture of Gaussian processes. This method not only facilitates information pooling but greatly improves the flexibility of the resultant predictive density of crop yields. The simulation results on corp insurance premium rates show that the proposed method compares favorably to conventional two stage estimators, especially when the underlying distributions are non-stationary. I illustrate the efficacy of the proposed method with an application to crop insurance policy selection by insurance companies. I adopt a decision theoretic framework in this exploration and demonstrate that insurance companies can use the proposed method to effectively identify profitable policies under symmetric or asymmetric loss functions

    Methodologies for time series prediction and missing value imputation

    Get PDF
    The amount of collected data is increasing all the time in the world. More sophisticated measuring instruments and increase in the computer processing power produce more and more data, which requires more capacity from the collection, transmission and storage. Even though computers are faster, large databases need also good and accurate methodologies for them to be useful in practice. Some techniques are not feasible to be applied to very large databases or are not able to provide the necessary accuracy. As the title proclaims, this thesis focuses on two aspects encountered with databases, time series prediction and missing value imputation. The first one is a function approximation and regression problem, but can, in some cases, be formulated also as a classification task. Accurate prediction of future values is heavily dependent not only on a good model, which is well trained and validated, but also preprocessing, input variable selection or projection and output approximation strategy selection. The importance of all these choices made in the approximation process increases when the prediction horizon is extended further into the future. The second focus area deals with missing values in a database. The missing values can be a nuisance, but can be also be a prohibiting factor in the use of certain methodologies and degrade the performance of others. Hence, missing value imputation is a very necessary part of the preprocessing of a database. This imputation has to be done carefully in order to retain the integrity of the database and not to insert any unwanted artifacts to aggravate the job of the final data analysis methodology. Furthermore, even though the accuracy is always the main requisite for a good methodology, computational time has to be considered alongside the precision. In this thesis, a large variety of different strategies for output approximation and variable processing for time series prediction are presented. There is also a detailed presentation of new methodologies and tools for solving the problem of missing values. The strategies and methodologies are compared against the state-of-the-art ones and shown to be accurate and useful in practice.Maailmassa tuotetaan koko ajan enemmän ja enemmän tietoa. Kehittyneemmät mittalaitteet, nopeammat tietokoneet sekä kasvaneet siirto- ja tallennuskapasiteetit mahdollistavat suurien tietomassojen keräämisen, siirtämisen ja varastoinnin. Vaikka tietokoneiden laskentateho kasvaa jatkuvasti, suurten tietoaineistojen käsittelyssä tarvitaan edelleen hyviä ja tarkkoja menetelmiä. Kaikki menetelmät eivät sovellu valtavien aineistojen käsittelyyn tai eivät tuota tarpeeksi tarkkoja tuloksia. Tässä työssä keskitytään kahteen tärkeään osa-alueeseen tietokantojen käsittelyssä: aikasarjaennustamiseen ja puuttuvien arvojen täydentämiseen. Ensimmäinen näistä alueista on regressio-ongelma, jossa pyritään arvioimaan aikasarjan tulevaisuutta edeltävien näytteiden pohjalta. Joissain tapauksissa regressio-ongelma voidaan muotoilla myös luokitteluongelmaksi. Tarkka aikasarjan ennustaminen on riippuvainen hyvästä ja luotettavasta ennustusmallista. Malli on opetettava oikein ja sen oikeellisuus ja tarkkuus on varmistettava. Lisäksi aikasarjan esikäsittely, syötemuuttujien valinta- tai projektiotapa sekä ennustusstrategia täytyy valita huolella ja niiden soveltuvuus mallin yhteyteen on varmistettava huolellisesti. Tehtyjen valintojen tärkeys kasvaa entisestään mitä pidemmälle tulevaisuuteen ennustetaan. Toinen tämän työn osa-alue käsittelee puuttuvien arvojen ongelmaa. Tietokannasta puuttuvat arvot voivat heikentää data-analyysimenetelmän tuottamia tuloksia tai jopa estää joidenkin menetelmien käytön, joten puuttuvien arvojen arviointi ja täydentäminen esikäsittelyn osana on suositeltavaa. Täydentäminen on kuitenkin tehtävä harkiten, sillä puutteellinen täydentäminen johtaa hyvin todennäköisesti epätarkkuuksiin lopullisessa käyttökohteessa ja ei-toivottuihin rakenteisiin tietokannan sisällä. Koska kyseessä on esikäsittely, eikä varsinainen datan hyötykäyttö, puuttuvien arvojen täydentämiseen käytetty laskenta-aika tulisi minimoida säilyttäen laskentatarkkuus. Tässä väitöskirjassa on esitelty erilaisia tapoja ennustaa pitkän ajan päähän tulevaisuuteen ja keinoja syötemuuttujien valintaan. Lisäksi uusia menetelmiä puuttuvien arvojen täydentämiseen on kehitetty ja niitä on vertailtu olemassa oleviin menetelmiin
    corecore