12,039 research outputs found

    Bagging Time Series Models

    Get PDF
    A common problem in out-of-sample prediction is that there are potentially many relevant predictors that individually have only weak explanatory power. We propose bootstrap aggregation of pre-test predictors (or bagging for short) as a means of constructing forecasts from multiple regression models with local-to-zero regression parameters and errors subject to possible serial correlation or conditional heteroskedasticity. Bagging is designed for situations in which the number of predictors (M) is moderately large relative to the sample size (T). We show how to implement bagging in the dynamic multiple regression model and provide asymptotic justification for the bagging predictor. A simulation study shows that bagging tends to produce large reductions in the out-of-sample prediction mean squared error and provides a useful alternative to forecasting from factor models when M is large, but much smaller than T. We also find that bagging indicators of real economic activity greatly redcues the prediction mean squared error of forecasts of U.S. CPI inflation at horizons of one month and one yearforecasting; bootstrap; model selection; pre-testing; forecast aggregation; factor models; inflation.

    Ensemble predictions : empirical studies on learners' performance and sample distributions

    Full text link
    University of Technology, Sydney. Faculty of Engineering and Information Technology.Imbalanced data problems are among the most challenging in Data Mining and Machine Learning research. This dissertation investigates the performance of ensemble learning systems on different types of data environments, and proposes novel ensemble learning approaches for solving imbalanced data problems. Bagging is one of the most effective ensemble methods for classification tasks. Despite the popularity of bagging in many real-world applications, there is a major drawback on extremely imbalanced data. Much research has addressed the problems of imbalanced data by using over-sampling and/or under-sampling methods to generate an equally balanced training set to improve the performance of the prediction models. However, it is unclear which is the best ratio for training, and under which conditions bagging is outperformed by other sampling schemes on extremely imbalanced data. Previous research has mainly been concerned with studying unstable learners as the key to ensuring the performance gain of a bagging predictor, with many key factors remaining unclear. Some questions have not been well answered: (1) What are the key factors for bagging predictors to achieve the best predictive performance for applications? and (2) What is the impact of varying the levels of class distribution on bagging predictors on different data environments. There is a lack of empirical investigation of these issues in the literature. The main contributions of this dissertation are as follows: 1. This dissertation proposes novel approaches, uneven balanced bagging to boost the performance of the prediction model for solving imbalanced problems, and hybrid-sampling to enhance bagging for solving highly imbalanced time series classification problems. 2. This dissertation asserts that robustness and stability are two key factors for building a high performance bagging predictor. This dissertation also derives a new method, utilizing two-dimensional robustness and stability decomposition to rank the base learners into different categories for the purpose of comparing the performance of bagging predictors with respect to different learning algorithms. The experimental results demonstrate that bagging is influenced by the combination of robustness and instability, and indicate that robustness is important for bagging to achieve a highly accurate prediction model. 3. This dissertation investigates the sensitivity of bagging predictors. We demonstrate that bagging MLP and NB are insensitive to different levels of imbalanced class distribution. 4. This dissertation investigates the impact of varying levels of class distribution on bagging predictors with different learning algorithms on a range of data environments, to allow data mining practitioners to choose the best learners and understand what to expect when using bagging predictors

    Bagging Binary Predictors for Time Series

    Get PDF
    Bootstrap aggregating or Bagging, introduced by Breiman (1996a), has been proved to be effective to improve on unstable forecast. Theoretical and empirical works using classification, regression trees, variable selection in linear and non-linear regression have shown that bagging can generate substantial prediction gain. However, most of the existing literature on bagging have been limited to the cross sectional circumstances with symmetric cost functions. In this paper, we extend the application of bagging to time series settings with asymmetric cost functions, particularly for predicting signs and quantiles. We link quantile predictions to binary predictions in a unified framwork. We find that bagging may improve the accuracy of unstable predictions for time series data under certain conditions. Various bagging forecast combinations are used such as equal weighted and Bayesian Model Averaging (BMA) weighted combinations. For demonstration, we present results from Monte Carlo experiments and from empirical applications using monthly S&P500 and NASDAQ stock index returnsAsymmetric cost function, Bagging, Binary prediction, BMA, Forecast combination, Majority voting, Quantile prediction, Time Series.

    Localized Regression

    Get PDF
    The main problem with localized discriminant techniques is the curse of dimensionality, which seems to restrict their use to the case of few variables. This restriction does not hold if localization is combined with a reduction of dimension. In particular it is shown that localization yields powerful classifiers even in higher dimensions if localization is combined with locally adaptive selection of predictors. A robust localized logistic regression (LLR) method is developed for which all tuning parameters are chosen dataÂĄadaptively. In an extended simulation study we evaluate the potential of the proposed procedure for various types of data and compare it to other classification procedures. In addition we demonstrate that automatic choice of localization, predictor selection and penalty parameters based on cross validation is working well. Finally the method is applied to real data sets and its real world performance is compared to alternative procedures

    Neural network ensembles: Evaluation of aggregation algorithms

    Get PDF
    Ensembles of artificial neural networks show improved generalization capabilities that outperform those of single networks. However, for aggregation to be effective, the individual networks must be as accurate and diverse as possible. An important problem is, then, how to tune the aggregate members in order to have an optimal compromise between these two conflicting conditions. We present here an extensive evaluation of several algorithms for ensemble construction, including new proposals and comparing them with standard methods in the literature. We also discuss a potential problem with sequential aggregation algorithms: the non-frequent but damaging selection through their heuristics of particularly bad ensemble members. We introduce modified algorithms that cope with this problem by allowing individual weighting of aggregate members. Our algorithms and their weighted modifications are favorably tested against other methods in the literature, producing a sensible improvement in performance on most of the standard statistical databases used as benchmarks.Comment: 35 pages, 2 figures, In press AI Journa
    • …
    corecore