2,722 research outputs found

    Trimmed bagging.

    Get PDF
    Bagging has been found to be successful in increasing the predictive performance of unstable classifiers. Bagging draws bootstrap samples from the training sample, applies the classifier to each bootstrap sample, and then averages over all obtained classification rules. The idea of trimmed bagging is to exclude the bootstrapped classification rules that yield the highest error rates, as estimated by the out-of-bag error rate, and to aggregate over the remaining ones. In this note we explore the potential benefits of trimmed bagging. On the basis of numerical experiments, we conclude that trimmed bagging performs comparably to standard bagging when applied to unstable classifiers as decision trees, but yields better results when applied to more stable base classifiers, like support vector machines.

    Trimmed bagging.

    Get PDF
    Bagging has been found to be successful in increasing the predictive performance of unstable classifiers. Bagging draws bootstrap samples from the training sample, applies the classifier to each bootstrap sample, and then averages overal lobtained classification rules. The idea of trimmed bagging is to exclude the bootstrapped classification rules that yield the highest error rates, as estimated by the out-of-bag error rate, and to aggregate over the remaining ones. In this note we explore the potential benefits of trimmed bagging. On the basis of numerical experiments, we conclude that trimmed bagging performs comparably to standard bagging when applied to unstable classifiers as decision trees, but yields better results when applied to more stable base classifiers, like support vector machines.Bagging;

    Two-Stage Bagging Pruning for Reducing the Ensemble Size and Improving the Classification Performance

    Get PDF
    Ensemble methods, such as the traditional bagging algorithm, can usually improve the performance of a single classifier. However, they usually require large storage space as well as relatively time-consuming predictions. Many approaches were developed to reduce the ensemble size and improve the classification performance by pruning the traditional bagging algorithms. In this article, we proposed a two-stage strategy to prune the traditional bagging algorithm by combining two simple approaches: accuracy-based pruning (AP) and distance-based pruning (DP). These two methods, as well as their two combinations, “AP+DP” and “DP+AP” as the two-stage pruning strategy, were all examined. Comparing with the single pruning methods, we found that the two-stage pruning methods can furthermore reduce the ensemble size and improve the classification. “AP+DP” method generally performs better than the “DP+AP” method when using four base classifiers: decision tree, Gaussian naive Bayes, K-nearest neighbor, and logistic regression. Moreover, as compared to the traditional bagging, the two-stage method “AP+DP” improved the classification accuracy by 0.88%, 4.06%, 1.26%, and 0.96%, respectively, averaged over 28 datasets under the four base classifiers. It was also observed that “AP+DP” outperformed other three existing algorithms Brag, Nice, and TB assessed on 8 common datasets. In summary, the proposed two-stage pruning methods are simple and promising approaches, which can both reduce the ensemble size and improve the classification accuracy

    Multiple classifier architectures and their application to credit risk assessment

    Get PDF
    Multiple classifier systems combine several individual classifiers to deliver a final classification decision. An increasingly controversial question is whether such systems can outperform the single best classifier and if so, what form of multiple classifier system yields the greatest benefit. In this paper the performance of several multiple classifier systems are evaluated in terms of their ability to correctly classify consumers as good or bad credit risks. Empirical results suggest that many, but not all, multiple classifier systems deliver significantly better performance than the single best classifier. Overall, bagging and boosting outperform other multi-classifier systems, and a new boosting algorithm, Error Trimmed Boosting, outperforms bagging and AdaBoost by a significant margin

    Advanced composite aileron for L-1011 transport aircraft: Aileron manufacture

    Get PDF
    The fabrication activities of the Advanced Composite Aileron (ACA) program are discussed. These activities included detail fabrication, manufacturing development, assembly, repair and quality assurance. Five ship sets of ailerons were manufactured. The detail fabrication effort of ribs, spar and covers was accomplished on male tools to a common cure cycle. Graphite epoxy tape and fabric and syntactic epoxy materials were utilized in the fabrication. The ribs and spar were net cured and required no post cure trim. Material inconsistencies resulted in manufacturing development of the front spar during the production effort. The assembly effort was accomplished in subassembly and assembly fixtures. The manual drilling system utilized a dagger type drill in a hydraulic feed control hand drill. Coupon testing for each detail was done

    Hedge fund return predictability; To combine forecasts or combine information?

    Get PDF
    While the majority of the predictability literature has been devoted to the predictability of traditional asset classes, the literature on the predictability of hedge fund returns is quite scanty. We focus on assessing the out-of-sample predictability of hedge fund strategies by employing an extensive list of predictors. Aiming at reducing uncertainty risk associated with a single predictor model, we first engage into combining the individual forecasts. We consider various combining methods ranging from simple averaging schemes to more sophisticated ones, such as discounting forecast errors, cluster combining and principal components combining. Our second approach combines information of the predictors and applies kitchen sink, bootstrap aggregating (bagging), lasso, ridge and elastic net specifications. Our statistical and economic evaluation findings point to the superiority of simple combination methods. We also provide evidence on the use of hedge fund return forecasts for hedge fund risk measurement and portfolio allocation. Dynamically constructing portfolios based on the combination forecasts of hedge funds returns leads to considerably improved portfolio performance

    Wide consensus aggregation in the Wasserstein space. Application to location-scatter families

    Get PDF
    We introduce a general theory for a consensus-based combination of estimations of probability measures. Potential applications include parallelized or distributed sampling schemes as well as variations on aggregation from resampling techniques like boosting or bagging. Taking into account the possibility of very discrepant estimations, instead of a full consensus we consider a "wide consensus" procedure. The approach is based on the consideration of trimmed barycenters in the Wasserstein space of probability measures. We provide general existence and consistency results as well as suitable properties of these robustified Fréchet means. In order to get quick applicability, we also include characterizations of barycenters of probabilities that belong to (non necessarily elliptical) location and scatter families. For these families, we provide an iterative algorithm for the effective computation of trimmed barycenters, based on a consistent algorithm for computing barycenters, guarantying applicability in a wide setting of statistical problems

    Advanced manufacturing development of a composite empennage component for L-1011 aircraft

    Get PDF
    This is the final report of technical work conducted during the fourth phase of a multiphase program having the objective of the design, development and flight evaluation of an advanced composite empennage component manufactured in a production environment at a cost competitive with those of its metal counterpart, and at a weight savings of at least 20 percent. The empennage component selected for this program is the vertical fin box of the L-1011 aircraft. The box structure extends from the fuselage production joint to the tip rib and includes front and rear spars. During Phase 4 of the program, production quality tooling was designed and manufactured to produce three sets of covers, ribs, spars, miscellaneous parts, and subassemblies to assemble three complete ACVF units. Recurring and nonrecurring cost data were compiled and documented in the updated producibility/design to cost plan. Nondestruct inspections, quality control tests, and quality acceptance tests were performed in accordance with the quality assurance plan and the structural integrity control plan. Records were maintained to provide traceability of material and parts throughout the manufacturing development phase. It was also determined that additional tooling would not be required to support the current and projected L-1011 production rate

    A Mathematical Formalization of Hierarchical Temporal Memory's Spatial Pooler

    Get PDF
    Hierarchical temporal memory (HTM) is an emerging machine learning algorithm, with the potential to provide a means to perform predictions on spatiotemporal data. The algorithm, inspired by the neocortex, currently does not have a comprehensive mathematical framework. This work brings together all aspects of the spatial pooler (SP), a critical learning component in HTM, under a single unifying framework. The primary learning mechanism is explored, where a maximum likelihood estimator for determining the degree of permanence update is proposed. The boosting mechanisms are studied and found to be only relevant during the initial few iterations of the network. Observations are made relating HTM to well-known algorithms such as competitive learning and attribute bagging. Methods are provided for using the SP for classification as well as dimensionality reduction. Empirical evidence verifies that given the proper parameterizations, the SP may be used for feature learning.Comment: This work was submitted for publication and is currently under review. For associated code, see https://github.com/tehtechguy/mHT

    Integrating Specialized Classifiers Based on Continuous Time Markov Chain

    Full text link
    Specialized classifiers, namely those dedicated to a subset of classes, are often adopted in real-world recognition systems. However, integrating such classifiers is nontrivial. Existing methods, e.g. weighted average, usually implicitly assume that all constituents of an ensemble cover the same set of classes. Such methods can produce misleading predictions when used to combine specialized classifiers. This work explores a novel approach. Instead of combining predictions from individual classifiers directly, it first decomposes the predictions into sets of pairwise preferences, treating them as transition channels between classes, and thereon constructs a continuous-time Markov chain, and use the equilibrium distribution of this chain as the final prediction. This way allows us to form a coherent picture over all specialized predictions. On large public datasets, the proposed method obtains considerable improvement compared to mainstream ensemble methods, especially when the classifier coverage is highly unbalanced.Comment: Published at IJCAI-17, typo fixe
    corecore