8,054 research outputs found

    Logistic Ensemble Models

    Get PDF
    Predictive models that are developed in a regulated industry or a regulated application, like determination of credit worthiness must be interpretable and “rational” (e.g., improvements in basic credit behavior must result in improved credit worthiness scores). Machine Learning technologies provide very good performance with minimal analyst intervention, so they are well suited to a high volume analytic environment but the majority are “black box” tools that provide very limited insight or interpretability into key drivers of model performance or predicted model output values. This paper presents a methodology that blends one of the most popular predictive statistical modeling methods with a core model enhancement strategy, found in machine learning. The resulting prediction methodology provides solid performance, from minimal analyst effort, while providing the interpretability and rationality, required in regulated industries

    Ensemble deep learning: A review

    Get PDF
    Ensemble learning combines several individual models to obtain better generalization performance. Currently, deep learning models with multilayer processing architecture is showing better performance as compared to the shallow or traditional classification models. Deep ensemble learning models combine the advantages of both the deep learning models as well as the ensemble learning such that the final model has better generalization performance. This paper reviews the state-of-art deep ensemble models and hence serves as an extensive summary for the researchers. The ensemble models are broadly categorised into ensemble models like bagging, boosting and stacking, negative correlation based deep ensemble models, explicit/implicit ensembles, homogeneous /heterogeneous ensemble, decision fusion strategies, unsupervised, semi-supervised, reinforcement learning and online/incremental, multilabel based deep ensemble models. Application of deep ensemble models in different domains is also briefly discussed. Finally, we conclude this paper with some future recommendations and research directions

    Predicting media memorability using ensemble models

    Get PDF
    Memorability, defined as the quality of being worth remembering, is a pressing issue in media as we struggle to organize and retrieve digital content and make it more useful in our daily lives. The Predicting Media Memorability task in MediaEval 2019 tackles this problem by creating a challenge to automatically predict memorability scores building on the work developed in 2018. Our team ensembled transfer learning approaches with video captions using embeddings and our own pre-computed features which outperformed Medieval 2018’s state-of-the-art architectures

    Ensemble Models in Forecasting Financial Markets

    Get PDF

    Combining local- and large-scale models to predict the distributions of invasive plant species

    Get PDF
    Habitat-distribution models are increasingly used to predict the potential distributions of invasive species and to inform monitoring. However, these models assume that species are in equilibrium with the environment, which is clearly not true for most invasive species. Although this assumption is frequently acknowledged, solutions have not been adequately addressed. There are several potential methods for improving habitat-distribution models. Models that require only presence data may be more effective for invasive species, but this assumption has rarely been tested. In addition, combining modeling types to form ‘ensemble’ models may improve the accuracy of predictions. However, even with these improvements, models developed for recently invaded areas are greatly influenced by the current distributions of species and thus reflect near- rather than long-term potential for invasion. Larger scale models from species’ native and invaded ranges may better reflect long-term invasion potential, but they lack finer scale resolution. We compared logistic regression (which uses presence/absence data) and two presence-only methods for modeling the potential distributions of three invasive plant species on the Olympic Peninsula in Washington State, USA. We then combined the three methods to create ensemble models. We also developed climate-envelope models for the same species based on larger scale distributions and combined models from multiple scales to create an index of near- and long-term invasion risk to inform monitoring in Olympic National Park (ONP). Neither presence-only nor ensemble models were more accurate than logistic regression for any of the species. Larger scale models predicted much greater areas at risk of invasion. Our index of near- and long-term invasion risk indicates that \u3c4% of ONP is at high near-term risk of invasion while 67-99% of the Park is at moderate or high long-term risk of invasion. We demonstrate how modeling results can be used to guide the design of monitoring protocols and monitoring results can in turn be used to refine models. We propose that by using models from multiple scales to predict invasion risk and by explicitly linking model development to monitoring, it may be possible to overcome some of the limitations of habitat-distribution models
    corecore