38 research outputs found

    Ensemble methods for solving problems of medical diagnosis

    Full text link
    A consolidating method for analyzing series of observations based on a fitted model of a mixture of catalysts of the main components is proposed, which makes it possible to study any number of markers. Contrasting the longitudinal approach, it eliminates the need to connect regression analysis methods with their own uncertainties when choosing particular models. The consolidating method allows obtaining an original result in the subject area of early diagnosis of a disease: all options for using markers demonstrate an increase in classification accuracy with an increase in the length of a series of examinations.Comment: 4 pages, 1 figure, 4 table

    Churn Prediction Task in MOOC

    Get PDF
    Churn prediction is a common task for machine learning applications in business. In this paper, this task is adapted for solving problem of low efficiency of massive open online courses (only 5% of all the students finish their course). The approach is presented on course “Methods and algorithms of the graph theory” held on national platform of online education in Russia. This paper includes all the steps to build an intelligent system to predict students who are active during the course, but not likely to finish it. The first part consists of constructing the right sample for prediction, EDA and choosing the most appropriate week of the course to make predictions on. The second part is about choosing the right metric and building models. Also, approach with using ensembles like stacking is proposed to increase the accuracy of predictions. As a result, a general approach to build a churn prediction model for online course is reviewed. This approach can be used for making the process of online education adaptive and intelligent for a separate student

    Averaging of density kernel estimators

    Get PDF
    Averaging provides an alternative to bandwidth selection for density kernel estimation. We propose a procedure to combine linearly several kernel estimators of a density obtained from different, possibly data-driven, bandwidths. The method relies on minimizing an easily tractable approximation of the integrated square error of the combination. It provides, at a small computational cost, a final solution that improves on the initial estimators in most cases. The average estimator is proved to be asymptotically as efficient as the best possible combination (the oracle), with an error term that decreases faster than the minimax rate obtained with separated learning and validation samples. The performances are tested numerically, with results that compare favorably to other existing procedures in terms of mean integrated square errors

    Prediction of infectious disease epidemics via weighted density ensembles

    Full text link
    Accurate and reliable predictions of infectious disease dynamics can be valuable to public health organizations that plan interventions to decrease or prevent disease transmission. A great variety of models have been developed for this task, using different model structures, covariates, and targets for prediction. Experience has shown that the performance of these models varies; some tend to do better or worse in different seasons or at different points within a season. Ensemble methods combine multiple models to obtain a single prediction that leverages the strengths of each model. We considered a range of ensemble methods that each form a predictive density for a target of interest as a weighted sum of the predictive densities from component models. In the simplest case, equal weight is assigned to each component model; in the most complex case, the weights vary with the region, prediction target, week of the season when the predictions are made, a measure of component model uncertainty, and recent observations of disease incidence. We applied these methods to predict measures of influenza season timing and severity in the United States, both at the national and regional levels, using three component models. We trained the models on retrospective predictions from 14 seasons (1997/1998 - 2010/2011) and evaluated each model's prospective, out-of-sample performance in the five subsequent influenza seasons. In this test phase, the ensemble methods showed overall performance that was similar to the best of the component models, but offered more consistent performance across seasons than the component models. Ensemble methods offer the potential to deliver more reliable predictions to public health decision makers.Comment: 20 pages, 6 figure

    Aggregating density estimators: an empirical study

    Full text link
    We present some new density estimation algorithms obtained by bootstrap aggregation like Bagging. Our algorithms are analyzed and empirically compared to other methods found in the statistical literature, like stacking and boosting for density estimation. We show by extensive simulations that ensemble learning are effective for density estimation like for classification. Although our algorithms do not always outperform other methods, some of them are as simple as bagging, more intuitive and has computational lower cost
    corecore