211 research outputs found

    Puddles

    Get PDF

    COMET: A Recipe for Learning and Using Large Ensembles on Massive Data

    Full text link
    COMET is a single-pass MapReduce algorithm for learning on large-scale data. It builds multiple random forest ensembles on distributed blocks of data and merges them into a mega-ensemble. This approach is appropriate when learning from massive-scale data that is too large to fit on a single machine. To get the best accuracy, IVoting should be used instead of bagging to generate the training subset for each decision tree in the random forest. Experiments with two large datasets (5GB and 50GB compressed) show that COMET compares favorably (in both accuracy and training time) to learning on a subsample of data using a serial algorithm. Finally, we propose a new Gaussian approach for lazy ensemble evaluation which dynamically decides how many ensemble members to evaluate per data point; this can reduce evaluation cost by 100X or more

    SMOTE: Synthetic Minority Over-sampling Technique

    Full text link
    An approach to the construction of classifiers from imbalanced datasets is described. A dataset is imbalanced if the classification categories are not approximately equally represented. Often real-world data sets are predominately composed of "normal" examples with only a small percentage of "abnormal" or "interesting" examples. It is also the case that the cost of misclassifying an abnormal (interesting) example as a normal example is often much higher than the cost of the reverse error. Under-sampling of the majority (normal) class has been proposed as a good means of increasing the sensitivity of a classifier to the minority class. This paper shows that a combination of our method of over-sampling the minority (abnormal) class and under-sampling the majority (normal) class can achieve better classifier performance (in ROC space) than only under-sampling the majority class. This paper also shows that a combination of our method of over-sampling the minority class and under-sampling the majority class can achieve better classifier performance (in ROC space) than varying the loss ratios in Ripper or class priors in Naive Bayes. Our method of over-sampling the minority class involves creating synthetic minority class examples. Experiments are performed using C4.5, Ripper and a Naive Bayes classifier. The method is evaluated using the area under the Receiver Operating Characteristic curve (AUC) and the ROC convex hull strategy

    H11Implementing physiotherapy Huntington's disease guidelines in clinical practice: a global survey

    Get PDF
    Background Clinical practice guidelines are often not optimally translated to clinical care. Following the publication of the Huntington’s disease (HD) physiotherapy clinical practice guidelines in 2020, the European Huntington’s Disease Network Physiotherapy Working Group (EHDN PWG) identified a need to explore perceived facilitators and barriers to their implementation. The aims of this study were to explore physiotherapists’ awareness of and perceived barriers and facilitators to implementation of the 2020 guidelines. Methods An observational study was carried out using an online survey. Participants were physiotherapists recruited via the EHDN and physiotherapy associations in the United Kingdom, Australia, and United States of America. The survey gathered data on agreement and disagreement with statements of barriers and facilitators to implementation of each of six recommendations in the guidelines using Likert scales. Results There were 32 respondents: 18 from Europe, 7 from Australia, 5 from the USA, 1 from Africa (1 missing data). The majority were aware of the guidelines (69%), with 75% working with clients with HD < 40% of their time. Key findings were that HD specific attributes (physical, behavioural and low motivation) were perceived to be barriers to implementation of recommendations ( ≥ 70% agreement). Support from colleagues (81-91% agreement), an individualised plan (72-88% agreement) and physiotherapists’ expertise in HD (81-91% agreement) were found to be facilitators of implementation in all six of the recommendations. Conclusions This study is the first to explore implementation of guidelines in physiotherapy clinical practice. Resources from PWG to support physiotherapists need to focus on ways to implement recommendations specifically related to management of physical, behavioural and motivational problems associated with HD. This would enhance physiotherapists’ expertise, a facilitator to implementation of clinical practice guidelines
    • …
    corecore