199 research outputs found
COMET: A Recipe for Learning and Using Large Ensembles on Massive Data
COMET is a single-pass MapReduce algorithm for learning on large-scale data.
It builds multiple random forest ensembles on distributed blocks of data and
merges them into a mega-ensemble. This approach is appropriate when learning
from massive-scale data that is too large to fit on a single machine. To get
the best accuracy, IVoting should be used instead of bagging to generate the
training subset for each decision tree in the random forest. Experiments with
two large datasets (5GB and 50GB compressed) show that COMET compares favorably
(in both accuracy and training time) to learning on a subsample of data using a
serial algorithm. Finally, we propose a new Gaussian approach for lazy ensemble
evaluation which dynamically decides how many ensemble members to evaluate per
data point; this can reduce evaluation cost by 100X or more
Recommended from our members
ASCI visualization tool evaluation, Version 2.0
The charter of the ASCI Visualization Common Tools subgroup was to investigate and evaluate 3D scientific visualization tools. As part of that effort, a Tri-Lab evaluation effort was launched in February of 1996. The first step was to agree on a thoroughly documented list of 32 features against which all tool candidates would be evaluated. These evaluation criteria were both gleaned from a user survey and determined from informed extrapolation into the future, particularly as concerns the 3D nature and extremely large size of ASCI data sets. The second step was to winnow a field of 41 candidate tools down to 11. The selection principle was to be as inclusive as practical, retaining every tool that seemed to hold any promise of fulfilling all of ASCI`s visualization needs. These 11 tools were then closely investigated by volunteer evaluators distributed across LANL, LLNL, and SNL. This report contains the results of those evaluations, as well as a discussion of the evaluation philosophy and criteria
H11Implementing physiotherapy Huntington's disease guidelines in clinical practice: a global survey
Background Clinical practice guidelines are often not optimally translated to clinical care. Following the publication of the Huntington’s disease (HD) physiotherapy clinical practice guidelines in 2020, the European Huntington’s Disease Network Physiotherapy Working Group (EHDN PWG) identified a need to explore perceived facilitators and barriers to their implementation. The aims of this study were to explore physiotherapists’ awareness of and perceived barriers and facilitators to implementation of the 2020 guidelines.
Methods An observational study was carried out using an online survey. Participants were physiotherapists recruited via the EHDN and physiotherapy associations in the United Kingdom, Australia, and United States of America. The survey gathered data on agreement and disagreement with statements of barriers and facilitators to implementation of each of six recommendations in the guidelines using Likert scales.
Results There were 32 respondents: 18 from Europe, 7 from Australia, 5 from the USA, 1 from Africa (1 missing data). The majority were aware of the guidelines (69%), with 75% working with clients with HD < 40% of their time. Key findings were that HD specific attributes (physical, behavioural and low motivation) were perceived to be barriers to implementation of recommendations ( ≥ 70% agreement). Support from colleagues (81-91% agreement), an individualised plan (72-88% agreement) and physiotherapists’ expertise in HD (81-91% agreement) were found to be facilitators of implementation in all six of the recommendations.
Conclusions This study is the first to explore implementation of guidelines in physiotherapy clinical practice. Resources from PWG to support physiotherapists need to focus on ways to implement recommendations specifically related to management of physical, behavioural and motivational problems associated with HD. This would enhance physiotherapists’ expertise, a facilitator to implementation of clinical practice guidelines
SMOTE: Synthetic Minority Over-sampling Technique
An approach to the construction of classifiers from imbalanced datasets is
described. A dataset is imbalanced if the classification categories are not
approximately equally represented. Often real-world data sets are predominately
composed of "normal" examples with only a small percentage of "abnormal" or
"interesting" examples. It is also the case that the cost of misclassifying an
abnormal (interesting) example as a normal example is often much higher than
the cost of the reverse error. Under-sampling of the majority (normal) class
has been proposed as a good means of increasing the sensitivity of a classifier
to the minority class. This paper shows that a combination of our method of
over-sampling the minority (abnormal) class and under-sampling the majority
(normal) class can achieve better classifier performance (in ROC space) than
only under-sampling the majority class. This paper also shows that a
combination of our method of over-sampling the minority class and
under-sampling the majority class can achieve better classifier performance (in
ROC space) than varying the loss ratios in Ripper or class priors in Naive
Bayes. Our method of over-sampling the minority class involves creating
synthetic minority class examples. Experiments are performed using C4.5, Ripper
and a Naive Bayes classifier. The method is evaluated using the area under the
Receiver Operating Characteristic curve (AUC) and the ROC convex hull strategy
Physical therapy and exercise interventions in Huntington's disease: a mixed methods systematic review protocol
Review question/objective:
: The review seeks to evaluate the effectiveness of physical therapy and exercise interventions in Huntington's disease (HD). The review question is: What is the effectiveness of physiotherapy and therapeutic exercise interventions in people with HD, and what are patients’, families’ and caregivers’ perceptions of these interventions?
Review question/objective:
The specific objectives are:
Review question/objective:
This mixed methods review seeks to develop an aggregated synthesis of quantitative, qualitative and narrative systematic reviews on physiotherapy and exercise interventions in HD, in an attempt to derive conclusions and recommendations useful for clinical practice and policy decision-making
Local Area Signal-to-Noise Ratio (LASNR) algorithm for Image Segmentation
Many automated image-based applications have need of finding small spots in a variably noisy image. For humans, it is relatively easy to distinguish objects from local surroundings no matter what else may be in the image. We attempt to capture this distinguishing capability computationally by calculating a measurement that estimates the strength of signal within an object versus the noise in its local neighborhood. First, we hypothesize various sizes for the object and corresponding background areas. Then, we compute the Local Area Signal to Noise Ratio (LASNR) at every pixel in the image, resulting in a new image with LASNR values for each pixel. All pixels exceeding a pre-selected LASNR value become seed pixels, or initiation points, and are grown to include the full area extent of the object. Since growing the seed is a separate operation from finding the seed, each object can be any size and shape. Thus, the overall process is a 2-stage segmentation method that first finds object seeds and then grows them to find the full extent of the object. This algorithm was designed, optimized and is in daily use for the accurate and rapid inspection of optics from a large laser system (National Ignition Facility (NIF), Lawrence Livermore National Laboratory, Livermore, CA), which includes images with background noise, ghost reflections, different illumination and other sources of variation
- …