7,443 research outputs found
EC3: Combining Clustering and Classification for Ensemble Learning
Classification and clustering algorithms have been proved to be successful
individually in different contexts. Both of them have their own advantages and
limitations. For instance, although classification algorithms are more powerful
than clustering methods in predicting class labels of objects, they do not
perform well when there is a lack of sufficient manually labeled reliable data.
On the other hand, although clustering algorithms do not produce label
information for objects, they provide supplementary constraints (e.g., if two
objects are clustered together, it is more likely that the same label is
assigned to both of them) that one can leverage for label prediction of a set
of unknown objects. Therefore, systematic utilization of both these types of
algorithms together can lead to better prediction performance. In this paper,
We propose a novel algorithm, called EC3 that merges classification and
clustering together in order to support both binary and multi-class
classification. EC3 is based on a principled combination of multiple
classification and multiple clustering methods using an optimization function.
We theoretically show the convexity and optimality of the problem and solve it
by block coordinate descent method. We additionally propose iEC3, a variant of
EC3 that handles imbalanced training data. We perform an extensive experimental
analysis by comparing EC3 and iEC3 with 14 baseline methods (7 well-known
standalone classifiers, 5 ensemble classifiers, and 2 existing methods that
merge classification and clustering) on 13 standard benchmark datasets. We show
that our methods outperform other baselines for every single dataset, achieving
at most 10% higher AUC. Moreover our methods are faster (1.21 times faster than
the best baseline), more resilient to noise and class imbalance than the best
baseline method.Comment: 14 pages, 7 figures, 11 table
Self-Adaptive Surrogate-Assisted Covariance Matrix Adaptation Evolution Strategy
This paper presents a novel mechanism to adapt surrogate-assisted
population-based algorithms. This mechanism is applied to ACM-ES, a recently
proposed surrogate-assisted variant of CMA-ES. The resulting algorithm,
saACM-ES, adjusts online the lifelength of the current surrogate model (the
number of CMA-ES generations before learning a new surrogate) and the surrogate
hyper-parameters. Both heuristics significantly improve the quality of the
surrogate model, yielding a significant speed-up of saACM-ES compared to the
ACM-ES and CMA-ES baselines. The empirical validation of saACM-ES on the
BBOB-2012 noiseless testbed demonstrates the efficiency and the scalability
w.r.t the problem dimension and the population size of the proposed approach,
that reaches new best results on some of the benchmark problems.Comment: Genetic and Evolutionary Computation Conference (GECCO 2012) (2012
Hyperparameter Importance Across Datasets
With the advent of automated machine learning, automated hyperparameter
optimization methods are by now routinely used in data mining. However, this
progress is not yet matched by equal progress on automatic analyses that yield
information beyond performance-optimizing hyperparameter settings. In this
work, we aim to answer the following two questions: Given an algorithm, what
are generally its most important hyperparameters, and what are typically good
values for these? We present methodology and a framework to answer these
questions based on meta-learning across many datasets. We apply this
methodology using the experimental meta-data available on OpenML to determine
the most important hyperparameters of support vector machines, random forests
and Adaboost, and to infer priors for all their hyperparameters. The results,
obtained fully automatically, provide a quantitative basis to focus efforts in
both manual algorithm design and in automated hyperparameter optimization. The
conducted experiments confirm that the hyperparameters selected by the proposed
method are indeed the most important ones and that the obtained priors also
lead to statistically significant improvements in hyperparameter optimization.Comment: \c{opyright} 2018. Copyright is held by the owner/author(s).
Publication rights licensed to ACM. This is the author's version of the work.
It is posted here for your personal use, not for redistribution. The
definitive Version of Record was published in Proceedings of the 24th ACM
SIGKDD International Conference on Knowledge Discovery & Data Minin
Evolutionary model type selection for global surrogate modeling
Due to the scale and computational complexity of currently used simulation codes, global surrogate (metamodels) models have become indispensable tools for exploring and understanding the design space. Due to their compact formulation they are cheap to evaluate and thus readily facilitate visualization, design space exploration, rapid prototyping, and sensitivity analysis. They can also be used as accurate building blocks in design packages or larger simulation environments. Consequently, there is great interest in techniques that facilitate the construction of such approximation models while minimizing the computational cost and maximizing model accuracy. Many surrogate model types exist ( Support Vector Machines, Kriging, Neural Networks, etc.) but no type is optimal in all circumstances. Nor is there any hard theory available that can help make this choice. In this paper we present an automatic approach to the model type selection problem. We describe an adaptive global surrogate modeling environment with adaptive sampling, driven by speciated evolution. Different model types are evolved cooperatively using a Genetic Algorithm ( heterogeneous evolution) and compete to approximate the iteratively selected data. In this way the optimal model type and complexity for a given data set or simulation code can be dynamically determined. Its utility and performance is demonstrated on a number of problems where it outperforms traditional sequential execution of each model type
- …