17 research outputs found
Hyperparameter Importance Across Datasets
With the advent of automated machine learning, automated hyperparameter
optimization methods are by now routinely used in data mining. However, this
progress is not yet matched by equal progress on automatic analyses that yield
information beyond performance-optimizing hyperparameter settings. In this
work, we aim to answer the following two questions: Given an algorithm, what
are generally its most important hyperparameters, and what are typically good
values for these? We present methodology and a framework to answer these
questions based on meta-learning across many datasets. We apply this
methodology using the experimental meta-data available on OpenML to determine
the most important hyperparameters of support vector machines, random forests
and Adaboost, and to infer priors for all their hyperparameters. The results,
obtained fully automatically, provide a quantitative basis to focus efforts in
both manual algorithm design and in automated hyperparameter optimization. The
conducted experiments confirm that the hyperparameters selected by the proposed
method are indeed the most important ones and that the obtained priors also
lead to statistically significant improvements in hyperparameter optimization.Comment: \c{opyright} 2018. Copyright is held by the owner/author(s).
Publication rights licensed to ACM. This is the author's version of the work.
It is posted here for your personal use, not for redistribution. The
definitive Version of Record was published in Proceedings of the 24th ACM
SIGKDD International Conference on Knowledge Discovery & Data Minin
How to Identify Investor's types in real financial markets by means of agent based simulation
The paper proposes a computational adaptation of the principles underlying
principal component analysis with agent based simulation in order to produce a
novel modeling methodology for financial time series and financial markets.
Goal of the proposed methodology is to find a reduced set of investor s models
(agents) which is able to approximate or explain a target financial time
series. As computational testbed for the study, we choose the learning system L
FABS which combines simulated annealing with agent based simulation for
approximating financial time series. We will also comment on how L FABS s
architecture could exploit parallel computation to scale when dealing with
massive agent simulations. Two experimental case studies showing the efficacy
of the proposed methodology are reported.Comment: 18 pages, in pres
Hyperparameter Learning via Distributional Transfer
Bayesian optimisation is a popular technique for hyperparameter learning but
typically requires initial exploration even in cases where similar prior tasks
have been solved. We propose to transfer information across tasks using learnt
representations of training datasets used in those tasks. This results in a
joint Gaussian process model on hyperparameters and data representations.
Representations make use of the framework of distribution embeddings into
reproducing kernel Hilbert spaces. The developed method has a faster
convergence compared to existing baselines, in some cases requiring only a few
evaluations of the target objective
Towards meta-learning for multi-target regression problems
Several multi-target regression methods were devel-oped in the last years
aiming at improving predictive performanceby exploring inter-target correlation
within the problem. However, none of these methods outperforms the others for
all problems. This motivates the development of automatic approachesto
recommend the most suitable multi-target regression method. In this paper, we
propose a meta-learning system to recommend the best predictive method for a
given multi-target regression problem. We performed experiments with a
meta-dataset generated by a total of 648 synthetic datasets. These datasets
were created to explore distinct inter-targets characteristics toward
recommending the most promising method. In experiments, we evaluated four
different algorithms with different biases as meta-learners. Our meta-dataset
is composed of 58 meta-features, based on: statistical information, correlation
characteristics, linear landmarking, from the distribution and smoothness of
the data, and has four different meta-labels. Results showed that induced
meta-models were able to recommend the best methodfor different base level
datasets with a balanced accuracy superior to 70% using a Random Forest
meta-model, which statistically outperformed the meta-learning baselines.Comment: To appear on the 8th Brazilian Conference on Intelligent Systems
(BRACIS
Metalearning: a survey of trends and technologies
Metalearning attracted considerable interest in the machine learning community in the last years. Yet, some disagreement remains on what does or what does not constitute a metalearning problem and in which contexts the term is used in. This survey aims at giving an all-encompassing overview of the research directions pursued under the umbrella of metalearning, reconciling different definitions given in scientific literature, listing the choices involved when designing a metalearning system and identifying some of the future research challenges in this domain. © 2013 The Author(s)
Hybrid ACO and SVM algorithm for pattern classification
Ant Colony Optimization (ACO) is a metaheuristic algorithm that can be used to
solve a variety of combinatorial optimization problems. A new direction for ACO is to optimize continuous and mixed (discrete and continuous) variables. Support Vector Machine (SVM) is a pattern classification approach originated from statistical approaches. However, SVM suffers two main problems which include feature subset selection and parameter tuning. Most approaches related to tuning SVM parameters discretize the continuous value of the parameters which will give a negative effect on the classification performance. This study presents four algorithms for tuning the
SVM parameters and selecting feature subset which improved SVM classification accuracy with smaller size of feature subset. This is achieved by performing the SVM parameters’ tuning and feature subset selection processes simultaneously. Hybridization algorithms between ACO and SVM techniques were proposed. The first two algorithms, ACOR-SVM and IACOR-SVM, tune the SVM parameters while
the second two algorithms, ACOMV-R-SVM and IACOMV-R-SVM, tune the SVM parameters and select the feature subset simultaneously. Ten benchmark datasets from University of California, Irvine, were used in the experiments to validate the performance of the proposed algorithms. Experimental results obtained from the proposed algorithms are better when compared with other approaches in terms of classification accuracy and size of the feature subset. The average classification
accuracies for the ACOR-SVM, IACOR-SVM, ACOMV-R and IACOMV-R algorithms are 94.73%, 95.86%, 97.37% and 98.1% respectively. The average size of feature subset is eight for the ACOR-SVM and IACOR-SVM algorithms and four for the ACOMV-R and IACOMV-R algorithms. This study contributes to a new direction for ACO that can deal with continuous and mixed-variable ACO