Search CORE

9 research outputs found

Recommending Learning Algorithms and Their Associated Hyperparameters

Author: Giraud-Carrier Christophe
Martinez Tony
Mitchell Logan
Smith Michael R.
Publication venue
Publication date: 07/07/2014
Field of study

The success of machine learning on a given task dependson, among other things, which learning algorithm is selected and its associated hyperparameters. Selecting an appropriate learning algorithm and setting its hyperparameters for a given data set can be a challenging task, especially for users who are not experts in machine learning. Previous work has examined using meta-features to predict which learning algorithm and hyperparameters should be used. However, choosing a set of meta-features that are predictive of algorithm performance is difficult. Here, we propose to apply collaborative filtering techniques to learning algorithm and hyperparameter selection, and find that doing so avoids determining which meta-features to use and outperforms traditional meta-learning approaches in many cases.Comment: Short paper--2 pages, 2 table

arXiv.org e-Print Archive

CiteSeerX

An Easy to Use Repository for Comparing and Improving Machine Learning Algorithm Usage

Author: Giraud-Carrier Christophe
Martinez Tony
Smith Michael R.
White Andrew
Publication venue
Publication date: 05/06/2014
Field of study

The results from most machine learning experiments are used for a specific purpose and then discarded. This results in a significant loss of information and requires rerunning experiments to compare learning algorithms. This also requires implementation of another algorithm for comparison, that may not always be correctly implemented. By storing the results from previous experiments, machine learning algorithms can be compared easily and the knowledge gained from them can be used to improve their performance. The purpose of this work is to provide easy access to previous experimental results for learning and comparison. These stored results are comprehensive -- storing the prediction for each test instance as well as the learning algorithm, hyperparameters, and training set that were used. Previous results are particularly important for meta-learning, which, in a broad sense, is the process of learning from previous machine learning results such that the learning process is improved. While other experiment databases do exist, one of our focuses is on easy access to the data. We provide meta-learning data sets that are ready to be downloaded for meta-learning experiments. In addition, queries to the underlying database can be made if specific information is desired. We also differ from previous experiment databases in that our databases is designed at the instance level, where an instance is an example in a data set. We store the predictions of a learning algorithm trained on a specific training set for each instance in the test set. Data set level information can then be obtained by aggregating the results from the instances. The instance level information can be used for many tasks such as determining the diversity of a classifier or algorithmically determining the optimal subset of training instances for a learning algorithm.Comment: 7 pages, 1 figure, 6 table

arXiv.org e-Print Archive

CiteSeerX

OpenML: networked science in machine learning

Author: Bischl Bernd
Torgo Luis
van Rijn Jan N.
Vanschoren Joaquin
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/08/2014
Field of study

Many sciences have made significant breakthroughs by adopting online tools that help organize, structure and mine information that is too detailed to be printed in journals. In this paper, we introduce OpenML, a place for machine learning researchers to share and organize data in fine detail, so that they can work more effectively, be more visible, and collaborate with others to tackle harder problems. We discuss how OpenML relates to other examples of networked science and what benefits it brings for machine learning research, individual scientists, as well as students and practitioners.Comment: 12 pages, 10 figure

arXiv.org e-Print Archive

CiteSeerX

Semantic descriptor for intelligence services

Author: De Meester Ben
Montpetit Marie-J.
Ramos Edgar
Schneider Timon
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2020
Field of study

The exposition and discovery of intelligence especially for connected devices and autonomous systems have become an important area of the research towards an all-intelligent world. In this article, it a semantic description of functions is proposed and used to provide intelligence services mainly for networked devices. The semantic descriptors aim to provide interoperability between multiple domains' vocabularies, data models, and ontologies, so that device applications become able to deploy them autonomously once they are onboarded in the device or system platform. The proposed framework supports the discovery, onboarding, and updating of the services by providing descriptions of their execution environment, software dependencies, policies and data inputs required, as well as the outputs produced, to enable application decoupling from the AI functions

Crossref

Ghent University Academic Bibliography

ASlib: A Benchmark Library for Algorithm Selection

Author: Bischl Bernd
Frechette Alexandre
Hoos Holger
Hutter Frank
Kerschke Pascal
Kotthoff Lars
Leyton-Brown Kevin
Lindauer Marius
Malitsky Yuri
Tierney Kevin
Vanschoren Joaquin
Publication venue
Publication date: 01/01/2016
Field of study

The task of algorithm selection involves choosing an algorithm from a set of algorithms on a per-instance basis in order to exploit the varying performance of algorithms over a set of instances. The algorithm selection problem is attracting increasing attention from researchers and practitioners in AI. Years of fruitful applications in a number of domains have resulted in a large amount of data, but the community lacks a standard format or repository for this data. This situation makes it difficult to share and compare different approaches effectively, as is done in other, more established fields. It also unnecessarily hinders new researchers who want to work in this area. To address this problem, we introduce a standardized format for representing algorithm selection scenarios and a repository that contains a growing number of data sets from the literature. Our format has been designed to be able to express a wide variety of different scenarios. Demonstrating the breadth and power of our platform, we describe a set of example experiments that build and evaluate algorithm selection models through a common interface. The results display the potential of algorithm selection to achieve significant performance improvements across a broad range of problems and algorithms.Comment: Accepted to be published in Artificial Intelligence Journa

arXiv.org e-Print Archive

Repository TU/e

Crossref

Pure OAI Repository

Publications at Bielefeld University

Ontology of core data mining entities

Author: A Bernstein
A Golbraikh
A Karalic
B Smith
B Smith
B Smith
C Silla
C Vens
D Demšar
D Kocev
D Kocev
D Qi
D Young
DJ Hand
F Serban
G Madjarov
G Tsoumakas
GH Bakir
H Mannila
HP Kriegel
I Slavkov
J Vanschoren
K Button
Larisa Soldatova
LN Soldatova
M Courtot
M Ford
M Žáková
MA Avery
MA Avery
MF López
O Spjuth
P Robinson
Panče Panov
Q Yang
R Caruana
R Guha
R Guha
RD King
RD King
RR Brinkman
Sašo Džeroski
T Dietterich
V Podpečan
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 05/07/2014
Field of study

In this article, we present OntoDM-core, an ontology of core data mining entities. OntoDM-core defines themost essential datamining entities in a three-layered ontological structure comprising of a specification, an implementation and an application layer. It provides a representational framework for the description of mining structured data, and in addition provides taxonomies of datasets, data mining tasks, generalizations, data mining algorithms and constraints, based on the type of data. OntoDM-core is designed to support a wide range of applications/use cases, such as semantic annotation of data mining algorithms, datasets and results; annotation of QSAR studies in the context of drug discovery investigations; and disambiguation of terms in text mining. The ontology has been thoroughly assessed following the practices in ontology engineering, is fully interoperable with many domain resources and is easy to extend

Crossref

Brunel University Research Archive

Experiment databases: A new way to share, organize and learn from experiments

Author: Blockeel Hendrik
Holmes Geoffrey
Pfahringer Bernhard
Vanschoren Joaquin
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2012
Field of study

Thousands of machine learning research papers contain extensive experimental comparisons. However, the details of those experiments are often lost after publication, making it impossible to reuse these experiments in further research, or reproduce them to verify the claims made. In this paper, we present a collaboration framework designed to easily share machine learning experiments with the community, and automatically organize them in public databases. This enables immediate reuse of experiments for subsequent, possibly much broader investigation and offers faster and more thorough analysis based on a large set of varied results. We describe how we designed such an experiment database, currently holding over 650,000 classification experiments, and demonstrate its use by answering a wide range of interesting research questions and by verifying a number of recent studies

Lirias

Repository TU/e

Crossref

Research Commons@Waikato

On benchmark experiments and visualization methods for the evaluation and interpretation of machine learning models

Author: Casalicchio Giuseppe
Publication venue: Ludwig-Maximilians-Universität München
Publication date: 18/03/2019
Field of study

Digitale Hochschulschriften der LMU