Search CORE

34 research outputs found

Filtering participants improves generalization in competitions and benchmarks

Author: Guyon Isabelle
Liu Zhengying
Pavao Adrien
Publication venue: HAL CCSD
Publication date: 05/10/2022
Field of study

International audienceWe address the problem of selecting a winning algorithm in a challenge or benchmark. While evaluations of algorithms carried out by third party organizers eliminate the inventor-evaluator bias, little attention has been paid to the risk of over-fitting the winner's selection by the organizers. In this paper, we carry out an empirical evaluation using the results of several challenges and benchmarks, evidencing this phenomenon. We show that a heuristic commonly used by organizers consisting of pre-filtering participants using a trial run, reduces over-fitting. We formalize this method and derive a semi-empirical formula to determine the optimal number of top k participants to retain from the trial run

INRIA a CCSD electronic archive server

Judging competitions and benchmarks: a candidate election approach

Author: Guyon Isabelle
Pavao Adrien
Vaccaro Michael
Publication venue: HAL CCSD
Publication date: 06/10/2021
Field of study

International audienceMachine learning progress relies on algorithm benchmarks. We study the problem of declaring a winner, or ranking "candidate" algorithms, based on results obtained by "judges" (scores on various tasks). Inspired by social science and game theory on fair elections, we compare various ranking functions, ranging from simple score averaging to Condorcet methods. We devise novel empirical criteria to assess the quality of ranking functions, including the generalization to new tasks and the stability under judge or candidate perturbation. We conduct an empirical comparison on the results of 5 competitions and benchmarks (one artificially generated). While prior theoretical analyses indicate that no single ranking function satisfies all desired properties, our empirical study reveals that the classical "average rank" method fares well. However, some pairwise comparison methods can get better empirical results

INRIA a CCSD electronic archive server

CodaLab Competitions: An open source platform to organize scientific challenges

Author: Baró Xavier
Escalante Hugo
Escalera Sergio
Guyon Isabelle
Letournel Anne-Catherine
Pavao Adrien
Thomas Tyler
Xu Zhen
Publication venue: HAL CCSD
Publication date: 04/04/2022
Field of study

CodaLab Competitions is an open source web platform designed to help data scientists and research teams to crowd-source the resolution of machine learning problems through the organization of competitions, also called challenges or contests. CodaLab Competitions provides useful features such as multiple phases, results and code submissions, multi-score leaderboards, and jobs running inside Docker containers. The platform is very flexible and can handle large scale experiments, by allowing organizers to upload large datasets and provide their own CPU or GPU compute workers

INRIA a CCSD electronic archive server

Towards Automated Deep Learning: Analysis of the AutoDL challenge series 2019

Author: Escalera Sergio
Guyon Isabelle
Jacques Julio,
Liu Zhengying
Madadi Meysam
Pavao Adrien
Rajaa Shangeth
Treguer Sebastien
Tu Wei-Wei
Xu Zhen
Publication venue: PMLR
Publication date: 06/12/2020
Field of study

International audienceWe present the design and results of recent competitions in Automated Deep Learning (AutoDL). In the AutoDL challenge series 2019, we organized 5 machine learning challenges: AutoCV, AutoCV2, AutoNLP, AutoSpeech and AutoDL. The first 4 challenges concern each a specific application domain, such as computer vision, natural language processing and speech recognition. At the time of March 2020, the last challenge AutoDL is still ongoing and we only present its design. 1 Some highlights of this work include: (1) a benchmark suite of baseline AutoML solutions, with emphasis on domains for which Deep Learning methods have had prior success (image, video, text, speech, etc); (2) a novel "anytime learning" framework, which opens doors for further theoretical consideration; (3) a repository of around 100 datasets (from all above domains) over half of which are released as public datasets to enable research on meta-learning; (4) analyses revealing that winning solutions generalize to new unseen datasets, validating progress towards universal AutoML 1. Its results will be presented in future work together with detailed introduction of winning solutions of each challenge

INRIA a CCSD electronic archive server

Aircraft Numerical "Twin": A Time Series Regression Competition

International audienceThis paper presents the design and analysis of a data science competition on a problem of time series regression from aeronautics data. For the purpose of performing predictive maintenance, aviation companies seek to create aircraft "numerical twins", which are programs capable of accurately predicting strains at strategic positions in various body parts of the aircraft. Given a number of input parameters (sensor data) recorded in sequence during the flight, the competition participants had to predict output values (gauges), also recorded sequentially during test flights, but not recorded during regular flights. The competition data included hundreds of complete flights. It was a code submission competition with complete blind testing of algorithms. The results indicate that such a problem can be effectively solved with gradient boosted trees, after preprocessing and feature engineering. Deep learning methods did not prove as efficient

INRIA a CCSD electronic archive server