Search CORE

944 research outputs found

ASlib: A Benchmark Library for Algorithm Selection

Author: Bischl Bernd
Frechette Alexandre
Hoos Holger
Hutter Frank
Kerschke Pascal
Kotthoff Lars
Leyton-Brown Kevin
Lindauer Marius
Malitsky Yuri
Tierney Kevin
Vanschoren Joaquin
Publication venue
Publication date: 01/01/2016
Field of study

The task of algorithm selection involves choosing an algorithm from a set of algorithms on a per-instance basis in order to exploit the varying performance of algorithms over a set of instances. The algorithm selection problem is attracting increasing attention from researchers and practitioners in AI. Years of fruitful applications in a number of domains have resulted in a large amount of data, but the community lacks a standard format or repository for this data. This situation makes it difficult to share and compare different approaches effectively, as is done in other, more established fields. It also unnecessarily hinders new researchers who want to work in this area. To address this problem, we introduce a standardized format for representing algorithm selection scenarios and a repository that contains a growing number of data sets from the literature. Our format has been designed to be able to express a wide variety of different scenarios. Demonstrating the breadth and power of our platform, we describe a set of example experiments that build and evaluate algorithm selection models through a common interface. The results display the potential of algorithm selection to achieve significant performance improvements across a broad range of problems and algorithms.Comment: Accepted to be published in Artificial Intelligence Journa

arXiv.org e-Print Archive

Repository TU/e

Pure OAI Repository

Publications at Bielefeld University

Systems for AutoML Research

Author: Gijsbers Pieter
Publication venue: Eindhoven University of Technology
Publication date: 19/05/2022
Field of study

Pure OAI Repository

Enabling trade-offs between accuracy and computational cost: Adaptive algorithms to reduce time to clinical insight

Author: Balasubramanian V
Coveney PV
Dakka J
Farkas-Pall K
Jha S
Turilli M
Wan S
Wright DW
Zasada S
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 16/07/2018
Field of study

The efficacy of drug treatments depends on how tightly small molecules bind to their target proteins. Quantifying the strength of these interactions (the so called 'binding affinity') is a grand challenge of computational chemistry, surmounting which could revolutionize drug design and provide the platform for patient specific medicine. Recently, evidence from blind challenge predictions and retrospective validation studies has suggested that molecular dynamics (MD) can now achieve useful predictive accuracy (1 kcal/mol) This accuracy is sufficient to greatly accelerate hit to lead and lead optimization. To translate these advances in predictive accuracy so as to impact clinical and/or industrial decision making requires that binding free energy results must be turned around on reduced timescales without loss of accuracy. This demands advances in algorithms, scalable software systems, and intelligent and efficient utilization of supercomputing resources. This work is motivated by the real world problem of providing insight from drug candidate data on a time scale that is as short as possible. Specifically, we reproduce results from a collaborative project between UCL and GlaxoSmithKline to study a congeneric series of drug candidates binding to the BRD4 protein-inhibitors of which have shown promising preclinical efficacy in pathologies ranging from cancer to inflammation. We demonstrate the use of a framework called HTBAC, designed to support the aforementioned requirements of accurate and rapid drug binding affinity calculations. HTBAC facilitates the execution of the numbers of simulations while supporting the adaptive execution of algorithms. Furthermore, HTBAC enables the selection of simulation parameters during runtime which can, in principle, optimize the use of computational resources whilst producing results within a target uncertainty

Crossref

UCL Discovery

Automated Machine Learning for Multi-Label Classification

Author: Wever Marcel
Publication venue
Publication date: 01/01/2021
Field of study

Open Access LMU

From Facility to Application Sensor Data: Modular, Continuous and Holistic Monitoring with DCDB

Author: Auweter Axel
Guillen Carla
Mueller Micha
Netti Alessio
Ott Michael
Schulz Martin
Tafani Daniele
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 14/08/2019
Field of study

Today's HPC installations are highly-complex systems, and their complexity will only increase as we move to exascale and beyond. At each layer, from facilities to systems, from runtimes to applications, a wide range of tuning decisions must be made in order to achieve efficient operation. This, however, requires systematic and continuous monitoring of system and user data. While many insular solutions exist, a system for holistic and facility-wide monitoring is still lacking in the current HPC ecosystem. In this paper we introduce DCDB, a comprehensive monitoring system capable of integrating data from all system levels. It is designed as a modular and highly-scalable framework based on a plugin infrastructure. All monitored data is aggregated at a distributed noSQL data store for analysis and cross-system correlation. We demonstrate the performance and scalability of DCDB, and describe two use cases in the area of energy management and characterization.Comment: Accepted at the The International Conference for High Performance Computing, Networking, Storage, and Analysis (SC) 201

arXiv.org e-Print Archive

Crossref

Hyperparameter optimization: Foundations, algorithms, best practices, and open challenges

Author: Becker Marc
Binder Martin
Bischl Bernd
Boulesteix Anne‐Laure
Coors Stefan
Deng Difan
Lang Michel
Lindauer Marius
Pielok Tobias
Richter Jakob
Thomas Janek
Ullmann Theresa
Publication venue: Hoboken, NJ : Wiley
Publication date: 01/01/2023
Field of study

Most machine learning algorithms are configured by a set of hyperparameters whose values must be carefully chosen and which often considerably impact performance. To avoid a time-consuming and irreproducible manual process of trial-and-error to find well-performing hyperparameter configurations, various automatic hyperparameter optimization (HPO) methods—for example, based on resampling error estimation for supervised machine learning—can be employed. After introducing HPO from a general perspective, this paper reviews important HPO methods, from simple techniques such as grid or random search to more advanced methods like evolution strategies, Bayesian optimization, Hyperband, and racing. This work gives practical recommendations regarding important choices to be made when conducting HPO, including the HPO algorithms themselves, performance evaluation, how to combine HPO with machine learning pipelines, runtime improvements, and parallelization. This article is categorized under: Algorithmic Development > Statistics Technologies > Machine Learning Technologies > Prediction

Institutionelles Repositorium der Leibniz Universität Hannover

Applications

Author
Publication venue: 'Walter de Gruyter GmbH'
Publication date: 30/01/2023
Field of study

Volume 3 describes how resource-aware machine learning methods and techniques are used to successfully solve real-world problems. The book provides numerous specific application examples: in health and medicine for risk modelling, diagnosis, and treatment selection for diseases in electronics, steel production and milling for quality control during manufacturing processes in traffic, logistics for smart cities and for mobile communications

Directory of Open Access Books (DOAB)

Fundamentals

Author
Publication venue: 'Walter de Gruyter GmbH'
Publication date: 30/01/2023
Field of study

Volume 1 establishes the foundations of this new field. It goes through all the steps from data collection, their summary and clustering, to different aspects of resource-aware learning, i.e., hardware, memory, energy, and communication awareness. Machine learning methods are inspected with respect to resource requirements and how to enhance scalability on diverse computing architectures ranging from embedded systems to large computing clusters

Directory of Open Access Books (DOAB)

Fifty years of pulsar candidate selection: from simple filters to a new principled real-time classification approach

Author: Brooke John
Cooper Sally
Knowles Joshua
LYON ROBERT
Stappers Benjamin
Publication venue: 'Oxford University Press (OUP)'
Publication date: 16/03/2016
Field of study

Improving survey specifications are causing an exponential rise in pulsar candidate numbers and data volumes. We study the candidate filters used to mitigate these problems during the past fifty years. We find that some existing methods such as applying constraints on the total number of candidates collected per observation, may have detrimental effects on the success of pulsar searches. Those methods immune to such effects are found to be ill-equipped to deal with the problems associated with increasing data volumes and candidate numbers, motivating the development of new approaches. We therefore present a new method designed for on-line operation. It selects promising candidates using a purpose-built tree-based machine learning classifier, the Gaussian Hellinger Very Fast Decision Tree (GH-VFDT), and a new set of features for describing candidates. The features have been chosen so as to i) maximise the separation between candidates arising from noise and those of probable astrophysical origin, and ii) be as survey-independent as possible. Using these features our new approach can process millions of candidates in seconds (~1 million every 15 seconds), with high levels of pulsar recall (90%+). This technique is therefore applicable to the large volumes of data expected to be produced by the Square Kilometre Array (SKA). Use of this approach has assisted in the discovery of 20 new pulsars in data obtained during the LOFAR Tied-Array All-Sky Survey (LOTAAS).Comment: Accepted for publication in MNRAS, 20 pages, 8 figures. See http://www.jb.man.ac.uk/pulsar/Surveys.html for survey data, and https://dx.doi.org/10.6084/m9.figshare.3080389.v1 for our dat

arXiv.org e-Print Archive

University of Birmingham Research Portal

Edge Hill University Research Information Repository

The University of Manchester - Institutional Repository