Search CORE

42,936 research outputs found

Algorithmic statistics, prediction and machine learning

Author: Milovanov Alexey
Publication venue
Publication date: 17/09/2015
Field of study

Algorithmic statistics considers the following problem: given a binary string

x

(e.g., some experimental data), find a "good" explanation of this data. It uses algorithmic information theory to define formally what is a good explanation. In this paper we extend this framework in two directions. First, the explanations are not only interesting in themselves but also used for prediction: we want to know what kind of data we may reasonably expect in similar situations (repeating the same experiment). We show that some kind of hierarchy can be constructed both in terms of algorithmic statistics and using the notion of a priori probability, and these two approaches turn out to be equivalent. Second, a more realistic approach that goes back to machine learning theory, assumes that we have not a single data string

x

but some set of "positive examples"

x_1,\ldots,x_l

that all belong to some unknown set

A

, a property that we want to learn. We want this set

A

to contain all positive examples and to be as small and simple as possible. We show how algorithmic statistic can be extended to cover this situation.Comment: 22 page

arXiv.org e-Print Archive

Dagstuhl Research Online Publication Server

Hyperparameter optimization: Foundations, algorithms, best practices, and open challenges

Author: Becker Marc
Binder Martin
Bischl Bernd
Boulesteix Anne‐Laure
Coors Stefan
Deng Difan
Lang Michel
Lindauer Marius
Pielok Tobias
Richter Jakob
Thomas Janek
Ullmann Theresa
Publication venue: Hoboken, NJ : Wiley
Publication date: 01/01/2023
Field of study

Most machine learning algorithms are configured by a set of hyperparameters whose values must be carefully chosen and which often considerably impact performance. To avoid a time-consuming and irreproducible manual process of trial-and-error to find well-performing hyperparameter configurations, various automatic hyperparameter optimization (HPO) methods—for example, based on resampling error estimation for supervised machine learning—can be employed. After introducing HPO from a general perspective, this paper reviews important HPO methods, from simple techniques such as grid or random search to more advanced methods like evolution strategies, Bayesian optimization, Hyperband, and racing. This work gives practical recommendations regarding important choices to be made when conducting HPO, including the HPO algorithms themselves, performance evaluation, how to combine HPO with machine learning pipelines, runtime improvements, and parallelization. This article is categorized under: Algorithmic Development > Statistics Technologies > Machine Learning Technologies > Prediction

Institutionelles Repositorium der Leibniz Universität Hannover

Alternative methods to quantify variable importance in ecology

Author: Huettmann Falk
Oppel Steffen
Strobl Carolin
Publication venue
Publication date: 01/01/2009
Field of study

Open Access LMU

Algorithmic Randomness as Foundation of Inductive Reasoning and Artificial Intelligence

Author: Hutter Marcus
Publication venue
Publication date: 01/01/2010
Field of study

This article is a brief personal account of the past, present, and future of algorithmic randomness, emphasizing its role in inductive inference and artificial intelligence. It is written for a general audience interested in science and philosophy. Intuitively, randomness is a lack of order or predictability. If randomness is the opposite of determinism, then algorithmic randomness is the opposite of computability. Besides many other things, these concepts have been used to quantify Ockham's razor, solve the induction problem, and define intelligence.Comment: 9 LaTeX page

arXiv.org e-Print Archive

CiteSeerX

Crossref

The Australian National University

The Measure and Mismeasure of Fairness: A Critical Review of Fair Machine Learning

Author: Corbett-Davies Sam
Goel Sharad
Publication venue
Publication date: 14/08/2018
Field of study

The nascent field of fair machine learning aims to ensure that decisions guided by algorithms are equitable. Over the last several years, three formal definitions of fairness have gained prominence: (1) anti-classification, meaning that protected attributes---like race, gender, and their proxies---are not explicitly used to make decisions; (2) classification parity, meaning that common measures of predictive performance (e.g., false positive and false negative rates) are equal across groups defined by the protected attributes; and (3) calibration, meaning that conditional on risk estimates, outcomes are independent of protected attributes. Here we show that all three of these fairness definitions suffer from significant statistical limitations. Requiring anti-classification or classification parity can, perversely, harm the very groups they were designed to protect; and calibration, though generally desirable, provides little guarantee that decisions are equitable. In contrast to these formal fairness criteria, we argue that it is often preferable to treat similarly risky people similarly, based on the most statistically accurate estimates of risk that one can produce. Such a strategy, while not universally applicable, often aligns well with policy objectives; notably, this strategy will typically violate both anti-classification and classification parity. In practice, it requires significant effort to construct suitable risk estimates. One must carefully define and measure the targets of prediction to avoid retrenching biases in the data. But, importantly, one cannot generally address these difficulties by requiring that algorithms satisfy popular mathematical formalizations of fairness. By highlighting these challenges in the foundation of fair machine learning, we hope to help researchers and practitioners productively advance the area

arXiv.org e-Print Archive