33 research outputs found

### Online Learning, Physics and Algorithms

In recent years, we have witnessed an increasing cross-fertilization between the fields of computer science, statistics, optimization and the statistical physics of learning. The area of machine learning is at the interface of these subjects. We start with an analysis in the statistical physics of learning, where we analyze some properties of the loss landscape of simple models of neural networks using the computer science formalism of Constraint Satisfaction Problems. Some of the techniques we employ are probabilistic, but others have their root in the studies of disorder systems in the statistical physics literature.
After that, we focus mainly on online prediction problems, which were initially investigated in statistics but are now very active areas of research also in computer science and optimization, where they are studied in the adversarial case through the lens of (online) convex optimization. We are particularly interested in the cooperative setting, where we show that cooperation improves learning. More specifically, we give efficient algorithms and unify previous works under a simplified and more general framework

### Moduli spaces of gauge theories in 3 dimensions.

The objective of this thesis is to study the moduli spaces of pairs of mirror theories in 3 dimensions with N = 4. The original conjecture of 3d mirror symmetry was motivated by the fact that in these pairs of theories the Higgs and Coulomb branches are swapped. After a brief introduction to supersymmetry we will first focus on the Higgs branch. This will be investigated through the Hilbert series and the plethystic program.
The methods used for the Higgs branch are very well known in literature, more difficult is the case of the Coulomb branch since it receives quantum corrections. We will explain how it is parametrized in term of monopole operators and having both Higgs and Coulomb branches for theories with different gauge groups we will be able to show how mirror symmetry works in the case of ADE theories. We will show in which cases these Yang-
Mills vacua are equivalent to one instanton moduli spaces.ope

### Stochastic Online Instrumental Variable Regression: Regrets for Endogeneity and Bandit Feedback

Endogeneity, i.e. the dependence of noise and covariates, is a common
phenomenon in real data due to omitted variables, strategic behaviours,
measurement errors etc. In contrast, the existing analyses of stochastic online
linear regression with unbounded noise and linear bandits depend heavily on
exogeneity, i.e. the independence of noise and covariates. Motivated by this
gap, we study the over- and just-identified Instrumental Variable (IV)
regression, specifically Two-Stage Least Squares, for stochastic online
learning, and propose to use an online variant of Two-Stage Least Squares,
namely O2SLS. We show that O2SLS achieves $\mathcal O(d_{x}d_{z}\log^2 T)$
identification and $\widetilde{\mathcal O}(\gamma \sqrt{d_{z} T})$ oracle
regret after $T$ interactions, where $d_{x}$ and $d_{z}$ are the dimensions of
covariates and IVs, and $\gamma$ is the bias due to endogeneity. For
$\gamma=0$, i.e. under exogeneity, O2SLS exhibits $\mathcal O(d_{x}^2 \log^2
T)$ oracle regret, which is of the same order as that of the stochastic online
ridge. Then, we leverage O2SLS as an oracle to design OFUL-IV, a stochastic
linear bandit algorithm to tackle endogeneity. OFUL-IV yields
$\widetilde{\mathcal O}(\sqrt{d_{x}d_{z}T})$ regret that matches the regret
lower bound under exogeneity. For different datasets with endogeneity, we
experimentally show efficiencies of O2SLS and OFUL-IV

### Cooperative Online Learning

In this preliminary (and unpolished) version of the paper, we study an
asynchronous online learning setting with a network of agents. At each time
step, some of the agents are activated, requested to make a prediction, and pay
the corresponding loss. Some feedback is then revealed to these agents and is
later propagated through the network. We consider the case of full, bandit, and
semi-bandit feedback. In particular, we construct a reduction to delayed
single-agent learning that applies to both the full and the bandit feedback
case and allows to obtain regret guarantees for both settings. We complement
these results with a near-matching lower bound

### Clustering of solutions in the symmetric binary perceptron

The geometrical features of the (non-convex) loss landscape of neural network
models are crucial in ensuring successful optimization and, most importantly,
the capability to generalize well. While minimizers' flatness consistently
correlates with good generalization, there has been little rigorous work in
exploring the condition of existence of such minimizers, even in toy models.
Here we consider a simple neural network model, the symmetric perceptron, with
binary weights. Phrasing the learning problem as a constraint satisfaction
problem, the analogous of a flat minimizer becomes a large and dense cluster of
solutions, while the narrowest minimizers are isolated solutions. We perform
the first steps toward the rigorous proof of the existence of a dense cluster
in certain regimes of the parameters, by computing the first and second moment
upper bounds for the existence of pairs of arbitrarily close solutions.
Moreover, we present a non rigorous derivation of the same bounds for sets of
$y$ solutions at fixed pairwise distances

### AdaStop: adaptive statistical testing for sound comparisons of Deep RL agents

Recently, the scientific community has questioned the statistical
reproducibility of many empirical results, especially in the field of machine
learning. To solve this reproducibility crisis, we propose a theoretically
sound methodology to compare the overall performance of multiple algorithms
with stochastic returns. We exemplify our methodology in Deep RL. Indeed, the
performance of one execution of a Deep RL algorithm is random. Therefore,
several independent executions are needed to accurately evaluate the overall
performance. When comparing several RL algorithms, a major question is how many
executions must be made and how can we ensure that the results of such a
comparison are theoretically sound. When comparing several algorithms at once,
the error of each comparison may accumulate and must be taken into account with
a multiple tests procedure to preserve low error guarantees. We introduce
AdaStop, a new statistical test based on multiple group sequential tests. When
comparing algorithms, AdaStop adapts the number of executions to stop as early
as possible while ensuring that we have enough information to distinguish
algorithms that perform better than the others in a statistical significant
way. We prove theoretically and empirically that AdaStop has a low probability
of making a (family-wise) error. Finally, we illustrate the effectiveness of
AdaStop in multiple Deep RL use-cases, including toy examples and challenging
Mujoco environments. AdaStop is the first statistical test fitted to this sort
of comparisons: AdaStop is both a significant contribution to statistics, and a
major contribution to computational studies performed in reinforcement learning
and in other domains. To summarize our contribution, we introduce AdaStop, a
formally grounded statistical tool to let anyone answer the practical question:
``Is my algorithm the new state-of-the-art?''