247 research outputs found
Quantile Propagation for Wasserstein-Approximate Gaussian Processes
We develop a new approximate Bayesian inference method for Gaussian process models with factorized non-Gaussian likelihoods. Our method---dubbed Quantile Propagation (QP)---is similar to expectation propagation (EP) but minimizes the L_2 Wasserstein distance rather than the Kullback-Leibler (KL) divergence. We consider the case where likelihood factors are approximated by a Gaussian form. We show that QP matches quantile functions rather than moments as in EP and has the same mean update but a smaller variance update than EP, thereby alleviating the over-estimation of the posterior variance exhibited by EP. Crucially, QP has the same favorable locality property as EP, and thereby admits an efficient algorithm. Experiments on classification and Poisson regression tasks demonstrate that QP outperforms both EP and variational Bayes
Approximate Inference for Non-parametric Bayesian Hawkes Processes and Beyond
The Hawkes process has been widely applied to modeling self-exciting events including neuron spikes, earthquakes and tweets. To avoid designing parametric triggering kernels, the non-parametric Hawkes process has been proposed, in which the triggering kernel is in a non-parametric form. However, inference in such models suffers from poor scalability to large-scale datasets and sensitivity to uncertainty in the random finite samples. To deal with these issues, we employ Bayesian non-parametric Hawkes processes and propose two kinds of efficient approximate inference methods based on existing inference techniques. Although having worked as the cornerstone of probabilistic methods based on Gaussian process priors, most of existing inference techniques approximately optimize standard divergence measures such as the Kullback-Leibler (KL) divergence, which lacks the basic desiderata for the task at hand, while chiefly offering merely technical convenience. In order to improve them, we further propose a more advanced Bayesian inference approach based on the Wasserstein distance, which is applicable to a wide range of models. Apart from these works, we also explore a robust frequentist estimation method beyond the Bayesian field. Efficient inference techniques for the Hawkes process will help all the different applications that it already has, from earthquake forecasting, finance to social media. Furthermore, approximate inference techniques proposed in this thesis have the potential to be applied to other models to improve robustness and account for uncertainty
Interpretable statistics for complex modelling: quantile and topological learning
As the complexity of our data increased exponentially in the last decades, so has our
need for interpretable features. This thesis revolves around two paradigms to approach
this quest for insights.
In the first part we focus on parametric models, where the problem of interpretability
can be seen as a “parametrization selection”. We introduce a quantile-centric
parametrization and we show the advantages of our proposal in the context of regression,
where it allows to bridge the gap between classical generalized linear (mixed)
models and increasingly popular quantile methods.
The second part of the thesis, concerned with topological learning, tackles the
problem from a non-parametric perspective. As topology can be thought of as a way
of characterizing data in terms of their connectivity structure, it allows to represent
complex and possibly high dimensional through few features, such as the number of
connected components, loops and voids. We illustrate how the emerging branch of
statistics devoted to recovering topological structures in the data, Topological Data
Analysis, can be exploited both for exploratory and inferential purposes with a special
emphasis on kernels that preserve the topological information in the data.
Finally, we show with an application how these two approaches can borrow strength
from one another in the identification and description of brain activity through fMRI
data from the ABIDE project
Wasserstein Actor-Critic: Directed Exploration via Optimism for Continuous-Actions Control
Uncertainty quantification has been extensively used as a means to achieve efficient directed exploration in Reinforcement Learning (RL). However, state-of-the-art methods for continuous actions still suffer from high sample complexity requirements. Indeed, they either completely lack strategies for propagating the epistemic uncertainty throughout the updates, or they mix it with aleatoric uncertainty while learning the full return distribution (e.g., distributional RL). In this paper, we propose Wasserstein Actor-Critic (WAC), an actor-critic architecture inspired by the recent Wasserstein Q-Learning (WQL), that employs approximate Q-posteriors to represent the epistemic uncertainty and Wasserstein barycenters for uncertainty propagation across the state-action space. WAC enforces exploration in a principled way by guiding the policy learning process with the optimization of an upper bound of the Q-value estimates. Furthermore, we study some peculiar issues that arise when using function approximation, coupled with the uncertainty estimation, and propose a regularized loss for the uncertainty estimation. Finally, we evaluate our algorithm on standard MujoCo tasks as well as suite of continuous-actions domains, where exploration is crucial, in comparison with state-of-the-art baselines. Additional details and results can be found in the supplementary material with our Arxiv preprint
Extreme Risk Mitigation in Reinforcement Learning using Extreme Value Theory
Risk-sensitive reinforcement learning (RL) has garnered significant attention
in recent years due to the growing interest in deploying RL agents in
real-world scenarios. A critical aspect of risk awareness involves modeling
highly rare risk events (rewards) that could potentially lead to catastrophic
outcomes. These infrequent occurrences present a formidable challenge for
data-driven methods aiming to capture such risky events accurately. While
risk-aware RL techniques do exist, their level of risk aversion heavily relies
on the precision of the state-action value function estimation when modeling
these rare occurrences. Our work proposes to enhance the resilience of RL
agents when faced with very rare and risky events by focusing on refining the
predictions of the extreme values predicted by the state-action value function
distribution. To achieve this, we formulate the extreme values of the
state-action value function distribution as parameterized distributions,
drawing inspiration from the principles of extreme value theory (EVT). This
approach effectively addresses the issue of infrequent occurrence by leveraging
EVT-based parameterization. Importantly, we theoretically demonstrate the
advantages of employing these parameterized distributions in contrast to other
risk-averse algorithms. Our evaluations show that the proposed method
outperforms other risk averse RL algorithms on a diverse range of benchmark
tasks, each encompassing distinct risk scenarios
- …