1,146 research outputs found
Beyond correlation: optimal transport metrics for characterizing representational stability and remapping in neurons encoding spatial memory
IntroductionSpatial representations in the entorhinal cortex (EC) and hippocampus (HPC) are fundamental to cognitive functions like navigation and memory. These representations, embodied in spatial field maps, dynamically remap in response to environmental changes. However, current methods, such as Pearson's correlation coefficient, struggle to capture the complexity of these remapping events, especially when fields do not overlap, or transformations are non-linear. This limitation hinders our understanding and quantification of remapping, a key aspect of spatial memory function.MethodsWe propose a family of metrics based on the Earth Mover's Distance (EMD) as a versatile framework for characterizing remapping.ResultsThe EMD provides a granular, noise-resistant, and rate-robust description of remapping. This approach enables the identification of specific cell types and the characterization of remapping in various scenarios, including disease models. Furthermore, the EMD's properties can be manipulated to identify spatially tuned cell types and to explore remapping as it relates to alternate information forms such as spatiotemporal coding.DiscussionWe present a feasible, lightweight approach that complements traditional methods. Our findings underscore the potential of the EMD as a powerful tool for enhancing our understanding of remapping in the brain and its implications for spatial navigation, memory studies and beyond
LIPIcs, Volume 251, ITCS 2023, Complete Volume
LIPIcs, Volume 251, ITCS 2023, Complete Volum
Safe Collaborative Filtering
Excellent tail performance is crucial for modern machine learning tasks, such
as algorithmic fairness, class imbalance, and risk-sensitive decision making,
as it ensures the effective handling of challenging samples within a dataset.
Tail performance is also a vital determinant of success for personalised
recommender systems to reduce the risk of losing users with low satisfaction.
This study introduces a "safe" collaborative filtering method that prioritises
recommendation quality for less-satisfied users rather than focusing on the
average performance. Our approach minimises the conditional value at risk
(CVaR), which represents the average risk over the tails of users' loss. To
overcome computational challenges for web-scale recommender systems, we develop
a robust yet practical algorithm that extends the most scalable method,
implicit alternating least squares (iALS). Empirical evaluation on real-world
datasets demonstrates the excellent tail performance of our approach while
maintaining competitive computational efficiency
Reinforcement Learning with General Utilities: Simpler Variance Reduction and Large State-Action Space
We consider the reinforcement learning (RL) problem with general utilities
which consists in maximizing a function of the state-action occupancy measure.
Beyond the standard cumulative reward RL setting, this problem includes as
particular cases constrained RL, pure exploration and learning from
demonstrations among others. For this problem, we propose a simpler single-loop
parameter-free normalized policy gradient algorithm. Implementing a recursive
momentum variance reduction mechanism, our algorithm achieves
and
sample complexities for -first-order stationarity and
-global optimality respectively, under adequate assumptions. We
further address the setting of large finite state action spaces via linear
function approximation of the occupancy measure and show a
sample complexity for a simple policy
gradient method with a linear regression subroutine.Comment: 48 pages, 2 figures, ICML 2023, this paper was initially submitted in
January 26th 202
Adaptive Robotic Information Gathering via Non-Stationary Gaussian Processes
Robotic Information Gathering (RIG) is a foundational research topic that
answers how a robot (team) collects informative data to efficiently build an
accurate model of an unknown target function under robot embodiment
constraints. RIG has many applications, including but not limited to autonomous
exploration and mapping, 3D reconstruction or inspection, search and rescue,
and environmental monitoring. A RIG system relies on a probabilistic model's
prediction uncertainty to identify critical areas for informative data
collection. Gaussian Processes (GPs) with stationary kernels have been widely
adopted for spatial modeling. However, real-world spatial data is typically
non-stationary -- different locations do not have the same degree of
variability. As a result, the prediction uncertainty does not accurately reveal
prediction error, limiting the success of RIG algorithms. We propose a family
of non-stationary kernels named Attentive Kernel (AK), which is simple, robust,
and can extend any existing kernel to a non-stationary one. We evaluate the new
kernel in elevation mapping tasks, where AK provides better accuracy and
uncertainty quantification over the commonly used stationary kernels and the
leading non-stationary kernels. The improved uncertainty quantification guides
the downstream informative planner to collect more valuable data around the
high-error area, further increasing prediction accuracy. A field experiment
demonstrates that the proposed method can guide an Autonomous Surface Vehicle
(ASV) to prioritize data collection in locations with significant spatial
variations, enabling the model to characterize salient environmental features.Comment: International Journal of Robotics Research (IJRR). arXiv admin note:
text overlap with arXiv:2205.0642
Discrete sequential games with random payoffs
In our thesis we consider games with random payoff as a generalizations of the stan- dard concept of games in the game theory. We discuss possible optimality conditions for these types of games. In one of these approaches by the concept of a α-Nash equilibria we manage to prove the existence of this generalization of Nash equilibria for the case when the payoff has only finite number of realizations. We then apply those concepts to the case when the game is considered in multiple stages. In the practical part of this thesis we consider an application to a competition of internet providers which we model by a generalized version of the Cornout model of duopoly. We compare results of our optimal strategy with the deterministic approaches to this problem. 1V naši práci studujeme hry s náhodnou výplatní funkcí jako zobecnění standardní definice hry z teorie her. V práci diskutujeme možné kritéria optimality pro tyto hry. Jedním z těchto kritérií je koncept α-Nashové rovnováhy. Pro toto zobecnění Nashové rovnováhy se nám podařilo dokázat její existenci pro případ, kdy má výplatní funkce nanejvýš konečný počet realizací. Následně aplikujeme koncepty optimality vyvinuté pro jednokolovou hru na případ hry s více koly. V praktické části naši práce se věnujeme aplikaci na soutěž poskytovatelů internetových služeb, kterou modelujeme zobecněnou verzí Cornoutovho modelu duopolu. Výsledky naši optimálni stratégie porovnávame s optimálnimi stratégiemi determinisitických přístupu. 1Department of Probability and Mathematical StatisticsKatedra pravděpodobnosti a matematické statistikyFaculty of Mathematics and PhysicsMatematicko-fyzikální fakult
Reinforcement learning in large, structured action spaces: A simulation study of decision support for spinal cord injury rehabilitation
Reinforcement learning (RL) has helped improve decision-making in several
applications. However, applying traditional RL is challenging in some
applications, such as rehabilitation of people with a spinal cord injury (SCI).
Among other factors, using RL in this domain is difficult because there are
many possible treatments (i.e., large action space) and few patients (i.e.,
limited training data). Treatments for SCIs have natural groupings, so we
propose two approaches to grouping treatments so that an RL agent can learn
effectively from limited data. One relies on domain knowledge of SCI
rehabilitation and the other learns similarities among treatments using an
embedding technique. We then use Fitted Q Iteration to train an agent that
learns optimal treatments. Through a simulation study designed to reflect the
properties of SCI rehabilitation, we find that both methods can help improve
the treatment decisions of physiotherapists, but the approach based on domain
knowledge offers better performance. Our findings provide a "proof of concept"
that RL can be used to help improve the treatment of those with an SCI and
indicates that continued efforts to gather data and apply RL to this domain are
worthwhile.Comment: 31 pages, 7 figure
The Benefits of Being Distributional: Small-Loss Bounds for Reinforcement Learning
While distributional reinforcement learning (RL) has demonstrated empirical
success, the question of when and why it is beneficial has remained unanswered.
In this work, we provide one explanation for the benefits of distributional RL
through the lens of small-loss bounds, which scale with the instance-dependent
optimal cost. If the optimal cost is small, our bounds are stronger than those
from non-distributional approaches. As warmup, we show that learning the cost
distribution leads to small-loss regret bounds in contextual bandits (CB), and
we find that distributional CB empirically outperforms the state-of-the-art on
three challenging tasks. For online RL, we propose a distributional
version-space algorithm that constructs confidence sets using maximum
likelihood estimation, and we prove that it achieves small-loss regret in the
tabular MDPs and enjoys small-loss PAC bounds in latent variable models.
Building on similar insights, we propose a distributional offline RL algorithm
based on the pessimism principle and prove that it enjoys small-loss PAC
bounds, which exhibit a novel robustness property. For both online and offline
RL, our results provide the first theoretical benefits of learning
distributions even when we only need the mean for making decisions
- …