1,146 research outputs found

    Beyond correlation: optimal transport metrics for characterizing representational stability and remapping in neurons encoding spatial memory

    Get PDF
    IntroductionSpatial representations in the entorhinal cortex (EC) and hippocampus (HPC) are fundamental to cognitive functions like navigation and memory. These representations, embodied in spatial field maps, dynamically remap in response to environmental changes. However, current methods, such as Pearson's correlation coefficient, struggle to capture the complexity of these remapping events, especially when fields do not overlap, or transformations are non-linear. This limitation hinders our understanding and quantification of remapping, a key aspect of spatial memory function.MethodsWe propose a family of metrics based on the Earth Mover's Distance (EMD) as a versatile framework for characterizing remapping.ResultsThe EMD provides a granular, noise-resistant, and rate-robust description of remapping. This approach enables the identification of specific cell types and the characterization of remapping in various scenarios, including disease models. Furthermore, the EMD's properties can be manipulated to identify spatially tuned cell types and to explore remapping as it relates to alternate information forms such as spatiotemporal coding.DiscussionWe present a feasible, lightweight approach that complements traditional methods. Our findings underscore the potential of the EMD as a powerful tool for enhancing our understanding of remapping in the brain and its implications for spatial navigation, memory studies and beyond

    Industry Herding in Crypto Assets

    Get PDF
    Peer reviewedPostprin

    LIPIcs, Volume 251, ITCS 2023, Complete Volume

    Get PDF
    LIPIcs, Volume 251, ITCS 2023, Complete Volum

    Safe Collaborative Filtering

    Full text link
    Excellent tail performance is crucial for modern machine learning tasks, such as algorithmic fairness, class imbalance, and risk-sensitive decision making, as it ensures the effective handling of challenging samples within a dataset. Tail performance is also a vital determinant of success for personalised recommender systems to reduce the risk of losing users with low satisfaction. This study introduces a "safe" collaborative filtering method that prioritises recommendation quality for less-satisfied users rather than focusing on the average performance. Our approach minimises the conditional value at risk (CVaR), which represents the average risk over the tails of users' loss. To overcome computational challenges for web-scale recommender systems, we develop a robust yet practical algorithm that extends the most scalable method, implicit alternating least squares (iALS). Empirical evaluation on real-world datasets demonstrates the excellent tail performance of our approach while maintaining competitive computational efficiency

    Reinforcement Learning with General Utilities: Simpler Variance Reduction and Large State-Action Space

    Full text link
    We consider the reinforcement learning (RL) problem with general utilities which consists in maximizing a function of the state-action occupancy measure. Beyond the standard cumulative reward RL setting, this problem includes as particular cases constrained RL, pure exploration and learning from demonstrations among others. For this problem, we propose a simpler single-loop parameter-free normalized policy gradient algorithm. Implementing a recursive momentum variance reduction mechanism, our algorithm achieves O~(ϵ3)\tilde{\mathcal{O}}(\epsilon^{-3}) and O~(ϵ2)\tilde{\mathcal{O}}(\epsilon^{-2}) sample complexities for ϵ\epsilon-first-order stationarity and ϵ\epsilon-global optimality respectively, under adequate assumptions. We further address the setting of large finite state action spaces via linear function approximation of the occupancy measure and show a O~(ϵ4)\tilde{\mathcal{O}}(\epsilon^{-4}) sample complexity for a simple policy gradient method with a linear regression subroutine.Comment: 48 pages, 2 figures, ICML 2023, this paper was initially submitted in January 26th 202

    Adaptive Robotic Information Gathering via Non-Stationary Gaussian Processes

    Full text link
    Robotic Information Gathering (RIG) is a foundational research topic that answers how a robot (team) collects informative data to efficiently build an accurate model of an unknown target function under robot embodiment constraints. RIG has many applications, including but not limited to autonomous exploration and mapping, 3D reconstruction or inspection, search and rescue, and environmental monitoring. A RIG system relies on a probabilistic model's prediction uncertainty to identify critical areas for informative data collection. Gaussian Processes (GPs) with stationary kernels have been widely adopted for spatial modeling. However, real-world spatial data is typically non-stationary -- different locations do not have the same degree of variability. As a result, the prediction uncertainty does not accurately reveal prediction error, limiting the success of RIG algorithms. We propose a family of non-stationary kernels named Attentive Kernel (AK), which is simple, robust, and can extend any existing kernel to a non-stationary one. We evaluate the new kernel in elevation mapping tasks, where AK provides better accuracy and uncertainty quantification over the commonly used stationary kernels and the leading non-stationary kernels. The improved uncertainty quantification guides the downstream informative planner to collect more valuable data around the high-error area, further increasing prediction accuracy. A field experiment demonstrates that the proposed method can guide an Autonomous Surface Vehicle (ASV) to prioritize data collection in locations with significant spatial variations, enabling the model to characterize salient environmental features.Comment: International Journal of Robotics Research (IJRR). arXiv admin note: text overlap with arXiv:2205.0642

    Discrete sequential games with random payoffs

    Get PDF
    In our thesis we consider games with random payoff as a generalizations of the stan- dard concept of games in the game theory. We discuss possible optimality conditions for these types of games. In one of these approaches by the concept of a α-Nash equilibria we manage to prove the existence of this generalization of Nash equilibria for the case when the payoff has only finite number of realizations. We then apply those concepts to the case when the game is considered in multiple stages. In the practical part of this thesis we consider an application to a competition of internet providers which we model by a generalized version of the Cornout model of duopoly. We compare results of our optimal strategy with the deterministic approaches to this problem. 1V naši práci studujeme hry s náhodnou výplatní funkcí jako zobecnění standardní definice hry z teorie her. V práci diskutujeme možné kritéria optimality pro tyto hry. Jedním z těchto kritérií je koncept α-Nashové rovnováhy. Pro toto zobecnění Nashové rovnováhy se nám podařilo dokázat její existenci pro případ, kdy má výplatní funkce nanejvýš konečný počet realizací. Následně aplikujeme koncepty optimality vyvinuté pro jednokolovou hru na případ hry s více koly. V praktické části naši práce se věnujeme aplikaci na soutěž poskytovatelů internetových služeb, kterou modelujeme zobecněnou verzí Cornoutovho modelu duopolu. Výsledky naši optimálni stratégie porovnávame s optimálnimi stratégiemi determinisitických přístupu. 1Department of Probability and Mathematical StatisticsKatedra pravděpodobnosti a matematické statistikyFaculty of Mathematics and PhysicsMatematicko-fyzikální fakult

    Reinforcement learning in large, structured action spaces: A simulation study of decision support for spinal cord injury rehabilitation

    Full text link
    Reinforcement learning (RL) has helped improve decision-making in several applications. However, applying traditional RL is challenging in some applications, such as rehabilitation of people with a spinal cord injury (SCI). Among other factors, using RL in this domain is difficult because there are many possible treatments (i.e., large action space) and few patients (i.e., limited training data). Treatments for SCIs have natural groupings, so we propose two approaches to grouping treatments so that an RL agent can learn effectively from limited data. One relies on domain knowledge of SCI rehabilitation and the other learns similarities among treatments using an embedding technique. We then use Fitted Q Iteration to train an agent that learns optimal treatments. Through a simulation study designed to reflect the properties of SCI rehabilitation, we find that both methods can help improve the treatment decisions of physiotherapists, but the approach based on domain knowledge offers better performance. Our findings provide a "proof of concept" that RL can be used to help improve the treatment of those with an SCI and indicates that continued efforts to gather data and apply RL to this domain are worthwhile.Comment: 31 pages, 7 figure

    The Benefits of Being Distributional: Small-Loss Bounds for Reinforcement Learning

    Full text link
    While distributional reinforcement learning (RL) has demonstrated empirical success, the question of when and why it is beneficial has remained unanswered. In this work, we provide one explanation for the benefits of distributional RL through the lens of small-loss bounds, which scale with the instance-dependent optimal cost. If the optimal cost is small, our bounds are stronger than those from non-distributional approaches. As warmup, we show that learning the cost distribution leads to small-loss regret bounds in contextual bandits (CB), and we find that distributional CB empirically outperforms the state-of-the-art on three challenging tasks. For online RL, we propose a distributional version-space algorithm that constructs confidence sets using maximum likelihood estimation, and we prove that it achieves small-loss regret in the tabular MDPs and enjoys small-loss PAC bounds in latent variable models. Building on similar insights, we propose a distributional offline RL algorithm based on the pessimism principle and prove that it enjoys small-loss PAC bounds, which exhibit a novel robustness property. For both online and offline RL, our results provide the first theoretical benefits of learning distributions even when we only need the mean for making decisions
    corecore