7 research outputs found

    Trustworthy Reinforcement Learning Against Intrinsic Vulnerabilities: Robustness, Safety, and Generalizability

    Full text link
    A trustworthy reinforcement learning algorithm should be competent in solving challenging real-world problems, including {robustly} handling uncertainties, satisfying {safety} constraints to avoid catastrophic failures, and {generalizing} to unseen scenarios during deployments. This study aims to overview these main perspectives of trustworthy reinforcement learning considering its intrinsic vulnerabilities on robustness, safety, and generalizability. In particular, we give rigorous formulations, categorize corresponding methodologies, and discuss benchmarks for each perspective. Moreover, we provide an outlook section to spur promising future directions with a brief discussion on extrinsic vulnerabilities considering human feedback. We hope this survey could bring together separate threads of studies together in a unified framework and promote the trustworthiness of reinforcement learning.Comment: 36 pages, 5 figure

    Clustering of Preferred Directions During Brain-Computer Interface Usage

    Get PDF
    Brain-computer interfaces (BCIs) are proving to be viable clinical interventions for sufferers of amyotrophic lateral sclerosis, amputations, and spinal cord injuries. To improve the viability of BCIs, it will help to have a thorough understanding of how the brain controls them. Neural activity during usage of certain BCIs behaves in a surprising and seemingly counterintuitive manner – the preferred directions (PDs) of neurons cluster together. We trained monkeys to reach to targets in a center-out task either using their arm or a BCI. We found that neurons’ PDs cluster similarly during training of the BCI decoder and usage of the BCI, but remain relatively unclustered when the monkeys use their arms. Modulation depths increase upon usage of the BCI, and narrowness of tuning tends to either increase or decrease rather than staying the same. In addition, the cluster direction can be predicted from per-target performance. A model where two neurons’ PDs approach one another reveals how much modulation depths have to increase to maintain controllability. This thesis concludes with considerations of why this clustering might occur, and whether or not it benefits BCI control

    Investigating the neural basis of learning using brain-computer interfaces

    Get PDF
    Learning a new skill requires one to produce new patterns of activity among networks of neurons. This applies not only to physical skills, such as learning to play a new sport, but also to abstract skills, such as learning to play chess. An abstract skill that we can use to study the neural mechanisms of learning, in general, is controlling a brain-computer interface (BCI). BCIs were conceived of as assistive devices to help people with paralysis, limb-loss, or other neurological disorder, but they have also proven effective as tools to study the neural basis of sensory-motor control and learning. We tested the ability of subjects to generate neural activity patterns required to control arbitrary BCI decoders. We found that the subjects could more easily learn to control the decoder when they could use existing patterns of neural activity than when they needed to generate new patterns. We also analyzed the way in which subjects adapted their neural activity during learning. We found that neural activity adapts in a way that is consistent with the learning-related performance improvements and that the trial-to-trial variability of neural activity decreased as performance improved. We tested how specific properties of BCI decoders, which translate neural activity into movements of the effector, influence the ability to learn to control a BCI by incorporating dimensionality reduction into a Kalman filter and assessing how performance related to the number of latent dimensions. We found that the subjects could use a standard Kalman filter just as well as a Kalman filter that incorporates dimensionality reduction. However, as the dimensionality of the model increased, performance improved up to an asymptotic level. Lastly, we tested whether increasing the difficulty of a task would lead the subjects to learn to demonstrate better BCI performance. We implemented an instructed path task that required the animals to move a cursor along re-defined paths, and we found that this task motivated one monkey to improve his performance. In all, these studies help to uncover what contributes to BCI control, and they help pave the way for transitioning BCIs from the lab to the clinic
    corecore