22,579 research outputs found

    Consciousness is learning: predictive processing systems that learn by binding may perceive themselves as conscious

    Full text link
    Machine learning algorithms have achieved superhuman performance in specific complex domains. Yet learning online from few examples and efficiently generalizing across domains remains elusive. In humans such learning proceeds via declarative memory formation and is closely associated with consciousness. Predictive processing has been advanced as a principled Bayesian inference framework for understanding the cortex as implementing deep generative perceptual models for both sensory data and action control. However, predictive processing offers little direct insight into fast compositional learning or the mystery of consciousness. Here we propose that through implementing online learning by hierarchical binding of unpredicted inferences, a predictive processing system may flexibly generalize in novel situations by forming working memories for perceptions and actions from single examples, which can become short- and long-term declarative memories retrievable by associative recall. We argue that the contents of such working memories are unified yet differentiated, can be maintained by selective attention and are consistent with observations of masking, postdictive perceptual integration, and other paradigm cases of consciousness research. We describe how the brain could have evolved to use perceptual value prediction for reinforcement learning of complex action policies simultaneously implementing multiple survival and reproduction strategies. 'Conscious experience' is how such a learning system perceptually represents its own functioning, suggesting an answer to the meta problem of consciousness. Our proposal naturally unifies feature binding, recurrent processing, and predictive processing with global workspace, and, to a lesser extent, the higher order theories of consciousness.Comment: This version adds 5 figures (new) and only modifies the text to reference the figure

    Graph-based Reinforcement Learning meets Mixed Integer Programs: An application to 3D robot assembly discovery

    Full text link
    Robot assembly discovery is a challenging problem that lives at the intersection of resource allocation and motion planning. The goal is to combine a predefined set of objects to form something new while considering task execution with the robot-in-the-loop. In this work, we tackle the problem of building arbitrary, predefined target structures entirely from scratch using a set of Tetris-like building blocks and a robotic manipulator. Our novel hierarchical approach aims at efficiently decomposing the overall task into three feasible levels that benefit mutually from each other. On the high level, we run a classical mixed-integer program for global optimization of block-type selection and the blocks' final poses to recreate the desired shape. Its output is then exploited to efficiently guide the exploration of an underlying reinforcement learning (RL) policy. This RL policy draws its generalization properties from a flexible graph-based representation that is learned through Q-learning and can be refined with search. Moreover, it accounts for the necessary conditions of structural stability and robotic feasibility that cannot be effectively reflected in the previous layer. Lastly, a grasp and motion planner transforms the desired assembly commands into robot joint movements. We demonstrate our proposed method's performance on a set of competitive simulated RAD environments, showcase real-world transfer, and report performance and robustness gains compared to an unstructured end-to-end approach. Videos are available at https://sites.google.com/view/rl-meets-milp

    Towards practical reinforcement learning for tokamak magnetic control

    Full text link
    Reinforcement learning (RL) has shown promising results for real-time control systems, including the domain of plasma magnetic control. However, there are still significant drawbacks compared to traditional feedback control approaches for magnetic confinement. In this work, we address key drawbacks of the RL method; achieving higher control accuracy for desired plasma properties, reducing the steady-state error, and decreasing the required time to learn new tasks. We build on top of \cite{degrave2022magnetic}, and present algorithmic improvements to the agent architecture and training procedure. We present simulation results that show up to 65\% improvement in shape accuracy, achieve substantial reduction in the long-term bias of the plasma current, and additionally reduce the training time required to learn new tasks by a factor of 3 or more. We present new experiments using the upgraded RL-based controllers on the TCV tokamak, which validate the simulation results achieved, and point the way towards routinely achieving accurate discharges using the RL approach
    • …
    corecore