4,378 research outputs found

    Active Inference: Demystified and Compared

    Get PDF
    Active inference is a first principle account of how autonomous agents operate in dynamic, nonstationary environments. This problem is also considered in reinforcement learning, but limited work exists on comparing the two approaches on the same discrete-state environments. In this letter, we provide (1) an accessible overview of the discrete-state formulation of active inference, highlighting natural behaviors in active inference that are generally engineered in reinforcement learning, and (2) an explicit discrete-state comparison between active inference and reinforcement learning on an OpenAI gym baseline. We begin by providing a condensed overview of the active inference literature, in particular viewing the various natural behaviors of active inference agents through the lens of reinforcement learning. We show that by operating in a pure belief-based setting, active inference agents can carry out epistemic exploration-and account for uncertainty about their environment-in a Bayes-optimal fashion. Furthermore, we show that the reliance on an explicit reward signal in reinforcement learning is removed in active inference, where reward can simply be treated as another observation we have a preference over; even in the total absence of rewards, agent behaviors are learned through preference learning. We make these properties explicit by showing two scenarios in which active inference agents can infer behaviors in reward-free environments compared to both Q-learning and Bayesian model-based reinforcement learning agents and by placing zero prior preferences over rewards and learning the prior preferences over the observations corresponding to reward. We conclude by noting that this formalism can be applied to more complex settings (e.g., robotic arm movement, Atari games) if appropriate generative models can be formulated. In short, we aim to demystify the behavior of active inference agents by presenting an accessible discrete state-space and time formulation and demonstrate these behaviors in a OpenAI gym environment, alongside reinforcement learning agents

    Request-and-Reverify: Hierarchical Hypothesis Testing for Concept Drift Detection with Expensive Labels

    Full text link
    One important assumption underlying common classification models is the stationarity of the data. However, in real-world streaming applications, the data concept indicated by the joint distribution of feature and label is not stationary but drifting over time. Concept drift detection aims to detect such drifts and adapt the model so as to mitigate any deterioration in the model's predictive performance. Unfortunately, most existing concept drift detection methods rely on a strong and over-optimistic condition that the true labels are available immediately for all already classified instances. In this paper, a novel Hierarchical Hypothesis Testing framework with Request-and-Reverify strategy is developed to detect concept drifts by requesting labels only when necessary. Two methods, namely Hierarchical Hypothesis Testing with Classification Uncertainty (HHT-CU) and Hierarchical Hypothesis Testing with Attribute-wise "Goodness-of-fit" (HHT-AG), are proposed respectively under the novel framework. In experiments with benchmark datasets, our methods demonstrate overwhelming advantages over state-of-the-art unsupervised drift detectors. More importantly, our methods even outperform DDM (the widely used supervised drift detector) when we use significantly fewer labels.Comment: Published as a conference paper at IJCAI 201

    A Real-Time Unsupervised Neural Network for the Low-Level Control of a Mobile Robot in a Nonstationary Environment

    Full text link
    This article introduces a real-time, unsupervised neural network that learns to control a two-degree-of-freedom mobile robot in a nonstationary environment. The neural controller, which is termed neural NETwork MObile Robot Controller (NETMORC), combines associative learning and Vector Associative Map (YAM) learning to generate transformations between spatial and velocity coordinates. As a result, the controller learns the wheel velocities required to reach a target at an arbitrary distance and angle. The transformations are learned during an unsupervised training phase, during which the robot moves as a result of randomly selected wheel velocities. The robot learns the relationship between these velocities and the resulting incremental movements. Aside form being able to reach stationary or moving targets, the NETMORC structure also enables the robot to perform successfully in spite of disturbances in the enviroment, such as wheel slippage, or changes in the robot's plant, including changes in wheel radius, changes in inter-wheel distance, or changes in the internal time step of the system. Finally, the controller is extended to include a module that learns an internal odometric transformation, allowing the robot to reach targets when visual input is sporadic or unreliable.Sloan Fellowship (BR-3122), Air Force Office of Scientific Research (F49620-92-J-0499
    • …
    corecore