1,694 research outputs found

    Active Inference: Demystified and Compared

    Get PDF
    Active inference is a first principle account of how autonomous agents operate in dynamic, nonstationary environments. This problem is also considered in reinforcement learning, but limited work exists on comparing the two approaches on the same discrete-state environments. In this letter, we provide (1) an accessible overview of the discrete-state formulation of active inference, highlighting natural behaviors in active inference that are generally engineered in reinforcement learning, and (2) an explicit discrete-state comparison between active inference and reinforcement learning on an OpenAI gym baseline. We begin by providing a condensed overview of the active inference literature, in particular viewing the various natural behaviors of active inference agents through the lens of reinforcement learning. We show that by operating in a pure belief-based setting, active inference agents can carry out epistemic exploration-and account for uncertainty about their environment-in a Bayes-optimal fashion. Furthermore, we show that the reliance on an explicit reward signal in reinforcement learning is removed in active inference, where reward can simply be treated as another observation we have a preference over; even in the total absence of rewards, agent behaviors are learned through preference learning. We make these properties explicit by showing two scenarios in which active inference agents can infer behaviors in reward-free environments compared to both Q-learning and Bayesian model-based reinforcement learning agents and by placing zero prior preferences over rewards and learning the prior preferences over the observations corresponding to reward. We conclude by noting that this formalism can be applied to more complex settings (e.g., robotic arm movement, Atari games) if appropriate generative models can be formulated. In short, we aim to demystify the behavior of active inference agents by presenting an accessible discrete state-space and time formulation and demonstrate these behaviors in a OpenAI gym environment, alongside reinforcement learning agents

    Deformable plate tectonic models of the southern North Atlantic

    Get PDF
    Significant, poly-phase deformation occurred prior to, simultaneous with, and after the opening of the North Atlantic Ocean. Understanding this deformation history is essential for understanding the regional development and the mechanisms controlling rifting and subsequent failure or breakup. Here, we primarily use published constraints to construct deformable plate tectonic models for the southern North Atlantic from 200 Ma to present using GPlates. The aim of this work is to test both the capability of the GPlates deformable modelling approach and the reliability of published plate reconstructions. Overall, modelled crustal thickness values at 0 Ma produced from the deformable models show general, regional-scale, similarities with values derived from the inversion of gravity data for crustal thickness. However, the deformable models typically underestimate thinning in marginal basins and overestimate crustal thickness in continental fragments compared to values from gravity inversion. This is possibly due to: 1) thinning occurring earlier than the 200 Ma start time modelled, 2) variations in the original crustal thickness, 3) depth-dependent stretching, 4) rigid blocks undergoing some degree of thinning, and 5) variations in the mesh density of the models. The results demonstrate that inclusion of micro-continental fragments, and locally defined limits of continental crust, generally produce results more akin to observations. One exception is the Grand Banks where global GPlates models produce more realistic deformation, likely due to the inclusion of the exhumed domains continent-ward of the transition zone boundary. Results also indicate that Flemish Cap rotation is required to provide a reasonable fit between North America and Iberia, with the palaeo-position of the Flemish Cap likely to be the proto-Orphan sub-basins. Moreover, the East and West Orphan sub-basins formed separately due to the respective rotations of the Flemish Cap and the Orphan Knoll, which was likely associated with other continental fragments that subsequently contributed to the thicker crust forming the boundary between the East and West Orphan basins. The results also suggest a link between tectonic and magmatic processes. For example, the inclusion of an Orphan Knoll micro-continental block results in greater extension (higher beta factors) in the northern West Orphan Basin near the termination of the Charlie-Gibbs Fracture Zone, and the site of the Charlie-Gibbs Volcanic Province (CGVP). Thus, we infer that the CGVP was likely influenced by plate tectonic processes through the concentration of strain resulting from interaction in proximity to the transform system. Finally, marginal basins that were considered to be conjugate and thus related, may only appear conjugate through later rotation of micro-continental blocks, and thus their genesis is not directly related

    Moment-based parameter estimation in binomial random intersection graph models

    Full text link
    Binomial random intersection graphs can be used as parsimonious statistical models of large and sparse networks, with one parameter for the average degree and another for transitivity, the tendency of neighbours of a node to be connected. This paper discusses the estimation of these parameters from a single observed instance of the graph, using moment estimators based on observed degrees and frequencies of 2-stars and triangles. The observed data set is assumed to be a subgraph induced by a set of n0n_0 nodes sampled from the full set of nn nodes. We prove the consistency of the proposed estimators by showing that the relative estimation error is small with high probability for n0≫n2/3≫1n_0 \gg n^{2/3} \gg 1. As a byproduct, our analysis confirms that the empirical transitivity coefficient of the graph is with high probability close to the theoretical clustering coefficient of the model.Comment: 15 pages, 6 figure

    Learning General World Models in a Handful of Reward-Free Deployments

    Get PDF
    Building generally capable agents is a grand challenge for deep reinforcement learning (RL). To approach this challenge practically, we outline two key desiderata: 1) to facilitate generalization, exploration should be task agnostic; 2) to facilitate scalability, exploration policies should collect large quantities of data without costly centralized retraining. Combining these two properties, we introduce the reward-free deployment efficiency setting, a new paradigm for RL research. We then present CASCADE, a novel approach for self-supervised exploration in this new setting. CASCADE seeks to learn a world model by collecting data with a population of agents, using an information theoretic objective inspired by Bayesian Active Learning. CASCADE achieves this by specifically maximizing the diversity of trajectories sampled by the population through a novel cascading objective. We provide theoretical intuition for CASCADE which we show in a tabular setting improves upon naĂŻve approaches that do not account for population diversity. We then demonstrate that CASCADE collects diverse task-agnostic datasets and learns agents that generalize zero-shot to novel, unseen downstream tasks on Atari, MiniGrid, Crafter and the DM Control Suite. Code and videos are available at https://ycxuyingchen.github.io/cascade/

    Modelling change with an integrated approach to manufacturing system design

    Get PDF
    This paper proposes a model that integrates information from product, process and organisation domains with a view to help manage these complex interrelationships with multiple layers of interaction. This model incorporates an integrated mechanism that simulates change effects during the design of complex manufacturing system by populating a Multi-layered Domain Matrix (MDM) and applying a Change Prediction Model (CPM) propagation mechanism to interconnected elements

    Earnings Prediction with Deep Leaning

    Full text link
    In the financial sector, a reliable forecast the future financial performance of a company is of great importance for investors' investment decisions. In this paper we compare long-term short-term memory (LSTM) networks to temporal convolution network (TCNs) in the prediction of future earnings per share (EPS). The experimental analysis is based on quarterly financial reporting data and daily stock market returns. For a broad sample of US firms, we find that both LSTMs outperform the naive persistent model with up to 30.0% more accurate predictions, while TCNs achieve and an improvement of 30.8%. Both types of networks are at least as accurate as analysts and exceed them by up to 12.2% (LSTM) and 13.2% (TCN).Comment: 7 pages, 4 figures, 2 tables, submitted to KI202
    • …
    corecore