56 research outputs found

    Extreme State Aggregation Beyond MDPs

    Full text link
    We consider a Reinforcement Learning setup where an agent interacts with an environment in observation-reward-action cycles without any (esp.\ MDP) assumptions on the environment. State aggregation and more generally feature reinforcement learning is concerned with mapping histories/raw-states to reduced/aggregated states. The idea behind both is that the resulting reduced process (approximately) forms a small stationary finite-state MDP, which can then be efficiently solved or learnt. We considerably generalize existing aggregation results by showing that even if the reduced process is not an MDP, the (q-)value functions and (optimal) policies of an associated MDP with same state-space size solve the original problem, as long as the solution can approximately be represented as a function of the reduced states. This implies an upper bound on the required state space size that holds uniformly for all RL problems. It may also explain why RL algorithms designed for MDPs sometimes perform well beyond MDPs.Comment: 28 LaTeX pages. 8 Theorem

    Intelligence as inference or forcing Occam on the world

    No full text
    We propose to perform the optimization task of Universal Artificial Intelligence (UAI) through learning a reference machine on which good programs are short. Further, we also acknowledge that the choice of reference machine that the UAI objective is based on is arbitrary and, therefore, we learn a suitable machine for the environment we are in. This is based on viewing Occam’s razor as an imperative instead of as a proposition about the world. Since this principle cannot be true for all reference machines, we need to find a machine that makes the principle true. We both want good policies and the environment to have short implementations on the machine. Such a machine is learnt iteratively through a procedure that generalizes the principle underlying the Expectation-Maximization algorithm

    Melting Pot 2.0

    Full text link
    Multi-agent artificial intelligence research promises a path to develop intelligent technologies that are more human-like and more human-compatible than those produced by "solipsistic" approaches, which do not consider interactions between agents. Melting Pot is a research tool developed to facilitate work on multi-agent artificial intelligence, and provides an evaluation protocol that measures generalization to novel social partners in a set of canonical test scenarios. Each scenario pairs a physical environment (a "substrate") with a reference set of co-players (a "background population"), to create a social situation with substantial interdependence between the individuals involved. For instance, some scenarios were inspired by institutional-economics-based accounts of natural resource management and public-good-provision dilemmas. Others were inspired by considerations from evolutionary biology, game theory, and artificial life. Melting Pot aims to cover a maximally diverse set of interdependencies and incentives. It includes the commonly-studied extreme cases of perfectly-competitive (zero-sum) motivations and perfectly-cooperative (shared-reward) motivations, but does not stop with them. As in real-life, a clear majority of scenarios in Melting Pot have mixed incentives. They are neither purely competitive nor purely cooperative and thus demand successful agents be able to navigate the resulting ambiguity. Here we describe Melting Pot 2.0, which revises and expands on Melting Pot. We also introduce support for scenarios with asymmetric roles, and explain how to integrate them into the evaluation protocol. This report also contains: (1) details of all substrates and scenarios; (2) a complete description of all baseline algorithms and results. Our intention is for it to serve as a reference for researchers using Melting Pot 2.0.Comment: 59 pages, 54 figures. arXiv admin note: text overlap with arXiv:2107.0685

    Effects of dietary macronutrient intake on insulin sensitivity and secretion and glucose and lipid metabolism in healthy, obese adolescents

    No full text
    CONTEXT: Adolescent obesity is a serious public health concern. OBJECTIVE: The aim of the study was to determine whether obese adolescents can adapt metabolically to changes in dietary macronutrient intake. PATIENTS AND DESIGN: Using a random cross-over design, 13 healthy obese volunteers (six boys and seven girls; age, 14.7 +/- 0.3 yr; body mass index, 34 +/- 1 kg/m2; body fat, 42 +/- 1%) were studied twice after 7 d of isocaloric, isonitrogenous diets with 60% carbohydrate (CHO) and 25% fat (high CHO), or 30% CHO and 55% fat (low CHO). MAIN OUTCOME MEASURES AND METHODS: Glucose metabolism, insulin sensitivity, and first- and second-phase insulin secretory indices were measured by stable isotope techniques and the stable labeled iv glucose tolerance test. The results were compared with those of previously studied lean adolescents. RESULTS: Obese adolescents increased first- and second-phase insulin secretory indices by 18 (P = 0.05) and 36% (P = 0.05), respectively, to maintain normoglycemia during the high-CHO diet because they failed to increase insulin sensitivity as did the lean adolescents. Regardless of diet, in obese adolescents, insulin sensitivity was half (P < 0.05) and first- and second-phase insulin secretory indices twice (P < 0.01), compared with the the corresponding values in lean subjects. In obese adolescents, gluconeogenesis increased by 32% during the low-CHO (high-fat diet) (P < 0.01). CONCLUSION: In obese adolescents, insulin secretory demands were increased regardless of diet. Failure to increase insulin sensitivity while receiving a high-CHO diet required a further increase in insulin secretion, which may lead to earlier beta-cell failure. A low-CHO/high-fat diet resulted in increased gluconeogenesis, which may be a prelude to the increased glucose production and hyperglycemia observed in type 2 diabetics
    • …
    corecore