643 research outputs found

    A FIRST-OCCUPANCY REPRESENTATION FOR REINFORCEMENT LEARNING

    Get PDF
    Both animals and artificial agents benefit from state representations that support rapid transfer of learning across tasks and which enable them to efficiently traverse their environments to reach rewarding states. The successor representation (SR), which measures the expected cumulative, discounted state occupancy under a fixed policy, enables efficient transfer to different reward structures in an otherwise constant Markovian environment and has been hypothesized to underlie aspects of biological behavior and neural activity. However, in the real world, rewards may only be available for consumption once, may shift location, or agents may simply aim to reach goal states as rapidly as possible without the constraint of artificially imposed task horizons. In such cases, the most behaviorally-relevant representation would carry information about when the agent was likely to first reach states of interest, rather than how often it should expect to visit them over a potentially infinite time span. To reflect such demands, we introduce the first-occupancy representation (FR), which measures the expected temporal discount to the first time a state is accessed. We demonstrate that the FR facilitates exploration, the selection of efficient paths to desired states, allows the agent, under certain conditions, to plan provably optimal trajectories defined by a sequence of subgoals, and induces similar behavior to animals avoiding threatening stimuli

    Terrestrial Exoplanet Light Curves

    Full text link
    The phase or orbital light curves of extrasolar terrestrial planets in reflected or emitted light will contain information about their atmospheres and surfaces complementary to data obtained by other techniques such as spectrosopy. We show calculated light curves at optical and thermal infrared wavelengths for a variety of Earth-like and Earth-unlike planets. We also show that large satellites of Earth-sized planets are detectable, but may cause aliasing effects if the lightcurve is insufficiently sampled.Comment: To appear in Proceedings of the IAU Colloquium 200, Direct Imaging of Exoplanets; Science & Technology, Villefranche-sur-mer, France, October 2-7, 200

    Amortised learning by wake-sleep

    Get PDF
    Models that employ latent variables to capture structure in observed data lie at the heart of many current unsupervised learning algorithms, but exact maximum-likelihood learning for powerful and flexible latent-variable models is almost always intractable. Thus, state-of-the-art approaches either abandon the maximum-likelihood framework entirely, or else rely on a variety of variational approximations to the posterior distribution over the latents. Here, we propose an alternative approach that we call amortised learning. Rather than computing an approximation to the posterior over latents, we use a wake-sleep Monte-Carlo strategy to learn a function that directly estimates the maximum-likelihood parameter updates. Amortised learning is possible whenever samples of latents and observations can be simulated from the generative model, treating the model as a “black box”. We demonstrate its effectiveness on a wide range of complex models, including those with latents that are discrete or supported on non-Euclidean spaces

    A Unified Theory of Dual-Process Control

    Full text link
    Dual-process theories play a central role in both psychology and neuroscience, figuring prominently in fields ranging from executive control to reward-based learning to judgment and decision making. In each of these domains, two mechanisms appear to operate concurrently, one relatively high in computational complexity, the other relatively simple. Why is neural information processing organized in this way? We propose an answer to this question based on the notion of compression. The key insight is that dual-process structure can enhance adaptive behavior by allowing an agent to minimize the description length of its own behavior. We apply a single model based on this observation to findings from research on executive control, reward-based learning, and judgment and decision making, showing that seemingly diverse dual-process phenomena can be understood as domain-specific consequences of a single underlying set of computational principles

    Minimum Description Length Control

    Full text link
    We propose a novel framework for multitask reinforcement learning based on the minimum description length (MDL) principle. In this approach, which we term MDL-control (MDL-C), the agent learns the common structure among the tasks with which it is faced and then distills it into a simpler representation which facilitates faster convergence and generalization to new tasks. In doing so, MDL-C naturally balances adaptation to each task with epistemic uncertainty about the task distribution. We motivate MDL-C via formal connections between the MDL principle and Bayesian inference, derive theoretical performance guarantees, and demonstrate MDL-C's empirical effectiveness on both discrete and high-dimensional continuous control tasks

    Physical map location of the peptide methionine sulfoxide reductase gene on the Escherichia coli chromosome

    Get PDF
    This is the publisher's version, also available electronically from "http://jb.asm.org".No abstract available

    The astorb database at Lowell Observatory

    Get PDF
    The astorb database at Lowell Observatory is an actively curated catalog of all known asteroids in the Solar System. astorb has heritage dating back to the 1970s and has been publicly accessible since the 1990s. Work began in 2015 to modernize the underlying database infrastructure, operational software, and associated web applications. That effort involved the expansion of astorb to incorporate new data such as physical properties (e.g. albedo, colors, spectral types) from a variety of sources. The data in astorb are used to support a number of research tools hosted at https://asteroid.lowell.edu. Here we present a full description of the software tools, computational foundation, and data products upon which the astorb ecosystem has been built. (c) 2022 The Author(s). Published by Elsevier B.V. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).Peer reviewe

    The 2016 Reactivations of Main-Belt Comets 238P/Read and 288P/(300163) 2006 VW139

    Full text link
    We report observations of the reactivations of main-belt comets 238P/Read and 288P/(300163) 2006 VW139, that also track the evolution of each object's activity over several months in 2016 and 2017. We additionally identify and analyze archival SDSS data showing 288P to be active in 2000, meaning that both 238P and 288P have now each been confirmed to be active near perihelion on three separate occasions. From data obtained of 288P from 2012-2015 when it appeared inactive, we find best-fit R-band H,G phase function parameters of H_R=16.80+/-0.12 mag and G_R=0.18+/-0.11, corresponding to effective component radii of r_c=0.80+/-0.04 km, assuming a binary system with equally-sized components. Fitting linear functions to ejected dust masses inferred for 238P and 288P soon after their observed reactivations in 2016, we find an initial average net dust production rate of 0.7+/-0.3 kg/s and a best-fit start date of 2016 March 11 (when the object was at a true anomaly of -63 deg) for 238P, and an initial average net dust production rate of 5.6+/-0.7 kg/s and a best-fit start date of 2016 August 5 (when the object was at a true anomaly of -27 deg) for 288P. Applying similar analyses to archival data, we find similar start points for previous active episodes for both objects, suggesting that minimal mantle growth or ice recession occurred between the active episodes in question. Some changes in dust production rates between active episodes are detected, however. More detailed dust modeling is suggested to further clarify the process of activity evolution in main-belt comets.Comment: 21 pages, 9 figures, accepted by A

    Methionine sulfoxide reductase regulates brain catechol-O-methyl transferase activity

    Get PDF
    This is the published version. Copyright 2014 Oxford University PressCatechol-O-methyl transferase (COMT) plays a key role in the degradation of brain dopamine (DA). Specifically, low COMT activity results in higher DA levels in the prefrontal cortex (PFC), thereby reducing the vulnerability for attentional and cognitive deficits in both psychotic and healthy individuals. COMT activity is markedly reduced by a non-synonymous single-nucleotide polymorphism (SNP) that generates a valine-to-methionine substitution on the residue 108/158, by means of as-yet incompletely understood post-translational mechanisms. One post-translational modification is methionine sulfoxide, which can be reduced by the methionine sulfoxide reductase (Msr) A and B enzymes. We used recombinant COMT proteins (Val/Met108) and mice (wild-type (WT) and MsrA knockout) to determine the effect of methionine oxidation on COMT activity and COMT interaction with Msr, through a combination of enzymatic activity and Western blot assays. Recombinant COMT activity is positively regulated by MsrA, especially under oxidative conditions, whereas brains of MsrA knockout mice exhibited lower COMT activity (as compared with their WT counterparts). These results suggest that COMT activity may be reduced by methionine oxidation, and point to Msr as a key molecular determinant for the modulation of COMT activity in the brain. The role of Msr in modulating cognitive functions in healthy individuals and schizophrenia patients is yet to be determined
    • …
    corecore