4,779 research outputs found

    Q-PrOP: Sample-efficient policy gradient with an off-policy critic

    Get PDF
    Model-free deep reinforcement learning (RL) methods have been successful in a wide variety of simulated domains. However, a major obstacle facing deep RL in the real world is their high sample complexity. Batch policy gradient methods offer stable learning, but at the cost of high variance, which often requires large batches. TD-style methods, such as off-policy actor-critic and Q-learning, are more sample-efficient but biased, and often require costly hyperparameter sweeps to stabilize. In this work, we aim to develop methods that combine the stability of policy gradients with the efficiency of off-policy RL. We present Q-Prop, a policy gradient method that uses a Taylor expansion of the off-policy critic as a control variate. Q-Prop is both sample efficient and stable, and effectively combines the benefits of on-policy and off-policy methods. We analyze the connection between Q-Prop and existing model-free algorithms, and use control variate theory to derive two variants of Q-Prop with conservative and aggressive adaptation. We show that conservative Q-Prop provides substantial gains in sample efficiency over trust region policy optimization (TRPO) with generalized advantage estimation (GAE), and improves stability over deep deterministic policy gradient (DDPG), the state-of-the-art on-policy and off-policy methods, on OpenAI Gym's MuJoCo continuous control environments

    Interpolated policy gradient: Merging on-policy and off-policy gradient estimation for deep reinforcement learning

    Get PDF
    Off-policy model-free deep reinforcement learning methods using previously collected data can improve sample efficiency over on-policy policy gradient techniques. On the other hand, on-policy algorithms are often more stable and easier to use. This paper examines, both theoretically and empirically, approaches to merging on- and off-policy updates for deep reinforcement learning. Theoretical results show that off-policy updates with a value function estimator can be interpolated with on-policy policy gradient updates whilst still satisfying performance bounds. Our analysis uses control variate methods to produce a family of policy gradient algorithms, with several recently proposed algorithms being special cases of this family. We then provide an empirical comparison of these techniques with the remaining algorithmic details fixed, and show how different mixing of off-policy gradient estimates with on-policy samples contribute to improvements in empirical performance. The final algorithm provides a generalization and unification of existing deep policy gradient techniques, has theoretical guarantees on the bias introduced by off-policy updates, and improves on the state-of-the-art model-free deep RL methods on a number of OpenAI Gym continuous control benchmarks

    Financial intermediation and growth : causality and causes without outliers

    Get PDF
    In a seminal paper, Levine et al. (J Monet Econ 46:31–77, 2000) provide cross-sectional evidence showing that financial development has pos- itive average impact on long-run growth, using a sample of 71 countries. We argue that the evidence is sensitive to the presence of outliers.info:eu-repo/semantics/publishedVersio

    Tamoxifen and the Rafoxifene analog LY117018: their effects on arachidonic acid release from cells in culture and on prostaglandin I(2 )production by rat liver cells

    Get PDF
    BACKGROUND: Tamoxifen is being used successfully to treat breast cancer. However, tamoxifen also increases the risk of developing endometrial cancer in postmenopausal women. Raloxifene also decreases breast cancer in women at high risk and may have a lower risk at developing cancer of the uterus. Tamoxifen has been shown to stimulate arachidonic acid release from rat liver cells. I have postulated that arachidonic acid release from cells may be associated with cancer chemoprevention. METHODS: Rat liver, rat glial, human colon carcinoma and human breast carcinoma cells were labelled with [(3)H] arachidonic acid. The release of the radiolabel from these cells during incubation with tamoxifen and the raloxifene analog LY117018 was measured. The prostaglandin I(2 )produced during incubation of the rat liver cells with μM concentrations of tamoxifen and the raloxifene analog was quantitatively estimated. RESULTS: Tamoxifen is about 5 times more effective than LY117018 at releasing arachidonic acid from all the cells tested. In rat liver cells only tamoxifen stimulates basal prostaglandin I(2 )production and that induced by lactacystin and 12-O-tetradecanoyl-phorbol-13-acetate. LY117018, however, blocks the tamoxifen stimulated prostaglandin production. The stimulated prostaglandin I(2 )production is rapid and not affected either by preincubation of the cells with actinomycin or by incubation with the estrogen antagonist ICI-182,780. CONCLUSIONS: Tamoxifen and the raloxifene analog, LY117018, may prevent estrogen-independent as well as estrogen-dependent breast cancer by stimulating phospholipase activity and initiating arachidonic acid release. The release of arachidonic acid and/or molecular reactions that accompany that release may initiate pathways that prevent tumor growth. Oxygenation of the intracellularly released arachidonic acid and its metabolic products may mediate some of the pharmacological actions of tamoxifen and raloxifene

    Toxoplasma effectors targeting host signaling and transcription

    Get PDF
    Early electron microscopy studies revealed the elaborate cellular features that define the unique adaptations of apicomplexan parasites. Among these were bulbous rhoptry (ROP) organelles and small, dense granules (GRAs), both of which are secreted during invasion of host cells. These early morphological studies were followed by the exploration of the cellular contents of these secretory organelles, revealing them to be comprised of highly divergent protein families with few conserved domains or predicted functions. In parallel, studies on host-pathogen interactions identified many host signaling pathways that were mysteriously altered by infection. It was only with the advent of forward and reverse genetic strategies that the connections between individual parasite effectors and the specific host pathways that they targeted finally became clear. The current repertoire of parasite effectors includes ROP kinases and pseudokinases that are secreted during invasion and that block host immune pathways. Similarly, many secretory GRA proteins alter host gene expression by activating host transcription factors, through modification of chromatin, or by inducing small noncoding RNAs. These effectors highlight novel mechanisms by whichhas learned to harness host signaling to favor intracellular survival and will guide future studies designed to uncover the additional complexity of this intricate host-pathogen interaction

    The Mirage of Action-Dependent Baselines in Reinforcement Learning

    Get PDF
    Policy gradient methods are a widely used class of model-free reinforcement learning algorithms where a state-dependent baseline is used to reduce gradient estimator variance. Several recent papers extend the baseline to depend on both the state and action and suggest that this significantly reduces variance and improves sample efficiency without introducing bias into the gradient estimates. To better understand this development, we decompose the variance of the policy gradient estimator and numerically show that learned state-action-dependent baselines do not in fact reduce variance over a state-dependent baseline in commonly tested benchmark domains. We confirm this unexpected result by reviewing the open-source code accompanying these prior papers, and show that subtle implementation decisions cause deviations from the methods presented in the papers and explain the source of the previously observed empirical gains. Furthermore, the variance decomposition highlights areas for improvement, which we demonstrate by illustrating a simple change to the typical value function parameterization that can significantly improve performance

    Climate Change and Human Health Impacts in the United States: An Update on the Results of the U.S. National Assessment

    Get PDF
    The health sector component of the first U.S. National Assessment, published in 2000, synthesized the anticipated health impacts of climate variability and change for five categories of health outcomes: impacts attributable to temperature, extreme weather events (e.g., storms and floods), air pollution, water- and food-borne diseases, and vector- and rodent-borne diseases. The Health Sector Assessment (HSA) concluded that climate variability and change are likely to increase morbidity and mortality risks for several climate-sensitive health outcomes, with the net impact uncertain. The objective of this study was to update the first HSA based on recent publications that address the potential impacts of climate variability and change in the United States for the five health outcome categories. The literature published since the first HSA supports the initial conclusions, with new data refining quantitative exposure–response relationships for several health end points, particularly for extreme heat events and air pollution. The United States continues to have a very high capacity to plan for and respond to climate change, although relatively little progress has been noted in the literature on implementing adaptive strategies and measures. Large knowledge gaps remain, resulting in a substantial need for additional research to improve our understanding of how weather and climate, both directly and indirectly, can influence human health. Filling these knowledge gaps will help better define the potential health impacts of climate change and identify specific public health adaptations to increase resilience

    Safety, tumor trafficking and immunogenicity of chimeric antigen receptor (CAR)-T cells specific for TAG-72 in colorectal cancer.

    Get PDF
    BackgroundT cells engineered to express chimeric antigen receptors (CARs) have established efficacy in the treatment of B-cell malignancies, but their relevance in solid tumors remains undefined. Here we report results of the first human trials of CAR-T cells in the treatment of solid tumors performed in the 1990s.MethodsPatients with metastatic colorectal cancer (CRC) were treated in two phase 1 trials with first-generation retroviral transduced CAR-T cells targeting tumor-associated glycoprotein (TAG)-72 and including a CD3-zeta intracellular signaling domain (CART72 cells). In trial C-9701 and C-9702, CART72 cells were administered in escalating doses up to 1010 total cells; in trial C-9701 CART72 cells were administered by intravenous infusion. In trial C-9702, CART72 cells were administered via direct hepatic artery infusion in patients with colorectal liver metastases. In both trials, a brief course of interferon-alpha (IFN-α) was given with each CART72 infusion to upregulate expression of TAG-72.ResultsFourteen patients were enrolled in C-9701 and nine in C-9702. CART72 manufacturing success rate was 100% with an average transduction efficiency of 38%. Ten patients were treated in CC-9701 and 6 in CC-9702. Symptoms consistent with low-grade, cytokine release syndrome were observed in both trials without clear evidence of on target/off tumor toxicity. Detectable, but mostly short-term (≤14 weeks), persistence of CART72 cells was observed in blood; one patient had CART72 cells detectable at 48 weeks. Trafficking to tumor tissues was confirmed in a tumor biopsy from one of three patients. A subset of patients had 111Indium-labeled CART72 cells injected, and trafficking could be detected to liver, but T cells appeared largely excluded from large metastatic deposits. Tumor biomarkers carcinoembryonic antigen (CEA) and TAG-72 were measured in serum; there was a precipitous decline of TAG-72, but not CEA, in some patients due to induction of an interfering antibody to the TAG-72 binding domain of humanized CC49, reflecting an anti-CAR immune response. No radiologic tumor responses were observed.ConclusionThese findings demonstrate the relative safety of CART72 cells. The limited persistence supports the incorporation of co-stimulatory domains in the CAR design and the use of fully human CAR constructs to mitigate immunogenicity

    Prenatal Treatment for Serious Neurological Sequelae of Congenital Toxoplasmosis: An Observational Prospective Cohort Study

    Get PDF
    Background: The effectiveness of prenatal treatment to prevent serious neurological sequelae (SNSD) of congenital toxoplasmosis is not known.Methods and Findings: Congenital toxoplasmosis was prospectively identified by universal prenatal or neonatal screening in 14 European centres and children were followed for a median of 4 years. We evaluated determinants of postnatal death or SNSD defined by one or more of functional neurological abnormalities, severe bilateral visual impairment, or pregnancy termination for confirmed congenital toxoplasmosis. Two-thirds of the cohort received prenatal treatment (189/293; 65%). 23/293 (8%) fetuses developed SNSD of which nine were pregnancy terminations. Prenatal treatment reduced the risk of SNSD. The odds ratio for prenatal treatment, adjusted for gestational age at maternal seroconversion, was 0.24 (95% Bayesian credible intervals 0.07-0.71). This effect was robust to most sensitivity analyses. The number of infected fetuses needed to be treated to prevent one case of SNSD was three (95% Bayesian credible intervals 2-15) after maternal seroconversion at 10 weeks, and 18 (9-75) at 30 weeks of gestation. Pyrimethamine-sulphonamide treatment did not reduce SNSD compared with spiramycin alone (adjusted odds ratio 0.78, 0.21-2.95). The proportion of live-born infants with intracranial lesions detected postnatally who developed SNSD was 31.0% (17.0%-38.1%).Conclusion: The finding that prenatal treatment reduced the risk of SNSD in infected fetuses should be interpreted with caution because of the low number of SNSD cases and uncertainty about the timing of maternal seroconversion. As these are observational data, policy decisions about screening require further evidence from a randomized trial of prenatal screening and from cost-effectiveness analyses that take into account the incidence and prevalence of maternal infection
    corecore