437 research outputs found

    Extreme State Aggregation Beyond MDPs

    Full text link
    We consider a Reinforcement Learning setup where an agent interacts with an environment in observation-reward-action cycles without any (esp.\ MDP) assumptions on the environment. State aggregation and more generally feature reinforcement learning is concerned with mapping histories/raw-states to reduced/aggregated states. The idea behind both is that the resulting reduced process (approximately) forms a small stationary finite-state MDP, which can then be efficiently solved or learnt. We considerably generalize existing aggregation results by showing that even if the reduced process is not an MDP, the (q-)value functions and (optimal) policies of an associated MDP with same state-space size solve the original problem, as long as the solution can approximately be represented as a function of the reduced states. This implies an upper bound on the required state space size that holds uniformly for all RL problems. It may also explain why RL algorithms designed for MDPs sometimes perform well beyond MDPs.Comment: 28 LaTeX pages. 8 Theorem

    Universal knowledge-seeking agents for stochastic environments

    No full text
    We define an optimal Bayesian knowledge-seeking agent, KL-KSA, designed for countable hypothesis classes of stochastic environments and whose goal is to gather as much information about the unknown world as possible. Although this agent works for arbitrary countable classes and priors, we focus on the especially interesting case where all stochastic computable environments are considered and the prior is based on Solomonoff’s universal prior. Among other properties, we show that KL-KSA learns the true environment in the sense that it learns to predict the consequences of actions it does not take. We show that it does not consider noise to be information and avoids taking actions leading to inescapable traps. We also present a variety of toy experiments demonstrating that KL-KSA behaves according to expectation

    BubbleRank: Safe Online Learning to Re-Rank via Implicit Click Feedback

    Get PDF
    In this paper, we study the problem of safe online learning to re-rank, where user feedback is used to improve the quality of displayed lists. Learning to rank has traditionally been studied in two settings. In the offline setting, rankers are typically learned from relevance labels created by judges. This approach has generally become standard in industrial applications of ranking, such as search. However, this approach lacks exploration and thus is limited by the information content of the offline training data. In the online setting, an algorithm can experiment with lists and learn from feedback on them in a sequential fashion. Bandit algorithms are well-suited for this setting but they tend to learn user preferences from scratch, which results in a high initial cost of exploration. This poses an additional challenge of safe exploration in ranked lists. We propose BubbleRank, a bandit algorithm for safe re-ranking that combines the strengths of both the offline and online settings. The algorithm starts with an initial base list and improves it online by gradually exchanging higher-ranked less attractive items for lower-ranked more attractive items. We prove an upper bound on the n-step regret of BubbleRank that degrades gracefully with the quality of the initial base list. Our theoretical findings are supported by extensive experiments on a large-scale real-world click dataset

    Corrosive-Abrasive Wear Induced by Soot in Boundary Lubrication Regime

    Get PDF
    Soot is known to induce high wear in engine components. The mechanism by which soot induces wear is not well understood. Although several mechanisms have been suggested, there is still no consensus. This study aims to investigate the most likely mechanism responsible for soot-induced wear in the boundary lubrication regime. Results from this study have shown that previously suggested mechanisms such as abrasion and additive adsorption do not fully explain the high wear observed when soot is present. Based on the results obtained from tests conducted at varying temperature and soot levels, it has been proven that the corrosive–abrasive mechanism was responsible for high wear that occurred in boundary lubrication conditions

    Neuroscience in gambling policy and treatment: an interdisciplinary perspective

    Get PDF
    Neuroscientific explanations of gambling disorder can help people make sense of their experiences and guide the development of psychosocial interventions. However, the societal perceptions and implications of these explanations are not always clear or helpful. Two workshops in 2013 and 2014 brought together multidisciplinary researchers aiming to improve the clinical and policy-related effects of neuroscience research on gambling. The workshops revealed that neuroscience can be used to improve identification of the dangers of products used in gambling. Additionally, there was optimism associated with the diagnostic and prognostic uses of neuroscience in problem gambling and the provision of novel tools (eg, virtual reality) to assess the effectiveness of new policy interventions before their implementation. Other messages from these workshops were that neuroscientific models of decision making could provide a strong rationale for precommitment strategies and that interdisciplinary collaborations are needed to reduce the harms of gambling

    Personalization Paradox in Behavior Change Apps:Lessons from a Social Comparison-Based Personalized App for Physical Activity

    Get PDF
    Social comparison-based features are widely used in social computing apps. However, most existing apps are not grounded in social comparison theories and do not consider individual differences in social comparison preferences and reactions. This paper is among the first to automatically personalize social comparison targets. In the context of an m-health app for physical activity, we use artificial intelligence (AI) techniques of multi-armed bandits. Results from our user study (n=53) indicate that there is some evidence that motivation can be increased using the AI-based personalization of social comparison. The detected effects achieved small-to-moderate effect sizes, illustrating the real-world implications of the intervention for enhancing motivation and physical activity. In addition to design implications for social comparison features in social apps, this paper identified the personalization paradox, the conflict between user modeling and adaptation, as a key design challenge of personalized applications for behavior change. Additionally, we propose research directions to mitigate this Personalization Paradox

    TranscriptomeBrowser: A Powerful and Flexible Toolbox to Explore Productively the Transcriptional Landscape of the Gene Expression Omnibus Database

    Get PDF
    International audienceAs public microarray repositories are constantly growing, we are facing the challenge of designing strategies to provide productive access to the available data.\ We used a modified version of the Markov clustering algorithm to systematically extract clusters of co-regulated genes from hundreds of microarray datasets stored in the Gene Expression Omnibus database (n = 1,484). This approach led to the definition of 18,250 transcriptional signatures (TS) that were tested for functional enrichment using the DAVID knowledgebase. Over-representation of functional terms was found in a large proportion of these TS (84%). We developed a JAVA application, TBrowser that comes with an open plug-in architecture and whose interface implements a highly sophisticated search engine supporting several Boolean operators (http://tagc.univ-mrs.fr/tbrowser/). User can search and analyze TS containing a list of identifiers (gene symbols or AffyIDs) or associated with a set of functional terms.\ As proof of principle, TBrowser was used to define breast cancer cell specific genes and to detect chromosomal abnormalities in tumors. Finally, taking advantage of our large collection of transcriptional signatures, we constructed a comprehensive map that summarizes gene-gene co-regulations observed through all the experiments performed on HGU133A Affymetrix platform. We provide evidences that this map can extend our knowledge of cellular signaling pathways

    The mothers, Omega-3 and mental health study

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Major depressive disorder (MDD) during pregnancy and postpartum depression are associated with significant maternal and neonatal morbidity. While antidepressants are readily used in pregnancy, studies have raised concerns regarding neurobehavioral outcomes in exposed infants. Omega-3 fatty acid supplementation, most frequently from fish oil, has emerged as a possible treatment or prevention strategy for MDD in non-pregnant individuals, and may have beneficial effects in pregnant women. Although published observational studies in the psychiatric literature suggest that maternal docosahexaenoic acid (DHA) deficiency may lead to the development of MDD in pregnancy and postpartum, there are more intervention trials suggesting clinical benefit for supplementation with eicosapentaenoic acid (EPA) in MDD.</p> <p>Methods/Design</p> <p>The Mothers, Omega-3 and Mental Health study is a double blind, placebo-controlled, randomized controlled trial to assess whether omega-3 fatty acid supplementation may prevent antenatal and postpartum depressive symptoms among pregnant women at risk for depression. We plan to recruit 126 pregnant women at less than 20 weeks gestation from prenatal clinics at two health systems in Ann Arbor, Michigan and the surrounding communities. We will follow them prospectively over the course of their pregnancies and up to 6 weeks postpartum. Enrolled participants will be randomized to one of three groups: a) EPA-rich fish oil supplement (1060 mg EPA plus 274 mg DHA) b) DHA-rich fish oil supplement (900 mg DHA plus 180 mg EPA; or c) a placebo. The primary outcome for this study is the Beck Depression Inventory (BDI) score at 6 weeks postpartum. We will need to randomize 126 women to have 80% power to detect a 50% reduction in participants' mean BDI scores with EPA or DHA supplementation compared with placebo. We will also gather information on secondary outcome measures which will include: omega-3 fatty acid concentrations in maternal plasma and cord blood, pro-inflammatory cytokine levels (IL-1β, IL-6, and TNF-α) in maternal and cord blood, need for and dosage of antidepressant medications, and obstetrical outcomes. Analyses will be by intent to treat.</p> <p>Discussion</p> <p>This study compares the relative effectiveness of DHA and EPA at preventing depressive symptoms among pregnant women at risk.</p> <p>Trial registration</p> <p>Clinical trial registration number: <a href="http://www.clinicaltrials.gov/ct2/show/NCT00981877">NCT00711971</a></p
    • …
    corecore