437 research outputs found
Extreme State Aggregation Beyond MDPs
We consider a Reinforcement Learning setup where an agent interacts with an
environment in observation-reward-action cycles without any (esp.\ MDP)
assumptions on the environment. State aggregation and more generally feature
reinforcement learning is concerned with mapping histories/raw-states to
reduced/aggregated states. The idea behind both is that the resulting reduced
process (approximately) forms a small stationary finite-state MDP, which can
then be efficiently solved or learnt. We considerably generalize existing
aggregation results by showing that even if the reduced process is not an MDP,
the (q-)value functions and (optimal) policies of an associated MDP with same
state-space size solve the original problem, as long as the solution can
approximately be represented as a function of the reduced states. This implies
an upper bound on the required state space size that holds uniformly for all RL
problems. It may also explain why RL algorithms designed for MDPs sometimes
perform well beyond MDPs.Comment: 28 LaTeX pages. 8 Theorem
Universal knowledge-seeking agents for stochastic environments
We define an optimal Bayesian knowledge-seeking agent, KL-KSA, designed for countable hypothesis classes of stochastic environments and whose goal is to gather as much information about the unknown world as possible. Although this agent works for arbitrary countable classes and priors, we focus on the especially interesting case where all stochastic computable environments are considered and the prior is based on Solomonoff’s universal prior. Among other properties, we show that KL-KSA learns the true environment in the sense that it learns to predict the consequences of actions it does not take. We show that it does not consider noise to be information and avoids taking actions leading to inescapable traps. We also present a variety of toy experiments demonstrating that KL-KSA behaves according to expectation
BubbleRank: Safe Online Learning to Re-Rank via Implicit Click Feedback
In this paper, we study the problem of safe online learning to re-rank, where
user feedback is used to improve the quality of displayed lists. Learning to
rank has traditionally been studied in two settings. In the offline setting,
rankers are typically learned from relevance labels created by judges. This
approach has generally become standard in industrial applications of ranking,
such as search. However, this approach lacks exploration and thus is limited by
the information content of the offline training data. In the online setting, an
algorithm can experiment with lists and learn from feedback on them in a
sequential fashion. Bandit algorithms are well-suited for this setting but they
tend to learn user preferences from scratch, which results in a high initial
cost of exploration. This poses an additional challenge of safe exploration in
ranked lists. We propose BubbleRank, a bandit algorithm for safe re-ranking
that combines the strengths of both the offline and online settings. The
algorithm starts with an initial base list and improves it online by gradually
exchanging higher-ranked less attractive items for lower-ranked more attractive
items. We prove an upper bound on the n-step regret of BubbleRank that degrades
gracefully with the quality of the initial base list. Our theoretical findings
are supported by extensive experiments on a large-scale real-world click
dataset
Molecular Analysis of Precursor Lesions in Familial Pancreatic Cancer
PMCID: PMC3553106This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited
Corrosive-Abrasive Wear Induced by Soot in Boundary Lubrication Regime
Soot is known to induce high wear in engine components. The mechanism by which soot induces wear is not well understood. Although several mechanisms have been suggested, there is still no consensus. This study aims to investigate the most likely mechanism responsible for soot-induced wear in the boundary lubrication regime. Results from this study have shown that previously suggested mechanisms such as abrasion and additive adsorption do not fully explain the high wear observed when soot is present. Based on the results obtained from tests conducted at varying temperature and soot levels, it has been proven that the corrosive–abrasive mechanism was responsible for high wear that occurred in boundary lubrication conditions
Neuroscience in gambling policy and treatment: an interdisciplinary perspective
Neuroscientific explanations of gambling disorder can help people make sense of their experiences and guide the development of psychosocial interventions. However, the societal perceptions and implications of these explanations are not always clear or helpful. Two workshops in 2013 and 2014 brought together multidisciplinary researchers aiming to improve the clinical and policy-related effects of neuroscience research on gambling. The workshops revealed that neuroscience can be used to improve identification of the dangers of products used in gambling. Additionally, there was optimism associated with the diagnostic and prognostic uses of neuroscience in problem gambling and the provision of novel tools (eg, virtual reality) to assess the effectiveness of new policy interventions before their implementation. Other messages from these workshops were that neuroscientific models of decision making could provide a strong rationale for precommitment strategies and that interdisciplinary collaborations are needed to reduce the harms of gambling
Personalization Paradox in Behavior Change Apps:Lessons from a Social Comparison-Based Personalized App for Physical Activity
Social comparison-based features are widely used in social computing apps.
However, most existing apps are not grounded in social comparison theories and
do not consider individual differences in social comparison preferences and
reactions. This paper is among the first to automatically personalize social
comparison targets. In the context of an m-health app for physical activity, we
use artificial intelligence (AI) techniques of multi-armed bandits. Results
from our user study (n=53) indicate that there is some evidence that motivation
can be increased using the AI-based personalization of social comparison. The
detected effects achieved small-to-moderate effect sizes, illustrating the
real-world implications of the intervention for enhancing motivation and
physical activity. In addition to design implications for social comparison
features in social apps, this paper identified the personalization paradox, the
conflict between user modeling and adaptation, as a key design challenge of
personalized applications for behavior change. Additionally, we propose
research directions to mitigate this Personalization Paradox
TranscriptomeBrowser: A Powerful and Flexible Toolbox to Explore Productively the Transcriptional Landscape of the Gene Expression Omnibus Database
International audienceAs public microarray repositories are constantly growing, we are facing the challenge of designing strategies to provide productive access to the available data.\ We used a modified version of the Markov clustering algorithm to systematically extract clusters of co-regulated genes from hundreds of microarray datasets stored in the Gene Expression Omnibus database (n = 1,484). This approach led to the definition of 18,250 transcriptional signatures (TS) that were tested for functional enrichment using the DAVID knowledgebase. Over-representation of functional terms was found in a large proportion of these TS (84%). We developed a JAVA application, TBrowser that comes with an open plug-in architecture and whose interface implements a highly sophisticated search engine supporting several Boolean operators (http://tagc.univ-mrs.fr/tbrowser/). User can search and analyze TS containing a list of identifiers (gene symbols or AffyIDs) or associated with a set of functional terms.\ As proof of principle, TBrowser was used to define breast cancer cell specific genes and to detect chromosomal abnormalities in tumors. Finally, taking advantage of our large collection of transcriptional signatures, we constructed a comprehensive map that summarizes gene-gene co-regulations observed through all the experiments performed on HGU133A Affymetrix platform. We provide evidences that this map can extend our knowledge of cellular signaling pathways
The mothers, Omega-3 and mental health study
<p>Abstract</p> <p>Background</p> <p>Major depressive disorder (MDD) during pregnancy and postpartum depression are associated with significant maternal and neonatal morbidity. While antidepressants are readily used in pregnancy, studies have raised concerns regarding neurobehavioral outcomes in exposed infants. Omega-3 fatty acid supplementation, most frequently from fish oil, has emerged as a possible treatment or prevention strategy for MDD in non-pregnant individuals, and may have beneficial effects in pregnant women. Although published observational studies in the psychiatric literature suggest that maternal docosahexaenoic acid (DHA) deficiency may lead to the development of MDD in pregnancy and postpartum, there are more intervention trials suggesting clinical benefit for supplementation with eicosapentaenoic acid (EPA) in MDD.</p> <p>Methods/Design</p> <p>The Mothers, Omega-3 and Mental Health study is a double blind, placebo-controlled, randomized controlled trial to assess whether omega-3 fatty acid supplementation may prevent antenatal and postpartum depressive symptoms among pregnant women at risk for depression. We plan to recruit 126 pregnant women at less than 20 weeks gestation from prenatal clinics at two health systems in Ann Arbor, Michigan and the surrounding communities. We will follow them prospectively over the course of their pregnancies and up to 6 weeks postpartum. Enrolled participants will be randomized to one of three groups: a) EPA-rich fish oil supplement (1060 mg EPA plus 274 mg DHA) b) DHA-rich fish oil supplement (900 mg DHA plus 180 mg EPA; or c) a placebo. The primary outcome for this study is the Beck Depression Inventory (BDI) score at 6 weeks postpartum. We will need to randomize 126 women to have 80% power to detect a 50% reduction in participants' mean BDI scores with EPA or DHA supplementation compared with placebo. We will also gather information on secondary outcome measures which will include: omega-3 fatty acid concentrations in maternal plasma and cord blood, pro-inflammatory cytokine levels (IL-1β, IL-6, and TNF-α) in maternal and cord blood, need for and dosage of antidepressant medications, and obstetrical outcomes. Analyses will be by intent to treat.</p> <p>Discussion</p> <p>This study compares the relative effectiveness of DHA and EPA at preventing depressive symptoms among pregnant women at risk.</p> <p>Trial registration</p> <p>Clinical trial registration number: <a href="http://www.clinicaltrials.gov/ct2/show/NCT00981877">NCT00711971</a></p
- …