1,816 research outputs found

    Toward Interpretable Deep Reinforcement Learning with Linear Model U-Trees

    Full text link
    Deep Reinforcement Learning (DRL) has achieved impressive success in many applications. A key component of many DRL models is a neural network representing a Q function, to estimate the expected cumulative reward following a state-action pair. The Q function neural network contains a lot of implicit knowledge about the RL problems, but often remains unexamined and uninterpreted. To our knowledge, this work develops the first mimic learning framework for Q functions in DRL. We introduce Linear Model U-trees (LMUTs) to approximate neural network predictions. An LMUT is learned using a novel on-line algorithm that is well-suited for an active play setting, where the mimic learner observes an ongoing interaction between the neural net and the environment. Empirical evaluation shows that an LMUT mimics a Q function substantially better than five baseline methods. The transparent tree structure of an LMUT facilitates understanding the network's learned knowledge by analyzing feature influence, extracting rules, and highlighting the super-pixels in image inputs.Comment: This paper is accepted by ECML-PKDD 201

    Prioritized Sweeping Neural DynaQ with Multiple Predecessors, and Hippocampal Replays

    Full text link
    During sleep and awake rest, the hippocampus replays sequences of place cells that have been activated during prior experiences. These have been interpreted as a memory consolidation process, but recent results suggest a possible interpretation in terms of reinforcement learning. The Dyna reinforcement learning algorithms use off-line replays to improve learning. Under limited replay budget, a prioritized sweeping approach, which requires a model of the transitions to the predecessors, can be used to improve performance. We investigate whether such algorithms can explain the experimentally observed replays. We propose a neural network version of prioritized sweeping Q-learning, for which we developed a growing multiple expert algorithm, able to cope with multiple predecessors. The resulting architecture is able to improve the learning of simulated agents confronted to a navigation task. We predict that, in animals, learning the world model should occur during rest periods, and that the corresponding replays should be shuffled.Comment: Living Machines 2018 (Paris, France

    Revisiting the relationship between host attitudes and tourism development: A utility maximization approach

    Get PDF
    Host attitudes toward tourists are critical to the sustainable development of the tourism industry. Although numerous studies have focused on investigating host attitudes toward tourists and tourism development, the theoretical support from an economic perspective in this field is still underdeveloped. By following the social exchange theory and applying a utility maximization model, the current study not only explains Doxey’s Irridex model from an economic perspective but also complements the findings of the tourism area life cycle model proposed by Butler. Results show that the public resources at the destination, along with the ability of local community in channeling (foreign) tourism income into productivity advancement, influence the optimal level of tourism development in a destination

    Context-aware Approach for Determining the Threshold Price in Name-Your-Own-Price Channels

    Get PDF
    Key feature of a context-aware application is the ability to adapt based on the change of context. Two approaches that are widely used in this regard are the context-action pair mapping where developers match an action to execute for a particular context change and the adaptive learning where a context-aware application refines its action over time based on the preceding action’s outcome. Both these approaches have limitation which makes them unsuitable in situations where a context-aware application has to deal with unknown context changes. In this paper we propose a framework where adaptation is carried out via concurrent multi-action evaluation of a dynamically created action space. This dynamic creation of the action space eliminates the need for relying on the developers to create context-action pairs and the concurrent multi-action evaluation reduces the adaptation time as opposed to the iterative approach used by adaptive learning techniques. Using our reference implementation of the framework we show how it could be used to dynamically determine the threshold price in an e-commerce system which uses the name-your-own-price (NYOP) strategy

    Electronic transport in polycrystalline graphene

    Full text link
    Most materials in available macroscopic quantities are polycrystalline. Graphene, a recently discovered two-dimensional form of carbon with strong potential for replacing silicon in future electronics, is no exception. There is growing evidence of the polycrystalline nature of graphene samples obtained using various techniques. Grain boundaries, intrinsic topological defects of polycrystalline materials, are expected to dramatically alter the electronic transport in graphene. Here, we develop a theory of charge carrier transmission through grain boundaries composed of a periodic array of dislocations in graphene based on the momentum conservation principle. Depending on the grain boundary structure we find two distinct transport behaviours - either high transparency, or perfect reflection of charge carriers over remarkably large energy ranges. First-principles quantum transport calculations are used to verify and further investigate this striking behaviour. Our study sheds light on the transport properties of large-area graphene samples. Furthermore, purposeful engineering of periodic grain boundaries with tunable transport gaps would allow for controlling charge currents without the need of introducing bulk band gaps in otherwise semimetallic graphene. The proposed approach can be regarded as a means towards building practical graphene electronics.Comment: accepted in Nature Material

    A multi-exon deletion within WWOX is associated with a 46,XY disorder of sex development

    Get PDF
    Disorders of sex development (DSD) are congenital conditions where chromosomal, gonad or genital development is atypical. In a significant proportion of 46,XY DSD cases it is not possible to identify a causative mutation, making genetic counseling difficult and potentially hindering optimal treatment. Here, we describe the analysis of a 46,XY DSD patient that presented at birth with ambiguous genitalia. Histological analysis of the surgically removed gonads showed bilateral undifferentiated gonadal tissue and immature testis, both containing malignant germ cells. We screened genomic DNA from this patient for deletions and duplications using an Illumina whole-genome SNP microarray. This analysis revealed a heterozygous deletion within the WWOX gene on chromosome 16, removing exons 6-8. Analysis of parental DNA showed that the deletion was inherited from the mother. cDNA analysis confirmed that the deletion maintained the reading frame, with exon 5 being spliced directly onto exon 9. This deletion is the first description of a germline rearrangement affecting the coding sequence of WWOX in humans. Previously described Wwox knockout mouse models showed gonadal abnormalities, supporting a role for WWOX in human gonad development

    Training emergency services’ dispatchers to recognise stroke: an interrupted time-series analysis

    Get PDF
    Background: Stroke is a time-dependent medical emergency in which early presentation to specialist care reduces death and dependency. Up to 70% of all stroke patients obtain first medical contact from the Emergency Medical Services (EMS). Identifying ‘true stroke’ from an EMS call is challenging, with over 50% of strokes being misclassified. The aim of this study was to evaluate the impact of the training package on the recognition of stroke by Emergency Medical Dispatchers (EMDs). Methods: This study took place in an ambulance service and a hospital in England using an interrupted time-series design. Suspected stroke patients were identified in one week blocks, every three weeks over an 18 month period, during which time the training was implemented. Patients were included if they had a diagnosis of stroke (EMS or hospital). The effect of the intervention on the accuracy of dispatch diagnosis was investigated using binomial (grouped) logistic regression. Results: In the Pre-implementation period EMDs correctly identified 63% of stroke patients; this increased to 80% Post-implementation. This change was significant (p=0.003), reflecting an improvement in identifying stroke patients relative to the Pre-implementation period both the During-implementation (OR=4.10 [95% CI 1.58 to 10.66]) and Post-implementation (OR=2.30 [95% CI 1.07 to 4.92]) periods. For patients with a final diagnosis of stroke who had been dispatched as stroke there was a marginally non-significant 2.8 minutes (95% CI −0.2 to 5.9 minutes, p=0.068)reduction between Pre- and Post-implementation periods from call to arrival of the ambulance at scene. Conclusions: This is the first study to develop, implement and evaluate the impact of a training package for EMDs with the aim of improving the recognition of stroke. Training led to a significant increase in the proportion of stroke patients dispatched as such by EMDs; a small reduction in time from call to arrival at scene by the ambulance also appeared likely. The training package has been endorsed by the UK Stroke Forum Education and Training, and is free to access on-line

    Catastrophizing mediates the relationship between the personal belief in a just world and pain outcomes among chronic pain support group attendees

    Get PDF
    Health-related research suggests the belief in a just world can act as a personal resource that protects against the adverse effects of pain and illness. However, currently, little is known about how this belief, particularly in relation to one’s own life, might influence pain. Consistent with the suggestions of previous research, the present study undertook a secondary data analysis to investigate pain catastrophizing as a mediator of the relationship between the personal just world belief and chronic pain outcomes in a sample of chronic pain support group attendees. Partially supporting the hypotheses, catastrophizing was negatively correlated with the personal just world belief and mediated the relationship between this belief and pain and disability, but not distress. Suggestions for future research and intervention development are made

    The International-Trade Network: Gravity Equations and Topological Properties

    Get PDF
    This paper begins to explore the determinants of the topological properties of the international - trade network (ITN). We fit bilateral-trade flows using a standard gravity equation to build a "residual" ITN where trade-link weights are depurated from geographical distance, size, border effects, trade agreements, and so on. We then compare the topological properties of the original and residual ITNs. We find that the residual ITN displays, unlike the original one, marked signatures of a complex system, and is characterized by a very different topological architecture. Whereas the original ITN is geographically clustered and organized around a few large-sized hubs, the residual ITN displays many small-sized but trade-oriented countries that, independently of their geographical position, either play the role of local hubs or attract large and rich countries in relatively complex trade-interaction patterns

    “Even if You Know Everything You Can Forget”: Health Worker Perceptions of Mobile Phone Text-Messaging to Improve Malaria Case-Management in Kenya

    Get PDF
    This paper presents the results of a qualitative study to investigate the perceptions and experiences of health workers involved in a a cluster-randomized controlled trial of a novel intervention to improve health worker malaria case-management in 107 government health facilities in Kenya. The intervention involved sending text-messages about paediatric outpatient malaria case-management accompanied by “motivating” quotes to health workers’ mobile phones. Ten malaria messages were developed reflecting recommendations from the Kenyan national guidelines. Two messages were delivered per day for 5 working days and the process was repeated for 26 weeks (May to October 2009). The accompanying quotes were unique to each message. The intervention was delivered to 119 health workers and there were significant improvements in correct artemether-lumefantrine (AL) management both immediately after the intervention (November 2009) and 6 months later (May 2010). In-depth interviews with 24 health workers were undertaken to investigate the possible drivers of this change. The results suggest high acceptance of all components of the intervention, with the active delivery of information in an on the job setting, the ready availability of new and stored text messages and the perception of being kept ‘up to date’ as important factors influencing practice. Applying the construct of stages of change we infer that in this intervention the SMS messages were operating primarily at the action and maintenance stages of behaviour change achieving their effect by creating an enabling environment and providing a prompt to action for the implementation of case management practices that had already been accepted as the clinical norm by the health workers. Future trials testing the effectiveness of SMS reminders in creating an enabling environment for the establishment of new norms in clinical practice as well as in providing a prompt to action for the implementation of the new case-management guidelines are justified
    corecore