2,638 research outputs found

    Monte Carlo Tree Search with Heuristic Evaluations using Implicit Minimax Backups

    Full text link
    Monte Carlo Tree Search (MCTS) has improved the performance of game engines in domains such as Go, Hex, and general game playing. MCTS has been shown to outperform classic alpha-beta search in games where good heuristic evaluations are difficult to obtain. In recent years, combining ideas from traditional minimax search in MCTS has been shown to be advantageous in some domains, such as Lines of Action, Amazons, and Breakthrough. In this paper, we propose a new way to use heuristic evaluations to guide the MCTS search by storing the two sources of information, estimated win rates and heuristic evaluations, separately. Rather than using the heuristic evaluations to replace the playouts, our technique backs them up implicitly during the MCTS simulations. These minimax values are then used to guide future simulations. We show that using implicit minimax backups leads to stronger play performance in Kalah, Breakthrough, and Lines of Action.Comment: 24 pages, 7 figures, 9 tables, expanded version of paper presented at IEEE Conference on Computational Intelligence and Games (CIG) 2014 conferenc

    Functional Haplotypes in the ADIPOQ Gene are Associated with Underweight, Immunosuppression and Viral Suppression in Kenyan HIV-1 Infected Antiretroviral Treatment Naive and Experienced Injection Substance Users

    Get PDF
    BACKGROUND: Human immunodeficiency virus and injection substance use have an influence on genes and gene expression. These effects could be beneficial or detrimental in defining disease outcomes. Adiponectin gene is key in modulating metabolic and immunoregulatory functions. Understanding the effects of human immunodeficiency virus and injection substance use on the gene in the context of antiretroviral therapy is important for predicting disease outcomes.METHODS: This cross-sectional genetic study determined polymorphisms in the promoter region of adiponectin gene. Two variants were analyzed: rs2241766 and rs266729. Polymorphisms were associated with clinical markers of disease outcome; underweight, immunosuppression and viral suppression. The variants were genotyped via random fragment length polymorphism.RESULTS: GC haplotype was associated with higher odds of having underweight (OR, 2.21; 95% CI, 1.83-4.60; P=0.008 vs. OR, 2.30; 95% CI, 1.89-4.71; P=0.006) in antiretroviral treatment -naive and experienced injection substance users andimmunosuppression (OR, 1.90; 95% CI 1.67-3.98, P=0.041) in naive. Bonferroni correction revealed GC haplotype carriers only to have low body mass index in both naive (median, 14.8; IQR, 3.2kg/m2; P<0.002) and experienced (median, 15.2; IQR, 3.2 kg/m2; P<0.002) injection substance users. Circulating total adiponectin levels were higher in naive (median, 19.5; IQR, 7.9 μg/ml) than - experienced (median, 12.0; IQR, 4.4 μg/ml) injection substance users (P<0.0001). GC carriers presented with low serum adiponectin levels in both study groups.CONCLUSION: The study revealed haplotypes of adiponectin gene at loci rs2241766 and rs266729 that could determine disease outcomes in human immunodeficiency virus -1 antiretroviral treatment- naive and experienced injection substance users

    Determinants of Non-Performing Loans in Uganda’s Commercial Banking Sector

    Get PDF
    Over the past decade, Non-Performing Loans (NPLs) in Uganda’s commercial banking industry have exhibited a positive trend, in spite of the reforms undertaken in the industry. The continued increase in NPLs has not only affected credit growth, but also resulted in the collapse and closure of some commercial banks. Against this backdrop, it was necessary to understand the determinants of NPLs in Uganda’s commercial banking sector. To execute the study, quarterly data for the period 2002q1 to 2017q2 was analyzed using ARDL and bounds test techniques while controlling for both bank-specific and macroeconomic factors. The findings of the study indicate that NPLs increase with increase in lending rates, real effective exchange rate and unemployment rate while increase in returns on assets and GDP growth rate lower NPLs. Based on the findings, commercial banks are advised to diversify their asset portfolio by holding other income earning assets such as governments bonds, equity so as to reduce on credit risk exposure. In addition, commercial banks need to focus more on internationally competitive sectors. Measure that reduce lending rates, promote GDP growth, reduce unemployment would also serve to reduce NPLs.

    Irrational Axions as a Solution of The Strong CP Problem in an Eternal Universe

    Full text link
    We exhibit a novel solution of the strong CP problem, which does not involve any massless particles. The low energy effective Lagrangian of our model involves a discrete spacetime independent axion field which can be thought of as a parameter labeling a dense set of θ\theta vacua. In the full theory this parameter is seen to be dynamical, and the model seeks the state of lowest energy, which has θeff=0\theta_{eff} = 0. The processes which mediate transitions between θ\theta vacua involve heavy degrees of freedom and are very slow. Consequently, we do not know whether our model can solve the strong CP problem in a universe which has been cool for only a finite time. We present several speculations about the cosmological evolution of our model.Comment: 12 page

    The History and Risks of Reinforcement Learning and Human Feedback

    Full text link
    Reinforcement learning from human feedback (RLHF) has emerged as a powerful technique to make large language models (LLMs) easier to use and more effective. A core piece of the RLHF process is the training and utilization of a model of human preferences that acts as a reward function for optimization. This approach, which operates at the intersection of many stakeholders and academic disciplines, remains poorly understood. RLHF reward models are often cited as being central to achieving performance, yet very few descriptors of capabilities, evaluations, training methods, or open-source models exist. Given this lack of information, further study and transparency is needed for learned RLHF reward models. In this paper, we illustrate the complex history of optimizing preferences, and articulate lines of inquiry to understand the sociotechnical context of reward models. In particular, we highlight the ontological differences between costs, rewards, and preferences at stake in RLHF's foundations, related methodological tensions, and possible research directions to improve general understanding of how reward models function.Comment: 14 pages, 3 figure
    • …
    corecore