2,638 research outputs found
Monte Carlo Tree Search with Heuristic Evaluations using Implicit Minimax Backups
Monte Carlo Tree Search (MCTS) has improved the performance of game engines
in domains such as Go, Hex, and general game playing. MCTS has been shown to
outperform classic alpha-beta search in games where good heuristic evaluations
are difficult to obtain. In recent years, combining ideas from traditional
minimax search in MCTS has been shown to be advantageous in some domains, such
as Lines of Action, Amazons, and Breakthrough. In this paper, we propose a new
way to use heuristic evaluations to guide the MCTS search by storing the two
sources of information, estimated win rates and heuristic evaluations,
separately. Rather than using the heuristic evaluations to replace the
playouts, our technique backs them up implicitly during the MCTS simulations.
These minimax values are then used to guide future simulations. We show that
using implicit minimax backups leads to stronger play performance in Kalah,
Breakthrough, and Lines of Action.Comment: 24 pages, 7 figures, 9 tables, expanded version of paper presented at
IEEE Conference on Computational Intelligence and Games (CIG) 2014 conferenc
Functional Haplotypes in the ADIPOQ Gene are Associated with Underweight, Immunosuppression and Viral Suppression in Kenyan HIV-1 Infected Antiretroviral Treatment Naive and Experienced Injection Substance Users
BACKGROUND: Human immunodeficiency virus and injection substance use have an influence on genes and gene expression. These effects could be beneficial or detrimental in defining disease outcomes. Adiponectin gene is key in modulating metabolic and immunoregulatory functions. Understanding the effects of human immunodeficiency virus and injection substance use on the gene in the context of antiretroviral therapy is important for predicting disease outcomes.METHODS: This cross-sectional genetic study determined polymorphisms in the promoter region of adiponectin gene. Two variants were analyzed: rs2241766 and rs266729. Polymorphisms were associated with clinical markers of disease outcome; underweight, immunosuppression and viral suppression. The variants were genotyped via random fragment length polymorphism.RESULTS: GC haplotype was associated with higher odds of having underweight (OR, 2.21; 95% CI, 1.83-4.60; P=0.008 vs. OR, 2.30; 95% CI, 1.89-4.71; P=0.006) in antiretroviral treatment -naive and experienced injection substance users andimmunosuppression (OR, 1.90; 95% CI 1.67-3.98, P=0.041) in naive. Bonferroni correction revealed GC haplotype carriers only to have low body mass index in both naive (median, 14.8; IQR, 3.2kg/m2; P<0.002) and experienced (median, 15.2; IQR, 3.2 kg/m2; P<0.002) injection substance users. Circulating total adiponectin levels were higher in naive (median, 19.5; IQR, 7.9 μg/ml) than - experienced (median, 12.0; IQR, 4.4 μg/ml) injection substance users (P<0.0001). GC carriers presented with low serum adiponectin levels in both study groups.CONCLUSION: The study revealed haplotypes of adiponectin gene at loci rs2241766 and rs266729 that could determine disease outcomes in human immunodeficiency virus -1 antiretroviral treatment- naive and experienced injection substance users
Determinants of Non-Performing Loans in Uganda’s Commercial Banking Sector
Over the past decade, Non-Performing Loans (NPLs) in Uganda’s commercial banking industry have exhibited a positive trend, in spite of the reforms undertaken in the industry. The continued increase in NPLs has not only affected credit growth, but also resulted in the collapse and closure of some commercial banks. Against this backdrop, it was necessary to understand the determinants of NPLs in Uganda’s commercial banking sector. To execute the study, quarterly data for the period 2002q1 to 2017q2 was analyzed using ARDL and bounds test techniques while controlling for both bank-specific and macroeconomic factors. The findings of the study indicate that NPLs increase with increase in lending rates, real effective exchange rate and unemployment rate while increase in returns on assets and GDP growth rate lower NPLs. Based on the findings, commercial banks are advised to diversify their asset portfolio by holding other income earning assets such as governments bonds, equity so as to reduce on credit risk exposure. In addition, commercial banks need to focus more on internationally competitive sectors. Measure that reduce lending rates, promote GDP growth, reduce unemployment would also serve to reduce NPLs.
Irrational Axions as a Solution of The Strong CP Problem in an Eternal Universe
We exhibit a novel solution of the strong CP problem, which does not involve
any massless particles. The low energy effective Lagrangian of our model
involves a discrete spacetime independent axion field which can be thought of
as a parameter labeling a dense set of vacua. In the full theory this
parameter is seen to be dynamical, and the model seeks the state of lowest
energy, which has . The processes which mediate transitions
between vacua involve heavy degrees of freedom and are very slow.
Consequently, we do not know whether our model can solve the strong CP problem
in a universe which has been cool for only a finite time. We present several
speculations about the cosmological evolution of our model.Comment: 12 page
The History and Risks of Reinforcement Learning and Human Feedback
Reinforcement learning from human feedback (RLHF) has emerged as a powerful
technique to make large language models (LLMs) easier to use and more
effective. A core piece of the RLHF process is the training and utilization of
a model of human preferences that acts as a reward function for optimization.
This approach, which operates at the intersection of many stakeholders and
academic disciplines, remains poorly understood. RLHF reward models are often
cited as being central to achieving performance, yet very few descriptors of
capabilities, evaluations, training methods, or open-source models exist. Given
this lack of information, further study and transparency is needed for learned
RLHF reward models. In this paper, we illustrate the complex history of
optimizing preferences, and articulate lines of inquiry to understand the
sociotechnical context of reward models. In particular, we highlight the
ontological differences between costs, rewards, and preferences at stake in
RLHF's foundations, related methodological tensions, and possible research
directions to improve general understanding of how reward models function.Comment: 14 pages, 3 figure
- …