26 research outputs found
Monte Carlo Tree Search with Heuristic Evaluations using Implicit Minimax Backups
Monte Carlo Tree Search (MCTS) has improved the performance of game engines
in domains such as Go, Hex, and general game playing. MCTS has been shown to
outperform classic alpha-beta search in games where good heuristic evaluations
are difficult to obtain. In recent years, combining ideas from traditional
minimax search in MCTS has been shown to be advantageous in some domains, such
as Lines of Action, Amazons, and Breakthrough. In this paper, we propose a new
way to use heuristic evaluations to guide the MCTS search by storing the two
sources of information, estimated win rates and heuristic evaluations,
separately. Rather than using the heuristic evaluations to replace the
playouts, our technique backs them up implicitly during the MCTS simulations.
These minimax values are then used to guide future simulations. We show that
using implicit minimax backups leads to stronger play performance in Kalah,
Breakthrough, and Lines of Action.Comment: 24 pages, 7 figures, 9 tables, expanded version of paper presented at
IEEE Conference on Computational Intelligence and Games (CIG) 2014 conferenc
Novel online data allocation for hybrid memories on tele-health systems
[EN] The developments of wearable devices such as Body Sensor Networks (BSNs) have greatly improved the capability of tele-health industry. Large amount of data will be collected from every local BSN in real-time. These data is processed by embedded systems including smart phones and tablets. After that, the data will be transferred to distributed storage systems for further processing. Traditional on-chip SRAMs cause critical power leakage issues and occupy relatively large chip areas. Therefore, hybrid memories, which combine volatile memories with non-volatile memories, are widely adopted in reducing the latency and energy cost on multi-core systems. However, most of the current works are about static data allocation for hybrid memories. Those mechanisms cannot achieve better data placement in real-time. Hence, we propose online data allocation for hybrid memories on embedded tele-health systems. In this paper, we present dynamic programming and heuristic approaches. Considering the difference between profiled data access and actual data access, the proposed algorithms use a feedback mechanism to improve the accuracy of data allocation during runtime. Experimental results demonstrate that, compared to greedy approaches, the proposed algorithms achieve 20%-40% performance improvement based on different benchmarks. (C) 2016 Elsevier B.V. All rights reserved.This work is supported by NSF CNS-1457506 and NSF CNS-1359557.Chen, L.; Qiu, M.; Dai, W.; Hassan Mohamed, H. (2017). Novel online data allocation for hybrid memories on tele-health systems. Microprocessors and Microsystems. 52:391-400. https://doi.org/10.1016/j.micpro.2016.08.003S3914005
Combination Strategies for Semantic Role Labeling
This paper introduces and analyzes a battery of inference models for the
problem of semantic role labeling: one based on constraint satisfaction, and
several strategies that model the inference as a meta-learning problem using
discriminative classifiers. These classifiers are developed with a rich set of
novel features that encode proposition and sentence-level information. To our
knowledge, this is the first work that: (a) performs a thorough analysis of
learning-based inference models for semantic role labeling, and (b) compares
several inference strategies in this context. We evaluate the proposed
inference strategies in the framework of the CoNLL-2005 shared task using only
automatically-generated syntactic information. The extensive experimental
evaluation and analysis indicates that all the proposed inference strategies
are successful -they all outperform the current best results reported in the
CoNLL-2005 evaluation exercise- but each of the proposed approaches has its
advantages and disadvantages. Several important traits of a state-of-the-art
SRL combination strategy emerge from this analysis: (i) individual models
should be combined at the granularity of candidate arguments rather than at the
granularity of complete solutions; (ii) the best combination strategy uses an
inference model based in learning; and (iii) the learning-based inference
benefits from max-margin classifiers and global feedback
Power management optimisation for hybrid electric systems using reinforcement learning and adaptive dynamic programming
This paper presents an online learning scheme based on reinforcement learning and adaptive dynamic programming for the power management of hybrid electric systems. Current methods for power management are conservative and unable to fully account for variations in the system due to changes in the health and operational conditions. These conservative schemes result in less efficient use of available power sources, increasing the overall system costs and heightening the risk of failure due to the variations. The proposed scheme is able to compensate for modelling uncertainties and the gradual system variations by adapting its performance function using the observed system measurements as reinforcement signals. The reinforcement signals are nonlinear and consequently neural networks are employed in the implementation of the scheme. Simulation results for the power management of an autonomous hybrid system show improved system performance using the proposed scheme as compared with a conventional offline dynamic programming approach
Issues on Stability of ADP Feedback Controllers for Dynamical Systems
This paper traces the development of neural-network (NN)-based feedback controllers that are derived from the principle of adaptive/approximate dynamic programming (ADP) and discusses their closed-loop stability. Different versions of NN structures in the literature, which embed mathematical mappings related to solutions of the ADP-formulated problems called “adaptive critics” or “action-critic” networks, are discussed. Distinction between the two classes of ADP applications is pointed out. Furthermore, papers in “model-free” development and model-based neurocontrollers are reviewed in terms of their contributions to stability issues. Recent literature suggests that work in ADP-based feedback controllers with assured stability is growing in diverse forms