181,509 research outputs found
An Approximate Subgame-Perfect Equilibrium Computation Technique for Repeated Games
This paper presents a technique for approximating, up to any precision, the
set of subgame-perfect equilibria (SPE) in discounted repeated games. The
process starts with a single hypercube approximation of the set of SPE. Then
the initial hypercube is gradually partitioned on to a set of smaller adjacent
hypercubes, while those hypercubes that cannot contain any point belonging to
the set of SPE are simultaneously withdrawn.
Whether a given hypercube can contain an equilibrium point is verified by an
appropriate mathematical program. Three different formulations of the algorithm
for both approximately computing the set of SPE payoffs and extracting players'
strategies are then proposed: the first two that do not assume the presence of
an external coordination between players, and the third one that assumes a
certain level of coordination during game play for convexifying the set of
continuation payoffs after any repeated game history.
A special attention is paid to the question of extracting players' strategies
and their representability in form of finite automata, an important feature for
artificial agent systems.Comment: 26 pages, 13 figures, 1 tabl
Semiconductor-enriched single wall carbon nanotube networks applied to field effect transistors
Substantial progress on field effect transistors "FETs" consisting of
semiconducting single wall carbon nanotubes "s-SWNTs" without detectable traces
of metallic nanotubes and impurities is reported. Nearly perfect removal of
metallic nanotubes is confirmed by optical absorption, Raman measurements, and
electrical measurements. This outstanding result was made possible in
particular by ultracentrifugation (150 000 g) of solutions prepared from SWNT
powders using polyfluorene as an extracting agent in toluene. Such s-SWNTs
processable solutions were applied to realize FET, embodying randomly or
preferentially oriented nanotube networks prepared by spin coating or
dielectrophoresis. Devices exhibit stable p-type semiconductor behavior in air
with very promising characteristics. The on-off current ratio is 10^5, the
on-current level is around 10 A, and the estimated hole mobility is larger
than 2 cm2 / V s
Feature Reinforcement Learning: Part I: Unstructured MDPs
General-purpose, intelligent, learning agents cycle through sequences of
observations, actions, and rewards that are complex, uncertain, unknown, and
non-Markovian. On the other hand, reinforcement learning is well-developed for
small finite state Markov decision processes (MDPs). Up to now, extracting the
right state representations out of bare observations, that is, reducing the
general agent setup to the MDP framework, is an art that involves significant
effort by designers. The primary goal of this work is to automate the reduction
process and thereby significantly expand the scope of many existing
reinforcement learning algorithms and the agents that employ them. Before we
can think of mechanizing this search for suitable MDPs, we need a formal
objective criterion. The main contribution of this article is to develop such a
criterion. I also integrate the various parts into one learning algorithm.
Extensions to more realistic dynamic Bayesian networks are developed in Part
II. The role of POMDPs is also considered there.Comment: 24 LaTeX pages, 5 diagram
Learning to Extract Coherent Summary via Deep Reinforcement Learning
Coherence plays a critical role in producing a high-quality summary from a
document. In recent years, neural extractive summarization is becoming
increasingly attractive. However, most of them ignore the coherence of
summaries when extracting sentences. As an effort towards extracting coherent
summaries, we propose a neural coherence model to capture the cross-sentence
semantic and syntactic coherence patterns. The proposed neural coherence model
obviates the need for feature engineering and can be trained in an end-to-end
fashion using unlabeled data. Empirical results show that the proposed neural
coherence model can efficiently capture the cross-sentence coherence patterns.
Using the combined output of the neural coherence model and ROUGE package as
the reward, we design a reinforcement learning method to train a proposed
neural extractive summarizer which is named Reinforced Neural Extractive
Summarization (RNES) model. The RNES model learns to optimize coherence and
informative importance of the summary simultaneously. Experimental results show
that the proposed RNES outperforms existing baselines and achieves
state-of-the-art performance in term of ROUGE on CNN/Daily Mail dataset. The
qualitative evaluation indicates that summaries produced by RNES are more
coherent and readable.Comment: 8 pages, 1 figure, presented at AAAI-201
- …
