1,107 research outputs found
Laplace's rule of succession in information geometry
Laplace's "add-one" rule of succession modifies the observed frequencies in a
sequence of heads and tails by adding one to the observed counts. This improves
prediction by avoiding zero probabilities and corresponds to a uniform Bayesian
prior on the parameter. The canonical Jeffreys prior corresponds to the
"add-one-half" rule. We prove that, for exponential families of distributions,
such Bayesian predictors can be approximated by taking the average of the
maximum likelihood predictor and the \emph{sequential normalized maximum
likelihood} predictor from information theory. Thus in this case it is possible
to approximate Bayesian predictors without the cost of integrating or sampling
in parameter space
Adaptive Regret Minimization in Bounded-Memory Games
Online learning algorithms that minimize regret provide strong guarantees in
situations that involve repeatedly making decisions in an uncertain
environment, e.g. a driver deciding what route to drive to work every day.
While regret minimization has been extensively studied in repeated games, we
study regret minimization for a richer class of games called bounded memory
games. In each round of a two-player bounded memory-m game, both players
simultaneously play an action, observe an outcome and receive a reward. The
reward may depend on the last m outcomes as well as the actions of the players
in the current round. The standard notion of regret for repeated games is no
longer suitable because actions and rewards can depend on the history of play.
To account for this generality, we introduce the notion of k-adaptive regret,
which compares the reward obtained by playing actions prescribed by the
algorithm against a hypothetical k-adaptive adversary with the reward obtained
by the best expert in hindsight against the same adversary. Roughly, a
hypothetical k-adaptive adversary adapts her strategy to the defender's actions
exactly as the real adversary would within each window of k rounds. Our
definition is parametrized by a set of experts, which can include both fixed
and adaptive defender strategies.
We investigate the inherent complexity of and design algorithms for adaptive
regret minimization in bounded memory games of perfect and imperfect
information. We prove a hardness result showing that, with imperfect
information, any k-adaptive regret minimizing algorithm (with fixed strategies
as experts) must be inefficient unless NP=RP even when playing against an
oblivious adversary. In contrast, for bounded memory games of perfect and
imperfect information we present approximate 0-adaptive regret minimization
algorithms against an oblivious adversary running in time n^{O(1)}.Comment: Full Version. GameSec 2013 (Invited Paper
An efficient algorithm for learning with semi-bandit feedback
We consider the problem of online combinatorial optimization under
semi-bandit feedback. The goal of the learner is to sequentially select its
actions from a combinatorial decision set so as to minimize its cumulative
loss. We propose a learning algorithm for this problem based on combining the
Follow-the-Perturbed-Leader (FPL) prediction method with a novel loss
estimation procedure called Geometric Resampling (GR). Contrary to previous
solutions, the resulting algorithm can be efficiently implemented for any
decision set where efficient offline combinatorial optimization is possible at
all. Assuming that the elements of the decision set can be described with
d-dimensional binary vectors with at most m non-zero entries, we show that the
expected regret of our algorithm after T rounds is O(m sqrt(dT log d)). As a
side result, we also improve the best known regret bounds for FPL in the full
information setting to O(m^(3/2) sqrt(T log d)), gaining a factor of sqrt(d/m)
over previous bounds for this algorithm.Comment: submitted to ALT 201
Effects of degenerate orbitals on the Hubbard model
Stability of a metallic state in the two-orbital Hubbard model at
half-filling is investigated. We clarify how spin and orbital fluctuations are
enhanced to stabilize the formation of quasi-particles by combining dynamical
mean field theory with the quantum Monte Carlo simulations. These analyses shed
some light on the reason why the metallic phase is particularly stable when the
intra- and inter-band Coulomb interactions are nearly equal.Comment: 3 pages, To appear in JPSJ Vol. 72, No. 5 200
Microscopic Approach to Magnetism and Superconductivity of -Electron Systems with Filled Skutterudite Structure
In order to gain a deep insight into -electron properties of filled
skutterudite compounds from a microscopic viewpoint, we investigate the
multiorbital Anderson model including Coulomb interactions, spin-orbit
coupling, and crystalline electric field effect. For each case of
=113, where is the number of electrons per rare-earth ion, the
model is analyzed by using the numerical renormalization group (NRG) method to
evaluate magnetic susceptibility and entropy of electron. In order to make
further step to construct a simplified model which can be treated even in a
periodic system, we also analyze the Anderson model constructed based on the
- coupling scheme by using the NRG method. Then, we construct an orbital
degenerate Hubbard model based on the - coupling scheme to investigate
the mechanism of superconductivity of filled skutterudites. In the 2-site
model, we carefully evaluate the superconducting pair susceptibility for the
case of =2 and find that the susceptibility for off-site Cooper pair is
clearly enhanced only in a transition region in which the singlet and triplet
ground states are interchanged.Comment: 14 pages, 11 figures, Typeset with jpsj2.cl
- …