1,473 research outputs found
Transmit without regrets: Online optimization in MIMO-OFDM cognitive radio systems
In this paper, we examine cognitive radio systems that evolve dynamically
over time due to changing user and environmental conditions. To combine the
advantages of orthogonal frequency division multiplexing (OFDM) and
multiple-input, multiple-output (MIMO) technologies, we consider a MIMO-OFDM
cognitive radio network where wireless users with multiple antennas communicate
over several non-interfering frequency bands. As the network's primary users
(PUs) come and go in the system, the communication environment changes
constantly (and, in many cases, randomly). Accordingly, the network's
unlicensed, secondary users (SUs) must adapt their transmit profiles "on the
fly" in order to maximize their data rate in a rapidly evolving environment
over which they have no control. In this dynamic setting, static solution
concepts (such as Nash equilibrium) are no longer relevant, so we focus on
dynamic transmit policies that lead to no regret: specifically, we consider
policies that perform at least as well as (and typically outperform) even the
best fixed transmit profile in hindsight. Drawing on the method of matrix
exponential learning and online mirror descent techniques, we derive a
no-regret transmit policy for the system's SUs which relies only on local
channel state information (CSI). Using this method, the system's SUs are able
to track their individually evolving optimum transmit profiles remarkably well,
even under rapidly (and randomly) changing conditions. Importantly, the
proposed augmented exponential learning (AXL) policy leads to no regret even if
the SUs' channel measurements are subject to arbitrarily large observation
errors (the imperfect CSI case), thus ensuring the method's robustness in the
presence of uncertainties.Comment: 25 pages, 3 figures, to appear in the IEEE Journal on Selected Areas
in Communication
Sparse Stochastic Bandits
In the classical multi-armed bandit problem, d arms are available to the
decision maker who pulls them sequentially in order to maximize his cumulative
reward. Guarantees can be obtained on a relative quantity called regret, which
scales linearly with d (or with sqrt(d) in the minimax sense). We here consider
the sparse case of this classical problem in the sense that only a small number
of arms, namely s < d, have a positive expected reward. We are able to leverage
this additional assumption to provide an algorithm whose regret scales with s
instead of d. Moreover, we prove that this algorithm is optimal by providing a
matching lower bound - at least for a wide and pertinent range of parameters
that we determine - and by evaluating its performance on simulated data
Approximate Convex Optimization by Online Game Playing
Lagrangian relaxation and approximate optimization algorithms have received
much attention in the last two decades. Typically, the running time of these
methods to obtain a approximate solution is proportional to
. Recently, Bienstock and Iyengar, following Nesterov,
gave an algorithm for fractional packing linear programs which runs in
iterations. The latter algorithm requires to solve a
convex quadratic program every iteration - an optimization subroutine which
dominates the theoretical running time.
We give an algorithm for convex programs with strictly convex constraints
which runs in time proportional to . The algorithm does NOT
require to solve any quadratic program, but uses gradient steps and elementary
operations only. Problems which have strictly convex constraints include
maximum entropy frequency estimation, portfolio optimization with loss risk
constraints, and various computational problems in signal processing.
As a side product, we also obtain a simpler version of Bienstock and
Iyengar's result for general linear programming, with similar running time.
We derive these algorithms using a new framework for deriving convex
optimization algorithms from online game playing algorithms, which may be of
independent interest
Decomposition Strategies for Constructive Preference Elicitation
We tackle the problem of constructive preference elicitation, that is the
problem of learning user preferences over very large decision problems,
involving a combinatorial space of possible outcomes. In this setting, the
suggested configuration is synthesized on-the-fly by solving a constrained
optimization problem, while the preferences are learned itera tively by
interacting with the user. Previous work has shown that Coactive Learning is a
suitable method for learning user preferences in constructive scenarios. In
Coactive Learning the user provides feedback to the algorithm in the form of an
improvement to a suggested configuration. When the problem involves many
decision variables and constraints, this type of interaction poses a
significant cognitive burden on the user. We propose a decomposition technique
for large preference-based decision problems relying exclusively on inference
and feedback over partial configurations. This has the clear advantage of
drastically reducing the user cognitive load. Additionally, part-wise inference
can be (up to exponentially) less computationally demanding than inference over
full configurations. We discuss the theoretical implications of working with
parts and present promising empirical results on one synthetic and two
realistic constructive problems.Comment: Accepted at the Thirty-Second AAAI Conference on Artificial
Intelligence (AAAI-18
- …