225 research outputs found
Smooth Convex Optimization using Sub-Zeroth-Order Oracles
We consider the problem of minimizing a smooth, Lipschitz, convex function
over a compact, convex set using sub-zeroth-order oracles: an oracle that
outputs the sign of the directional derivative for a given point and a given
direction, an oracle that compares the function values for a given pair of
points, and an oracle that outputs a noisy function value for a given point. We
show that the sample complexity of optimization using these oracles is
polynomial in the relevant parameters. The optimization algorithm that we
provide for the comparator oracle is the first algorithm with a known rate of
convergence that is polynomial in the number of dimensions. We also give an
algorithm for the noisy-value oracle that incurs a regret of
(ignoring the other factors and
logarithmic dependencies) where is the number of dimensions and is the
number of queries.Comment: Extended version of the accepted paper in the 35th AAAI Conference on
Artificial Intelligence 2021. 19 pages including supplementary materia
Identity Concealment Games: How I Learned to Stop Revealing and Love the Coincidences
In an adversarial environment, a hostile player performing a task may behave
like a non-hostile one in order not to reveal its identity to an opponent. To
model such a scenario, we define identity concealment games: zero-sum
stochastic reachability games with a zero-sum objective of identity
concealment. To measure the identity concealment of the player, we introduce
the notion of an average player. The average player's policy represents the
expected behavior of a non-hostile player. We show that there exists an
equilibrium policy pair for every identity concealment game and give the
optimality equations to synthesize an equilibrium policy pair. If the player's
opponent follows a non-equilibrium policy, the player can hide its identity
better. For this reason, we study how the hostile player may learn the
opponent's policy. Since learning via exploration policies would quickly reveal
the hostile player's identity to the opponent, we consider the problem of
learning a near-optimal policy for the hostile player using the game runs
collected under the average player's policy. Consequently, we propose an
algorithm that provably learns a near-optimal policy and give an upper bound on
the number of sample runs to be collected
Optimal Deceptive and Reference Policies for Supervisory Control
The use of deceptive strategies is important for an agent that attempts not
to reveal his intentions in an adversarial environment. We consider a setting
in which a supervisor provides a reference policy and expects an agent to
follow the reference policy and perform a task. The agent may instead follow a
different, deceptive policy to achieve a different task. We model the
environment and the behavior of the agent with a Markov decision process,
represent the tasks of the agent and the supervisor with linear temporal logic
formulae, and study the synthesis of optimal deceptive policies for such
agents. We also study the synthesis of optimal reference policies that prevents
deceptive strategies of the agent and achieves the supervisor's task with high
probability. We show that the synthesis of deceptive policies has a convex
optimization problem formulation, while the synthesis of reference policies
requires solving a nonconvex optimization problem.Comment: 20 page
Alternating Direction Method of Multipliers for Decomposable Saddle-Point Problems
Saddle-point problems appear in various settings including machine learning,
zero-sum stochastic games, and regression problems. We consider decomposable
saddle-point problems and study an extension of the alternating direction
method of multipliers to such saddle-point problems. Instead of solving the
original saddle-point problem directly, this algorithm solves smaller
saddle-point problems by exploiting the decomposable structure. We show the
convergence of this algorithm for convex-concave saddle-point problems under a
mild assumption. We also provide a sufficient condition for which the
assumption holds. We demonstrate the convergence properties of the saddle-point
alternating direction method of multipliers with numerical examples on a power
allocation problem in communication channels and a network routing problem with
adversarial costs.Comment: Accepted to 58th Annual Allerton Conference on Communication,
Control, and Computin
A novel algorithm for DC analysis of piecewise-linear circuits: popcorn
Cataloged from PDF version of article.A fast and convergent iteration method for piecewise-linear
analysis of nonlinear resistive circuits is presented. Most of the existing
algorithms are applicable only to a limited class of circuits. In general,
they are either not convergent or too slow for large circuits. The new algorithm presented in the paper is much more efficient than the existing
ones and can be applied to any piecewise-linear circuit. It is based on the
piecewise-linear version of the Newton-Raphson algorithm. As opposed
to the Newton-Raphson method, the new algorithm is globally convergent
from an arbitrary starting point. It is simple to understand and it can
be easily programmed. Some numerical examples are given in order to
demonstrate the effectiveness of the proposed algorithm in terms of the
amount of computation
Computer aided frequency planning for the radio and tv broadcasts
Cataloged from PDF version of article.The frequency planning of the VHF and UHF
broadcasts in Turkey is described. This planning is done
with the aid of computer databases and digital terrain
map. The frequency offset is applied whenever applicable
to increase the channel capacity. The offset assignment
is done through Simulated Annealing algorithm.
The international rules and regulations concerning Turkey
are also considered
Differential Privacy in Cooperative Multiagent Planning
Privacy-aware multiagent systems must protect agents' sensitive data while
simultaneously ensuring that agents accomplish their shared objectives. Towards
this goal, we propose a framework to privatize inter-agent communications in
cooperative multiagent decision-making problems. We study sequential
decision-making problems formulated as cooperative Markov games with
reach-avoid objectives. We apply a differential privacy mechanism to privatize
agents' communicated symbolic state trajectories, and then we analyze tradeoffs
between the strength of privacy and the team's performance. For a given level
of privacy, this tradeoff is shown to depend critically upon the total
correlation among agents' state-action processes. We synthesize policies that
are robust to privacy by reducing the value of the total correlation. Numerical
experiments demonstrate that the team's performance under these policies
decreases by only 3 percent when comparing private versus non-private
implementations of communication. By contrast, the team's performance decreases
by roughly 86 percent when using baseline policies that ignore total
correlation and only optimize team performance
Formal Methods for Autonomous Systems
Formal methods refer to rigorous, mathematical approaches to system
development and have played a key role in establishing the correctness of
safety-critical systems. The main building blocks of formal methods are models
and specifications, which are analogous to behaviors and requirements in system
design and give us the means to verify and synthesize system behaviors with
formal guarantees.
This monograph provides a survey of the current state of the art on
applications of formal methods in the autonomous systems domain. We consider
correct-by-construction synthesis under various formulations, including closed
systems, reactive, and probabilistic settings. Beyond synthesizing systems in
known environments, we address the concept of uncertainty and bound the
behavior of systems that employ learning using formal methods. Further, we
examine the synthesis of systems with monitoring, a mitigation technique for
ensuring that once a system deviates from expected behavior, it knows a way of
returning to normalcy. We also show how to overcome some limitations of formal
methods themselves with learning. We conclude with future directions for formal
methods in reinforcement learning, uncertainty, privacy, explainability of
formal methods, and regulation and certification
- …