66 research outputs found
On Neuron Mechanisms Used to Resolve Mental Problems of Identification and Learning in Sensorium
The paper considers some possible neuron mechanisms that do not contradict biological data.
They are represented in terms of the notion of an elementary sensorium discussed in the previous authorsā
works. Such mechanisms resolve problems of two large classes: when identification mechanisms are used
and when sensory learning mechanisms are applied along with identification
On the convergence of iterative voting: how restrictive should restricted dynamics be?
We study convergence properties of iterative voting procedures. Such procedures are defined by a voting rule and a (restricted) iterative process, where at each step one agent can modify his vote towards a better outcome for himself. It is already known that if the iteration dynamics (the manner in which voters are allowed to modify their votes) are unrestricted, then the voting process may not converge. For most common voting rules this may be observed even under the best response dynamics limitation. It is therefore important to investigate whether and which natural restrictions on the dynamics of iterative voting procedures can guarantee convergence. To this end, we provide two general conditions on the dynamics based on iterative myopic improvements, each of which is sufficient for convergence. We then identify several classes of voting rules (including Positional Scoring Rules, Maximin, Copeland and Bucklin), along with their corresponding iterative processes, for which at least one of these conditions hold
Reaching Consensus Under a Deadline
Committee decisions are complicated by a deadline, e.g., the next start of a
budget, or the beginning of a semester. In committee hiring decisions, it may
be that if no candidate is supported by a strong majority, the default is to
hire no one - an option that may cost dearly. As a result, committee members
might prefer to agree on a reasonable, if not necessarily the best, candidate,
to avoid unfilled positions. In this paper, we propose a model for the above
scenario - Consensus Under a Deadline (CUD)- based on a time-bounded iterative
voting process. We provide convergence guarantees and an analysis of the
quality of the final decision. An extensive experimental study demonstrates
more subtle features of CUDs, e.g., the difference between two simple types of
committee member behavior, lazy vs.~proactive voters. Finally, a user study
examines the differences between the behavior of rational voting bots and real
voters, concluding that it may often be best to have bots play on the voters'
behalf
Adaptive Discounting of Training Time Attacks
Among the most insidious attacks on Reinforcement Learning (RL) solutions are
training-time attacks (TTAs) that create loopholes and backdoors in the learned
behaviour. Not limited to a simple disruption, constructive TTAs (C-TTAs) are
now available, where the attacker forces a specific, target behaviour upon a
training RL agent (victim). However, even state-of-the-art C-TTAs focus on
target behaviours that could be naturally adopted by the victim if not for a
particular feature of the environment dynamics, which C-TTAs exploit. In this
work, we show that a C-TTA is possible even when the target behaviour is
un-adoptable due to both environment dynamics as well as non-optimality with
respect to the victim objective(s). To find efficient attacks in this context,
we develop a specialised flavour of the DDPG algorithm, which we term
gammaDDPG, that learns this stronger version of C-TTA. gammaDDPG dynamically
alters the attack policy planning horizon based on the victim's current
behaviour. This improves effort distribution throughout the attack timeline and
reduces the effect of uncertainty the attacker has about the victim. To
demonstrate the features of our method and better relate the results to prior
research, we borrow a 3D grid domain from a state-of-the-art C-TTA for our
experiments. Code is available at "bit.ly/github-rb-gDDPG".Comment: 19 pages, 7 figure
Security Games with Information Leakage: Modeling and Computation
Most models of Stackelberg security games assume that the attacker only knows
the defender's mixed strategy, but is not able to observe (even partially) the
instantiated pure strategy. Such partial observation of the deployed pure
strategy -- an issue we refer to as information leakage -- is a significant
concern in practical applications. While previous research on patrolling games
has considered the attacker's real-time surveillance, our settings, therefore
models and techniques, are fundamentally different. More specifically, after
describing the information leakage model, we start with an LP formulation to
compute the defender's optimal strategy in the presence of leakage. Perhaps
surprisingly, we show that a key subproblem to solve this LP (more precisely,
the defender oracle) is NP-hard even for the simplest of security game models.
We then approach the problem from three possible directions: efficient
algorithms for restricted cases, approximation algorithms, and heuristic
algorithms for sampling that improves upon the status quo. Our experiments
confirm the necessity of handling information leakage and the advantage of our
algorithms
Cultivating Desired Behaviour: Policy Teaching Via Environment-Dynamics Tweaks
In this paper we study, for the first time explicitly, the implications of endowing an interested party (i.e. a teacher) with the ability to modify the underlying dynamics of the environment, in order to encourage an agent to learn to follow a specific policy. We introduce a cost function which can be used by the teacher to balance the modifications it makes to the underlying environment dynamics, with the learner's performance compared to some ideal, desired, policy. We formulate teacher's problem of determining optimal environment changes as a planning and control problem, and empirically validate the effectiveness of our model
Protecting elections by recounting ballots
Complexity of voting manipulation is a prominent topic in computational social choice. In this work, we consider a two-stage voting manipulation scenario. First, a malicious party (an attacker) attempts to manipulate the election outcome in favor of a preferred candidate by changing the vote counts in some of the voting districts. Afterwards, another party (a defender), which cares about the voters' wishes, demands a recount in a subset of the manipulated districts, restoring their vote counts to their original values. We investigate the resulting Stackelberg game for the case where votes are aggregated using two variants of the Plurality rule, and obtain an almost complete picture of the complexity landscape, both from the attacker's and from the defender's perspective
Approximating Mixed Nash Equilibria using Smooth Fictitious Play in Simultaneous Auctions
We investigate equilibrium strategies for bidding agents that participate in multiple, simultaneous second-price auctions with perfect substitutes. For this setting, previous research has shown that it is a best response for a bidder to participate in as many such auctions as there are available, provided that other bidders only participate in a single auction. In contrast, in this paper we consider equilibrium behaviour where all bidders participate in multiple auctions. For this new setting we consider mixed-strategy Nash equilibria where bidders can bid high in one auction and low in all others. By discretising the bid space, we are able to use smooth fictitious play to compute approximate solutions. Specifically, we find that the results do indeed converge to -Nash mixed equilibria and, therefore, we are able to locate equilibrium strategies in such complex games where no known solutions previously existed
- ā¦