66 research outputs found

    On Neuron Mechanisms Used to Resolve Mental Problems of Identification and Learning in Sensorium

    Get PDF
    The paper considers some possible neuron mechanisms that do not contradict biological data. They are represented in terms of the notion of an elementary sensorium discussed in the previous authorsā€™ works. Such mechanisms resolve problems of two large classes: when identification mechanisms are used and when sensory learning mechanisms are applied along with identification

    On the convergence of iterative voting: how restrictive should restricted dynamics be?

    No full text
    We study convergence properties of iterative voting procedures. Such procedures are defined by a voting rule and a (restricted) iterative process, where at each step one agent can modify his vote towards a better outcome for himself. It is already known that if the iteration dynamics (the manner in which voters are allowed to modify their votes) are unrestricted, then the voting process may not converge. For most common voting rules this may be observed even under the best response dynamics limitation. It is therefore important to investigate whether and which natural restrictions on the dynamics of iterative voting procedures can guarantee convergence. To this end, we provide two general conditions on the dynamics based on iterative myopic improvements, each of which is sufficient for convergence. We then identify several classes of voting rules (including Positional Scoring Rules, Maximin, Copeland and Bucklin), along with their corresponding iterative processes, for which at least one of these conditions hold

    Reaching Consensus Under a Deadline

    Full text link
    Committee decisions are complicated by a deadline, e.g., the next start of a budget, or the beginning of a semester. In committee hiring decisions, it may be that if no candidate is supported by a strong majority, the default is to hire no one - an option that may cost dearly. As a result, committee members might prefer to agree on a reasonable, if not necessarily the best, candidate, to avoid unfilled positions. In this paper, we propose a model for the above scenario - Consensus Under a Deadline (CUD)- based on a time-bounded iterative voting process. We provide convergence guarantees and an analysis of the quality of the final decision. An extensive experimental study demonstrates more subtle features of CUDs, e.g., the difference between two simple types of committee member behavior, lazy vs.~proactive voters. Finally, a user study examines the differences between the behavior of rational voting bots and real voters, concluding that it may often be best to have bots play on the voters' behalf

    Adaptive Discounting of Training Time Attacks

    Full text link
    Among the most insidious attacks on Reinforcement Learning (RL) solutions are training-time attacks (TTAs) that create loopholes and backdoors in the learned behaviour. Not limited to a simple disruption, constructive TTAs (C-TTAs) are now available, where the attacker forces a specific, target behaviour upon a training RL agent (victim). However, even state-of-the-art C-TTAs focus on target behaviours that could be naturally adopted by the victim if not for a particular feature of the environment dynamics, which C-TTAs exploit. In this work, we show that a C-TTA is possible even when the target behaviour is un-adoptable due to both environment dynamics as well as non-optimality with respect to the victim objective(s). To find efficient attacks in this context, we develop a specialised flavour of the DDPG algorithm, which we term gammaDDPG, that learns this stronger version of C-TTA. gammaDDPG dynamically alters the attack policy planning horizon based on the victim's current behaviour. This improves effort distribution throughout the attack timeline and reduces the effect of uncertainty the attacker has about the victim. To demonstrate the features of our method and better relate the results to prior research, we borrow a 3D grid domain from a state-of-the-art C-TTA for our experiments. Code is available at "bit.ly/github-rb-gDDPG".Comment: 19 pages, 7 figure

    Security Games with Information Leakage: Modeling and Computation

    Full text link
    Most models of Stackelberg security games assume that the attacker only knows the defender's mixed strategy, but is not able to observe (even partially) the instantiated pure strategy. Such partial observation of the deployed pure strategy -- an issue we refer to as information leakage -- is a significant concern in practical applications. While previous research on patrolling games has considered the attacker's real-time surveillance, our settings, therefore models and techniques, are fundamentally different. More specifically, after describing the information leakage model, we start with an LP formulation to compute the defender's optimal strategy in the presence of leakage. Perhaps surprisingly, we show that a key subproblem to solve this LP (more precisely, the defender oracle) is NP-hard even for the simplest of security game models. We then approach the problem from three possible directions: efficient algorithms for restricted cases, approximation algorithms, and heuristic algorithms for sampling that improves upon the status quo. Our experiments confirm the necessity of handling information leakage and the advantage of our algorithms

    Cultivating Desired Behaviour: Policy Teaching Via Environment-Dynamics Tweaks

    No full text
    In this paper we study, for the first time explicitly, the implications of endowing an interested party (i.e. a teacher) with the ability to modify the underlying dynamics of the environment, in order to encourage an agent to learn to follow a specific policy. We introduce a cost function which can be used by the teacher to balance the modifications it makes to the underlying environment dynamics, with the learner's performance compared to some ideal, desired, policy. We formulate teacher's problem of determining optimal environment changes as a planning and control problem, and empirically validate the effectiveness of our model

    Protecting elections by recounting ballots

    Get PDF
    Complexity of voting manipulation is a prominent topic in computational social choice. In this work, we consider a two-stage voting manipulation scenario. First, a malicious party (an attacker) attempts to manipulate the election outcome in favor of a preferred candidate by changing the vote counts in some of the voting districts. Afterwards, another party (a defender), which cares about the voters' wishes, demands a recount in a subset of the manipulated districts, restoring their vote counts to their original values. We investigate the resulting Stackelberg game for the case where votes are aggregated using two variants of the Plurality rule, and obtain an almost complete picture of the complexity landscape, both from the attacker's and from the defender's perspective

    Approximating Mixed Nash Equilibria using Smooth Fictitious Play in Simultaneous Auctions

    No full text
    We investigate equilibrium strategies for bidding agents that participate in multiple, simultaneous second-price auctions with perfect substitutes. For this setting, previous research has shown that it is a best response for a bidder to participate in as many such auctions as there are available, provided that other bidders only participate in a single auction. In contrast, in this paper we consider equilibrium behaviour where all bidders participate in multiple auctions. For this new setting we consider mixed-strategy Nash equilibria where bidders can bid high in one auction and low in all others. By discretising the bid space, we are able to use smooth fictitious play to compute approximate solutions. Specifically, we find that the results do indeed converge to Ļµ\epsilon-Nash mixed equilibria and, therefore, we are able to locate equilibrium strategies in such complex games where no known solutions previously existed
    • ā€¦
    corecore