3,390 research outputs found

    Iterated Regret Minimization in Game Graphs

    Full text link
    Iterated regret minimization has been introduced recently by J.Y. Halpern and R. Pass in classical strategic games. For many games of interest, this new solution concept provides solutions that are judged more reasonable than solutions offered by traditional game concepts -- such as Nash equilibrium --. Although computing iterated regret on explicit matrix game is conceptually and computationally easy, nothing is known about computing the iterated regret on games whose matrices are defined implicitly using game tree, game DAG or, more generally game graphs. In this paper, we investigate iterated regret minimization for infinite duration two-player quantitative non-zero sum games played on graphs. We consider reachability objectives that are not necessarily antagonist. Edges are weighted by integers -- one for each player --, and the payoffs are defined by the sum of the weights along the paths. Depending on the class of graphs, we give either polynomial or pseudo-polynomial time algorithms to compute a strategy that minimizes the regret for a fixed player. We finally give algorithms to compute the strategies of the two players that minimize the iterated regret for trees, and for graphs with strictly positive weights only.Comment: 19 pages. Bug in introductive example fixed

    Positive multi-criteria models in agriculture for energy and environmental policy analysis

    Get PDF
    Environmental consciousness and accompanying actions have been paralleled by the evolution of multi-criteria methods which have provided tools to assist policy makers in discovering compromises in order to muddle through. This paper recalls the development of multi-criteria methods in agriculture, focusing on their contribution to produce input or output functions useful for environmental and/or energy policy. Response curves generated by MC models can more accurately predict farmers’ response to market and policy parameters compared with classic profit maximizing behavior. Concrete examples from recent literature illustrate the above statements and ideas for further research are provided.multi-criteria models, interval programming, supply curves, bio-energy, policy analysis

    Private Multiplicative Weights Beyond Linear Queries

    Full text link
    A wide variety of fundamental data analyses in machine learning, such as linear and logistic regression, require minimizing a convex function defined by the data. Since the data may contain sensitive information about individuals, and these analyses can leak that sensitive information, it is important to be able to solve convex minimization in a privacy-preserving way. A series of recent results show how to accurately solve a single convex minimization problem in a differentially private manner. However, the same data is often analyzed repeatedly, and little is known about solving multiple convex minimization problems with differential privacy. For simpler data analyses, such as linear queries, there are remarkable differentially private algorithms such as the private multiplicative weights mechanism (Hardt and Rothblum, FOCS 2010) that accurately answer exponentially many distinct queries. In this work, we extend these results to the case of convex minimization and show how to give accurate and differentially private solutions to *exponentially many* convex minimization problems on a sensitive dataset

    Trend Detection based Regret Minimization for Bandit Problems

    Full text link
    We study a variation of the classical multi-armed bandits problem. In this problem, the learner has to make a sequence of decisions, picking from a fixed set of choices. In each round, she receives as feedback only the loss incurred from the chosen action. Conventionally, this problem has been studied when losses of the actions are drawn from an unknown distribution or when they are adversarial. In this paper, we study this problem when the losses of the actions also satisfy certain structural properties, and especially, do show a trend structure. When this is true, we show that using \textit{trend detection}, we can achieve regret of order O~(NTK)\tilde{O} (N \sqrt{TK}) with respect to a switching strategy for the version of the problem where a single action is chosen in each round and O~(NmTK)\tilde{O} (Nm \sqrt{TK}) when mm actions are chosen each round. This guarantee is a significant improvement over the conventional benchmark. Our approach can, as a framework, be applied in combination with various well-known bandit algorithms, like Exp3. For both versions of the problem, we give regret guarantees also for the \textit{anytime} setting, i.e. when the length of the choice-sequence is not known in advance. Finally, we pinpoint the advantages of our method by comparing it to some well-known other strategies
    • 

    corecore