Search CORE

13 research outputs found

Decentralized Exploration in Multi-Armed Bandits

Author: Alami Réda
Féraud Raphaël
Laroche Romain
Publication venue
Publication date: 13/05/2019
Field of study

We consider the decentralized exploration problem: a set of players collaborate to identify the best arm by asynchronously interacting with the same stochastic environment. The objective is to insure privacy in the best arm identification problem between asynchronous, collaborative, and thrifty players. In the context of a digital service, we advocate that this decentralized approach allows a good balance between the interests of users and those of service providers: the providers optimize their services, while protecting the privacy of the users and saving resources. We define the privacy level as the amount of information an adversary could infer by intercepting the messages concerning a single user. We provide a generic algorithm Decentralized Elimination, which uses any best arm identification algorithm as a subroutine. We prove that this algorithm insures privacy, with a low communication cost, and that in comparison to the lower bound of the best arm identification problem, its sample complexity suffers from a penalty depending on the inverse of the probability of the most frequent players. Then, thanks to the genericity of the approach, we extend the proposed algorithm to the non-stationary bandits. Finally, experiments illustrate and complete the analysis

arXiv.org e-Print Archive

Machine Learning for Ad Publishers in Real Time Bidding

Author: Refaei Afshar Reza
Publication venue: Eindhoven University of Technology
Publication date: 17/03/2022
Field of study

Pure OAI Repository

Notes on Randomized Algorithms

Author: Aspnes James
Publication venue
Publication date: 04/03/2020
Field of study

Lecture notes for the Yale Computer Science course CPSC 469/569 Randomized Algorithms. Suitable for use as a supplementary text for an introductory graduate or advanced undergraduate course on randomized algorithms. Discusses tools from probability theory, including random variables and expectations, union bound arguments, concentration bounds, applications of martingales and Markov chains, and the Lov\'asz Local Lemma. Algorithmic topics include analysis of classic randomized algorithms such as Quicksort and Hoare's FIND, randomized tree data structures, hashing, Markov chain Monte Carlo sampling, randomized approximate counting, derandomization, quantum computing, and some examples of randomized distributed algorithms

arXiv.org e-Print Archive

Pessimistic Bayesianism for conservative optimization and imitation

Author: Cohen Michael
Publication venue
Publication date: 25/07/2023
Field of study

Subject to several assumptions, sufficiently advanced reinforcement learners would likely face an incentive and likely have an ability to intervene in the provision of their reward, with catastrophic consequences. In this thesis, I develop a theory of pessimism and show how it can produce safe advanced artificial agents. Not only do I demonstrate that the assumptions mentioned above can be avoided; I prove theorems which demonstrate safety. First, I develop an idealized pessimistic reinforcement learner. For any given novel event that a mentor would never cause, a sufficiently pessimistic reinforcement learner trained with the help of that mentor would probably avoid causing it. This result is without precedent in the literature. Next, on similar principles, I develop an idealized pessimistic imitation learner. If the probability of an event when the demonstrator acts can be bounded above, then the probability can be bounded above when the imitator acts instead; this kind of result is unprecedented when the imitator learns online and the environment never resets. In an environment that never resets, no one has previously demonstrated, to my knowledge, that an imitation learner even exists. Finally, both of the agents above demand more efficient algorithms for high-quality uncertainty quantification, so I have developed a new kernel for Gaussian process modelling that allows for log-linear time complexity and linear space complexity, instead of a naïve cubic time complexity and quadratic space complexity. This is not the first Gaussian process with this time complexity—inducing points methods have linear complexity—but we do outperform such methods significantly on regression benchmarks, as one might expect given the much higher dimensionality of our kernel. This thesis shows the viability of pessimism with respect to well-quantified epistemic uncertainty as a path to safe artificial agency

Oxford University Research Archive

Deep Learning Applications in Industrial and Systems Engineering

Author: Harvey Winthrop
Publication venue: ScholarWorks@UARK
Publication date: 01/08/2022
Field of study

Deep learning - the use of large neural networks to perform machine learning - has transformed the world. As the capabilities of deep models continue to grow, deep learning is becoming an increasingly valuable and practical tool for industrial engineering. With its wide applicability, deep learning can be turned to many industrial engineering tasks, including optimization, heuristic search, and functional approximation. In this dissertation, the major concepts and paradigms of deep learning are reviewed, and three industrial engineering projects applying these methods are described. The first applies a deep convolutional network to the task of absolute aerial geolocalization - the regression of real geographic coordinates from aerial photos - showing promising results. Next, continuing on this work, the features and characteristics of the deep aerial geolocalization model are further studied, with implications for future applications and methodological improvements. Lastly, a deep learning model is developed and applied to a difficult rare event problem of predicting failure times in oil and natural gas wells from process and site data. Practical details of applying deep learning to this sort of data are discussed, and methodological principles are proposed

ScholarWorks@UARK

UARK (University of Arkansas )

Random Shuffling and Resets for the Non-stationary Stochastic Bandit Problem

Author: Allesiardo Robin
Féraud Raphaël
Maillard Odalric-Ambrym
Publication venue: HAL CCSD
Publication date: 07/09/2016
Field of study

We consider a non-stationary formulation of the stochastic multi-armed bandit where the rewards are no longer assumed to be identically distributed. For the best-arm identification task, we introduce a version of SUCCESSIVE ELIMINATION based on random shuffling of the K arms. We prove that under a novel and mild assumption on the mean gap ∆, this simple but powerful modification achieves the same guarantees in term of sample complexity and cumulative regret than its original version, but in a much wider class of problems, as it is not anymore constrained to stationary distributions. We also show that the original SUCCESSIVE ELIMINATION fails to have controlled regret in this more general scenario, thus showing the benefit of shuffling. We then remove our mild assumption and adapt the algorithm to the best-arm identification task with switching arms. We adapt the definition of the sample complexity for that case and prove that, against an optimal policy with N − 1 switches of the optimal arm, this new algorithm achieves an expected sample complexity of O(∆^{−2} sqrt(N Kdelta^{−1} log(K/delta)), where δ is the probability of failure of the algorithm, and an expected cumulative regret of O(∆^{−1} sqrt(N T K log(T K))) after T time steps

HAL-CentraleSupelec

arXiv.org e-Print Archive

HAL - Lille 3

INRIA a CCSD electronic archive server

HAL-Rennes 1

LIPIcs, Volume 251, ITCS 2023, Complete Volume

Author: Tauman Kalai Yael
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 14th Innovations in Theoretical Computer Science Conference (ITCS 2023)
Publication date: 01/01/2023
Field of study

LIPIcs, Volume 251, ITCS 2023, Complete Volum

Dagstuhl Research Online Publication Server

Using MapReduce Streaming for Distributed Life Simulation on the Cloud

Author: Radenski Atanas
Publication venue: Chapman University Digital Commons
Publication date: 01/01/2013
Field of study

Distributed software simulations are indispensable in the study of large-scale life models but often require the use of technically complex lower-level distributed computing frameworks, such as MPI. We propose to overcome the complexity challenge by applying the emerging MapReduce (MR) model to distributed life simulations and by running such simulations on the cloud. Technically, we design optimized MR streaming algorithms for discrete and continuous versions of Conway’s life according to a general MR streaming pattern. We chose life because it is simple enough as a testbed for MR’s applicability to a-life simulations and general enough to make our results applicable to various lattice-based a-life models. We implement and empirically evaluate our algorithms’ performance on Amazon’s Elastic MR cloud. Our experiments demonstrate that a single MR optimization technique called strip partitioning can reduce the execution time of continuous life simulations by 64%. To the best of our knowledge, we are the first to propose and evaluate MR streaming algorithms for lattice-based simulations. Our algorithms can serve as prototypes in the development of novel MR simulation algorithms for large-scale lattice-based a-life models.https://digitalcommons.chapman.edu/scs_books/1014/thumbnail.jp

Chapman University Digital Commons