4,982 research outputs found
Sample Efficient Bayesian Reinforcement Learning
Artificial Intelligence (AI) has been an active field of research for over a century now. The research field of AI may be grouped into various tasks that are expected from an intelligent agent; two major ones being learning & inference and planning. The act of storing new knowledge is known as learning while inference refers to the act to extracting conclusions given agent’s limited knowledge base. They are tightly knit by the design of its knowledge base. The process of deciding long-term actions or plans given its current knowledge is called planning.Reinforcement Learning (RL) brings together these two tasks by posing a seemingly benign question “How to act optimally in an unknown environment?”. This requires the agent to learn about its environment as well as plan actions given its current knowledge about it. In RL, the environment can be represented by a mathematical model and we associate an intrinsic value to the actions that the agent may choose.In this thesis, we present a novel Bayesian algorithm for the problem of RL. Bayesian RL is a widely explored area of research but is constrained by scalability and performance issues. We provide first steps towards rigorous analysis of these types of algorithms. Bayesian algorithms are characterized by the belief that they maintain over their unknowns; which is updated based on the collected evidence. This is different from the traditional approach in RL in terms of problem formulation and formal guarantees. Our novel algorithm combines aspects of planning and learning due to its inherent Bayesian formulation. It does so in a more scalable fashion, with formal PAC guarantees. We also give insights on the application of Bayesian framework for the estimation of model and value, in a joint work on Bayesian backward induction for RL
Recommended from our members
Security, Privacy, and Transparency Guarantees for Machine Learning Systems
Machine learning (ML) is transforming a wide range of applications, promising to bring immense economic and social benefits. However, it also raises substantial security, privacy, and transparency challenges. ML workloads indeed push companies toward aggressive data collection and loose data access policies, placing troves of sensitive user information at risk if the company is hacked. ML also introduces new attack vectors, such as adversarial example attacks, which can completely nullify models’ accuracy under attack. Finally, ML models make complex data-driven decisions, which are opaque to the end-users, and difficult to inspect for programmers. In this dissertation we describe three systems we developed. Each system addresses a dimension of the previous challenges, by combining new practical systems techniques with rigorous theory to achieve a guaranteed level of protection, and make systems easier to understand. First we present Sage, a differentially private ML platform that enforces a meaningful protection semantic for the troves of personal information amassed by today’s companies. Second we describe PixelDP, a defense against adversarial examples that leverages differential privacy theory to provide a guaranteed level of accuracy under attack. Third we introduce Sunlight, a tool to enhance the transparency of opaque targeting services, using rigorous causal inference theory to explain targeting decisions to end-users
Fundamental Approaches to Software Engineering
This open access book constitutes the proceedings of the 24th International Conference on Fundamental Approaches to Software Engineering, FASE 2021, which took place during March 27–April 1, 2021, and was held as part of the Joint Conferences on Theory and Practice of Software, ETAPS 2021. The conference was planned to take place in Luxembourg but changed to an online format due to the COVID-19 pandemic. The 16 full papers presented in this volume were carefully reviewed and selected from 52 submissions. The book also contains 4 Test-Comp contributions
Machine Unlearning: A Survey
Machine learning has attracted widespread attention and evolved into an
enabling technology for a wide range of highly successful applications, such as
intelligent computer vision, speech recognition, medical diagnosis, and more.
Yet a special need has arisen where, due to privacy, usability, and/or the
right to be forgotten, information about some specific samples needs to be
removed from a model, called machine unlearning. This emerging technology has
drawn significant interest from both academics and industry due to its
innovation and practicality. At the same time, this ambitious problem has led
to numerous research efforts aimed at confronting its challenges. To the best
of our knowledge, no study has analyzed this complex topic or compared the
feasibility of existing unlearning solutions in different kinds of scenarios.
Accordingly, with this survey, we aim to capture the key concepts of unlearning
techniques. The existing solutions are classified and summarized based on their
characteristics within an up-to-date and comprehensive review of each
category's advantages and limitations. The survey concludes by highlighting
some of the outstanding issues with unlearning techniques, along with some
feasible directions for new research opportunities
Recommended from our members
New Data Protection Abstractions for Emerging Mobile and Big Data Workloads
Two recent shifts in computing are challenging the effectiveness of traditional approaches to data protection. Emerging machine learning workloads have complex access patterns and unique leakage characteristics that are not well supported by existing protection approaches. Second, mobile operating systems do not provide sufficient support for fine grained data protection tools forcing users to rely on individual applications to correctly manage and protect data. My thesis is that these emerging workloads have unique characteristics that we can leverage to build new, more effective data protection abstractions.
This dissertation presents two new data protection systems for machine learning work-loads and a new system for fine grained data management and protection on mobile devices. First is Sage, a differentially private machine learning platform addressing the two primary challenges of differential privacy: running out of budget and the privacy utility tradeoff. The second system, Pyramid, is the first selective data system. Pyramid leverages count featurization to reduce the amount of data exposed while training classification models by two orders of magnitude. The final system, Pebbles, provides users with logical data objects as a new fine grained data management and protection primitive allowing data management at a higher level of abstraction. Pebbles, leverages high level storage abstractions in mobile operating systems to discover user recognizable application level data objects in unmodified mobile applications
Trusted content-based publish/subscribe trees
Publish/Subscribe systems hold strong assumptions of the expected behaviour of clients and routers, as it is assumed they all abide by the matching and routing protocols. Assumptions of implicit trust between the components of the publish/subscribe infrastructure are acceptable where the underlying event distribution service is under the control of a single or multiple co-operating administrative entities and contracts between clients and these authorities exist, however there are application contexts where these presumptions do not hold. In such environments, such as ad hoc networks, there is the possibility of selfish and malicious behaviour that can lead to disruption of the routing and matching algorithms.
The most commonly researched approach to security in publish/subscribe systems is role-based access control (RBAC). RBAC is suitable for ensuring confidentiality, but due to the assumption of strong identities associated with well defined roles and the absence of monitoring systems to allow for adaptable policies in response to the changing behaviour of clients, it is not appropriate for environments where: identities can not be assigned to roles in the absence of a trusted administrative entity; long-lived identities of entities do not exist; and where the threat model consists of highly adaptable malicious and selfish
entities.
Motivated by recent work in the application of trust and reputation to Peer-to-Peer networks, where past behaviour is used to generate trust opinions that inform future transactions, we propose an approach where the publish/subscribe infrastructure is constructed and re-configured with respect to the trust preferences of clients and routers. In this thesis, we show how Publish/Subscribe trees (PSTs) can be constructed with respect to the trust
preferences of publishers and subscribers, and the overhead costs of event dissemination. Using social welfare theory, it is shown that individual trust preferences over clients and routers, which are informed by a variety of trust sources, can be aggregated to give a social preference over the set of feasible PSTs. By combining this and the existing work on PST overheads, the Maximum Trust PST with Overhead Budget problem is defined and is shown to be in NP-complete. An exhaustive search algorithm is proposed that is shown to be suitable only for very small problem sizes. To improve scalability, a faster tabu search algorithm is presented, which is shown to scale to larger problem instances and gives good approximations of the optimal solutions.
The research contributions of this work are: the use of social welfare theory to provide a mechanism to establish the trustworthiness of PSTs; the finding that individual trust is not interpersonal comparable as is considered to be the case in much of the trust literature; the Maximum Trust PST with Overhead Budget problem; and algorithms to solve this problem
- …