Search CORE

4,983 research outputs found

Sample Efficient Bayesian Reinforcement Learning

Author: Grover Divya
Publication venue
Publication date: 01/01/2020
Field of study

Artificial Intelligence (AI) has been an active field of research for over a century now. The research field of AI may be grouped into various tasks that are expected from an intelligent agent; two major ones being learning & inference and planning. The act of storing new knowledge is known as learning while inference refers to the act to extracting conclusions given agent’s limited knowledge base. They are tightly knit by the design of its knowledge base. The process of deciding long-term actions or plans given its current knowledge is called planning.Reinforcement Learning (RL) brings together these two tasks by posing a seemingly benign question “How to act optimally in an unknown environment?”. This requires the agent to learn about its environment as well as plan actions given its current knowledge about it. In RL, the environment can be represented by a mathematical model and we associate an intrinsic value to the actions that the agent may choose.In this thesis, we present a novel Bayesian algorithm for the problem of RL. Bayesian RL is a widely explored area of research but is constrained by scalability and performance issues. We provide first steps towards rigorous analysis of these types of algorithms. Bayesian algorithms are characterized by the belief that they maintain over their unknowns; which is updated based on the collected evidence. This is different from the traditional approach in RL in terms of problem formulation and formal guarantees. Our novel algorithm combines aspects of planning and learning due to its inherent Bayesian formulation. It does so in a more scalable fashion, with formal PAC guarantees. We also give insights on the application of Bayesian framework for the estimation of model and value, in a joint work on Bayesian backward induction for RL

Chalmers Research

Recommended from our members

Security, Privacy, and Transparency Guarantees for Machine Learning Systems

Author: Lecuyer Mathias
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2019
Field of study

Machine learning (ML) is transforming a wide range of applications, promising to bring immense economic and social benefits. However, it also raises substantial security, privacy, and transparency challenges. ML workloads indeed push companies toward aggressive data collection and loose data access policies, placing troves of sensitive user information at risk if the company is hacked. ML also introduces new attack vectors, such as adversarial example attacks, which can completely nullify models’ accuracy under attack. Finally, ML models make complex data-driven decisions, which are opaque to the end-users, and difficult to inspect for programmers. In this dissertation we describe three systems we developed. Each system addresses a dimension of the previous challenges, by combining new practical systems techniques with rigorous theory to achieve a guaranteed level of protection, and make systems easier to understand. First we present Sage, a differentially private ML platform that enforces a meaningful protection semantic for the troves of personal information amassed by today’s companies. Second we describe PixelDP, a defense against adversarial examples that leverages differential privacy theory to provide a guaranteed level of accuracy under attack. Third we introduce Sunlight, a tool to enhance the transparency of opaque targeting services, using rigorous causal inference theory to explain targeting decisions to end-users

Columbia University Academic Commons

Fundamental Approaches to Software Engineering

Author
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

This open access book constitutes the proceedings of the 24th International Conference on Fundamental Approaches to Software Engineering, FASE 2021, which took place during March 27–April 1, 2021, and was held as part of the Joint Conferences on Theory and Practice of Software, ETAPS 2021. The conference was planned to take place in Luxembourg but changed to an online format due to the COVID-19 pandemic. The 16 full papers presented in this volume were carefully reviewed and selected from 52 submissions. The book also contains 4 Test-Comp contributions

OAPEN Library

Machine Unlearning: A Survey

Author: Xu Heng
Yu Philip S.
Zhang Lefeng
Zhou Wanlei
Zhu Tianqing
Publication venue
Publication date: 06/06/2023
Field of study

Machine learning has attracted widespread attention and evolved into an enabling technology for a wide range of highly successful applications, such as intelligent computer vision, speech recognition, medical diagnosis, and more. Yet a special need has arisen where, due to privacy, usability, and/or the right to be forgotten, information about some specific samples needs to be removed from a model, called machine unlearning. This emerging technology has drawn significant interest from both academics and industry due to its innovation and practicality. At the same time, this ambitious problem has led to numerous research efforts aimed at confronting its challenges. To the best of our knowledge, no study has analyzed this complex topic or compared the feasibility of existing unlearning solutions in different kinds of scenarios. Accordingly, with this survey, we aim to capture the key concepts of unlearning techniques. The existing solutions are classified and summarized based on their characteristics within an up-to-date and comprehensive review of each category's advantages and limitations. The survey concludes by highlighting some of the outstanding issues with unlearning techniques, along with some feasible directions for new research opportunities

arXiv.org e-Print Archive

Recommended from our members

New Data Protection Abstractions for Emerging Mobile and Big Data Workloads

Author: Spahn Riley Burns
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2020
Field of study

Two recent shifts in computing are challenging the effectiveness of traditional approaches to data protection. Emerging machine learning workloads have complex access patterns and unique leakage characteristics that are not well supported by existing protection approaches. Second, mobile operating systems do not provide sufficient support for fine grained data protection tools forcing users to rely on individual applications to correctly manage and protect data. My thesis is that these emerging workloads have unique characteristics that we can leverage to build new, more effective data protection abstractions. This dissertation presents two new data protection systems for machine learning work-loads and a new system for fine grained data management and protection on mobile devices. First is Sage, a differentially private machine learning platform addressing the two primary challenges of differential privacy: running out of budget and the privacy utility tradeoff. The second system, Pyramid, is the first selective data system. Pyramid leverages count featurization to reduce the amount of data exposed while training classification models by two orders of magnitude. The final system, Pebbles, provides users with logical data objects as a new fine grained data management and protection primitive allowing data management at a higher level of abstraction. Pebbles, leverages high level storage abstractions in mobile operating systems to discover user recognizable application level data objects in unmodified mobile applications

Columbia University Academic Commons

Trusted content-based publish/subscribe trees

Author: Naicken Stephen Murugapa
Publication venue
Publication date: 10/03/2012
Field of study

Publish/Subscribe systems hold strong assumptions of the expected behaviour of clients and routers, as it is assumed they all abide by the matching and routing protocols. Assumptions of implicit trust between the components of the publish/subscribe infrastructure are acceptable where the underlying event distribution service is under the control of a single or multiple co-operating administrative entities and contracts between clients and these authorities exist, however there are application contexts where these presumptions do not hold. In such environments, such as ad hoc networks, there is the possibility of selfish and malicious behaviour that can lead to disruption of the routing and matching algorithms. The most commonly researched approach to security in publish/subscribe systems is role-based access control (RBAC). RBAC is suitable for ensuring confidentiality, but due to the assumption of strong identities associated with well defined roles and the absence of monitoring systems to allow for adaptable policies in response to the changing behaviour of clients, it is not appropriate for environments where: identities can not be assigned to roles in the absence of a trusted administrative entity; long-lived identities of entities do not exist; and where the threat model consists of highly adaptable malicious and selfish entities. Motivated by recent work in the application of trust and reputation to Peer-to-Peer networks, where past behaviour is used to generate trust opinions that inform future transactions, we propose an approach where the publish/subscribe infrastructure is constructed and re-configured with respect to the trust preferences of clients and routers. In this thesis, we show how Publish/Subscribe trees (PSTs) can be constructed with respect to the trust preferences of publishers and subscribers, and the overhead costs of event dissemination. Using social welfare theory, it is shown that individual trust preferences over clients and routers, which are informed by a variety of trust sources, can be aggregated to give a social preference over the set of feasible PSTs. By combining this and the existing work on PST overheads, the Maximum Trust PST with Overhead Budget problem is defined and is shown to be in NP-complete. An exhaustive search algorithm is proposed that is shown to be suitable only for very small problem sizes. To improve scalability, a faster tabu search algorithm is presented, which is shown to scale to larger problem instances and gives good approximations of the optimal solutions. The research contributions of this work are: the use of social welfare theory to provide a mechanism to establish the trustworthiness of PSTs; the finding that individual trust is not interpersonal comparable as is considered to be the case in much of the trust literature; the Maximum Trust PST with Overhead Budget problem; and algorithms to solve this problem

Sussex Research Online