212 research outputs found
A Utility-Theoretic Approach to Privacy in Online Services
Online offerings such as web search, news portals, and e-commerce applications face the challenge of providing high-quality service to a large, heterogeneous user base. Recent efforts have highlighted the potential to improve performance by introducing methods to personalize services based on special knowledge about users and their context. For example, a user's demographics, location, and past search and browsing may be useful in enhancing the results offered in response to web search queries. However, reasonable concerns about privacy by both users, providers, and government agencies acting on behalf of citizens, may limit access by services to such information. We introduce and explore an economics of privacy in personalization, where people can opt to share personal information, in a standing or on-demand manner, in return for expected enhancements in the quality of an online service. We focus on the example of web search and formulate realistic objective functions for search efficacy and privacy. We demonstrate how we can find a provably near-optimal optimization of the utility-privacy tradeoff in an efficient manner. We evaluate our methodology on data drawn from a log of the search activity of volunteer participants. We separately assess usersā preferences about privacy and utility via a large-scale survey, aimed at eliciting preferences about peoplesā willingness to trade the sharing of personal data in returns for gains in search efficiency. We show that a significant level of personalization can be achieved using a relatively small amount of information about users
Recommended from our members
New perspectives and applications for greedy algorithms in machine learning
Approximating probability densities is a core problem in Bayesian statistics, where the inference involves the computation of a posterior distribution. Variational Inference (VI) is a technique to approximate posterior distributions through optimization. It involves specifying a set of tractable densities, out of which the final approximation is to be chosen. While VI is traditionally motivated with the goal of tractability, the focus in this dissertation is to use Bayesian approximation to obtain parsimonious distributions. With this goal in mind, we develop greedy algorithm variants and study their theoretical properties by establishing novel connections of the resulting optimization problems in parsimonious VI with traditional studies in the discrete optimization literature. Specific realizations lead to efficient solutions for many sparse probabilistic models like Sparse regression, Sparse PCA, Sparse Collective Matrix Factorization (CMF) etc. For cases where existing results are insufficient to provide acceptable approximation guarantees, we extend the optimization results for some large scale algorithms to a much larger class of functions.The developed methods are applied to both simulated and real world datasets, including high dimensional functional Magnetic Resonance Imaging (fMRI) datasets, and to the real world tasks of interpreting data exploration and model predictions.Electrical and Computer Engineerin
Submodularity in Action: From Machine Learning to Signal Processing Applications
Submodularity is a discrete domain functional property that can be
interpreted as mimicking the role of the well-known convexity/concavity
properties in the continuous domain. Submodular functions exhibit strong
structure that lead to efficient optimization algorithms with provable
near-optimality guarantees. These characteristics, namely, efficiency and
provable performance bounds, are of particular interest for signal processing
(SP) and machine learning (ML) practitioners as a variety of discrete
optimization problems are encountered in a wide range of applications.
Conventionally, two general approaches exist to solve discrete problems:
relaxation into the continuous domain to obtain an approximate solution, or
development of a tailored algorithm that applies directly in the
discrete domain. In both approaches, worst-case performance guarantees are
often hard to establish. Furthermore, they are often complex, thus not
practical for large-scale problems. In this paper, we show how certain
scenarios lend themselves to exploiting submodularity so as to construct
scalable solutions with provable worst-case performance guarantees. We
introduce a variety of submodular-friendly applications, and elucidate the
relation of submodularity to convexity and concavity which enables efficient
optimization. With a mixture of theory and practice, we present different
flavors of submodularity accompanying illustrative real-world case studies from
modern SP and ML. In all cases, optimization algorithms are presented, along
with hints on how optimality guarantees can be established
Active planning for underwater inspection and the benefit of adaptivity
We discuss the problem of inspecting an underwater structure, such as a submerged ship hull, with an autonomous underwater vehicle (AUV). Unlike a large body of prior work, we focus on planning the views of the AUV to improve the quality of the inspection, rather than maximizing the accuracy of a given data stream. We formulate the inspection planning problem as an extension to Bayesian active learning, and we show connections to recent theoretical guarantees in this area. We rigorously analyze the benefit of adaptive re-planning for such problems, and we prove that the potential benefit of adaptivity can be reduced from an exponential to a constant factor by changing the problem from cost minimization with a constraint on information gain to variance reduction with a constraint on cost. Such analysis allows the use of robust, non-adaptive planning algorithms that perform competitively with adaptive algorithms. Based on our analysis, we propose a method for constructing 3D meshes from sonar-derived point clouds, and we introduce uncertainty modeling through non-parametric Bayesian regression. Finally, we demonstrate the benefit of active inspection planning using sonar data from ship hull inspections with the Bluefin-MIT Hovering AUV.United States. Office of Naval Research (ONR Grant N00014-09-1-0700)United States. Office of Naval Research (ONR Grant N00014-07-1-00738)National Science Foundation (U.S.) (NSF grant 0831728)National Science Foundation (U.S.) (NSF grant CCR-0120778)National Science Foundation (U.S.) (NSF grant CNS-1035866
Bayesian batch active learning as sparse subset approximation
Leveraging the wealth of unlabeled data produced in recent years provides
great potential for improving supervised models. When the cost of acquiring
labels is high, probabilistic active learning methods can be used to greedily
select the most informative data points to be labeled. However, for many
large-scale problems standard greedy procedures become computationally
infeasible and suffer from negligible model change. In this paper, we introduce
a novel Bayesian batch active learning approach that mitigates these issues.
Our approach is motivated by approximating the complete data posterior of the
model parameters. While naive batch construction methods result in correlated
queries, our algorithm produces diverse batches that enable efficient active
learning at scale. We derive interpretable closed-form solutions akin to
existing active learning procedures for linear models, and generalize to
arbitrary models using random projections. We demonstrate the benefits of our
approach on several large-scale regression and classification tasks.Comment: NeurIPS 201
- ā¦