254,018 research outputs found
Differentially Private Model Selection with Penalized and Constrained Likelihood
In statistical disclosure control, the goal of data analysis is twofold: The
released information must provide accurate and useful statistics about the
underlying population of interest, while minimizing the potential for an
individual record to be identified. In recent years, the notion of differential
privacy has received much attention in theoretical computer science, machine
learning, and statistics. It provides a rigorous and strong notion of
protection for individuals' sensitive information. A fundamental question is
how to incorporate differential privacy into traditional statistical inference
procedures. In this paper we study model selection in multivariate linear
regression under the constraint of differential privacy. We show that model
selection procedures based on penalized least squares or likelihood can be made
differentially private by a combination of regularization and randomization,
and propose two algorithms to do so. We show that our private procedures are
consistent under essentially the same conditions as the corresponding
non-private procedures. We also find that under differential privacy, the
procedure becomes more sensitive to the tuning parameters. We illustrate and
evaluate our method using simulation studies and two real data examples
Learning without Recall by Random Walks on Directed Graphs
We consider a network of agents that aim to learn some unknown state of the
world using private observations and exchange of beliefs. At each time, agents
observe private signals generated based on the true unknown state. Each agent
might not be able to distinguish the true state based only on her private
observations. This occurs when some other states are observationally equivalent
to the true state from the agent's perspective. To overcome this shortcoming,
agents must communicate with each other to benefit from local observations. We
propose a model where each agent selects one of her neighbors randomly at each
time. Then, she refines her opinion using her private signal and the prior of
that particular neighbor. The proposed rule can be thought of as a Bayesian
agent who cannot recall the priors based on which other agents make inferences.
This learning without recall approach preserves some aspects of the Bayesian
inference while being computationally tractable. By establishing a
correspondence with a random walk on the network graph, we prove that under the
described protocol, agents learn the truth exponentially fast in the almost
sure sense. The asymptotic rate is expressed as the sum of the relative
entropies between the signal structures of every agent weighted by the
stationary distribution of the random walk.Comment: 6 pages, To Appear in Conference on Decision and Control 201
Private Estimation and Inference in High-Dimensional Regression with FDR Control
This paper presents novel methodologies for conducting practical
differentially private (DP) estimation and inference in high-dimensional linear
regression. We start by proposing a differentially private Bayesian Information
Criterion (BIC) for selecting the unknown sparsity parameter in DP-Lasso,
eliminating the need for prior knowledge of model sparsity, a requisite in the
existing literature. Then we propose a differentially private debiased LASSO
algorithm that enables privacy-preserving inference on regression parameters.
Our proposed method enables accurate and private inference on the regression
parameters by leveraging the inherent sparsity of high-dimensional linear
regression models. Additionally, we address the issue of multiple testing in
high-dimensional linear regression by introducing a differentially private
multiple testing procedure that controls the false discovery rate (FDR). This
allows for accurate and privacy-preserving identification of significant
predictors in the regression model. Through extensive simulations and real data
analysis, we demonstrate the efficacy of our proposed methods in conducting
inference for high-dimensional linear models while safeguarding privacy and
controlling the FDR
An Account of Opinion Implicatures
While previous sentiment analysis research has concentrated on the
interpretation of explicitly stated opinions and attitudes, this work initiates
the computational study of a type of opinion implicature (i.e.,
opinion-oriented inference) in text. This paper described a rule-based
framework for representing and analyzing opinion implicatures which we hope
will contribute to deeper automatic interpretation of subjective language. In
the course of understanding implicatures, the system recognizes implicit
sentiments (and beliefs) toward various events and entities in the sentence,
often attributed to different sources (holders) and of mixed polarities; thus,
it produces a richer interpretation than is typical in opinion analysis.Comment: 50 Pages. Submitted to the journal, Language Resources and Evaluatio
Enabling Social Applications via Decentralized Social Data Management
An unprecedented information wealth produced by online social networks,
further augmented by location/collocation data, is currently fragmented across
different proprietary services. Combined, it can accurately represent the
social world and enable novel socially-aware applications. We present
Prometheus, a socially-aware peer-to-peer service that collects social
information from multiple sources into a multigraph managed in a decentralized
fashion on user-contributed nodes, and exposes it through an interface
implementing non-trivial social inferences while complying with user-defined
access policies. Simulations and experiments on PlanetLab with emulated
application workloads show the system exhibits good end-to-end response time,
low communication overhead and resilience to malicious attacks.Comment: 27 pages, single ACM column, 9 figures, accepted in Special Issue of
Foundations of Social Computing, ACM Transactions on Internet Technolog
- …