Search CORE

19,545 research outputs found

Economic Analysis and Statistical Disclosure Limitation

Author: Abowd John M.
Schmutte Ian M
Publication venue: DigitalCommons@ILR
Publication date: 13/08/2015
Field of study

This paper explores the consequences for economic research of methods used by data publishers to protect the privacy of their respondents. We review the concept of statistical disclosure limitation for an audience of economists who may be unfamiliar with these methods. We characterize what it means for statistical disclosure limitation to be ignorable. When it is not ignorable, we consider the effects of statistical disclosure limitation for a variety of research designs common in applied economic research. Because statistical agencies do not always report the methods they use to protect conﬁdentiality, we also characterize settings in which statistical disclosure limitation methods are discoverable; that is, they can be learned from the released data. We conclude with advice for researchers, journal editors, and statistical agencies

DigitalCommons@ILR

Differentially Private Exponential Random Graphs

Author: A. Goldenberg
A. Hout
C. Dwork
C. Dwork
C.J. Geyer
D.R. Hunter
G. Robins
L. Michell
M. Morris
M. Pearson
M.S. Handcock
M.S. Handcock
O. Frank
P.S. Bearman
S.M. Goodreau
T.A.B. Snijders
V. Karwa
Y.M.J. Woo
Publication venue
Publication date: 01/01/2014
Field of study

We propose methods to release and analyze synthetic graphs in order to protect privacy of individual relationships captured by the social network. Proposed techniques aim at fitting and estimating a wide class of exponential random graph models (ERGMs) in a differentially private manner, and thus offer rigorous privacy guarantees. More specifically, we use the randomized response mechanism to release networks under

\epsilon

-edge differential privacy. To maintain utility for statistical inference, treating the original graph as missing, we propose a way to use likelihood based inference and Markov chain Monte Carlo (MCMC) techniques to fit ERGMs to the produced synthetic networks. We demonstrate the usefulness of the proposed techniques on a real data example.Comment: minor edit

arXiv.org e-Print Archive

Crossref

Research Online

Avoiding disclosure of individually identifiable health information: a literature review

Author: Borton Joshua
Fernandes-Huessy Johannes
Gonzalez Claudia
Hair Elizabeth
Holden Craig
Mulcahy Tim
Prada Sergio I
Publication venue
Publication date
Field of study

Achieving data and information dissemination without arming anyone is a central task of any entity in charge of collecting data. In this article, the authors examine the literature on data and statistical confidentiality. Rather than comparing the theoretical properties of specific methods, they emphasize the main themes that emerge from the ongoing discussion among scientists regarding how best to achieve the appropriate balance between data protection, data utility, and data dissemination. They cover the literature on de-identification and reidentification methods with emphasis on health care data. The authors also discuss the benefits and limitations for the most common access methods. Although there is abundant theoretical and empirical research, their review reveals lack of consensus on fundamental questions for empirical practice: How to assess disclosure risk, how to choose among disclosure methods, how to assess reidentification risk, and how to measure utility loss.public use files, disclosure avoidance, reidentification, de-identification, data utility

Research Papers in Economics

Data Breaches in Higher Education Institutions

Author: Mello Samantha
Publication venue: University of New Hampshire Scholars\u27 Repository
Publication date: 01/01/2018
Field of study

UNH Scholars' Repository

Quantifying the invisible audience in social networks

Author: Michael S. Bernstein
Publication venue: Stanford HCI Group
Publication date: 03/05/2013
Field of study

This paper combines survey and large-scale log data to examine how well users’ perceptions of their audience match their actual audience on Facebook.AbstractWhen you share content in an online social network, who is listening? Users have scarce information about who actually sees their content, making their audience seem invisible and difficult to estimate. However, understanding this invisible audience can impact both science and design, since perceived audiences influence content production and self-presentation online. In this paper, we combine survey and large-scale log data to examine how well users’ perceptions of their audience match their actual audience on Facebook. We find that social media users consistently underestimate their audience size for their posts, guessing that their audience is just 27% of its true size. Qualitative coding of survey responses reveals folk theories that attempt to reverse-engineer audience size using feedback and friend count, though none of these approaches are particularly accurate. We analyze audience logs for 222,000 Facebook users’ posts over the course of one month and find that publicly visible signals — friend count, likes, and comments — vary widely and do not strongly indicate the audience of a single post. Despite the variation, users typically reach 61% of their friends each month. Together, our results begin to reveal the invisible undercurrents of audience attention and behavior in online social networks.Authored by Michael S. Bernstein, Eytan Bakshy, Moira Burke and Brian Karrer

Analysis and Policy Observatory (APO)