21,820 research outputs found

    Quantifying Privacy: A Novel Entropy-Based Measure of Disclosure Risk

    Full text link
    It is well recognised that data mining and statistical analysis pose a serious treat to privacy. This is true for financial, medical, criminal and marketing research. Numerous techniques have been proposed to protect privacy, including restriction and data modification. Recently proposed privacy models such as differential privacy and k-anonymity received a lot of attention and for the latter there are now several improvements of the original scheme, each removing some security shortcomings of the previous one. However, the challenge lies in evaluating and comparing privacy provided by various techniques. In this paper we propose a novel entropy based security measure that can be applied to any generalisation, restriction or data modification technique. We use our measure to empirically evaluate and compare a few popular methods, namely query restriction, sampling and noise addition.Comment: 20 pages, 4 figure

    A New Method for Protecting Interrelated Time Series with Bayesian Prior Distributions and Synthetic Data

    Get PDF
    Organizations disseminate statistical summaries of administrative data via the Web for unrestricted public use. They balance the trade-off between confidentiality protection and inference quality. Recent developments in disclosure avoidance techniques include the incorporation of synthetic data, which capture the essential features of underlying data by releasing altered data generated from a posterior predictive distribution. The United States Census Bureau collects millions of interrelated time series micro-data that are hierarchical and contain many zeros and suppressions. Rule-based disclosure avoidance techniques often require the suppression of count data for small magnitudes and the modification of data based on a small number of entities. Motivated by this problem, we use zero-inflated extensions of Bayesian Generalized Linear Mixed Models (BGLMM) with privacy-preserving prior distributions to develop methods for protecting and releasing synthetic data from time series about thousands of small groups of entities without suppression based on the of magnitudes or number of entities. We find that as the prior distributions of the variance components in the BGLMM become more precise toward zero, confidentiality protection increases and inference quality deteriorates. We evaluate our methodology using a strict privacy measure, empirical differential privacy, and a newly defined risk measure, Probability of Range Identification (PoRI), which directly measures attribute disclosure risk. We illustrate our results with the U.S. Census Bureau’s Quarterly Workforce Indicators

    Time-series cross-sectional environmental performance and disclosure relationship:specific evidence from a less-developed country

    Get PDF
    This paper relies on ‘vulnerability and exploitability’ framework to submit new insights into legitimacy theory and voluntary disclosure theory using specific empirical evidence from the Nigerian oil and gas industry. The study connects the voluntary and legitimizing disclosure behaviors, regarding carbon emission due to gas flaring, of dominant companies in the Nigerian upstream petroleum sector to the vulnerability and exploitability of Nigeria as a less developed country. The hypothesized relations between gas flaring-related environmental performance and two forms of its disclosure (volume and substance) are estimated and tested using Prais-Winsten regression with Panel Corrected Standard Errors (PCSE). While the paper uses Data Envelopment Analysis (DEA) to measure gas flaring-related carbon performance, the two forms of gas flaring-related disclosures are measured using content analysis. We document significant positive and negative association between gas flaring-related carbon emission performance, on one hand, and the volumetric disclosure and disclosure substance on the other hand. These results imply that while the positive relation confirms the vulnerable nature of Nigeria as a less developed country, the negative relation is linked to the country’s exploitability. It is also empirically established that environmental performance is one of the key factors responsible for the undulating trend in the volume of environmental disclosures by large corporations operating in less-developed countries

    Scather: programming with multi-party computation and MapReduce

    Full text link
    We present a prototype of a distributed computational infrastructure, an associated high level programming language, and an underlying formal framework that allow multiple parties to leverage their own cloud-based computational resources (capable of supporting MapReduce [27] operations) in concert with multi-party computation (MPC) to execute statistical analysis algorithms that have privacy-preserving properties. Our architecture allows a data analyst unfamiliar with MPC to: (1) author an analysis algorithm that is agnostic with regard to data privacy policies, (2) to use an automated process to derive algorithm implementation variants that have different privacy and performance properties, and (3) to compile those implementation variants so that they can be deployed on an infrastructures that allows computations to take place locally within each participant’s MapReduce cluster as well as across all the participants’ clusters using an MPC protocol. We describe implementation details of the architecture, discuss and demonstrate how the formal framework enables the exploration of tradeoffs between the efficiency and privacy properties of an analysis algorithm, and present two example applications that illustrate how such an infrastructure can be utilized in practice.This work was supported in part by NSF Grants: #1430145, #1414119, #1347522, and #1012798

    GEM: a Distributed Goal Evaluation Algorithm for Trust Management

    Full text link
    Trust management is an approach to access control in distributed systems where access decisions are based on policy statements issued by multiple principals and stored in a distributed manner. In trust management, the policy statements of a principal can refer to other principals' statements; thus, the process of evaluating an access request (i.e., a goal) consists of finding a "chain" of policy statements that allows the access to the requested resource. Most existing goal evaluation algorithms for trust management either rely on a centralized evaluation strategy, which consists of collecting all the relevant policy statements in a single location (and therefore they do not guarantee the confidentiality of intensional policies), or do not detect the termination of the computation (i.e., when all the answers of a goal are computed). In this paper we present GEM, a distributed goal evaluation algorithm for trust management systems that relies on function-free logic programming for the specification of policy statements. GEM detects termination in a completely distributed way without disclosing intensional policies, thereby preserving their confidentiality. We demonstrate that the algorithm terminates and is sound and complete with respect to the standard semantics for logic programs.Comment: To appear in Theory and Practice of Logic Programming (TPLP

    A Cobalt-Containing Eukaryotic Nitrile Hydratase

    Get PDF
    Nitrile hydratase (NHase), an industrially important enzyme that catalyzes the hydration of nitriles to their corresponding amides, has only been characterized from prokaryotic microbes. The putative NHase from the eukaryotic unicellular choanoflagellate organism Monosiga brevicollis (MbNHase) was heterologously expressed in Escherichia coli. The resulting enzyme expressed as a single polypeptide with fused α- and β-subunits linked by a seventeen-histidine region. Size-exclusion chromatography indicated that MbNHase exists primarily as an (αβ)2 homodimer in solution, analogous to the α2β2 homotetramer architecture observed for prokaryotic NHases. The NHase enzyme contained its full complement of Co(III) and was fully functional without the co-expression of an activator protein or E. coli GroES/EL molecular chaperones. The homology model of MbNHase was developed identifying Cys400, Cys403, and Cys405 as active site ligands. The results presented here provide the first experimental data for a mature and active eukaryotic NHase with fused subunits. Since this new member of the NHase family is expressed from a single gene without the requirement of an activator protein, it represents an alternative biocatalyst for industrial syntheses of important amide compounds
    • …
    corecore