1,532 research outputs found

    EsPRESSo: Efficient Privacy-Preserving Evaluation of Sample Set Similarity

    Full text link
    Electronic information is increasingly often shared among entities without complete mutual trust. To address related security and privacy issues, a few cryptographic techniques have emerged that support privacy-preserving information sharing and retrieval. One interesting open problem in this context involves two parties that need to assess the similarity of their datasets, but are reluctant to disclose their actual content. This paper presents an efficient and provably-secure construction supporting the privacy-preserving evaluation of sample set similarity, where similarity is measured as the Jaccard index. We present two protocols: the first securely computes the (Jaccard) similarity of two sets, and the second approximates it, using MinHash techniques, with lower complexities. We show that our novel protocols are attractive in many compelling applications, including document/multimedia similarity, biometric authentication, and genetic tests. In the process, we demonstrate that our constructions are appreciably more efficient than prior work.Comment: A preliminary version of this paper was published in the Proceedings of the 7th ESORICS International Workshop on Digital Privacy Management (DPM 2012). This is the full version, appearing in the Journal of Computer Securit

    RANDOMIZATION BASED PRIVACY PRESERVING CATEGORICAL DATA ANALYSIS

    Get PDF
    The success of data mining relies on the availability of high quality data. To ensure quality data mining, effective information sharing between organizations becomes a vital requirement in today’s society. Since data mining often involves sensitive infor- mation of individuals, the public has expressed a deep concern about their privacy. Privacy-preserving data mining is a study of eliminating privacy threats while, at the same time, preserving useful information in the released data for data mining. This dissertation investigates data utility and privacy of randomization-based mod- els in privacy preserving data mining for categorical data. For the analysis of data utility in randomization model, we first investigate the accuracy analysis for associ- ation rule mining in market basket data. Then we propose a general framework to conduct theoretical analysis on how the randomization process affects the accuracy of various measures adopted in categorical data analysis. We also examine data utility when randomization mechanisms are not provided to data miners to achieve better privacy. We investigate how various objective associ- ation measures between two variables may be affected by randomization. We then extend it to multiple variables by examining the feasibility of hierarchical loglinear modeling. Our results provide a reference to data miners about what they can do and what they can not do with certainty upon randomized data directly without the knowledge about the original distribution of data and distortion information. Data privacy and data utility are commonly considered as a pair of conflicting re- quirements in privacy preserving data mining applications. In this dissertation, we investigate privacy issues in randomization models. In particular, we focus on the attribute disclosure under linking attack in data publishing. We propose efficient so- lutions to determine optimal distortion parameters such that we can maximize utility preservation while still satisfying privacy requirements. We compare our randomiza- tion approach with l-diversity and anatomy in terms of utility preservation (under the same privacy requirements) from three aspects (reconstructed distributions, accuracy of answering queries, and preservation of correlations). Our empirical results show that randomization incurs significantly smaller utility loss

    Programmeerimiskeeled turvalise ühisarvutuse rakenduste arendamiseks

    Get PDF
    Turvaline ühisarvutus on tehnoloogia, mis lubab mitmel sõltumatul osapoolel oma andmeid koos töödelda neis olevaid saladusi avalikustamata. Kui andmed on esitatud krüpteeritud kujul, tähendab see, et neid ei dekrüpteerita arvutuse käigus kordagi. Turvalise ühisarvutuse teoreetilised konstruktsioonid on teada olnud juba alates kaheksakümnendatest, kuid esimesed praktilised teostused ja rakendused, mis päris andmeid töötlesid, ilmusid alles natuke enam kui kümme aastat tagasi. Nüüdseks on turvalist ühisarvutust kasutatud mitmes praktilises rakenduses ning sellest on kujunenud oluline andmekaitsetehnoloogia. Turvalise ühisarvutuse rakenduste arendamine on keerukas. Vahendid, mis aitavad kaasa arendusprotsessile, on veel väga uued, ning raamistikud on sageli liiga aeglased praktiliste rakenduste jaoks. Rakendusi on endiselt võimelised arendama ainult krüptograafiaeksperdid. Käesoleva töö eesmärk on teha turvalise ühisarvutuse raamistikke paremaks ning muuta ühisarvutusrakenduste arendamist kergemaks. Väidame, et valdkon- naspetsiifiliste programmeerimiskeelte kasutamine võimaldab turvalise ühisarvu- tuse rakenduste ja raamistike ehitamist, mis on samaaegselt lihtsasti kasutatavad, hea jõudlusega, hooldatavad, usaldusväärsed ja võimelised suuri andmemahtusid töötlema. Peamise tulemusena esitleme kahte uut programmeerimiskeelt, mis on mõeldud turvalise ühisarvutuse jaoks. SecreC 2 on mõeldud turvalise ühisarvutuse rakendus- te arendamise lihtsustamiseks ja aitab kaasa sellele, et rakendused oleks turvalised ja efektiivsed. Teine keel on loodud turvalise ühisarvutuse protokollide arenda- miseks ning selle eesmärk on turvalise ühisarvutuse raamistikke paremaks muuta. Protokollide keel teeb raamistikke kiiremaks ja usaldusväärsemaks ning lihtsustab protokollide arendamist ja haldamist. Kirjeldame mõlemad keeled nii formaalselt kui mitteformaalselt. Näitame, kuidas mitmed rakendused ja prototüübid saavad neist keeltest kasu.Secure multi-party computation is a technology that allows several independent parties to cooperatively process their private data without revealing any secrets. If private inputs are given in encrypted form then the results will also be encrypted, and at no stage during processing are values ever decrypted. As a theoretical concept, the technology has been around since the 1980s, but the first practical implementations arose a bit more than a decade ago. Since then, secure multi-party computation has been used in practical applications, and has been established as an important method of data protection. Developing applications that use secure multi-party computation is challenging. The tools that help with development are still very young and the frameworks are often too slow for practical applications. Currently only experts in cryptography are able to develop secure multi-party applications. In this thesis we look how to improve secure multy-party computation frame- works and make the applications easier to develop. We claim that domain-specific programming languages enable to build secure multi-party applications and frame- works that are at the same time usable, efficient, maintainable, trustworthy, and practically scalable. The contribution of this thesis is the introduction of two new programming languages for secure multi-party computation. The SecreC 2 language makes secure multi-party computation application development easier, ensuring that the applications are secure and enabling them to be efficient. The second language is for developing low-level secure computation protocols. This language was created for improving secure multi-party computation frameworks. It makes the frameworks faster and more trustworthy, and protocols easier to develop and maintain. We give give both a formal and an informal overview of the two languages and see how they benefit multi-party applications and prototypes
    corecore