220 research outputs found

    Approach for GDPR Compliant Detection of COVID-19 Infection Chains

    Full text link
    While prospect of tracking mobile devices' users is widely discussed all over European countries to counteract COVID-19 propagation, we propose a Bloom filter based construction providing users' location privacy and preventing mass surveillance. We apply a solution based on Bloom filters data structure that allows a third party, a government agency, to perform some privacy-preserving set relations on a mobile telco's access logfile. By computing set relations, the government agency, given the knowledge of two identified persons, has an instrument that provides a (possible) infection chain from the initial to the final infected user no matter at which location on a worldwide scale they are. The benefit of our approach is that intermediate possible infected users can be identified and subsequently contacted by the agency. With such approach, we state that solely identities of possible infected users will be revealed and location privacy of others will be preserved. To this extent, it meets General Data Protection Regulation (GDPR)requirements in this area

    Yksityisyyden turvaavia protokollia verkkoliikenteen suojaamiseen

    Get PDF
    Digital technologies have become an essential part of our lives. In many parts of the world, activities such as socializing, providing health care, leisure and education are entirely or partially relying on the internet. Moreover, the COVID-19 world pandemic has also contributed significantly to our dependency on the on-line world. While the advancement of the internet brings many advantages, there are also disadvantages such as potential loss of privacy and security. While the users enjoy surfing on the web, service providers may collect a variety of information about their users, such as the users’ location, gender, and religion. Moreover, the attackers may try to violate the users’ security, for example, by infecting the users’ devices with malware. In this PhD dissertation, to provide means to protect networking we propose several privacy-preserving protocols. Our protocols empower internet users to get a variety of services, while at the same time ensuring users’ privacy and security in the digital world. In other words, we design our protocols such that the users only share the amount of information with the service providers that is absolutely necessary to gain the service. Moreover, our protocols only add minimal additional time and communication costs, while leveraging cryptographic schemes to ensure users’ privacy and security. The dissertation contains two main themes of protocols: privacy-preserving set operations and privacy-preserving graph queries. These protocols can be applied to a variety of application areas. We delve deeper into three application areas: privacy-preserving technologies for malware protection, protection of remote access, and protecting minors.Digitaaliteknologiasta on tullut oleellinen osa ihmisten elämää. Monissa osissa maailmaa sellaiset toiminnot kuten terveydenhuolto, vapaa-ajan vietto ja opetus ovat osittain tai kokonaan riippuvaisia internetistä. Lisäksi COVID-19 -pandemia on lisännyt ihmisten riippuvuutta tietoverkoista. Vaikkakin internetin kehittyminen on tuonut paljon hyvää, se on tuonut mukanaan myös haasteita yksityisyydelle ja tietoturvalle. Käyttäjien selatessa verkkoa palveluntarjoajat voivat kerätä käyttäjästä monenlaista tietoa, kuten esimerkiksi käyttäjän sijainnin, sukupuolen ja uskonnon. Lisäksi hyökkääjät voivat yrittää murtaa käyttäjän tietoturvan esimerkiksi asentamalla hänen koneelleen haittaohjelmia. Tässä väitöskirjassa esitellään useita turvallisuutta suojaavia protokollia tietoverkossa tapahtuvan toiminnan turvaamiseen. Nämä protokollat mahdollistavat internetin käytön monilla tavoilla samalla kun ne turvaavat käyttäjän yksityisyyden ja tietoturvan digitaalisessa maailmassa. Toisin sanoen nämä protokollat on suunniteltu siten, että käyttäjät jakavat palveluntarjoajille vain sen tiedon, joka on ehdottoman välttämätöntä palvelun tuottamiseksi. Protokollat käyttävät kryptografisia menetelmiä käyttäjän yksityisyyden sekä tietoturvan varmistamiseksi, ja ne hidastavat kommunikaatiota mahdollisimman vähän. Tämän väitöskirjan sisältämät protokollat voidaan jakaa kahteen eri teemaan: protokollat yksityisyyden suojaaville joukko-operaatioille ja protokollat yksityisyyden suojaaville graafihauille. Näitä protokollia voidaan soveltaa useilla aloilla. Näistä aloista väitöskirjassa käsitellään tarkemmin haittaohjelmilta suojautumista, etäyhteyksien suojaamista ja alaikäisten suojelemista

    Differential Privacy in Distributed Settings

    Get PDF

    Secure and Efficient Comparisons between Untrusted Parties

    Get PDF
    A vast number of online services is based on users contributing their personal information. Examples are manifold, including social networks, electronic commerce, sharing websites, lodging platforms, and genealogy. In all cases user privacy depends on a collective trust upon all involved intermediaries, like service providers, operators, administrators or even help desk staff. A single adversarial party in the whole chain of trust voids user privacy. Even more, the number of intermediaries is ever growing. Thus, user privacy must be preserved at every time and stage, independent of the intrinsic goals any involved party. Furthermore, next to these new services, traditional offline analytic systems are replaced by online services run in large data centers. Centralized processing of electronic medical records, genomic data or other health-related information is anticipated due to advances in medical research, better analytic results based on large amounts of medical information and lowered costs. In these scenarios privacy is of utmost concern due to the large amount of personal information contained within the centralized data. We focus on the challenge of privacy-preserving processing on genomic data, specifically comparing genomic sequences. The problem that arises is how to efficiently compare private sequences of two parties while preserving confidentiality of the compared data. It follows that the privacy of the data owner must be preserved, which means that as little information as possible must be leaked to any party participating in the comparison. Leakage can happen at several points during a comparison. The secured inputs for the comparing party might leak some information about the original input, or the output might leak information about the inputs. In the latter case, results of several comparisons can be combined to infer information about the confidential input of the party under observation. Genomic sequences serve as a use-case, but the proposed solutions are more general and can be applied to the generic field of privacy-preserving comparison of sequences. The solution should be efficient such that performing a comparison yields runtimes linear in the length of the input sequences and thus producing acceptable costs for a typical use-case. To tackle the problem of efficient, privacy-preserving sequence comparisons, we propose a framework consisting of three main parts. a) The basic protocol presents an efficient sequence comparison algorithm, which transforms a sequence into a set representation, allowing to approximate distance measures over input sequences using distance measures over sets. The sets are then represented by an efficient data structure - the Bloom filter -, which allows evaluation of certain set operations without storing the actual elements of the possibly large set. This representation yields low distortion for comparing similar sequences. Operations upon the set representation are carried out using efficient, partially homomorphic cryptographic systems for data confidentiality of the inputs. The output can be adjusted to either return the actual approximated distance or the result of an in-range check of the approximated distance. b) Building upon this efficient basic protocol we introduce the first mechanism to reduce the success of inference attacks by detecting and rejecting similar queries in a privacy-preserving way. This is achieved by generating generalized commitments for inputs. This generalization is done by treating inputs as messages received from a noise channel, upon which error-correction from coding theory is applied. This way similar inputs are defined as inputs having a hamming distance of their generalized inputs below a certain predefined threshold. We present a protocol to perform a zero-knowledge proof to assess if the generalized input is indeed a generalization of the actual input. Furthermore, we generalize a very efficient inference attack on privacy-preserving sequence comparison protocols and use it to evaluate our inference-control mechanism. c) The third part of the framework lightens the computational load of the client taking part in the comparison protocol by presenting a compression mechanism for partially homomorphic cryptographic schemes. It reduces the transmission and storage overhead induced by the semantically secure homomorphic encryption schemes, as well as encryption latency. The compression is achieved by constructing an asymmetric stream cipher such that the generated ciphertext can be converted into a ciphertext of an associated homomorphic encryption scheme without revealing any information about the plaintext. This is the first compression scheme available for partially homomorphic encryption schemes. Compression of ciphertexts of fully homomorphic encryption schemes are several orders of magnitude slower at the conversion from the transmission ciphertext to the homomorphically encrypted ciphertext. Indeed our compression scheme achieves optimal conversion performance. It further allows to generate keystreams offline and thus supports offloading to trusted devices. This way transmission-, storage- and power-efficiency is improved. We give security proofs for all relevant parts of the proposed protocols and algorithms to evaluate their security. A performance evaluation of the core components demonstrates the practicability of our proposed solutions including a theoretical analysis and practical experiments to show the accuracy as well as efficiency of approximations and probabilistic algorithms. Several variations and configurations to detect similar inputs are studied during an in-depth discussion of the inference-control mechanism. A human mitochondrial genome database is used for the practical evaluation to compare genomic sequences and detect similar inputs as described by the use-case. In summary we show that it is indeed possible to construct an efficient and privacy-preserving (genomic) sequences comparison, while being able to control the amount of information that leaves the comparison. To the best of our knowledge we also contribute to the field by proposing the first efficient privacy-preserving inference detection and control mechanism, as well as the first ciphertext compression system for partially homomorphic cryptographic systems

    Privacy-Preserving Ad-Hoc Equi-Join on Outsourced Data

    Get PDF
    In IT outsourcing, a user may delegate the data storage and query processing functions to a third-party server that is not completely trusted. This gives rise to the need to safeguard the privacy of the database as well as the user queries over it. In this article, we address the problem of running ad hoc equi-join queries directly on encrypted data in such a setting. Our contribution is the first solution that achieves constant complexity per pair of records that are evaluated for the join. After formalizing the privacy requirements pertaining to the database and user queries, we introduce a cryptographic construct for securely joining records across relations. The construct protects the database with a strong encryption scheme. Moreover, information disclosure after executing an equi-join is kept to the minimum—that two input records combine to form an output record if and only if they share common join attribute values. There is no disclosure on records that are not part of the join result. Building on this construct, we then present join algorithms that optimize the join execution by eliminating the need to match every record pair from the input relations. We provide a detailed analysis of the cost of the algorithms and confirm the analysis through extensive experiments with both synthetic and benchmark workloads. Through this evaluation, we tease out useful insights on how to configure the join algorithms to deliver acceptable execution time in practice.</jats:p

    A New Approach for Homomorphic Encryption with Secure Function Evaluation on Genomic Data

    Get PDF
    Additively homomorphic encryption is a public-key primitive allowing a sum to be computed on encrypted values. Although limited in functionality, additive schemes have been an essential tool in the private function evaluation toolbox for decades. They are typically faster and more straightforward to implement relative to their fully homomorphic counterparts, and more efficient than garbled circuits in certain applications. This thesis presents a novel method for extending the functionality of additively homomorphic encryption to allow the private evaluation of functions of restricted domain. Provided the encrypted sum falls within the restricted domain, the function can be homomorphically evaluated “for free” in a single public-key operation. We will describe an algorithm for encoding private functions into the public-keys of two well-known additive cryptosystems. We extend this scheme to an application in the field of pharmacogenomics called Similar Patient Query. With the advent of human genome project, there is a tremendous availability of genomic data opening the door for a possibility of many advances in the field of medicine. Precision medicine is one such application where a patient is administered drugs based on their genetic makeup. If the genomic data is not kept private, it can lead to several information frauds, so it needs to be encrypted. To tap the full potential of the encrypted genomic data, we need to perform computations on it without compromising its security. For SPQ, we pick a query genome and compare it across a hospital data base, to find patients similar to that of the query and use the information to apply precision medicine, all of this is carried out under privacy preserving settings in the presence of a semi-honest adversary in a single transaction

    Approximate Data Analytics Systems

    Get PDF
    Today, most modern online services make use of big data analytics systems to extract useful information from the raw digital data. The data normally arrives as a continuous data stream at a high speed and in huge volumes. The cost of handling this massive data can be significant. Providing interactive latency in processing the data is often impractical due to the fact that the data is growing exponentially and even faster than Moore’s law predictions. To overcome this problem, approximate computing has recently emerged as a promising solution. Approximate computing is based on the observation that many modern applications are amenable to an approximate, rather than the exact output. Unlike traditional computing, approximate computing tolerates lower accuracy to achieve lower latency by computing over a partial subset instead of the entire input data. Unfortunately, the advancements in approximate computing are primarily geared towards batch analytics and cannot provide low-latency guarantees in the context of stream processing, where new data continuously arrives as an unbounded stream. In this thesis, we design and implement approximate computing techniques for processing and interacting with high-speed and large-scale stream data to achieve low latency and efficient utilization of resources. To achieve these goals, we have designed and built the following approximate data analytics systems: • StreamApprox—a data stream analytics system for approximate computing. This system supports approximate computing for low-latency stream analytics in a transparent way and has an ability to adapt to rapid fluctuations of input data streams. In this system, we designed an online adaptive stratified reservoir sampling algorithm to produce approximate output with bounded error. • IncApprox—a data analytics system for incremental approximate computing. This system adopts approximate and incremental computing in stream processing to achieve high-throughput and low-latency with efficient resource utilization. In this system, we designed an online stratified sampling algorithm that uses self-adjusting computation to produce an incrementally updated approximate output with bounded error. • PrivApprox—a data stream analytics system for privacy-preserving and approximate computing. This system supports high utility and low-latency data analytics and preserves user’s privacy at the same time. The system is based on the combination of privacy-preserving data analytics and approximate computing. • ApproxJoin—an approximate distributed joins system. This system improves the performance of joins — critical but expensive operations in big data systems. In this system, we employed a sketching technique (Bloom filter) to avoid shuffling non-joinable data items through the network as well as proposed a novel sampling mechanism that executes during the join to obtain an unbiased representative sample of the join output. Our evaluation based on micro-benchmarks and real world case studies shows that these systems can achieve significant performance speedup compared to state-of-the-art systems by tolerating negligible accuracy loss of the analytics output. In addition, our systems allow users to systematically make a trade-off between accuracy and throughput/latency and require no/minor modifications to the existing applications
    • …
    corecore