896 research outputs found

    Privacy in the Genomic Era

    Get PDF
    Genome sequencing technology has advanced at a rapid pace and it is now possible to generate highly-detailed genotypes inexpensively. The collection and analysis of such data has the potential to support various applications, including personalized medical services. While the benefits of the genomics revolution are trumpeted by the biomedical community, the increased availability of such data has major implications for personal privacy; notably because the genome has certain essential features, which include (but are not limited to) (i) an association with traits and certain diseases, (ii) identification capability (e.g., forensics), and (iii) revelation of family relationships. Moreover, direct-to-consumer DNA testing increases the likelihood that genome data will be made available in less regulated environments, such as the Internet and for-profit companies. The problem of genome data privacy thus resides at the crossroads of computer science, medicine, and public policy. While the computer scientists have addressed data privacy for various data types, there has been less attention dedicated to genomic data. Thus, the goal of this paper is to provide a systematization of knowledge for the computer science community. In doing so, we address some of the (sometimes erroneous) beliefs of this field and we report on a survey we conducted about genome data privacy with biomedical specialists. Then, after characterizing the genome privacy problem, we review the state-of-the-art regarding privacy attacks on genomic data and strategies for mitigating such attacks, as well as contextualizing these attacks from the perspective of medicine and public policy. This paper concludes with an enumeration of the challenges for genome data privacy and presents a framework to systematize the analysis of threats and the design of countermeasures as the field moves forward

    GenePING: secure, scalable management of personal genomic data

    Get PDF
    BACKGROUND: Patient genomic data are rapidly becoming part of clinical decision making. Within a few years, full genome expression profiling and genotyping will be affordable enough to perform on every individual. The management of such sizeable, yet fine-grained, data in compliance with privacy laws and best practices presents significant security and scalability challenges. RESULTS: We present the design and implementation of GenePING, an extension to the PING personal health record system that supports secure storage of large, genome-sized datasets, as well as efficient sharing and retrieval of individual datapoints (e.g. SNPs, rare mutations, gene expression levels). Even with full access to the raw GenePING storage, an attacker cannot discover any stored genomic datapoint on any single patient. Given a large-enough number of patient records, an attacker cannot discover which data corresponds to which patient, or even the size of a given patient's record. The computational overhead of GenePING's security features is a small constant, making the system usable, even in emergency care, on today's hardware. CONCLUSION: GenePING is the first personal health record management system to support the efficient and secure storage and sharing of large genomic datasets. GenePING is available online at , licensed under the LGPL

    FinBook: literary content as digital commodity

    Get PDF
    This short essay explains the significance of the FinBook intervention, and invites the reader to participate. We have associated each chapter within this book with a financial robot (FinBot), and created a market whereby book content will be traded with financial securities. As human labour increasingly consists of unstable and uncertain work practices and as algorithms replace people on the virtual trading floors of the worlds markets, we see members of society taking advantage of FinBots to invest and make extra funds. Bots of all kinds are making financial decisions for us, searching online on our behalf to help us invest, to consume products and services. Our contribution to this compilation is to turn the collection of chapters in this book into a dynamic investment portfolio, and thereby play out what might happen to the process of buying and consuming literature in the not-so-distant future. By attaching identities (through QR codes) to each chapter, we create a market in which the chapter can ‘perform’. Our FinBots will trade based on features extracted from the authors’ words in this book: the political, ethical and cultural values embedded in the work, and the extent to which the FinBots share authors’ concerns; and the performance of chapters amongst those human and non-human actors that make up the market, and readership. In short, the FinBook model turns our work and the work of our co-authors into an investment portfolio, mediated by the market and the attention of readers. By creating a digital economy specifically around the content of online texts, our chapter and the FinBook platform aims to challenge the reader to consider how their personal values align them with individual articles, and how these become contested as they perform different value judgements about the financial performance of each chapter and the book as a whole. At the same time, by introducing ‘autonomous’ trading bots, we also explore the different ‘network’ affordances that differ between paper based books that’s scarcity is developed through analogue form, and digital forms of books whose uniqueness is reached through encryption. We thereby speak to wider questions about the conditions of an aggressive market in which algorithms subject cultural and intellectual items – books – to economic parameters, and the increasing ubiquity of data bots as actors in our social, political, economic and cultural lives. We understand that our marketization of literature may be an uncomfortable juxtaposition against the conventionally-imagined way a book is created, enjoyed and shared: it is intended to be

    Fine-Grained Provenance And Applications To Data Analytics Computation

    Get PDF
    Data provenance tools seek to facilitate reproducible data science and auditable data analyses by capturing the analytics steps used in generating data analysis results. However, analysts must choose among workflow provenance systems, which allow arbitrary code but only track provenance at the granularity of files; prove-nance APIs, which provide tuple-level provenance, but incur overhead in all computations; and database provenance tools, which track tuple-level provenance through relational operators and support optimization, but support a limited subset of data science tasks. None of these solutions are well suited for tracing errors introduced during common ETL, record alignment, and matching tasks – for data types such as strings, images, etc.Additionally, we need a provenance archival layer to store and manage the tracked fine-grained prove-nance that enables future sophisticated reasoning about why individual output results appear or fail to appear. For reproducibility and auditing, the provenance archival system should be tamper-resistant. On the other hand, the provenance collecting over time or within the same query computation tends to be repeated partially (i.e., the same operation with the same input records in the middle computation step). Hence, we desire efficient provenance storage (i.e., it compresses repeated results). We address these challenges with novel formalisms and algorithms, implemented in the PROVision system, for reconstructing fine-grained provenance for a broad class of ETL-style workflows. We extend database-style provenance techniques to capture equivalences, support optimizations, and enable lazy evaluations. We develop solutions for storing fine-grained provenance in relational storage systems while both compressing and protecting it via cryptographic hashes. We experimentally validate our proposed solutions using both scientific and OLAP workloads

    Intelligent Computing for Big Data

    Get PDF
    Recent advances in artificial intelligence have the potential to further develop current big data research. The Special Issue on ‘Intelligent Computing for Big Data’ highlighted a number of recent studies related to the use of intelligent computing techniques in the processing of big data for text mining, autism diagnosis, behaviour recognition, and blockchain-based storage
    • …
    corecore