56 research outputs found
UnSplit: Data-Oblivious Model Inversion, Model Stealing, and Label Inference Attacks Against Split Learning
Training deep neural networks often forces users to work in a distributed or
outsourced setting, accompanied with privacy concerns. Split learning aims to
address this concern by distributing the model among a client and a server. The
scheme supposedly provides privacy, since the server cannot see the clients'
models and inputs. We show that this is not true via two novel attacks. (1) We
show that an honest-but-curious split learning server, equipped only with the
knowledge of the client neural network architecture, can recover the input
samples and obtain a functionally similar model to the client model, without
being detected. (2) We show that if the client keeps hidden only the output
layer of the model to "protect" the private labels, the honest-but-curious
server can infer the labels with perfect accuracy. We test our attacks using
various benchmark datasets and against proposed privacy-enhancing extensions to
split learning. Our results show that plaintext split learning can pose serious
risks, ranging from data (input) privacy to intellectual property (model
parameters), and provide no more than a false sense of security.Comment: Proceedings of the 21st Workshop on Privacy in the Electronic Society
(WPES '22), November 7, 2022, Los Angeles, CA, US
SplitGuard: Detecting and Mitigating Training-Hijacking Attacks in Split Learning
Distributed deep learning frameworks, such as split learning, have recently been proposed to enable a group of participants to collaboratively train a deep neural network without sharing their raw data. Split learning in particular achieves this goal by dividing a neural network between a client and a server so that the client computes the initial set of layers, and the server computes the rest. However, this method introduces a unique attack vector for a malicious server attempting to steal the client\u27s private data: the server can direct the client model towards learning a task of its choice. With a concrete example already proposed, such training-hijacking attacks present a significant risk for the data privacy of split learning clients.
In this paper, we propose SplitGuard, a method by which a split learning client can detect whether it is being targeted by a training-hijacking attack or not. We experimentally evaluate its effectiveness, and discuss in detail various points related to its use. We conclude that SplitGuard can effectively detect training-hijacking attacks while minimizing the amount of information recovered by the adversaries
De novo ChIP-seq analysis
Methods for the analysis of chromatin immunoprecipitation sequencing (ChIP-seq) data start by aligning the short reads to a reference genome. While often successful, they are not appropriate for cases where a reference genome is not available. Here we develop methods for de novo analysis of ChIP-seq data. Our methods combine de novo assembly with statistical tests enabling motif discovery without the use of a reference genome. We validate the performance of our method using human and mouse data. Analysis of fly data indicates that our method outperforms alignment based methods that utilize closely related species
PathCase-SB architecture and database design
<p>Abstract</p> <p>Background</p> <p>Integration of metabolic pathways resources and regulatory metabolic network models, and deploying new tools on the integrated platform can help perform more effective and more efficient systems biology research on understanding the regulation in metabolic networks. Therefore, the tasks of (a) integrating under a single database environment regulatory metabolic networks and existing models, and (b) building tools to help with modeling and analysis are desirable and intellectually challenging computational tasks.</p> <p>Description</p> <p>PathCase Systems Biology (PathCase-SB) is built and released. The PathCase-SB database provides data and API for multiple user interfaces and software tools. The current PathCase-SB system provides a database-enabled framework and web-based computational tools towards facilitating the development of kinetic models for biological systems. PathCase-SB aims to integrate data of selected biological data sources on the web (currently, BioModels database and KEGG), and to provide more powerful and/or new capabilities via the new web-based integrative framework. This paper describes architecture and database design issues encountered in PathCase-SB's design and implementation, and presents the current design of PathCase-SB's architecture and database.</p> <p>Conclusions</p> <p>PathCase-SB architecture and database provide a highly extensible and scalable environment with easy and fast (real-time) access to the data in the database. PathCase-SB itself is already being used by researchers across the world.</p
AMULET: a novel read count-based method for effective multiplet detection from single nucleus ATAC-seq data.
Detecting multiplets in single nucleus (sn)ATAC-seq data is challenging due to data sparsity and limited dynamic range. AMULET (ATAC-seq MULtiplet Estimation Tool) enumerates regions with greater than two uniquely aligned reads across the genome to effectively detect multiplets. We evaluate the method by generating snATAC-seq data in the human blood and pancreatic islet samples. AMULET has high precision, estimated via donor-based multiplexing, and high recall, estimated via simulated multiplets, compared to alternatives and identifies multiplets most effectively when a certain read depth of 25K median valid reads per nucleus is achieved
Potpourri: an epistasis test prioritization algorithm via diverse SNP selection
Genome-wide association studies (GWAS) explain a fraction of the underlying heritability of genetic diseases. Investigating epistatic interactions between two or more loci help to close this gap. Unfortunately, the sheer number of loci combinations to process and hypotheses prohibit the process both computationally and statistically. Epistasis test prioritization algorithms rank likely epistatic single nucleotide polymorphism (SNP) pairs to limit the number of tests. However, they still suffer from very low precision. It was shown in the literature that selecting SNPs that are individually correlated with the phenotype and also diverse with respect to genomic location leads to better phenotype prediction due to genetic complementation. Here, we propose that an algorithm that pairs SNPs from such diverse regions and ranks them can improve prediction power. We propose an epistasis test prioritization algorithm that optimizes a submodular set function to select a diverse and complementary set of genomic regions that span the underlying genome. The SNP pairs from these regions are then further ranked w.r.t. their co-coverage of the case cohort. We compare our algorithm with the state of the art on three GWAS and show that (1) we substantially improve precision (from 0.003 to 0.652) while maintaining the significance of selected pairs, (2) decrease the number of tests by 25-fold, and (3) decrease the runtime by 4-fold. We also show that promoting SNPs from regulatory/coding regions improves the performance (up to 0.8). Potpourri is available at http:/ciceklab.cs.bilkent.edu.tr/potpourri
A tool for detecting complementary single nucleotide polymorphism pairs in genome-wide association studies for epistasis testing
Detecting interacting loci pairs has been instrumental to understand disease etiology when single locus associations do not fully account for the underlying heritability. However, the number of loci to test is prohibitively large. Epistasis test prioritization algorithms rank likely epistatic single nucleotide polymorphism (SNP) pairs to limit the number of statistical tests. Potpourri detects epistatic SNP pairs by diversifying the selected SNPs' genomic regions and investigating their co-occurrence patterns over the case cohort. It can also input and further prioritize SNPs in regulatory or coding regions. The program identifies and returns a list of prioritized SNP pairs for epistasis testing. This article describes how to use the program and the details of the input and output data
- …