Search CORE

56 research outputs found

UnSplit: Data-Oblivious Model Inversion, Model Stealing, and Label Inference Attacks Against Split Learning

Author: Cicek A. Ercument
Erdogan Ege
Kupcu Alptekin
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 23/08/2021
Field of study

Training deep neural networks often forces users to work in a distributed or outsourced setting, accompanied with privacy concerns. Split learning aims to address this concern by distributing the model among a client and a server. The scheme supposedly provides privacy, since the server cannot see the clients' models and inputs. We show that this is not true via two novel attacks. (1) We show that an honest-but-curious split learning server, equipped only with the knowledge of the client neural network architecture, can recover the input samples and obtain a functionally similar model to the client model, without being detected. (2) We show that if the client keeps hidden only the output layer of the model to "protect" the private labels, the honest-but-curious server can infer the labels with perfect accuracy. We test our attacks using various benchmark datasets and against proposed privacy-enhancing extensions to split learning. Our results show that plaintext split learning can pose serious risks, ranging from data (input) privacy to intellectual property (model parameters), and provide no more than a false sense of security.Comment: Proceedings of the 21st Workshop on Privacy in the Electronic Society (WPES '22), November 7, 2022, Los Angeles, CA, US

arXiv.org e-Print Archive

Cryptology ePrint Archive

SplitGuard: Detecting and Mitigating Training-Hijacking Attacks in Split Learning

Author: A. Ercument Cicek
Alptekin Kupcu
Ege Erdogan
Publication venue: International Association for Cryptologic Research (IACR)
Publication date: 23/08/2021
Field of study

Distributed deep learning frameworks, such as split learning, have recently been proposed to enable a group of participants to collaboratively train a deep neural network without sharing their raw data. Split learning in particular achieves this goal by dividing a neural network between a client and a server so that the client computes the initial set of layers, and the server computes the rest. However, this method introduces a unique attack vector for a malicious server attempting to steal the client\u27s private data: the server can direct the client model towards learning a task of its choice. With a concrete example already proposed, such training-hijacking attacks present a significant risk for the data privacy of split learning clients. In this paper, we propose SplitGuard, a method by which a split learning client can detect whether it is being targeted by a training-hijacking attack or not. We experimentally evaluate its effectiveness, and discuss in detail various points related to its use. We conclude that SplitGuard can effectively detect training-hijacking attacks while minimizing the amount of information recovered by the adversaries

arXiv.org e-Print Archive

Cryptology ePrint Archive

De novo ChIP-seq analysis

Author: Bar-Joseph Ziv
Cicek A. Ercument
He Xin
Le Hai-Son
Schulz Marcel H.
Wang Yuhao
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2015
Field of study

Methods for the analysis of chromatin immunoprecipitation sequencing (ChIP-seq) data start by aligning the short reads to a reference genome. While often successful, they are not appropriate for cases where a reference genome is not available. Here we develop methods for de novo analysis of ChIP-seq data. Our methods combine de novo assembly with statistical tests enabling motif discovery without the use of a reference genome. We validate the performance of our method using human and mouse data. Analysis of fly data indicates that our method outperforms alignment based methods that utilize closely related species

DSpace@MIT

Crossref

Bilkent University Institutional Repository

Springer - Publisher Connector

PubMed Central

MPG.PuRe

PathCase-SB architecture and database design

Author: Cakmak Ali
Cheng En
Cicek A Ercument
Coskun Sarp A
Das Mitali
Lai Nicola
Ozsoyoglu Gultekin
Ozsoyoglu Z Meral
Qi Xinjian
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Abstract Background Integration of metabolic pathways resources and regulatory metabolic network models, and deploying new tools on the integrated platform can help perform more effective and more efficient systems biology research on understanding the regulation in metabolic networks. Therefore, the tasks of (a) integrating under a single database environment regulatory metabolic networks and existing models, and (b) building tools to help with modeling and analysis are desirable and intellectually challenging computational tasks. Description PathCase Systems Biology (PathCase-SB) is built and released. The PathCase-SB database provides data and API for multiple user interfaces and software tools. The current PathCase-SB system provides a database-enabled framework and web-based computational tools towards facilitating the development of kinetic models for biological systems. PathCase-SB aims to integrate data of selected biological data sources on the web (currently, BioModels database and KEGG), and to provide more powerful and/or new capabilities via the new web-based integrative framework. This paper describes architecture and database design issues encountered in PathCase-SB's design and implementation, and presents the current design of PathCase-SB's architecture and database. Conclusions PathCase-SB architecture and database provide a highly extensible and scalable environment with easy and fast (real-time) access to the data in the database. PathCase-SB itself is already being used by researchers across the world.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Archivio istituzionale della ricerca - Università di Cagliari

AMULET: a novel read count-based method for effective multiplet detection from single nucleus ATAC-seq data.

Author: Banchereau Jacques
Cicek A Ercument
Conrad Daniel N
Eroglu Alper
Gartner Zev J
Kuchel George A
Kursawe Romy
Lawlor Nathan
Marches Radu
McGinnis Christopher S
Nehar-Belaid Djamel
Stitzel Michael L
Thibodeau Asa
Ucar Duygu
Publication venue: The Mouseion at the JAXlibrary
Publication date: 01/09/2021
Field of study

Detecting multiplets in single nucleus (sn)ATAC-seq data is challenging due to data sparsity and limited dynamic range. AMULET (ATAC-seq MULtiplet Estimation Tool) enumerates regions with greater than two uniquely aligned reads across the genome to effectively detect multiplets. We evaluate the method by generating snATAC-seq data in the human blood and pancreatic islet samples. AMULET has high precision, estimated via donor-based multiplexing, and high recall, estimated via simulated multiplets, compared to alternatives and identifies multiplets most effectively when a certain read depth of 25K median valid reads per nucleus is achieved

The Jackson Laboratory: The Mouseion at the JAXlibrary

Directory of Open Access Journals

Potpourri: an epistasis test prioritization algorithm via diverse SNP selection

Author: Caylak Gizem
Cicek A. Ercument
Taştan Öznur
Publication venue: 'Mary Ann Liebert Inc'
Publication date: 01/04/2021
Field of study

Genome-wide association studies (GWAS) explain a fraction of the underlying heritability of genetic diseases. Investigating epistatic interactions between two or more loci help to close this gap. Unfortunately, the sheer number of loci combinations to process and hypotheses prohibit the process both computationally and statistically. Epistasis test prioritization algorithms rank likely epistatic single nucleotide polymorphism (SNP) pairs to limit the number of tests. However, they still suffer from very low precision. It was shown in the literature that selecting SNPs that are individually correlated with the phenotype and also diverse with respect to genomic location leads to better phenotype prediction due to genetic complementation. Here, we propose that an algorithm that pairs SNPs from such diverse regions and ranks them can improve prediction power. We propose an epistasis test prioritization algorithm that optimizes a submodular set function to select a diverse and complementary set of genomic regions that span the underlying genome. The SNP pairs from these regions are then further ranked w.r.t. their co-coverage of the case cohort. We compare our algorithm with the state of the art on three GWAS and show that (1) we substantially improve precision (from 0.003 to 0.652) while maintaining the significance of selected pairs, (2) decrease the number of tests by 25-fold, and (3) decrease the runtime by 4-fold. We also show that promoting SNPs from regulatory/coding regions improves the performance (up to 0.8). Potpourri is available at http:/ciceklab.cs.bilkent.edu.tr/potpourri

Sabanci University Research Database

A tool for detecting complementary single nucleotide polymorphism pairs in genome-wide association studies for epistasis testing

Author: Caylak Gizem
Cicek A. Ercument
Taştan Öznur
Publication venue: 'Mary Ann Liebert Inc'
Publication date: 01/04/2021
Field of study

Detecting interacting loci pairs has been instrumental to understand disease etiology when single locus associations do not fully account for the underlying heritability. However, the number of loci to test is prohibitively large. Epistasis test prioritization algorithms rank likely epistatic single nucleotide polymorphism (SNP) pairs to limit the number of statistical tests. Potpourri detects epistatic SNP pairs by diversifying the selected SNPs' genomic regions and investigating their co-occurrence patterns over the case cohort. It can also input and further prioritize SNPs in regulatory or coding regions. The program identifies and returns a list of prioritized SNP pairs for epistasis testing. This article describes how to use the program and the details of the input and output data

Sabanci University Research Database