65,048 research outputs found
Supporting Regularized Logistic Regression Privately and Efficiently
As one of the most popular statistical and machine learning models, logistic
regression with regularization has found wide adoption in biomedicine, social
sciences, information technology, and so on. These domains often involve data
of human subjects that are contingent upon strict privacy regulations.
Increasing concerns over data privacy make it more and more difficult to
coordinate and conduct large-scale collaborative studies, which typically rely
on cross-institution data sharing and joint analysis. Our work here focuses on
safeguarding regularized logistic regression, a widely-used machine learning
model in various disciplines while at the same time has not been investigated
from a data security and privacy perspective. We consider a common use scenario
of multi-institution collaborative studies, such as in the form of research
consortia or networks as widely seen in genetics, epidemiology, social
sciences, etc. To make our privacy-enhancing solution practical, we demonstrate
a non-conventional and computationally efficient method leveraging distributing
computing and strong cryptography to provide comprehensive protection over
individual-level and summary data. Extensive empirical evaluation on several
studies validated the privacy guarantees, efficiency and scalability of our
proposal. We also discuss the practical implications of our solution for
large-scale studies and applications from various disciplines, including
genetic and biomedical studies, smart grid, network analysis, etc
The Quantum Frontier
The success of the abstract model of computation, in terms of bits, logical
operations, programming language constructs, and the like, makes it easy to
forget that computation is a physical process. Our cherished notions of
computation and information are grounded in classical mechanics, but the
physics underlying our world is quantum. In the early 80s researchers began to
ask how computation would change if we adopted a quantum mechanical, instead of
a classical mechanical, view of computation. Slowly, a new picture of
computation arose, one that gave rise to a variety of faster algorithms, novel
cryptographic mechanisms, and alternative methods of communication. Small
quantum information processing devices have been built, and efforts are
underway to build larger ones. Even apart from the existence of these devices,
the quantum view on information processing has provided significant insight
into the nature of computation and information, and a deeper understanding of
the physics of our universe and its connections with computation.
We start by describing aspects of quantum mechanics that are at the heart of
a quantum view of information processing. We give our own idiosyncratic view of
a number of these topics in the hopes of correcting common misconceptions and
highlighting aspects that are often overlooked. A number of the phenomena
described were initially viewed as oddities of quantum mechanics. It was
quantum information processing, first quantum cryptography and then, more
dramatically, quantum computing, that turned the tables and showed that these
oddities could be put to practical effect. It is these application we describe
next. We conclude with a section describing some of the many questions left for
future work, especially the mysteries surrounding where the power of quantum
information ultimately comes from.Comment: Invited book chapter for Computation for Humanity - Information
Technology to Advance Society to be published by CRC Press. Concepts
clarified and style made more uniform in version 2. Many thanks to the
referees for their suggestions for improvement
Controlled Data Sharing for Collaborative Predictive Blacklisting
Although sharing data across organizations is often advocated as a promising
way to enhance cybersecurity, collaborative initiatives are rarely put into
practice owing to confidentiality, trust, and liability challenges. In this
paper, we investigate whether collaborative threat mitigation can be realized
via a controlled data sharing approach, whereby organizations make informed
decisions as to whether or not, and how much, to share. Using appropriate
cryptographic tools, entities can estimate the benefits of collaboration and
agree on what to share in a privacy-preserving way, without having to disclose
their datasets. We focus on collaborative predictive blacklisting, i.e.,
forecasting attack sources based on one's logs and those contributed by other
organizations. We study the impact of different sharing strategies by
experimenting on a real-world dataset of two billion suspicious IP addresses
collected from Dshield over two months. We find that controlled data sharing
yields up to 105% accuracy improvement on average, while also reducing the
false positive rate.Comment: A preliminary version of this paper appears in DIMVA 2015. This is
the full version. arXiv admin note: substantial text overlap with
arXiv:1403.212
Privacy-Friendly Collaboration for Cyber Threat Mitigation
Sharing of security data across organizational boundaries has often been
advocated as a promising way to enhance cyber threat mitigation. However,
collaborative security faces a number of important challenges, including
privacy, trust, and liability concerns with the potential disclosure of
sensitive data. In this paper, we focus on data sharing for predictive
blacklisting, i.e., forecasting attack sources based on past attack
information. We propose a novel privacy-enhanced data sharing approach in which
organizations estimate collaboration benefits without disclosing their
datasets, organize into coalitions of allied organizations, and securely share
data within these coalitions. We study how different partner selection
strategies affect prediction accuracy by experimenting on a real-world dataset
of 2 billion IP addresses and observe up to a 105% prediction improvement.Comment: This paper has been withdrawn as it has been superseded by
arXiv:1502.0533
- …