14 research outputs found
HOMOMORPHIC AUTOCOMPLETE
With the rapid progress in fully homomorpic encryption (FHE)
and somewhat homomorphic encryption (SHE) schemes, we are wit-
nessing renewed efforts to revisit privacy preserving protocols. Several works have already appeared in the literature that provide solutions to these problems by employing FHE or SHE techniques. These applications range from cloud computing to computation over confidential patient data to several machine learning problems such as classifying privatized data. One application where privacy is a major concern is web search – a task carried out on a daily basis by billions of users around the world. In this work, we focus on a more surmountable yet essential version of the search problem, i.e. autocomplete. By utilizing a SHE scheme we propose concrete solutions to a homomorphic autocomplete problem. To investigate the real-life viability, we tackle a number of problems in the way towards a practical implementation such as communication and computational efficiency
From task structures to world models: What do LLMs know?
In what sense does a large language model have knowledge? The answer to this
question extends beyond the capabilities of a particular AI system, and
challenges our assumptions about the nature of knowledge and intelligence. We
answer by granting LLMs "instrumental knowledge"; knowledge defined by a
certain set of abilities. We then ask how such knowledge is related to the more
ordinary, "worldly" knowledge exhibited by human agents, and explore this in
terms of the degree to which instrumental knowledge can be said to incorporate
the structured world models of cognitive science. We discuss ways LLMs could
recover degrees of worldly knowledge, and suggest such recovery will be
governed by an implicit, resource-rational tradeoff between world models and
task demands
Flattening NTRU for Evaluation Key Free Homomorphic Encryption
We propose a new FHE scheme {\sf F-NTRU} that adopts the flattening technique proposed in GSW to derive an NTRU based scheme that (similar to GSW) does not require evaluation keys or key switching. Our scheme eliminates the decision small polynomial ratio (DSPR) assumption but relies only on the standard R-LWE assumption. It uses wide key distributions, and hence is immune to the Subfield Lattice Attack. In practice, our scheme achieves competitive timings compared to the existing schemes. We are able to compute a homomorphic multiplication in ~msec and ~msec for and levels, respectively, without amortization. Furthermore, our scheme features small ciphertexts, e.g. ~KB for levels, and eliminates the need for storing and managing costly evaluation keys.
In addition, we present a slightly modified version of F-NTRU that is capable to support integer operations with a very large message space along with noise analysis for all cases. The assurance gained by using wide key distributions along with the message space flexibility of the scheme, i.e. bits, binary polynomials, and integers with a large message space, allows the use of the proposed scheme in a wide array of applications
Comparison between Subfield and Straightforward Attacks on NTRU
Recently in two independent papers, Albrecht, Bai and Ducas and Cheon, Jeong and Lee presented two
very similar attacks, that allow to break NTRU with larger parameters and GGH Multinear Map without
zero encodings. They proposed an algorithm for recovering the NTRU secret key given the public key
which apply for large NTRU modulus, in particular to Fully Homomorphic Encryption schemes based on
NTRU. Hopefully, these attacks do not endanger the security of the NTRUE NCRYPT scheme, but shed new
light on the hardness of this problem. The basic idea of both attacks relies on decreasing the dimension
of the NTRU lattice using the multiplication matrix by the norm (resp. trace) of the public key in some
subfield instead of the public key itself. Since the dimension of the subfield is smaller, the dimension of
the lattice decreases, and lattice reduction algorithm will perform better.
Here, we revisit the attacks on NTRU and propose another variant that is simpler and outperforms both
of these attacks in practice. It allows to break several concrete instances of YASHE, a NTRU-based FHE
scheme, but it is not as efficient as the hybrid method of Howgrave-Graham on concrete parameters of
NTRU. Instead of using the norm and trace, we propose to use the multiplication by the public key in
some subring and show that this choice leads to better attacks. We
√ can then show that for power of two
cyclotomic fields, the time complexity is polynomialFinally, we show that, under
heuristics, straightforward lattice reduction is even more efficient, allowing to extend this result to fields
without non-trivial subfields, such as NTRU Prime. We insist that the improvement on the analysis applies
even for relatively small modulus ; though if the secret is sparse, it may not be the fastest attack. We also
derive a tight estimation of security for (Ring-)LWE and NTRU assumptions. when
Tools for interfacing, extracting, and analyzing neural signals using wide-field fluorescence imaging and optogenetics in awake behaving mice
Imaging of multiple cells has rapidly multiplied the rate of data acquisition as well as our knowledge of the complex dynamics within the mammalian brain. The process of data acquisition has been dramatically enhanced with highly affordable, sensitive image sensors enable high-throughput detection of neural activity in intact animals. Genetically encoded calcium sensors deliver a substantial boost in signal strength and in combination with equally critical advances in the size, speed, and sensitivity of image sensors available in scientific cameras enables high-throughput detection of neural activity in behaving animals using traditional wide-field fluorescence microscopy. However, the tremendous increase in data flow presents challenges to processing, analysis, and storage of captured video, and prompts a reexamination of traditional routines used to process data in neuroscience and now demand improvements in both our hardware and software applications for processing, analyzing, and storing captured video. This project demonstrates the ease with which a dependable and affordable wide-field fluorescence imaging system can be assembled and integrated with behavior control and monitoring system such as found in a typical neuroscience laboratory.
An Open-source MATLAB toolbox is employed to efficiently analyze and visualize large imaging data sets in a manner that is both interactive and fully automated. This software package provides a library of image pre-processing routines optimized for batch-processing of continuous functional fluorescence video, and additionally automates a fast unsupervised ROI detection and signal extraction routine. Further, an extension of this toolbox that uses GPU programming to process streaming video, enabling the identification, segmentation and extraction of neural activity signals on-line is described in which specific algorithms improve signal specificity and image quality at the single cell level in a behaving animal. This project describes the strategic ingredients for transforming a large bulk flow of raw continuous video into proportionally informative images and knowledge
Adversarial inference and manipulation of machine learning models
Machine learning (ML) has established itself as a core component for various critical applications. However, with this increasing adoption rate of ML models, multiple attacks have emerged targeting different stages of the ML pipeline. Abstractly, the ML pipeline is divided into three phases, including training, updating, and inference. In this thesis, we evaluate the privacy, security, and accountability risks of the three stages of the ML pipeline. Firstly, we explore the inference phase, where the adversary can only access the target model after deployment. In this setting, we explore one of the most severe attacks against ML models, namely the membership inference attack (MIA). We relax all the MIA's key assumptions, thereby showing that such attacks are broadly applicable at low cost and thereby pose a more severe risk than previously thought. Secondly, we study the updating phase. To that end, we propose a new attack surface against ML models, i.e., the change in the output of an ML model before and after being updated. We then introduce four attacks, including data reconstruction ones, against this setting. Thirdly, we explore the training phase, where the adversary interferes with the target model's training. In this setting, we propose the model hijacking attack, in which the adversary can hijack the target model to provide their own illegal task. Finally, we propose different defense mechanisms to mitigate such identified risks.Maschinelles Lernen (ML) hat sich als Kernkomponente für verschiedene kritische Anwendungen etabliert. Mit der zunehmenden Verbreitung von ML-Modellen sind jedoch auch zahlreiche Angriffe auf verschiedene Phasen der ML-Pipeline aufgetreten. Abstrakt betrachtet ist die ML-Pipeline in drei Phasen unterteilt, darunter Training, Update und Inferenz. In dieser Arbeit werden die Datenschutz-, Sicherheits- und Verantwortlichkeitsrisiken der drei Phasen der ML-Pipeline bewertet. Zunächst untersuchen wir die Inferenzphase. Insbesondere untersuchen wir einen der schwerwiegendsten Angriffe auf ML-Modelle, nämlich den Membership Inference Attack (MIA). Wir lockern alle Hauptannahmen des MIA und zeigen, dass solche Angriffe mit geringen Kosten breit anwendbar sind und somit ein größeres Risiko darstellen als bisher angenommen. Zweitens untersuchen wir die Updatephase. Zu diesem Zweck führen wir eine neue Angriffsmethode gegen ML-Modelle ein, nämlich die Änderung der Ausgabe eines ML-Modells vor und nach dem Update. Anschließend stellen wir vier Angriffe vor, einschließlich auch Angriffe zur Datenrekonstruktion, die sich gegen dieses Szenario richten. Drittens untersuchen wir die Trainingsphase. In diesem Zusammenhang schlagen wir den Angriff “Model Hijacking” vor, bei dem der Angreifer das Zielmodell für seine eigenen illegalen Zwecke übernehmen kann. Schließlich schlagen wir verschiedene Verteidigungsmechanismen vor, um solche Risiken zu entschärfen
Crowd-powered systems
Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2012.Cataloged from PDF version of thesis.Includes bibliographical references (p. 217-237).Crowd-powered systems combine computation with human intelligence, drawn from large groups of people connecting and coordinating online. These hybrid systems enable applications and experiences that neither crowds nor computation could support alone. Unfortunately, crowd work is error-prone and slow, making it difficult to incorporate crowds as first-order building blocks in software systems. I introduce computational techniques that decompose complex tasks into simpler, verifiable steps to improve quality, and optimize work to return results in seconds. These techniques develop crowdsourcing as a platform so that it is reliable and responsive enough to be used in interactive systems. This thesis develops these ideas through a series of crowd-powered systems. The first, Soylent, is a word processor that uses paid micro-contributions to aid writing tasks such as text shortening and proofreading. Using Soylent is like having access to an entire editorial staff as you write. The second system, Adrenaline, is a camera that uses crowds to help amateur photographers capture the exact right moment for a photo. It finds the best smile and catches subjects in mid-air jumps, all in realtime. Moving beyond generic knowledge and paid crowds, I introduce techniques to motivate a social network that has specific expertise, and techniques to data mine crowd activity traces in support of a large number of uncommon user goals. These systems point to a future where social and crowd intelligence are central elements of interaction, software, and computation.by Michael Scott Bernstein.Ph.D
Management und IT: Tagungsband zur AKWI-Fachtagung vom 16. bis 18.09.2012 an der Hochschule Pforzheim
Wirtschaftsinformatik befasst sich mit allen Themen, die an der Schnittstelle zwischen Informatik und Betriebswirtschaft anzutreffen sind. So geht es in der Wirtschaftsinformatik – basierend auf dem Wissen und dem Verstehen der betriebswirtschaftlichen Konzepte und Anwendungen – insbesondere darum, IT-Systeme für die betriebliche Praxis zu entwickeln, einzuführen und zu betreiben. Eine wissenschaftliche Fachtagung, die den Titel „Management und IT“ trägt, setzt an einer solchen Beschreibung der Wirtschaftsinformatik an
Constructive Reasoning for Semantic Wikis
One of the main design goals of social software, such as wikis, is to
support and facilitate interaction and collaboration. This dissertation
explores challenges that arise from extending social software with
advanced facilities such as reasoning and semantic annotations and
presents tools in form of a conceptual model, structured tags, a rule
language, and a set of novel forward chaining and reason maintenance
methods for processing such rules that help to overcome the
challenges.
Wikis and semantic wikis were usually developed in an ad-hoc
manner, without much thought about the underlying concepts. A conceptual
model suitable for a semantic wiki that takes advanced features
such as annotations and reasoning into account is proposed. Moreover,
so called structured tags are proposed as a semi-formal knowledge
representation step between informal and formal annotations.
The focus of rule languages for the Semantic Web has been predominantly
on expert users and on the interplay of rule languages
and ontologies. KWRL, the KiWi Rule Language, is proposed as a
rule language for a semantic wiki that is easily understandable for
users as it is aware of the conceptual model of a wiki and as it
is inconsistency-tolerant, and that can be efficiently evaluated as it
builds upon Datalog concepts.
The requirement for fast response times of interactive software
translates in our work to bottom-up evaluation (materialization) of
rules (views) ahead of time – that is when rules or data change, not
when they are queried. Materialized views have to be updated when
data or rules change. While incremental view maintenance was intensively
studied in the past and literature on the subject is abundant,
the existing methods have surprisingly many disadvantages – they
do not provide all information desirable for explanation of derived
information, they require evaluation of possibly substantially larger
Datalog programs with negation, they recompute the whole extension
of a predicate even if only a small part of it is affected by a
change, they require adaptation for handling general rule changes.
A particular contribution of this dissertation consists in a set of
forward chaining and reason maintenance methods with a simple declarative
description that are efficient and derive and maintain information
necessary for reason maintenance and explanation. The reasoning
methods and most of the reason maintenance methods are described
in terms of a set of extended immediate consequence operators the
properties of which are proven in the classical logical programming
framework. In contrast to existing methods, the reason maintenance methods in this dissertation work by evaluating the original Datalog
program – they do not introduce negation if it is not present in the input
program – and only the affected part of a predicate’s extension is
recomputed. Moreover, our methods directly handle changes in both
data and rules; a rule change does not need to be handled as a special
case.
A framework of support graphs, a data structure inspired by justification
graphs of classical reason maintenance, is proposed. Support
graphs enable a unified description and a formal comparison of the
various reasoning and reason maintenance methods and define a notion
of a derivation such that the number of derivations of an atom is
always finite even in the recursive Datalog case.
A practical approach to implementing reasoning, reason maintenance,
and explanation in the KiWi semantic platform is also investigated. It
is shown how an implementation may benefit from using a graph
database instead of or along with a relational database