33 research outputs found
Synthesis of a simple self-stabilizing system
With the increasing importance of distributed systems as a computing
paradigm, a systematic approach to their design is needed. Although the area of
formal verification has made enormous advances towards this goal, the resulting
functionalities are limited to detecting problems in a particular design. By
means of a classical example, we illustrate a simple template-based approach to
computer-aided design of distributed systems based on leveraging the well-known
technique of bounded model checking to the synthesis setting.Comment: In Proceedings SYNT 2014, arXiv:1407.493
Unification on Compressed Terms
First-order term unification is an essential concept in areas like functional and
logic programming, automated deduction, deductive databases, artificial intelligence,
information retrieval, compiler design, etc. We build upon recent
developments in grammar-based compression mechanisms for terms and investigate
algorithms for first-order unification and matching on compressed
terms.
We prove that the first-order unification of compressed terms is decidable
in polynomial time, and also that a compressed representation of the most
general unifier can be computed in polynomial time. Furthermore, we present
a polynomial time algorithm for first-order matching on compressed terms.
Both algorithms represent an improvement in time complexity over previous
results [GGSS09, GGSS08].
We use several known results on the tree grammars used for compression,
called singleton tree grammars (STG)s, like polynomial time computability
of several subalgorithmms: certain grammar extensions, deciding equality of
represented terms, and generating their preorder traversal. An innovation is
a specialized depth of an STG that shows that unifiers can be represented in
polynomial spac
Unification and Matching on Compressed Terms
Term unification plays an important role in many areas of computer science,
especially in those related to logic. The universal mechanism of grammar-based
compression for terms, in particular the so-called Singleton Tree Grammars
(STG), have recently drawn considerable attention. Using STGs, terms of
exponential size and height can be represented in linear space. Furthermore,
the term representation by directed acyclic graphs (dags) can be efficiently
simulated. The present paper is the result of an investigation on term
unification and matching when the terms given as input are represented using
different compression mechanisms for terms such as dags and Singleton Tree
Grammars. We describe a polynomial time algorithm for context matching with
dags, when the number of different context variables is fixed for the problem.
For the same problem, NP-completeness is obtained when the terms are
represented using the more general formalism of Singleton Tree Grammars. For
first-order unification and matching polynomial time algorithms are presented,
each of them improving previous results for those problems.Comment: This paper is posted at the Computing Research Repository (CoRR) as
part of the process of submission to the journal ACM Transactions on
Computational Logic (TOCL)
Variants of unification considering compression and context variables
Term unification is a basic operation in several areas of computer science, specially in those related to logic. Generally speaking, it consists on solving equations over expressions called terms. Depending on the kind of variables allowed to occur in the terms and under which conditions two terms are considered to be equal, several frameworks of unification such as first-order unification, higher-order unification, syntactic unification, and unification modulo theories can be distinguished.
Moreover, other variants of term unification arise when we consider nontrivial representations for terms. In this thesis we study variants of the classic first-order syntactic term unification problem resulting from the introduction of context variables, i.e. variables standing for contexts, and/or assuming that the input is given in some kind of compressed representation.
More specifically, we focus on two of such compressed representations: the well-known Directed Acyclic Graphs (DAGs) and Singleton Tree Grammars (STGs). Similarly as DAGs allow compression by exploiting the reusement of repeated instances of a subterm in a term, STGs are a grammar-based compression mechanism based on the reusement of repeated (multi)contexts. An interesting property of the STG formalism is that many operations on terms can be efficiently performed directly in their compressed representation thus taking advantage of the compression also when computing with the represented terms.
Although finding a minimal STG representation of a term is computationally difficult, this limitation has been successfully overcome is practice, and several STG-compressors for terms are available. The STG representation has been applied in XML processing and analysis of Term Rewrite Systems. Moreover, STGs are a very useful concept for the analysis of unification problems since, roughly speaking, allow to represent solutions in a succinct but still efficiently verifiable form.La unificaci贸 de termes 茅s una operaci贸n b谩sica en diverses 脿rees de la inform谩tica, especialment en aquelles relacionades amb la l贸gica. En termes generals, consisteix en resoldre equacions sobre expressions anomenades termes. Depenent del tipus de variables que apareguin en els termes i de sota quines condicions dos termes s贸n considerats iguals, podem definir diverses variants del problema d'unificaci贸 de termes. A m茅s a m茅s, altres variants del problema d'unificaci贸 de termes sorgeixen quan considerem representacions no trivials per als termes. S'estudia variants del problema cl脿ssic d'unificaci贸 sint谩ctica de primer ordre resultants de la introducci贸 de variables de context i/o de l'assumpci贸 de que l'entrada 茅s donada en format comprimit. Ens centrem en l'estudi de dues representacions comprimides: els grafs dirigits ac铆clics i les gram脿tiques d'arbre singlet贸. Similarment a com els grafs dirigits ac铆clics permeten compressi贸 mitjan莽ant el re煤s d'inst脿ncies repetides d'un mateix subterme, les gram脿tiques d'arbres singlet贸 s贸n un sistema de compressi贸 basat en el re煤s de (multi)contexts. Una propietat interessant d'aquest formalisme 茅s que moltes de les operacions comuns sobre termes es poden realizar directament sobre la versi贸 comprimida d'aquests de forma eficient, sense cap mena de descompressi贸 pr猫via. Tot i que trobar una representaci贸 minimal d'un terme amb una gram谩tica singlet贸 茅s una tasca computacionalment dif铆cil, aquesta limitaci贸 ha estat resolta de forma satisfact貌ria en la pr脿ctica, hi ha disponibles diversos compressors per a termes. La representaci贸 de termes amb gram脿tiques singlet贸 ha estat 煤til per al processament de documents XML i l'an脿lisi de sistemes de reescriptura de termes. Les gram脿tiques singlet贸 s贸n concepte molt 煤til per a l'an脿lisi de problemas d'unificaci贸, ja que permeten representar solucions de forma comprimida sense renunciar a que puguin ser verificades de forma eficient. A la primera part d'aquesta tesi s'estudien els problemas d'unificaci贸 i matching de primer ordre sota l'assumpci贸 de que els termes de l'entrada s贸n representats fent servir gram脿tiques singlet贸. Presentem algorismes de temps polin貌mic per a aquests problemas, aix铆 com propostes d'implementacions i resultats experimentals. Aquests resultats s'utilitzen m茅s endevant en aquesta tesi per a l'an脿lisi de problemes d'unificaci贸 i matching amb variables de contexte i entrada comprimida. A la resta d'aquesta tesi ens centrem en variants d'unificaci贸 amb variables de contexte, que s贸n un cas particular de variables d'ordre superior. M茅s concretament, a la segona part d'aquesta tesi, ens centrem en un cas particular d'unificaci贸 de contextes on nom猫s es permet una sola variable de context en l'entrada. Aquest problema s'anomena "one context unification". Mostrem que aquest problema es pot resoldre en temps polin貌mic indeterminista, no nom茅s quan els termes de l'entrada s贸n representats de forma expl铆cita, sino tamb茅 quan es representen fent servir grafs dirigits ac铆clics i gram脿tiques singlet贸. Tamb茅 presentem un resultat parcial recent en la tasca de trobar un algorisme de temps polin貌mic per a one context unification. Al final de la tesi, estudiem el problema de matching per a entrades amb variables de contexte, que 茅s un cas particular d'unificaci贸 on nom茅s un dels dos termes de cada equaci贸 t茅 variables. En contrast amb el problema general, matching de contextes ha estat classificat com a problema NP-complet. Mostrem que matching de contextes 茅s NP-complet fins i tot quan es fan servir gram脿tiques com a formalisme de representaci贸 de termes. Aix貌 implica que afegir compressi贸 no ens porta a un augment dr脿stic de la complexitat del problema. Tamb茅 demostrem que la restricci贸 de matching de contextes on el nombre de variables de contexte est脿 afitat per una constant fixa del problema es pot resoldre en temps polin貌mic, fins i tot quan es fan servir grafs dirigits ac铆clics com formalisme de representaci贸 de terme
TAPAS: Tricks to Accelerate (encrypted) Prediction As a Service
Machine learning methods are widely used for a variety of prediction
problems. \emph{Prediction as a service} is a paradigm in which service
providers with technological expertise and computational resources may perform
predictions for clients. However, data privacy severely restricts the
applicability of such services, unless measures to keep client data private
(even from the service provider) are designed. Equally important is to minimize
the amount of computation and communication required between client and server.
Fully homomorphic encryption offers a possible way out, whereby clients may
encrypt their data, and on which the server may perform arithmetic
computations. The main drawback of using fully homomorphic encryption is the
amount of time required to evaluate large machine learning models on encrypted
data. We combine ideas from the machine learning literature, particularly work
on binarization and sparsification of neural networks, together with
algorithmic tools to speed-up and parallelize computation using encrypted data.Comment: Accepted at International Conference in Machine Learning (ICML), 201
Data Generation for Neural Programming by Example
Programming by example is the problem of synthesizing a program from a small
set of input / output pairs. Recent works applying machine learning methods to
this task show promise, but are typically reliant on generating synthetic
examples for training. A particular challenge lies in generating meaningful
sets of inputs and outputs, which well-characterize a given program and
accurately demonstrate its behavior. Where examples used for testing are
generated by the same method as training data then the performance of a model
may be partly reliant on this similarity. In this paper we introduce a novel
approach using an SMT solver to synthesize inputs which cover a diverse set of
behaviors for a given program. We carry out a case study comparing this method
to existing synthetic data generation procedures in the literature, and find
that data generated using our approach improves both the discriminatory power
of example sets and the ability of trained machine learning models to
generalize to unfamiliar data
Distributed Vector-OLE: Improved Constructions and Implementation
We investigate concretely efficient protocols for distributed oblivious linear evaluation over vectors (Vector-OLE). Boyle et al. (CCS 2018) proposed a protocol for secure distributed pseudorandom Vector-OLE generation using sublinear communication, but they did not provide an implementation. Their construction is based on a variant of the LPN assumption and assumes a distributed key generation protocol for single-point Function Secret Sharing (FSS), as well as an efficient batching scheme to obtain multi-point FSS. We show that this requirement can be relaxed, resulting in a weaker variant of FSS, for which we give an efficient protocol. This allows us to use efficient probabilistic batch codes that were also recently used for batched PIR by Angel et al. (S&P 2018). We construct a full Vector-OLE generator from our protocols, and compare it experimentally with alternative approaches. Our implementation parallelizes very well, and has low communication overhead in practice. For generating a VOLE of size , our implementation only takes s on 32 cores
Secure and Scalable Document Similarity on Distributed Databases: Differential Privacy to the Rescue
Privacy-preserving collaborative data analysis enables richer models than what each party can learn with their own data. Secure Multi-Party Computation (MPC) offers a robust cryptographic approach to this problem, and in fact several protocols have been proposed for various data analysis and machine learning tasks. In this work, we focus on secure similarity computation between text documents, and the application to -nearest neighbors (\knn) classification. Due to its non-parametric nature, \knn presents scalability challenges in the MPC setting. Previous work addresses these by introducing non-standard assumptions about the abilities of an attacker, for example by relying on non-colluding servers. In this work, we tackle the scalability challenge from a different angle, and instead introduce a secure preprocessing phase that reveals differentially private (DP) statistics about the data. This allows us to exploit the inherent sparsity of text data and significantly speed up all subsequent classifications
Blind Justice: Fairness with Encrypted Sensitive Attributes
Recent work has explored how to train machine learning models which do not
discriminate against any subgroup of the population as determined by sensitive
attributes such as gender or race. To avoid disparate treatment, sensitive
attributes should not be considered. On the other hand, in order to avoid
disparate impact, sensitive attributes must be examined, e.g., in order to
learn a fair model, or to check if a given model is fair. We introduce methods
from secure multi-party computation which allow us to avoid both. By encrypting
sensitive attributes, we show how an outcome-based fair model may be learned,
checked, or have its outputs verified and held to account, without users
revealing their sensitive attributes.Comment: published at ICML 201