1,456 research outputs found
Systematizing Decentralization and Privacy: Lessons from 15 Years of Research and Deployments
Decentralized systems are a subset of distributed systems where multiple
authorities control different components and no authority is fully trusted by
all. This implies that any component in a decentralized system is potentially
adversarial. We revise fifteen years of research on decentralization and
privacy, and provide an overview of key systems, as well as key insights for
designers of future systems. We show that decentralized designs can enhance
privacy, integrity, and availability but also require careful trade-offs in
terms of system complexity, properties provided, and degree of
decentralization. These trade-offs need to be understood and navigated by
designers. We argue that a combination of insights from cryptography,
distributed systems, and mechanism design, aligned with the development of
adequate incentives, are necessary to build scalable and successful
privacy-preserving decentralized systems
Towards causal federated learning : a federated approach to learning representations using causal invariance
Federated Learning is an emerging privacy-preserving distributed machine learning approach to building a shared model by performing distributed training locally on participating devices (clients) and aggregating the local models into a global one. As this approach prevents data collection and aggregation, it helps in reducing associated privacy risks to a great extent.
However, the data samples across all participating clients are
usually not independent and identically distributed (non-i.i.d.), and Out of Distribution (OOD) generalization for the learned models can be poor. Besides this challenge, federated learning also remains vulnerable to various attacks on security wherein a few malicious participating entities work towards inserting backdoors, degrading the generated aggregated model as well as inferring the data owned by participating entities. In this work, we propose an approach for learning invariant (causal) features common to all participating clients in a federated learning setup and analyse empirically how it enhances the Out of Distribution (OOD) accuracy as well as the privacy of the final learned model. Although Federated Learning allows for participants to contribute their local data without revealing it, it faces issues in data security and in accurately paying participants for quality data contributions. In this report, we also propose an EOS Blockchain design and workflow to establish data security, a novel validation error based metric upon which we qualify gradient uploads for payment, and implement a small example of our Blockchain Causal Federated Learning model to analyze its performance with respect to robustness, privacy and fairness in incentivization.L’apprentissage fédéré est une approche émergente d’apprentissage automatique distribué
préservant la confidentialité pour créer un modèle partagé en effectuant une formation
distribuée localement sur les appareils participants (clients) et en agrégeant les modèles locaux
en un modèle global. Comme cette approche empêche la collecte et l’agrégation de données,
elle contribue à réduire dans une large mesure les risques associés à la vie privée. Cependant,
les échantillons de données de tous les clients participants sont généralement pas indépendante
et distribuée de manière identique (non-i.i.d.), et la généralisation hors distribution (OOD)
pour les modèles appris peut être médiocre. Outre ce défi, l’apprentissage fédéré reste
également vulnérable à diverses attaques contre la sécurité dans lesquelles quelques entités
participantes malveillantes s’efforcent d’insérer des portes dérobées, dégradant le modèle
agrégé généré ainsi que d’inférer les données détenues par les entités participantes. Dans cet
article, nous proposons une approche pour l’apprentissage des caractéristiques invariantes
(causales) communes à tous les clients participants dans une configuration d’apprentissage
fédérée et analysons empiriquement comment elle améliore la précision hors distribution
(OOD) ainsi que la confidentialité du modèle appris final. Bien que l’apprentissage fédéré
permette aux participants de contribuer leurs données locales sans les révéler, il se heurte à des
problèmes de sécurité des données et de paiement précis des participants pour des contributions
de données de qualité. Dans ce rapport, nous proposons également une conception et un
flux de travail EOS Blockchain pour établir la sécurité des données, une nouvelle métrique
basée sur les erreurs de validation sur laquelle nous qualifions les téléchargements de gradient
pour le paiement, et implémentons un petit exemple de notre modèle d’apprentissage fédéré
blockchain pour analyser ses performances
From Social Data Mining to Forecasting Socio-Economic Crisis
Socio-economic data mining has a great potential in terms of gaining a better
understanding of problems that our economy and society are facing, such as
financial instability, shortages of resources, or conflicts. Without
large-scale data mining, progress in these areas seems hard or impossible.
Therefore, a suitable, distributed data mining infrastructure and research
centers should be built in Europe. It also appears appropriate to build a
network of Crisis Observatories. They can be imagined as laboratories devoted
to the gathering and processing of enormous volumes of data on both natural
systems such as the Earth and its ecosystem, as well as on human
techno-socio-economic systems, so as to gain early warnings of impending
events. Reality mining provides the chance to adapt more quickly and more
accurately to changing situations. Further opportunities arise by individually
customized services, which however should be provided in a privacy-respecting
way. This requires the development of novel ICT (such as a self- organizing
Web), but most likely new legal regulations and suitable institutions as well.
As long as such regulations are lacking on a world-wide scale, it is in the
public interest that scientists explore what can be done with the huge data
available. Big data do have the potential to change or even threaten democratic
societies. The same applies to sudden and large-scale failures of ICT systems.
Therefore, dealing with data must be done with a large degree of responsibility
and care. Self-interests of individuals, companies or institutions have limits,
where the public interest is affected, and public interest is not a sufficient
justification to violate human rights of individuals. Privacy is a high good,
as confidentiality is, and damaging it would have serious side effects for
society.Comment: 65 pages, 1 figure, Visioneer White Paper, see
http://www.visioneer.ethz.c
Crowdsourcing atop blockchains
Traditional crowdsourcing systems, such as Amazon\u27s Mechanical Turk (MTurk), though once acquiring great economic successes, have to fully rely on third-party platforms to serve between the requesters and the workers for basic utilities. These third-parties have to be fully trusted to assist payments, resolve disputes, protect data privacy, manage user authentications, maintain service online, etc. Nevertheless, tremendous real-world incidents indicate how elusive it is to completely trust these platforms in reality, and the reduction of such over-reliance becomes desirable.
In contrast to the arguably vulnerable centralized approaches, a public blockchain is a distributed and transparent global consensus computer that is highly robust. The blockchain is usually managed and replicated by a large-scale peer-to-peer network collectively, thus being much more robust to be fully trusted for correctness and availability. It, therefore, becomes enticing to build novel crowdsourcing applications atop blockchains to reduce the over-trust on third-party platforms.
However, this new fascinating technology also brings about new challenges, which were never that severe in the conventional centralized setting. The most serious issue is that the blockchain is usually maintained in the public Internet environment with a broader attack surface open to anyone. This not only causes serious privacy and security issues, but also allows the adversaries to exploit the attack surface to hamper more basic utilities. Worse still, most existing blockchains support only light on-chain computations, and the smart contract executed atop the decentralized consensus computer must be simple, which incurs serious feasibility problems. In reality, the privacy/security issue and the feasibility problem even restrain each other and create serious tensions to hinder the broader adoption of blockchain.
The dissertation goes through the non-trivial challenges to realize secure yet still practical decentralization (for urgent crowdsourcing use-cases), and lay down the foundation for this line of research. In sum, it makes the next major contributions.
First, it identifies the needed security requirements in decentralized knowledge crowdsourcing (e.g., data privacy), and initiates the research of private decentralized crowdsourcing. In particular, the confidentiality of solicited data is indispensable to prevent free-riders from pirating the others\u27 submissions, thus ensuring the quality of solicited knowledge. To this end, a generic private decentralized crowdsourcing framework is dedicatedly designed, analyzed, and implemented.
Furthermore, this dissertation leverages concretely efficient cryptographic design to reduce the cost of the above generic framework. It focuses on decentralizing the special use-case of Amazon MTurk, and conducts multiple specific-purpose optimizations to remove needless generality to squeeze performance. The implementation atop Ethereum demonstrates a handling cost even lower than MTurk.
In addition, it focuses on decentralized crowdsourcing of computing power for specific machine learning tasks. It lets a requester place deposits in the blockchain to recruit some workers for a designated (randomized) programs. If and only if these workers contribute their resources to compute correctly, they would earn well-deserved payments. For these goals, a simple yet still useful incentive mechanism is developed atop the blockchain to deter rational workers from cheating.
Finally, the research initiates the first systematic study on crowdsourcing blockchains\u27 full nodes to assist superlight clients (e.g., mobile phones and IoT devices) to read the blockchain\u27s records. This dissertation presents a novel generic solution through the powerful lens of game-theoretic treatments, which solves the long-standing open problem of designing generic superlight clients for all blockchains
Cryptographic protocols for privacy-aware and secure e-commerce
Aquesta tesi tracta sobre la investigació i el desenvolupament de tecnologies de millora de la privadesa per a proporcionar als consumidors de serveis de comerç electrònic el control sobre quanta informació privada volen compartir amb els proveïdors de serveis. Fem servir tecnologies existents, aixà com tecnologies desenvolupades durant aquesta tesi, per a protegir als usuaris de la recoleció excessiva de dades per part dels proveïdors de serveis en aplicacions especÃfiques. En particular, fem servir un nou esquema de signatura digital amb llindar dinà mic i basat en la identitat per a implementar un mecanisme d'acreditació de la mida d'un grup d'usuaris, que només revela el nombre d'integrants del grup, per a implementar descomptes de grup. A continuació, fem servir una nova construcció basada en signatures cegues, proves de coneixement nul i tècniques de generalització per implementar un sistema de descomptes de fidelitat que protegeix la privadesa dels consumidors. Per últim, fem servir protocols de computació multipart per a implementar dos mecanismes d'autenticació implÃcita que no revelen informació privada de l'usuari al proveïdor de serveis.Esta tesis trata sobre la investigación y desarrollo de tecnologÃas de mejora de la privacidad para proporcionar a los consumidores de servicios de comercio electrónico el control sobre cuanta información privada quieren compartir con los proveedores de servicio. Utilizamos tecnologÃas existentes y desarrolladas en esta tesis para proteger a los usuarios de la recolección excesiva de datos por parte de los proveedores de servicio en aplicaciones especfÃficas. En particular, utilizamos un nuevo esquema de firma digital basado en la identidad y con umbral dinámico para implementar un sistema de acreditación del tamaño de un grupo, que no desvela ninguna información de los miembros del grupo excepto el número de integrantes, para construir un sistema de descuentos de grupo. A continuación, utilizamos una nueva construcción basada en firmas ciegas, pruebas de conocimiento nulo y técnicas de generalización para implementar un sistema de descuentos de fidelidad que protege la privacidad de los consumidores. Por último, hacemos uso de protocolos de computación multiparte para implementar dos mecanismos de autenticación implÃcita que no revelan información privada del usuario al proveedor de servicios.This thesis is about the research and development of privacy enhancing techniques to empower consumers of electronic commerce services with the control on how much private information they want to share with the service providers. We make use of known and newly developed technologies to protect users against excessive data collection by service providers in specific applications. Namely, we use a novel identity-based dynamic threshold signature scheme and a novel key management scheme to implement a group size accreditation mechanism, that does not reveal anything about group members but the size of the group, to support group discounts. Next, we use a novel construction based on blind signatures, zero-knowledge proofs and generalization techniques to implement a privacy-preserving loyalty programs construction. Finally, we use multiparty computation protocols to implement implicit authentication mechanisms that do not disclose private information about the users to the service providers
Recommended from our members
From Controlled Data-Center Environments to Open Distributed Environments: Scalable, Efficient, and Robust Systems with Extended Functionality
The past two decades have witnessed several paradigm shifts in computing environments. Starting from cloud computing which offers on-demand allocation of storage, network, compute, and memory resources, as well as other services, in a pay-as-you-go billingmodel. Ending with the rise of permissionless blockchain technology, a decentralized computing paradigm with lower trust assumptions and limitless number of participants. Unlike in the cloud, where all the computing resources are owned by some trusted cloud provider, permissionless blockchains allow computing resources owned by possibly malicious parties to join and leave their network without obtaining permission from some centralized trusted authority. Still, in the presence of malicious parties, permissionlessblockchain networks can perform general computations and make progress. Cloud computing is powered by geographically distributed data-centers controlled and managed by trusted cloud service providers and promises theoretically infinite computing resources. On the other hand, permissionless blockchains are powered by open networks of geographically distributed computing nodes owned by entities that are not necessarily known or trusted. This paradigm shift requires a reconsideration of distributed data management protocols and distributed system designs that assume low latency across system components, inelastic computing resources, or fully trusted computing resources.In this dissertation, we propose new system designs and optimizations that address scalability and efficiency of distributed data management systems in cloud environments. We also propose several protocols and new programming paradigms to extend the functionality and enhance the robustness of permissionless blockchains. The work presented spans global-scale transaction processing, large-scale stream processing, atomic transaction processing across permissionless blockchains, and extending the functionality and the use-cases of permissionless blockchains. In all these directions, the focus is on rethinking system and protocol designs to account for novel cloud and permissionless blockchain assumptions. For global-scale transaction processing, we propose GPlacer, a placement optimization framework that decides replica placement of fully and partial geo-replicated databases. For large-scale stream processing, we propose Cache-on-Track (CoT) an adaptive and elastic client-side cache that addresses server-side load-imbalances that occur in large-scale distributed storage layers. In permissionless blockchain transaction processing, we propose AC3WN, the first correct cross-chain commitment protocol that guarantees atomicity of cross-chain transactions. Also, we propose TXSC, a transactional smart contract programming framework. TXSC provides smart contract developers with transaction primitives. These primitives allow developers to write smart contracts without the need to reason about the anomalies that can arise due to concurrent smart contract function executions. In addition, we propose a forward-looking architecture that unifies both permissioned and permissionless blockchains and exploits the running infrastructure of permissionless blockchains to build global asset management systems
Behavioral types in programming languages
A recent trend in programming language research is to use behav- ioral type theory to ensure various correctness properties of large- scale, communication-intensive systems. Behavioral types encompass concepts such as interfaces, communication protocols, contracts, and choreography. The successful application of behavioral types requires a solid understanding of several practical aspects, from their represen- tation in a concrete programming language, to their integration with other programming constructs such as methods and functions, to de- sign and monitoring methodologies that take behaviors into account. This survey provides an overview of the state of the art of these aspects, which we summarize as the pragmatics of behavioral types
- …