42 research outputs found

    P2KMV: A Privacy-preserving Counting Sketch for Efficient and Accurate Set Intersection Cardinality Estimations

    Get PDF
    In this paper, we propose P2KMV, a novel privacy-preserving counting sketch, based on the k minimum values algorithm. With P2KMV, we offer a versatile privacy-enhanced technology for obtaining statistics, following the principle of data minimization, and aiming for the sweet spot between privacy, accuracy, and computational efficiency. As our main contribution, we develop methods to perform set operations, which facilitate cardinality estimates under strong privacy requirements. Most notably, we propose an efficient, privacy-preserving algorithm to estimate the set intersection cardinality. P2KMV provides plausible deniability for all data items contained in the sketch. We discuss the algorithm's privacy guarantees as well as the accuracy of the obtained estimates. An experimental evaluation confirms our analytical expectations and provides insights regarding parameter choices

    On the Security of the K Minimum Values (KMV) Sketch

    Get PDF
    Data sketches are widely used to accelerate operations in big data analytics. For example, algorithms use sketches to compute the cardinality of a set, or the similarity between two sets. Sketches achieve significant reductions in computing time and storage requirements by providing probabilistic estimates rather than exact values. In many applications, an estimate is sufficient and thus, it is possible to trade accuracy for computational complexity; this enables the use of probabilistic sketches. However, the use of probabilistic data structures may create security issues because an attacker may manipulate the data in such a way that the sketches produce an incorrect estimate. For example, an attacker could potentially inflate the estimate of the number of distinct users to increase its revenues or popularity. Recent works have shown that an attacker can manipulate Hyperloglog, a sketch widely used for cardinality estimate, with no knowledge of its implementation details. This paper considers the security of K Minimum Values (KMV), a sketch that is also widely used to implement both cardinality and similarity estimates. Next sections characterize vulnerabilities at an implementationindependent level, with attacks formulated as part of a novel adversary model that manipulates the similarity estimate. Therefore, the paper pursues an analysis and simulation; the results suggest that as vulnerable to attacks, an increase or reduction of the estimate may occur. The execution of the attacks against the KMV implementation in the Apache DataSketches library validates these scenarios. Experiments show an excellent agreement between theory and experimental results.Pedro Reviriego acknowledges the support of the ACHILLES project PID2019-104207RB-I00 and the Go2Edge network RED2018-102585-T funded by the Spanish Ministry of Economy and Competitivity and of the Madrid Community research project TAPIR-CM under Grant P2018/TCS-4496

    Hypergraph Partitioning in the Cloud

    Get PDF
    The thesis investigates the partitioning and load balancing problem which has many applications in High Performance Computing (HPC). The application to be partitioned is described with a graph or hypergraph. The latter is of greater interest as hypergraphs, compared to graphs, have a more general structure and can be used to model more complex relationships between groups of objects such as non-symmetric dependencies. Optimal graph and hypergraph partitioning is known to be NP-Hard but good polynomial time heuristic algorithms have been proposed. In this thesis, we propose two multi-level hypergraph partitioning algorithms. The algorithms are based on rough set clustering techniques. The first algorithm, which is a serial algorithm, obtains high quality partitionings and improves the partitioning cut by up to 71\% compared to the state-of-the-art serial hypergraph partitioning algorithms. Furthermore, the capacity of serial algorithms is limited due to the rapid growth of problem sizes of distributed applications. Consequently, we also propose a parallel hypergraph partitioning algorithm. Considering the generality of the hypergraph model, designing a parallel algorithm is difficult and the available parallel hypergraph algorithms offer less scalability compared to their graph counterparts. The issue is twofold: the parallel algorithm and the complexity of the hypergraph structure. Our parallel algorithm provides a trade-off between global and local vertex clustering decisions. By employing novel techniques and approaches, our algorithm achieves better scalability than the state-of-the-art parallel hypergraph partitioner in the Zoltan tool on a set of benchmarks, especially ones with irregular structure. Furthermore, recent advances in cloud computing and the services they provide have led to a trend in moving HPC and large scale distributed applications into the cloud. Despite its advantages, some aspects of the cloud, such as limited network resources, present a challenge to running communication-intensive applications and make them non-scalable in the cloud. While hypergraph partitioning is proposed as a solution for decreasing the communication overhead within parallel distributed applications, it can also offer advantages for running these applications in the cloud. The partitioning is usually done as a pre-processing step before running the parallel application. As parallel hypergraph partitioning itself is a communication-intensive operation, running it in the cloud is hard and suffers from poor scalability. The thesis also investigates the scalability of parallel hypergraph partitioning algorithms in the cloud, the challenges they present, and proposes solutions to improve the cost/performance ratio for running the partitioning problem in the cloud. Our algorithms are implemented as a new hypergraph partitioning package within Zoltan. It is an open source Linux-based toolkit for parallel partitioning, load balancing and data-management designed at Sandia National Labs. The algorithms are known as FEHG and PFEHG algorithms

    Annales Mathematicae et Informaticae (44.)

    Get PDF

    An assessment of the component-based view for metaheuristic research.

    Get PDF
    Masters Degree. University of KwaZulu-Natal, Durban.Several authors have recently pointed to a crisis within the metaheuristic research field, particularly the proliferation of metaphor-inspired metaheuristics. Common problems identified include using non-standard terminology, poor experimental practices, and, most importantly, the introduction of purportedly new algorithms that are only superficially different from existing ones. These issues make similarity and performance analysis, classification, and metaheuristic generation difficult for both practitioners and researchers. A component-based view of metaheuristics has recently been promoted to deal with these problems. A component based view argues that metaheuristics are best understood in terms of their constituents or components. This dissertation presents three papers that are thematically centred on this view. The central problem for the component-based view is the identification of components of a metaheuristic. The first paper proposes the use of taxonomies to guide the identification of metaheuristic components. We developed a general and rigorous method, TAXONOG-IMC, that takes as input an appropriate taxonomy and guides the user to identify components. The method is described in detail, an example application of the method is given, and an analysis of its usefulness is provided. The analysis shows that the method is effective and provides insights that are not possible without the proper identification of the components. The second paper argues for formal, mathematically sound representations of metaheuristics. It introduces and defends a formal representation that leverages the component based view. The third paper demonstrates that a representation technique based on a component based view is able to provide the basis for a similarity measure. This paper presents a method of measuring similarity between two metaheuristic-algorithms, based on their representations as signal flow diagrams. Our findings indicate that the component based view of metaheuristics provides valuable insights and allows for more robust analysis, classification and comparison

    Breaking and Fixing Secure Similarity Approximations: Dealing with Adversarially Perturbed Inputs

    Get PDF
    Computing similarity between data is a fundamental problem in information retrieval and data mining. To address the relevant performance and scalability challenges, approximation methods are employed for large-scale similarity computation. A common characteristic among all privacy- preserving approximation protocols based on sketching is that the sketching is performed locally and is based on common randomness. In the semi-honest model the input to the sketching algorithm is independent of the common randomness. We, however, consider a new threat model where a party is allowed to use the common randomness to perturb her input 1) offline, and 2) before the execution of any secure protocol so as to steer the approximation result to a maliciously chosen output. We formally define perturbation attacks under this adversarial model and propose two attacks on the well-studied techniques of minhash and cosine sketching. We demonstrate the power of perturbation attacks by measuring their success on synthetic and real data. To mitigate such perturbation attacks we propose a server- aided architecture, where an additional party, the server, assists in the secure similarity approximation by handling the common randomness as private data. We revise and introduce the necessary secure protocols so as to apply minhash and cosine sketching techniques in the server-aided architecture. Our implementation demonstrates that this new design can mitigate offline perturbation attacks without sacrificing the efficiency and scalability of the reconstruction protocol

    Cryptographic protocols for privacy-aware and secure e-commerce

    Get PDF
    Aquesta tesi tracta sobre la investigació i el desenvolupament de tecnologies de millora de la privadesa per a proporcionar als consumidors de serveis de comerç electrònic el control sobre quanta informació privada volen compartir amb els proveïdors de serveis. Fem servir tecnologies existents, així com tecnologies desenvolupades durant aquesta tesi, per a protegir als usuaris de la recoleció excessiva de dades per part dels proveïdors de serveis en aplicacions específiques. En particular, fem servir un nou esquema de signatura digital amb llindar dinàmic i basat en la identitat per a implementar un mecanisme d'acreditació de la mida d'un grup d'usuaris, que només revela el nombre d'integrants del grup, per a implementar descomptes de grup. A continuació, fem servir una nova construcció basada en signatures cegues, proves de coneixement nul i tècniques de generalització per implementar un sistema de descomptes de fidelitat que protegeix la privadesa dels consumidors. Per últim, fem servir protocols de computació multipart per a implementar dos mecanismes d'autenticació implícita que no revelen informació privada de l'usuari al proveïdor de serveis.Esta tesis trata sobre la investigación y desarrollo de tecnologías de mejora de la privacidad para proporcionar a los consumidores de servicios de comercio electrónico el control sobre cuanta información privada quieren compartir con los proveedores de servicio. Utilizamos tecnologías existentes y desarrolladas en esta tesis para proteger a los usuarios de la recolección excesiva de datos por parte de los proveedores de servicio en aplicaciones especfíficas. En particular, utilizamos un nuevo esquema de firma digital basado en la identidad y con umbral dinámico para implementar un sistema de acreditación del tamaño de un grupo, que no desvela ninguna información de los miembros del grupo excepto el número de integrantes, para construir un sistema de descuentos de grupo. A continuación, utilizamos una nueva construcción basada en firmas ciegas, pruebas de conocimiento nulo y técnicas de generalización para implementar un sistema de descuentos de fidelidad que protege la privacidad de los consumidores. Por último, hacemos uso de protocolos de computación multiparte para implementar dos mecanismos de autenticación implícita que no revelan información privada del usuario al proveedor de servicios.This thesis is about the research and development of privacy enhancing techniques to empower consumers of electronic commerce services with the control on how much private information they want to share with the service providers. We make use of known and newly developed technologies to protect users against excessive data collection by service providers in specific applications. Namely, we use a novel identity-based dynamic threshold signature scheme and a novel key management scheme to implement a group size accreditation mechanism, that does not reveal anything about group members but the size of the group, to support group discounts. Next, we use a novel construction based on blind signatures, zero-knowledge proofs and generalization techniques to implement a privacy-preserving loyalty programs construction. Finally, we use multiparty computation protocols to implement implicit authentication mechanisms that do not disclose private information about the users to the service providers

    Annales Mathematicae et Informaticae 2015

    Get PDF

    Algebraic Structures of Neutrosophic Triplets, Neutrosophic Duplets, or Neutrosophic Multisets

    Get PDF
    Neutrosophy (1995) is a new branch of philosophy that studies triads of the form (, , ), where is an entity {i.e. element, concept, idea, theory, logical proposition, etc.}, is the opposite of , while is the neutral (or indeterminate) between them, i.e., neither nor .Based on neutrosophy, the neutrosophic triplets were founded, which have a similar form (x, neut(x), anti(x)), that satisfy several axioms, for each element x in a given set.This collective book presents original research papers by many neutrosophic researchers from around the world, that report on the state-of-the-art and recent advancements of neutrosophic triplets, neutrosophic duplets, neutrosophic multisets and their algebraic structures – that have been defined recently in 2016 but have gained interest from world researchers. Connections between classical algebraic structures and neutrosophic triplet / duplet / multiset structures are also studied. And numerous neutrosophic applications in various fields, such as: multi-criteria decision making, image segmentation, medical diagnosis, fault diagnosis, clustering data, neutrosophic probability, human resource management, strategic planning, forecasting model, multi-granulation, supplier selection problems, typhoon disaster evaluation, skin lesson detection, mining algorithm for big data analysis, etc

    Applications

    Get PDF
    Volume 3 describes how resource-aware machine learning methods and techniques are used to successfully solve real-world problems. The book provides numerous specific application examples: in health and medicine for risk modelling, diagnosis, and treatment selection for diseases in electronics, steel production and milling for quality control during manufacturing processes in traffic, logistics for smart cities and for mobile communications
    corecore