53 research outputs found

    Indexing structures for the PLS blockchain

    Get PDF
    © The Author(s) 2021. This article is licensed under a Creative Commons Attribution 4.0 International License, to view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.This paper studies known indexing structures from a new point of view: minimisation of data exchange between an IoT device acting as a blockchain client and the blockchain server running a protocol suite that includes two Guy Fawkes protocols, PLS and SLVP. The PLS blockchain is not a cryptocurrency instrument; it is an immutable ledger offering guaranteed non-repudiation to low-power clients without use of public key crypto. The novelty of the situation is in the fact that every PLS client has to obtain a proof of absence in all blocks of the chain to which its counterparty does not contribute, and we show that it is possible without traversing the block’s Merkle tree. We obtain weight statistics of a leaf path on a sparse Merkle tree theoretically, as our ground case. Using the theory we quantify the communication cost of a client interacting with the blockchain. We show that large savings can be achieved by providing a bitmap index of the tree compressed using Tunstall’s method. We further show that even in the case of correlated access, as in two IoT devices posting messages for each other in consecutive blocks, it is possible to prevent compression degradation by re-randomising the IDs using a pseudorandom bijective function. We propose a low-cost function of this kind and evaluate its quality by simulation, using the avalanche criterion.Peer reviewedFinal Published versio

    Auditable and performant Byzantine consensus for permissioned ledgers

    Get PDF
    Permissioned ledgers allow users to execute transactions against a data store, and retain proof of their execution in a replicated ledger. Each replica verifies the transactions’ execution and ensures that, in perpetuity, a committed transaction cannot be removed from the ledger. Unfortunately, this is not guaranteed by today’s permissioned ledgers, which can be re-written if an arbitrary number of replicas collude. In addition, the transaction throughput of permissioned ledgers is low, hampering real-world deployments, by not taking advantage of multi-core CPUs and hardware accelerators. This thesis explores how permissioned ledgers and their consensus protocols can be made auditable in perpetuity; even when all replicas collude and re-write the ledger. It also addresses how Byzantine consensus protocols can be changed to increase the execution throughput of complex transactions. This thesis makes the following contributions: 1. Always auditable Byzantine consensus protocols. We present a permissioned ledger system that can assign blame to individual replicas regardless of how many of them misbehave. This is achieved by signing and storing consensus protocol messages in the ledger and providing clients with signed, universally-verifiable receipts. 2. Performant transaction execution with hardware accelerators. Next, we describe a cloud-based ML inference service that provides strong integrity guarantees, while staying compatible with current inference APIs. We change the Byzantine consensus protocol to execute machine learning (ML) inference computation on GPUs to optimize throughput and latency of ML inference computation. 3. Parallel transactions execution on multi-core CPUs. Finally, we introduce a permissioned ledger that executes transactions, in parallel, on multi-core CPUs. We separate the execution of transactions between the primary and secondary replicas. The primary replica executes transactions on multiple CPU cores and creates a dependency graph of the transactions that the backup replicas utilize to execute transactions in parallel.Open Acces

    Privacy, Access Control, and Integrity for Large Graph Databases

    Get PDF
    Graph data are extensively utilized in social networks, collaboration networks, geo-social networks, and communication networks. Their growing usage in cyberspaces poses daunting security and privacy challenges. Data publication requires privacy-protection mechanisms to guard against information breaches. In addition, access control mechanisms can be used to allow controlled sharing of data. Provision of privacy-protection, access control, and data integrity for graph data require a holistic approach for data management and secure query processing. This thesis presents such an approach. In particular, the thesis addresses two notable challenges for graph databases, which are: i) how to ensure users\u27 privacy in published graph data under an access control policy enforcement, and ii) how to verify the integrity and query results of graph datasets. To address the first challenge, a privacy-protection framework under role-based access control (RBAC) policy constraints is proposed. The design of such a framework poses a trade-off problem, which is proved to be NP-complete. Novel heuristic solutions are provided to solve the constraint problem. To the best of our knowledge, this is the first scheme that studies the trade-off between RBAC policy constraints and privacy-protection for graph data. To address the second challenge, a cryptographic security model based on Hash Message Authentic Codes (HMACs) is proposed. The model ensures integrity and completeness verification of data and query results under both two-party and third-party data distribution environments. Unique solutions based on HMACs for integrity verification of graph data are developed and detailed security analysis is provided for the proposed schemes. Extensive experimental evaluations are conducted to illustrate the performance of proposed algorithms

    Stochastic Tree Models for Macroevolution: Development, Validation and Application

    Get PDF
    Phylogenetic trees capture the relationships between species and can be investigated by morphological and/or molecular data. When focusing on macroevolution, one considers the large-scale history of life with evolutionary changes affecting a single species of the entire clade leading to the enormous diversity of species obtained today. One major problem of biology is the explanation of this biodiversity. Therefore, one may ask which kind of macroevolutionary processes have given rise to observable tree shapes or patterns of species distribution which refers to the appearance of branching orders and time periods. Thus, with an increasing number of known species in the context of phylogenetic studies, testing hypotheses about evolution by analyzing the tree shape of the resulting phylogenetic trees became matter of particular interest. The attention of using those reconstructed phylogenies for studying evolutionary processes increased during the last decades. Many paleontologists (Raup et al., 1973; Gould et al., 1977; Gilinsky and Good, 1989; Nee, 2004) tried to describe such patterns of macroevolution by using models for growing trees. Those models describe stochastic processes to generate phylogenetic trees. Yule (1925) was the first who introduced such a model, the Equal Rate Markov (ERM) model, in the context of biological branching based on a continuous-time, uneven branching process. In the last decades, further dynamical models were proposed (Yule, 1925; Aldous, 1996; Nee, 2006; Rosen, 1978; Ford, 2005; Hernández-García et al., 2010) to address the investigation of tree shapes and hence, capture the rules of macroevolutionary forces. A common model, is the Aldous\\\'' Branching (AB) model, which is known for generating trees with a similar structure of \\\"real\\\" trees. To infer those macroevolutionary forces structures, estimated trees are analyzed and compared to simulated trees generated by models. There are a few drawbacks on recent models such as a missing biological motivation or the generated tree shape does not fit well to one observed in empirical trees. The central aim of this thesis is the development and study of new biologically motivated approaches which might help to better understand or even discover biological forces which lead to the huge diversity of organisms. The first approach, called age model, can be defined as a stochastic procedure which describes the growth of binary trees by an iterative stochastic attachment of leaves, similar to the ERM model. At difference with the latter, the branching rate at each clade is no longer constant, but decreasing in time, i.e., with the age. Thus, species involved in recent speciation events have a tendency to speciate again. The second introduced model, is a branching process which mimics the evolution of species driven by innovations. The process involves a separation of time scales. Rare innovation events trigger rapid cascades of diversification where a feature combines with previously existing features. The model is called innovation model. Three data sets of estimated phylogenetic trees are used to analyze and compare the produced tree shape of the new growth models. A tree shape statistic considering a variety of imbalance measurements is performed. Results show that simulated trees of both growth models fit well to the tree shape observed in real trees. In a further study, a likelihood analysis is performed in order to rank models with respect to their ability to explain observed tree shapes. Results show that the likelihoods of the age model and the AB model are clearly correlated under the trees in the databases when considering small and medium-sized trees with up to 19 leaves. For a data set, representing of phylogenetic trees of protein families, the age model outperforms the AB model. But for another data set, representing phylogenetic trees of species, the AB model performs slightly better. To support this observation a further analysis using larger trees is necessary. But an exact computation of likelihoods for large trees implies a huge computational effort. Therefore, an efficient method for likelihood estimation is proposed and compared to the estimation using a naive sampling strategy. Nevertheless, both models describe the tree generation process in a way which is easy to interpret biologically. Another interesting field of research in biology is the coevolution between species. This is the interaction of species across groups such that the evolution of a species from one group can be triggered by a species from another group. Most prominent examples are systems of host species and their associated parasites. One problem is the reconciliation of the common history of both groups of species and to predict the associations between ancestral hosts and their parasites. To solve this problem some algorithmic methods have been developed in recent years. But only a few host parasite systems have been analyzed in sufficient detail which makes an evaluation of these methods complex. Within the scope of coevolution, the proposed age model is applied to the generation of cophylogenies to evaluate such host parasite reconciliation methods. The presented age model as well as the innovation model produce tree shapes which are similar to obtained tree structures of estimated trees. Both models describe an evolutionary dynamics and might provide a further opportunity to infer macroevolutionary processes which lead to the biodiversity which can be obtained today. Furthermore with the application of the age model in the context of coevolution by generating a useful benchmark set of cophylogenies is a first step towards systematic studies on evaluating reconciliation methods

    Towards a New Generation of Permissioned Blockchain Systems

    Get PDF
    With the release of Satoshi Nakamoto's Bitcoin system in 2008 a new decentralized computation paradigm, known as blockchain, was born. Bitcoin promised a trading network for virtual coins, publicly available for anyone to participate in but owned by nobody. Any participant could propose a transaction and a lottery mechanism decided in which order these transactions would be recorded in a ledger with an elegant mechanism to prevent double spending. The remarkable achievement of Nakamoto's protocol was that participants did not have to trust each other to behave correctly for it to work. As long as more than half of the network participants adhered to the correct code, the recorded transactions on the ledger would both be valid and immutable. Ethereum, as the next major blockchain to appear, improved on the initial idea by introducing smart contracts, which are decentralized Turing-complete stored procedures, thus making blockchain technology interesting for the enterprise setting. However, its intrinsically public data and prohibitive energy costs needed to be overcome. This gave rise to a new type of systems called permissioned blockchains. With these, access to the ledger is restricted and trust assumptions about malicious behaviour have been weakened, allowing more efficient consensus mechanisms to find a global order of transactions. One of the most popular representatives of this kind of blockchain is Hyperledger Fabric. While it is much faster and more energy efficient than permissionless blockchains, it has to compete with conventional distributed databases in the enterprise sector. This thesis aims to mitigate Fabric's three major shortcomings. First, compared to conventional database systems, it is still far too slow. This thesis shows how the performance can be increased by a factor of seven by redesigning the transaction processing pipeline and introducing more efficient data structures. Second, we present a novel solution to Fabric's intrinsic problem of a low throughput for workloads with transactions that access the same data. This is achieved by analyzing the dependencies of transactions and selectively re-executing transactions when a conflict is detected. Third, this thesis tackles the preservation of private data. Even though access to the blockchain as a whole can be restricted, in a setting where multiple enterprises collaborate this is not sufficient to protect sensitive proprietary data. Thus, this thesis introduces a new privacy-preserving blockchain protocol based on network sharding and targeted data dissemination. It also introduces an additional layer of abstraction for the creation of transactions and interaction with data on the blockchain. This allows developers to write applications without the need for low-level knowledge of the internal data structure of the blockchain system. In summary, this thesis addresses the shortcomings of the current generation of permission blockchain systems

    LNCS

    Get PDF
    This paper studies the concrete security of PRFs and MACs obtained by keying hash functions based on the sponge paradigm. One such hash function is KECCAK, selected as NIST’s new SHA-3 standard. In contrast to other approaches like HMAC, the exact security of keyed sponges is not well understood. Indeed, recent security analyses delivered concrete security bounds which are far from existing attacks. This paper aims to close this gap. We prove (nearly) exact bounds on the concrete PRF security of keyed sponges using a random permutation. These bounds are tight for the most relevant ranges of parameters, i.e., for messages of length (roughly) l ≤ min{2n/4, 2r} blocks, where n is the state size and r is the desired output length; and for l ≤ q queries (to the construction or the underlying permutation). Moreover, we also improve standard-model bounds. As an intermediate step of independent interest, we prove tight bounds on the PRF security of the truncated CBC-MAC construction, which operates as plain CBC-MAC, but only returns a prefix of the output

    UNDERSTAND, DETECT, AND BLOCK MALWARE DISTRIBUTION FROM A GLOBAL VIEWPOINT

    Get PDF
    Malware still is a vital security threat. Adversaries continue to distribute various types of malicious programs to victims around the world. In this study, we try to understand the strategies the miscreants take to distribute malware, develop systems to detect malware delivery and explore the benefit of a transparent platform for blocking malware distribution in advance. At the first part of the study, to understand the malware distribution, we conduct several measurements. We initiate the study by investigating the dynamics of malware delivery. We share several findings including the downloaders responsible for the malware delivery and the high ratio of signed malicious downloaders. We further look into the problem of signed malware. To successfully distribute malware, the attacker exploits weaknesses in the code-signing PKI, which falls into three categories: inadequate client-side protections, publisher-side key mismanagement, and CA-side verification failures. We propose an algorithm to identify malware that exploits those weaknesses and to classify to the corresponding weakness. Using the algorithm, We conduct a systematic study of the weaknesses of code-signing PKI on a large scale. Then, we move to the problem of revocation. Certificate revocation is the primary defense against the abuse in code-signing PKI. We identify the effective revocation process, which includes the discovery of compromised certificates, the revocation date setting, and the dissemination of revocation information; moreover, we systematically measure the problems in the revocation process and new threats introduced by these problems. For the next part, we explore two different approaches to detect the malware distribution. We study the executable files known as downloader Trojans or droppers, which are the core of the malware delivery techniques. The malware delivery networks instruct these downloaders across the Internet to access a set of DNS domain address to retrieve payloads. We first focus on the downloaded by relationship between a downloader and a payload recorded by different sensors and introduce the downloader graph abstraction. The downloader graph captures the download activities across end hosts and exposes large parts of the malware download activity, which may otherwise remain undetected, by connecting the dots. By combining telemetry from anti-virus and intrusion-prevention systems, we perform a large-scale analysis on 19 million downloader graphs from 5 million real hosts. The analysis revealed several strong indicators of malicious activity, such as the slow growth rate and the high diameter. Moreover, we observed that, besides the local indicators, taking into account the global properties boost the performance in distinguishing between malicious and benign download activity. For example, the file prevalence (i.e., the number of hosts a file appears on) and download patterns (e.g., number of files downloaded per domain) are different from malicious to benign download activities. Next, we target the silent delivery campaigns, which is the critical method for quickly delivering malware or potentially unwanted programs (PUPs) to a large number of hosts at scale. Such large-scale attacks require coordination activities among multiple hosts involved in malicious activity. We developed Beewolf, a system for detecting silent delivery campaigns from Internet-wide records of download events. We exploit the behavior of downloaders involved in campaigns for this system: they operate in lockstep to retrieve payloads. We utilize Beewolf to identify these locksteps in an unsupervised and deterministic fashion at scale. Moreover, the lockstep detection exposes the indirect relationships among the downloaders. We investigate the indirect relationships and present novel findings such as the overlap between the malware and PUP ecosystem. The two different studies revealed the problems caused by the opaque software distribution ecosystem and the importance of the global properties in detecting malware distribution. To address both of these findings, we propose a transparent platform for software distribution called Download Transparency. Transparency guarantees openness and accountability of the data, however, itself does not provide any security guarantees. Although there exists an anecdotal example showing the benefit of transparency, it is still not clear how beneficial it is to security. In the last part of this work, we explore the benefit of transparency in the domain of downloads. To measure the performance, we designed the participants and the policies they might take when utilizing the platform. We then simulate different policies with five years of download events and measure the block performance. The results suggest that the Download Transparency can help to block a significant part of the malware distribution before the community can flag it as malicious
    • …
    corecore