36 research outputs found

    StAN: exploiting shared interests without disclosing them in gossip-based publish/subscribe

    Get PDF
    Publish/subscribe mechanisms for scalable event dissemination are a core component of many distributed systems ranging from EAI middleware to news dissemination in the Internet. Hence, a lot of research has been done on overlay networks for efficient decentralized topic-based routing. Specifically, in gossip-based dissemination, approximating nodes with shared interests in the overlay makes dissemination more efficient. Unfortunately, this usually requires fully disclosing interests to nearby nodes and impacts reliability due to clustering. In this paper we address this by starting with multiple overlays, one for each topic subscribed, that then separately self-organize to share a large number of physical connections, thereby leading to reduced message traffic and maintenance overhead. This is achieved without a node ever disclosing an interest to another node that doesn’t share it and without impacting the robustness of the overlay. Besides presenting the overlay maintenance protocol, we evaluate it using simulation in order to validate our results

    Scaling up publish/subscribe overlays using interest correlation for link sharing

    Get PDF
    Topic-based publish/subscribe is at the core of many distributed systems, ranging from application integration middleware to news dissemination. Therefore, much research was dedicated to publish/subscribe architectures and protocols, and in particular to the design of overlay networks for decentralized topic-based routing and efficient message dissemination. Nonetheless, existing systems fail to take full advantage of shared interests when disseminating information, hence suffering from high maintenance and traffic costs, or construct overlays that cope poorly with the scale and dynamism of large networks. In this paper we present StaN, a decentralized protocol that optimizes the properties of gossip-based overlay networks for topicbased publish/subscribe by sharing a large number of physical connections without disrupting its logical properties. StaN relies only on local knowledge and operates by leveraging common interests among participants to improve global resource usage and promote topic and event scalability. The experimental evaluation under two real workloads, both via a real deployment and through simulation shows that StaN provides an attractive infrastructure for scalable topic-based publish/subscribe

    Outcomes from elective colorectal cancer surgery during the SARS-CoV-2 pandemic

    Get PDF
    This study aimed to describe the change in surgical practice and the impact of SARS-CoV-2 on mortality after surgical resection of colorectal cancer during the initial phases of the SARS-CoV-2 pandemic

    Implementing a linear algebra approach to data processing

    No full text
    Data analysis is among the main strategies of our time for enterprises to take advantage of the vast amounts of data their systems generate and store everyday. Thus the standard relational database model is challenged everyday to cope with quantitative operations over a traditionally qualitative, relational model. A novel approach to the semantics of data is based on (typed) linear algebra (LA), rather than relational algebra, bridging the gap between data dimensions and data measures in a unified way. Also, this bears the promise of increased parallelism, as most operations in LA admit a ‘divide & conquer’ implementation. This paper presents a first experiment in implementing such a typed linear algebra approach and testing its performance on a data distributed system. It presents solutions to some theoretical limitations and evaluates the overall performance.info:eu-repo/semantics/publishedVersio

    Similarity aware shuffling for the distributed execution of SQL window functions

    No full text
    Window functions are extremely useful and have become increasingly popular, allowing ranking, cumulative sums and other analytic aggregations to be computed over a highly flexible and configurable sliding window. This powerful expressiveness comes naturally at the expense of heavy computational requirements which, so far, have been addressed through optimizations around centralized approaches by works both from the industry and academia. Distribution and parallelization has the potential to improve performance, but introduces several challenges associated with data distribution that may harm data locality. In this paper, we show how data similarity can be employed across partitions during the distributed execution of these operators to improve data co-locality between instances of a Distributed Query Engine and the associated data storage nodes. Our contribution can attain network gains in the average of 3 times and it is expected to scale as the number of instances increase. In the scenario with 8 nodes, we were to able attain bandwidth and time savings of 7.3 times and 2.61 times respectively.ERDF - European Regional Development Fund (NORTE 2020)info:eu-repo/semantics/publishedVersio

    EpTO: An epidemic total order algorithm for large-scale distributed systems

    No full text
    The ordering of events is a fundamental problem of distributed computing and has been extensively studied over several decades. From all the available orderings, total ordering is of particular interest as it provides a powerful abstraction for building reliable distributed applications. Unfortunately, deterministic total order algorithms scale poorly and are therefore unfit for modern large-scale applications. The main contribution of this paper is EPTO, a total order algorithm with probabilistic agreement that scales both in the number of processes and events. EPTO provides deterministic safety and probabilistic liveness: integrity, total order and validity are always preserved, while agreement is achieved with arbitrarily high probability. We show that EPTO is well-suited for large-scale dynamic distributed systems: it does not require a global clock nor synchronized processes, and it is highly robust even when the network suffers from large delays and significant churn and message loss.info:eu-repo/semantics/publishedVersio

    Practical evaluation of large scale applications

    No full text
    Designing and implementing distributed systems is a hard endeavor, both at an abstract level when designing the system, and at a concrete level when implementing, debugging and evaluating it. This stems not only from the inherent complexity of writing and reasoning about distributed software, but also from the lack of tools for testing and evaluating it under realistic conditions. Moreover, the gap between the protocols’ specifications found on research papers and their implementations on real code is huge, leading to inconsistencies that often result in the implementation no longer following the specification. As an example, the specification of the popular Chord DHT comprises a few dozens of lines, while its Java implementation, OpenChord, is close to twenty thousand lines, excluding libraries. This makes it hard and error prone to change the implementation to reflect changes in the specification, regardless of programmers’ skill. Besides, critical behavior due to the unpredictable interleaving of operations and network uncertainty, can only be observed on a realistic setting, limiting the usefulness of simulation tools. We believe that being able to write an algorithm implementation very close to its specification, and evaluating it in a real environment is a big step in the direction of building better distributed systems. Our approach leverages the MINHA platform to offer a set of built in primitives that allows one to program very close to pseudo-code. This high level implementation can interact with off-the-shelf existing middleware and can be gradually replaced by a production-ready Java implementation. In this paper, we present the system design and showcase it using a well-known algorithm from the literature.(undefined)info:eu-repo/semantics/publishedVersio

    Performance trade-offs on a secure multi-party relational database

    No full text
    The privacy of information is an increasing concern of software applications users. This concern was caused by attacks to cloud services over the last few years, that have leaked confidential information such as passwords, emails and even private pictures. Once the information is leaked, the users and software applications are powerless to contain the spread of information and its misuse. With databases as a central component of applications that store almost all of their data, they are one of the most common targets of attacks. However, typical deployments of databases do not leverage security mechanisms to stop attacks and do not apply cryptographic schemes to protect data. This issue has been tackled by multiple secure databases that provide trade-offs between security, query capabilities and performance. Despite providing stronger security guarantees, the proposed solutions still entrust their data to a single entity that can be corrupted or hacked. Secret sharing can solve this problem by dividing data in multiple secrets and storing each secret at a different location. The division is done in such a way that if one location is hacked, no information can be leaked. Depending on the protocols used to divide data, functions can be computed over this data through secure protocols that do not disclose information or actually know which values are being calculated. We propose a SQL database prototype capable of offering a trade-off between security and query latency by using a different secure protocol. An evaluation of the protocols is also performed, showing that our most relaxed protocol has an improvement of 5% on the query latency time over the original protocol.This work was partially funded by project ”NORTE-01-0145-FEDER-000016”, financed by the North Portugal Re-gional Operational Programme (NORTE 2020), under thePORTUGAL 2020 Partnership Agreement, and through theEuropean Regional Development Fund (ERD

    Resource usage prediction in distributed key-value datastores

    No full text
    In order to attain the promises of the Cloud Computing paradigm, systems need to be able to transparently adapt to environment changes. Such behavior benefits from the ability to predict those changes in order to handle them seamlessly. In this paper, we present a mechanism to accurately predict the resource usage of distributed key-value datastores. Our mechanism requires offline training but, in contrast with other approaches, it is sufficient to run it only once per hardware configuration and subsequently use it for online prediction of database performance under any circumstance. The mechanism accurately estimates the database resource usage for any request distribution with an average accuracy of 94 %, only by knowing two parameters: (i) cache hit ratio; and (ii) incoming throughput. Both input values can be observed in real time or synthesized for request allocation decisions. This novel approach is sufficiently simple and generic, while simultaneously being suitable for other practical applications.info:eu-repo/semantics/publishedVersio

    Characterisation of microbial attack on archaeological bone

    Get PDF
    As part of an EU funded project to investigate the factors influencing bone preservation in the archaeological record, more than 250 bones from 41 archaeological sites in five countries spanning four climatic regions were studied for diagenetic alteration. Sites were selected to cover a range of environmental conditions and archaeological contexts. Microscopic and physical (mercury intrusion porosimetry) analyses of these bones revealed that the majority (68%) had suffered microbial attack. Furthermore, significant differences were found between animal and human bone in both the state of preservation and the type of microbial attack present. These differences in preservation might result from differences in early taphonomy of the bones. © 2003 Elsevier Science Ltd. All rights reserved
    corecore