26 research outputs found

    The Replica Consistency Problem in Data Grids

    Get PDF
    Fast and reliable data access is a crucial aspect in distributed computing and is often achieved using data replication techniques. In Grid architectures, data are replicated in many nodes of the Grid, and users usually access the "best" replica in terms of availability and network latency. When replicas are modifiable, a change made to one replica will break the consistency with the other replicas that, at that point, become stale. Replica synchronisation protocols exist and are applied in several distributed architectures, for example in distributed databases. Grid middleware solutions provide well established support for replicating data. Nevertheless, replicas are still considered read-only, and no support is provided to the user for updating a replica while maintaining the consistency with the other replicas. In this thesis, done in collaboration with the Italian National Institute of Nuclear Physics (INFN) and the European Organisation for Nuclear Research (CERN), we study the replica consistency problem in Grid computing and propose a service, called CONStanza, that is able to synchronise both files and heterogeneous (different vendors) databases in a Grid environment. We analyse and implement a specific use case that arises in high energy Physics, where conditions databases are replicated using databases of different makes. We provide detailed performance results, and show how CONStanza can be used together with Oracle Streams to provide multitier replication of conditions databases using Oracle and MySQL databases

    Smart PIN: performance and cost-oriented context-aware personal information network

    Get PDF
    The next generation of networks will involve interconnection of heterogeneous individual networks such as WPAN, WLAN, WMAN and Cellular network, adopting the IP as common infrastructural protocol and providing virtually always-connected network. Furthermore, there are many devices which enable easy acquisition and storage of information as pictures, movies, emails, etc. Therefore, the information overload and divergent content’s characteristics make it difficult for users to handle their data in manual way. Consequently, there is a need for personalised automatic services which would enable data exchange across heterogeneous network and devices. To support these personalised services, user centric approaches for data delivery across the heterogeneous network are also required. In this context, this thesis proposes Smart PIN - a novel performance and cost-oriented context-aware Personal Information Network. Smart PIN's architecture is detailed including its network, service and management components. Within the service component, two novel schemes for efficient delivery of context and content data are proposed: Multimedia Data Replication Scheme (MDRS) and Quality-oriented Algorithm for Multiple-source Multimedia Delivery (QAMMD). MDRS supports efficient data accessibility among distributed devices using data replication which is based on a utility function and a minimum data set. QAMMD employs a buffer underflow avoidance scheme for streaming, which achieves high multimedia quality without content adaptation to network conditions. Simulation models for MDRS and QAMMD were built which are based on various heterogeneous network scenarios. Additionally a multiple-source streaming based on QAMMS was implemented as a prototype and tested in an emulated network environment. Comparative tests show that MDRS and QAMMD perform significantly better than other approaches

    Asynchronous epidemic algorithms for consistency in large-scale systems

    Get PDF
    Achieving and detecting a globally consistent state is essential to many services in the large and extreme-scale distributed systems, especially when the desired consistent state is critical for services operation. Centralised and deterministic approaches for synchronisation and distributed consistency are not scalable and not fault-tolerant. Alternatively, epidemic-based paradigms are decentralised computations based on randomised communications. They are scalable, resilient, fault-tolerant, and converge to the desired target in logarithmic time with respect to system size. Thus, many distributed services have adopted epidemic protocols to achieve the consensus and the consistent state, mainly due to scalability concerns. The convergence of epidemic protocols is stochastically guaranteed. However, the detection of the convergence is probabilistic and non-explicit. In a real-world environment, systems are unreliable, and epidemic protocols cannot converge to the desired state. Thus, achieving convergence by itself does not ensure making a system-wide consistent state under dynamic conditions. The research work presented in this thesis introduces the Phase Transition Algorithm (PTA) to achieve distributed consistent state based on the explicit detection of convergence. Each phase in PTA is a decentralised decision-making process that implements epidemic data aggregation, in which the detection of convergence implies achieving a global agreement. The phases in PTA can be cascaded to achieve higher certainty as desired. Following the PTA, two epidemic protocols, namely PTP and ECP, are proposed to acquire of consensus, i.e. for the consistency in data dissemination and data aggregation. The protocols are examined through simulations, and experimental results have validated the protocols ability to achieve and explicitly detect the consensus among system nodes. The research work has also studied the epidemic data aggregation under nodes churn and network failures, in which the analysis has identified three phases of the aggregation process. The investigations have shown a different impact of nodes churn on each phase. The phase that is critical for the aggregation process has been studied further, which led to propose new robust data aggregation protocols, REAP and REAP+. Each protocol has a different decentralised replication method, and both implements distributed failure detection and instantaneous mass restoration mechanisms. Simulations have validated the protocols, and results have shown protocols ability to converge, detect convergence, and produce competitive accuracy under various levels of nodes churn. Furthermore, distributed consistency in continuous systems is addressed in the research. The work has proposed a novel continuous epidemic protocol with the adaptive restart mechanism. The protocol restarts either upon the detection of system convergence or upon the detection of divergence. Also, the protocol introduces the seed selection method for the peak data distribution in decentralised approaches, which was a challenge that requires single-point initialisation and leader-election step. The simulations validated the performance of the algorithm under static and dynamic conditions and approved that convergence and divergence detection accuracy can be tuned as desired. Finally, the research work shows that combining and integrating of the proposed protocols enables extreme-scale distributed systems to achieve and detect global consistent states even under realistic and dynamical conditions

    Design and Implementation of a Distributed Mobility Management Entity (MME) on OpenStack

    Get PDF
    Network Functions Virtualisation (NFV) involves the implementation of network functions, for example firewalls and routers, as software applications that can run on general-purpose servers. In present-day networks, each network function is typically implemented on dedicated and proprietary hardware. By utilising virtualisation technologies, NFV enables network functions to be deployed on cloud computing infrastructure in data centers. This thesis discusses the application of NFV to the Evolved Packet Core (EPC) in Long Term Evolution (LTE) networks; specifically to the Mobility Management Entity (MME), a control plane entity in the EPC. With the convergence of cloud computing and mobile networks, conventional architectures of network elements need to be re-designed in order to fully harness benefits such as scalability and elasticity. To this end, we design and implement a distributed MME with a three-tier architecture common to web applications. We highlight design considerations for moving MME functionality to the cloud and compare our new distributed design to that of a standalone MME. We deploy and test the distributed MME on two separate OpenStack clouds. Our results indicate that the benefits of scalability and resilience can outweigh the marginal increase in latency for EPC procedures. We find that the latency is dependent on the actual placement of MME components within the data center. Also, we believe that extensions to the OpenStack platform are required before it can meet performance and availability requirements for telecommunication applications

    Resilience-Building Technologies: State of Knowledge -- ReSIST NoE Deliverable D12

    Get PDF
    This document is the first product of work package WP2, "Resilience-building and -scaling technologies", in the programme of jointly executed research (JER) of the ReSIST Network of Excellenc

    Dynamic data placement and discovery in wide-area networks

    Get PDF
    The workloads of online services and applications such as social networks, sensor data platforms and web search engines have become increasingly global and dynamic, setting new challenges to providing users with low latency access to data. To achieve this, these services typically leverage a multi-site wide-area networked infrastructure. Data access latency in such an infrastructure depends on the network paths between users and data, which is determined by the data placement and discovery strategies. Current strategies are static, which offer low latencies upon deployment but worse performance under a dynamic workload. We propose dynamic data placement and discovery strategies for wide-area networked infrastructures, which adapt to the data access workload. We achieve this with data activity correlation (DAC), an application-agnostic approach for determining the correlations between data items based on access pattern similarities. By dynamically clustering data according to DAC, network traffic in clusters is kept local. We utilise DAC as a key component in reducing access latencies for two application scenarios, emphasising different aspects of the problem: The first scenario assumes the fixed placement of data at sites, and thus focusses on data discovery. This is the case for a global sensor discovery platform, which aims to provide low latency discovery of sensor metadata. We present a self-organising hierarchical infrastructure consisting of multiple DAC clusters, maintained with an online and distributed split-and-merge algorithm. This reduces the number of sites visited, and thus latency, during discovery for a variety of workloads. The second scenario focusses on data placement. This is the case for global online services that leverage a multi-data centre deployment to provide users with low latency access to data. We present a geo-dynamic partitioning middleware, which maintains DAC clusters with an online elastic partition algorithm. It supports the geo-aware placement of partitions across data centres according to the workload. This provides globally distributed users with low latency access to data for static and dynamic workloads.Open Acces

    In-Production Continuous Testing for Future Telco Cloud

    Get PDF
    Software Defined Networking (SDN) is an emerging paradigm to design, build and operate networks. The driving motivation of SDN was the need for a major change in network technologies to support a configuration, management, operation, reconfiguration and evolution than in current computer networks. In the SDN world, performance it is not only related to the behaviour of the data plane. As the separation of control plane and data plane makes the latter significantly more agile, it lays off all the complex processing workload to the control plane. This is further exacerbated in distributed network controller, where the control plane is additionally loaded with the state synchronization overhead. Furthermore, the introduction of SDNs technologies has raised advanced challenges in achieving failure resilience, meant as the persistence of service delivery that can justifiably be trusted, when facing changes, and fault tolerance, meant as the ability to avoid service failures in the presence of faults. Therefore, along with the “softwarization” of network services, it is an important goal in the engineering of such services, e.g. SDNs and NFVs, to be able to test and assess the proper functioning not only in emulated conditions before release and deployment, but also “in-production”, when the system is under real operating conditions.   The goal of this thesis is to devise an approach to evaluate not only the performance, but also the effectiveness of the failure detection, and mitigation mechanisms provided by SDN controllers, as well as the capability of the SDNs to ultimately satisfy nonfunctional requirements, especially resiliency, availability, and reliability. The approach consists of exploiting benchmarking techniques, such as the failure injection, to get continuously feedback on the performance as well as capabilities of the SDN services to survive failures, which is of paramount importance to improve the effective- ness of the system internal mechanisms in reacting to anomalous situations potentially occurring in operation, while its services are regularly updated or improved. Within this vision, this dissertation first presents SCP-CLUB (SDN Control Plane CLoUd-based Benchmarking), a benchmarking frame- work designed to automate the characterization of SDN control plane performance, resilience and fault tolerance in telco cloud deployments. The idea is to provide the same level of automation available in deploying NFV function, for the testing of different configuration, using idle cycles of the telco cloud infrastructure. Then, the dissertation proposes an extension of the framework with mechanisms to evaluate the runtime behaviour of a Telco Cloud SDN under (possibly unforeseen) failure conditions, by exploiting the software failure injection

    Protecting applications using trusted execution environments

    Get PDF
    While cloud computing has been broadly adopted, companies that deal with sensitive data are still reluctant to do so due to privacy concerns or legal restrictions. Vulnerabilities in complex cloud infrastructures, resource sharing among tenants, and malicious insiders pose a real threat to the confidentiality and integrity of sensitive customer data. In recent years trusted execution environments (TEEs), hardware-enforced isolated regions that can protect code and data from the rest of the system, have become available as part of commodity CPUs. However, designing applications for the execution within TEEs requires careful consideration of the elevated threats that come with running in a fully untrusted environment. Interaction with the environment should be minimised, but some cooperation with the untrusted host is required, e.g. for disk and network I/O, via a host interface. Implementing this interface while maintaining the security of sensitive application code and data is a fundamental challenge. This thesis addresses this challenge and discusses how TEEs can be leveraged to secure existing applications efficiently and effectively in untrusted environments. We explore this in the context of three systems that deal with the protection of TEE applications and their host interfaces: SGX-LKL is a library operating system that can run full unmodified applications within TEEs with a minimal general-purpose host interface. By providing broad system support inside the TEE, the reliance on the untrusted host can be reduced to a minimal set of low-level operations that cannot be performed inside the enclave. SGX-LKL provides transparent protection of the host interface and for both disk and network I/O. Glamdring is a framework for the semi-automated partitioning of TEE applications into an untrusted and a trusted compartment. Based on source-level annotations, it uses either dynamic or static code analysis to identify sensitive parts of an application. Taking into account the objectives of a small TCB size and low host interface complexity, it defines an application-specific host interface and generates partitioned application code. EnclaveDB is a secure database using Intel SGX based on a partitioned in-memory database engine. The core of EnclaveDB is its logging and recovery protocol for transaction durability. For this, it relies on the database log managed and persisted by the untrusted database server. EnclaveDB protects against advanced host interface attacks and ensures the confidentiality, integrity, and freshness of sensitive data.Open Acces

    State management in coreless mobile networks

    Get PDF
    The number of mobile Internet users has skyrocketed and with the advent of the Internet of things we are reaching the limits of the current telecommunications standard 4G. The improvements and goals set for the next standard, 5G, are not trivial and research is in progress to reach them. Improvements across all involved technology fields is needed. In this thesis we present a novel mobile network architecture-coreless mobile networks- and develop state management concepts, which we base on the analysis of the current 4G/LTE architecture. The coreless mobile network focuses on the redesign of the state management in mobile networks, more precisely, removal of state from 4G core network entities into an eternal ubiquitous data store. The architecture follows trends in current research, particularly network function virtualisation, software defined networking and mobile edge computing. The new network architecture requires a data storage solution that is capable of functioning as the state store in the mobile network environment. Thus, we present an overview of promising data stores and evaluate their suitability. Further in this thesis we present the results of benchmarking the Apache Geode data store, as an example of a state management solution that could be leveraged in realising the coreless mobile network architecture. We discovered that the Apache Geode data store is, depending on configuration, capable of delivering the data model, consistency, high availability, scaling, throughput and latency that are required in our proposed architecture. ACM Computing Classification System (CCS): - Information systems ~ Distributed storage - Information systems ~ Hierarchical storage management - Networks ~ Middle boxes / network appliances - Networks ~ Mobile network
    corecore