28 research outputs found

    Nu@ge: Towards a solidary and responsible cloud computing service

    Get PDF
    Best Paper AwardInternational audienceThe adoption of cloud computing is still limited by several legal concerns from companies. One of those reasons is the data sovereignty, as data can be physically host in sensible locations, resulting in a lack of control for companies. In this paper, we present the Nu@ge project aimed at building a federation of container-sized datacenter on the French territory. Nu@ge provides a software stack that enables companies to put independent datacenters in cooperation in a national mesh. Additionally, a prototype of a container-sized datacenter has been validated and patented

    Optical fibre local area networks

    Get PDF

    Simulation and design of storage area network

    Get PDF
    Master'sMASTER OF ENGINEERIN

    Bayesian Prognostic Framework for High-Availability Clusters

    Get PDF
    Critical services from domains as diverse as finance, manufacturing and healthcare are often delivered by complex enterprise applications (EAs). High-availability clusters (HACs) are software-managed IT infrastructures that enable these EAs to operate with minimum downtime. To that end, HACs monitor the health of EA layers (e.g., application servers and databases) and resources (i.e., components), and attempt to reinitialise or restart failed resources swiftly. When this is unsuccessful, HACs try to failover (i.e., relocate) the resource group to which the failed resource belongs to another server. If the resource group failover is also unsuccessful, or when a system-wide critical failure occurs, HACs initiate a complete system failover. Despite the availability of multiple commercial and open-source HAC solutions, these HACs (i) disregard important sources of historical and runtime information, and (ii) have limited reasoning capabilities. Therefore, they may conservatively perform unnecessary resource group or system failovers or delay justified failovers for longer than necessary. This thesis introduces the first HAC taxonomy, uses it to carry out an extensive survey of current HAC solutions, and develops a novel Bayesian prognostic (BP) framework that addresses the significant HAC limitations that are mentioned above and are identified by the survey. The BP framework comprises four \emph{modules}. The first module is a technique for modelling high availability using a combination of established and new HAC characteristics. The second is a suite of methods for obtaining and maintaining the information required by the other modules. The third is a HAC-independent Bayesian decision network (BDN) that predicts whether resource failures can be managed locally (i.e., without failovers). The fourth is a method for constructing a HAC-specific Bayesian network for the fast prediction of resource group and system failures. Used together, these modules reduce the downtime of HAC-protected EAs significantly. The experiments presented in this thesis show that the BP framework can deliver downtimes between 5.5 and 7.9 times smaller than those obtained with an established open-source HAC

    Data center resilience assessment : storage, networking and security.

    Get PDF
    Data centers (DC) are the core of the national cyber infrastructure. With the incredible growth of critical data volumes in financial institutions, government organizations, and global companies, data centers are becoming larger and more distributed posing more challenges for operational continuity in the presence of experienced cyber attackers and occasional natural disasters. The main objective of this research work is to present a new methodology for data center resilience assessment, this methodology consists of: • Define Data center resilience requirements. • Devise a high level metric for data center resilience. • Design and develop a tool to validate and the metric. Since computer networks are an important component in the data center architecture, this research work was extended to investigate computer network resilience enhancement opportunities within the area of routing protocols, redundancy, and server load to minimize the network down time and increase the time period of resisting attacks. Data center resilience assessment is a complex process as it involves several aspects such as: policies for emergencies, recovery plans, variation in data center operational roles, hosted/processed data types and data center architectures. However, in this dissertation, storage, networking and security are emphasized. The need for resilience assessment emerged due to the gap in existing reliability, availability, and serviceability (RAS) measures. Resilience as an evaluation metric leads to better proactive perspective in system design and management. The proposed Data center resilience assessment portal (DC-RAP) is designed to easily integrate various operational scenarios. DC-RAP features a user friendly interface to assess the resilience in terms of performance analysis and speed recovery by collecting the following information: time to detect attacks, time to resist, time to fail and recovery time. Several set of experiments were performed, results obtained from investigating the impact of routing protocols, server load balancing algorithms on network resilience, showed that using particular routing protocol or server load balancing algorithm can enhance network resilience level in terms of minimizing the downtime and ensure speed recovery. Also experimental results for investigating the use social network analysis (SNA) for identifying important router in computer network showed that the SNA was successful in identifying important routers. This important router list can be used to redundant those routers to ensure high level of resilience. Finally, experimental results for testing and validating the data center resilience assessment methodology using the DC-RAP showed the ability of the methodology quantify data center resilience in terms of providing steady performance, minimal recovery time and maximum resistance-attacks time. The main contributions of this work can be summarized as follows: • A methodology for evaluation data center resilience has been developed. • Implemented a Data Center Resilience Assessment Portal (D$-RAP) for resilience evaluations. • Investigated the usage of Social Network Analysis to Improve the computer network resilience

    Lossless Ethernet and its applications

    Get PDF
    Ethernet network is the most widely used transport network in access and data-center networks. Ethernet-based networks provide several advantages such as i) low-cost equipment, ii) sharing existing infrastructure, as well as iii) the ease in the Operations, Administration and Maintenance (OAM). However, Ethernet network is a best-effort network which raises significant issues regarding packet loss and throughput. In this research, we investigate the possibility of achieving lossless Ethernet while keeping network switches unchanged. We present three lossless Ethernet applications namely i) switch fabric for routers, ii) lossless data center fabric, and iii) zero-jitter fronthaul network for Common Public Radio Interface (CPRI) over Ethernet for 5th Generation Mobile Networks (5G) network. Switch fabric in routers requires stringent characteristics in term of packet loss, fairness, no head-of-line blocking and low latency. We propose a novel concept to control and prevent congestion in switch fabrics to achieve scalable, flexible, and more cost-efficient router fabric while using commodity Ethernet switches. On the other hand, data center applications require strict characteristics regarding packet loss, fairness, head-of-line blocking, latency, and low processing overhead. Therefore, we present a congestion control for data center networks. Our proposal is designed to achieve minimum queue length and latency while guaranteeing fairness between flows of different rates, packet sizes and Round-trip Times (RTTs). Besides, Using Ethernet as a transport network for fronthaul in 5G networks draws significant attention of both academia and industry due to i) the low cost of equipment, ii) sharing existing infrastructure, as well as iii) the ease of operations, administration and maintenance (OAM). Therefore, we introduce a distributed scheduling algorithm to support CPRI traffic over Ethernet. The results obtained through testbed implementations and simulations show that Lossless Ethernet is feasible and could achieve minimum queue length, latency, and jitter while preventing Head Of Line (HOL) blocking

    Equal cost multipath routing in IP networks

    Get PDF
    IP verkkojen palveluntarjoajat ja loppukäyttäjät vaativat yhä tehokkaampia ja parempilaatuisia palveluita, mikä vaatii tuotekehittäjiä tarjoamaan hienostuneempia liikennesuunnittelumenetelmiä verkon optimointia ja hallintaa varten. IS-IS ja OSPF ovat standardiratkaisut hoitamaan reititystä pienissä ja keskisuurissa pakettiverkoissa. Monipolkureititys on melko helppo ja yleispätevä tapa parantaa kuorman balansointia ja nopeaa suojausta tällaisissa yhden polun reititykseen keskittyvissä verkoissa. Tämä diplomityö kirjoitettiin aikana, jolloin monipolkureititys toteutettiin Tellabs-nimisen yrityksen 8600-sarjan reitittimiin. Tärkeimpiä kohtia monipolkureitityksen käyttöönotossa ovat lyhyimmän polun algoritmin muokkaukseen ja reititystaulun toimintaan liittyvät muutokset ohjaustasolla sekä kuormanbalansointialgoritmin toteutus reitittimen edelleenkuljetustasolla. Diplomityön tulokset sekä olemassa oleva kirjallisuus osoittavat, että kuormanbalansointialgoritmilla on suurin vaikutus yhtä hyvien polkujen liikenteen jakautumiseen ja että oikean algoritmin valinta on ratkaisevan tärkeää. Hajakoodaukseen perustuvat algoritmit, jotka pitävät suurimman osan liikennevuoista samalla polulla, ovat dominoivia ratkaisuja nykyisin. Tämän algoritmityypin etuna on helppo toteutettavuus ja kohtuullisen hyvä suorituskyky. Liikenne on jakautunut tasaisesti, kunhan liikennevuoiden lukumäärä on riittävän suuri. Monipolkureititys tarjoaa yksinkertaisen ratkaisun, jota on helppo konfiguroida ja ylläpitää. Suorituskyky on parempi kuin yksipolkureititykseen perustuvat ratkaisut ja se haastaa monimutkaisemmat MPLS ratkaisut. Ainoa huolehdittava asia on linkkien painojen asettaminen sillä tavalla, että riittävästi kuormantasauspolkuja syntyy.Increasing efficiency and quality demands of services from IP network service providers and end users drive developers to offer more and more sophisticated traffic engineering methods for network optimization and control. Intermediate System to Intermediate System and Open Shortest Path First are the standard routing solutions for intra-domain networks. An easy upgrade utilizes Equal Cost Multipath (ECMP) that is one of the most general solutions for IP traffic engineering to increase load balancing and fast protection performance of single path interior gateway protocols. This thesis was written during the implementation process of the ECMP feature of Tellabs 8600 series routers. The most important parts in adoption of ECMP are changes to shortest path first algorithm and routing table modification in the control plane and implementation of load balancing algorithm to the forwarding plane of router. The results of the thesis and existing literature prove, that the load balancing algorithm has the largest affect on traffic distribution of equal cost paths and the selection of the correct algorithm is crucial. Hash-based algorithms, that keep the traffic flows in the same path, are the dominating solutions currently. They provide simple implementation and moderate performance. Traffic is distributed evenly, when the number of flows is large enough. ECMP provides a simple solution that is easy to configure and maintain. It outperforms single path solutions and competes with more complex MPLS solutions. The only thing to take care of is the adjustment of link weights of the network in order to create enough load balancing paths

    Linear Scalability of Distributed Applications

    Get PDF
    The explosion of social applications such as Facebook, LinkedIn and Twitter, of electronic commerce with companies like Amazon.com and Ebay.com, and of Internet search has created the need for new technologies and appropriate systems to manage effectively a considerable amount of data and users. These applications must run continuously every day of the year and must be capable of surviving sudden and abrupt load increases as well as all kinds of software, hardware, human and organizational failures. Increasing (or decreasing) the allocated resources of a distributed application in an elastic and scalable manner, while satisfying requirements on availability and performance in a cost-effective way, is essential for the commercial viability but it poses great challenges in today's infrastructures. Indeed, Cloud Computing can provide resources on demand: it now becomes easy to start dozens of servers in parallel (computational resources) or to store a huge amount of data (storage resources), even for a very limited period, paying only for the resources consumed. However, these complex infrastructures consisting of heterogeneous and low-cost resources are failure-prone. Also, although cloud resources are deemed to be virtually unlimited, only adequate resource management and demand multiplexing can meet customer requirements and avoid performance deteriorations. In this thesis, we deal with adaptive management of cloud resources under specific application requirements. First, in the intra-cloud environment, we address the problem of cloud storage resource management with availability guarantees and find the optimal resource allocation in a decentralized way by means of a virtual economy. Data replicas migrate, replicate or delete themselves according to their economic fitness. Our approach responds effectively to sudden load increases or failures and makes best use of the geographical distance between nodes to improve application-specific data availability. We then propose a decentralized approach for adaptive management of computational resources for applications requiring high availability and performance guarantees under load spikes, sudden failures or cloud resource updates. Our approach involves a virtual economy among service components (similar to the one among data replicas) and an innovative cascading scheme for setting up the performance goals of individual components so as to meet the overall application requirements. Our approach manages to meet application requirements with the minimum resources, by allocating new ones or releasing redundant ones. Finally, as cloud storage vendors offer online services at different rates, which can vary widely due to second-degree price discrimination, we present an inter-cloud storage resource allocation method to aggregate resources from different storage vendors and provide to the user a system which guarantees the best rate to host and serve its data, while satisfying the user requirements on availability, durability, latency, etc. Our system continuously optimizes the placement of data according to its type and usage pattern, and minimizes migration costs from one provider to another, thereby avoiding vendor lock-in

    Towards practicalization of blockchain-based decentralized applications

    Get PDF
    Blockchain can be defined as an immutable ledger for recording transactions, maintained in a distributed network of mutually untrusting peers. Blockchain technology has been widely applied to various fields beyond its initial usage of cryptocurrency. However, blockchain itself is insufficient to meet all the desired security or efficiency requirements for diversified application scenarios. This dissertation focuses on two core functionalities that blockchain provides, i.e., robust storage and reliable computation. Three concrete application scenarios including Internet of Things (IoT), cybersecurity management (CSM), and peer-to-peer (P2P) content delivery network (CDN) are utilized to elaborate the general design principles for these two main functionalities. Among them, the IoT and CSM applications involve the design of blockchain-based robust storage and management while the P2P CDN requires reliable computation. Such general design principles derived from disparate application scenarios have the potential to realize practicalization of many other blockchain-enabled decentralized applications. In the IoT application, blockchain-based decentralized data management is capable of handling faulty nodes, as designed in the cybersecurity application. But an important issue lies in the interaction between external network and blockchain network, i.e., external clients must rely on a relay node to communicate with the full nodes in the blockchain. Compromization of such relay nodes may result in a security breach and even a blockage of IoT sensors from the network. Therefore, a censorship-resistant blockchain-based decentralized IoT management system is proposed. Experimental results from proof-of-concept implementation and deployment in a real distributed environment show the feasibility and effectiveness in achieving censorship resistance. The CSM application incorporates blockchain to provide robust storage of historical cybersecurity data so that with a certain level of cyber intelligence, a defender can determine if a network has been compromised and to what extent. The CSM functions can be categorized into three classes: Network-centric (N-CSM), Tools-centric (T-CSM) and Application-centric (A-CSM). The cyber intelligence identifies new attackers, victims, or defense capabilities. Moreover, a decentralized storage network (DSN) is integrated to reduce on-chain storage costs without undermining its robustness. Experiments with the prototype implementation and real-world cyber datasets show that the blockchain-based CSM solution is effective and efficient. The P2P CDN application explores and utilizes the functionality of reliable computation that blockchain empowers. Particularly, P2P CDN is promising to provide benefits including cost-saving and scalable peak-demand handling compared with centralized CDNs. However, reliable P2P delivery requires proper enforcement of delivery fairness. Unfortunately, most existing studies on delivery fairness are based on non-cooperative game-theoretic assumptions that are arguably unrealistic in the ad-hoc P2P setting. To address this issue, an expressive security requirement for desired fair P2P content delivery is defined and two efficient approaches based on blockchain for P2P downloading and P2P streaming are proposed. The proposed system guarantees the fairness for each party even when all others collude to arbitrarily misbehave and achieves asymptotically optimal on-chain costs and optimal delivery communication

    ASYMMETRIC DISTRIBUTED LOCK MANAGEMENT IN CLOUD COMPUTING

    Get PDF
    Cloud computing have become part of our daily lives. They offer a dynamic environment for costumers to store and access their data at any time in any location. The developments of social networks have led to the necessity to build a solution which is easily accesible and available when required. Cloud computing provide a solution that does not depend on the location and can offer a wide range of services, while being free from failure and errors. Although there is an increase in the usage of the cloud storage services, there is still a significant number of aspects such as instant servers failures, network partitioning and natural disasters that require to be carefully addressed. Another important point that is vital for a sustainable cloud is the implementation of an algorithm which will coordinate and maintain concurrent access and keep shared files free from errors. One of the main approaches to overcome these problems is to provide a set of servers which will act as a gateway between clients and storage nodes. In this thesis we propose a new approach which provides an alternative solution to the main problematics related with cloud storages. The approach is based on multiple strategies for eliminating the problem of node failure and network partitioning while providing a complete distributed environment. In our approach, every server acts as a master server for its own requests and can provide service to its clients without interacting with other master servers. The concurrent access is maintained in an asymmetric way through our lock manager algorithm with the least communication among other master servers. According to the state of a specific file, master server can execute any received request without communicating with other master servers and only when additional information is required does further communication occur. In our approach the network partitioning or failure of one or more master servers has no effect on the other part of the cloud. To improve availability, we associate every master server with a failover server which takes up the duty of a master when the master server fails or becomes obsolete. To measure the performance of our approach we have performed various tests and the results are presented in detailed graphs
    corecore