Search CORE

560 research outputs found

Single system image servers on top of clusters of PCs

Author: Olaru Vlad
Publication venue: KIT-Bibliothek, Karlsruhe
Publication date: 01/01/2004
Field of study

Middleware-based Database Replication: The Gaps between Theory and Practice

Author: Ailamaki Anastasia
Candea George
Cecchet Emmanuel
Publication venue
Publication date: 01/01/2008
Field of study

The need for high availability and performance in data management systems has been fueling a long running interest in database replication from both academia and industry. However, academic groups often attack replication problems in isolation, overlooking the need for completeness in their solutions, while commercial teams take a holistic approach that often misses opportunities for fundamental innovation. This has created over time a gap between academic research and industrial practice. This paper aims to characterize the gap along three axes: performance, availability, and administration. We build on our own experience developing and deploying replication systems in commercial and academic settings, as well as on a large body of prior related work. We sift through representative examples from the last decade of open-source, academic, and commercial database replication systems and combine this material with case studies from real systems deployed at Fortune 500 customers. We propose two agendas, one for academic research and one for industrial R&D, which we believe can bridge the gap within 5-10 years. This way, we hope to both motivate and help researchers in making the theory and practice of middleware-based database replication more relevant to each other.Comment: 14 pages. Appears in Proc. ACM SIGMOD International Conference on Management of Data, Vancouver, Canada, June 200

arXiv.org e-Print Archive

Infoscience - École polytechnique fédérale de Lausanne

CiteSeerX

Scalable, Data- intensive Network Computation

Author: Liu Huadong
Publication venue: TRACE: Tennessee Research and Creative Exchange
Publication date: 01/08/2008
Field of study

To enable groups of collaborating researchers at different locations to effectively share large datasets and investigate their spontaneous hypotheses on the fly, we are interested in de- veloping a distributed system that can be easily leveraged by a variety of data intensive applications. The system is composed of (i) a number of best effort logistical depots to en- able large-scale data sharing and in-network data processing, (ii) a set of end-to-end tools to effectively aggregate, manage and schedule a large number of network computations with attendant data movements, and (iii) a Distributed Hash Table (DHT) on top of the generic depot services for scalable data management. The logistical depot is extended by following the end-to-end principles and is modeled with a closed queuing network model. Its performance characteristics are studied by solving the steady state distributions of the model using local balance equations. The modeling results confirm that the wide area network is the performance bottleneck and running concurrent jobs can increase resource utilization and system throughput. As a novel contribution, techniques to effectively support resource demanding data- intensive applications using the ¯ne-grained depot services are developed. These techniques include instruction level scheduling of operations, dynamic co-scheduling of computation and replication, and adaptive workload control. Experiments in volume visualization have proved the effectiveness of these techniques. Due to the unique characteristic of data- intensive applications and our co-scheduling algorithm, a DHT is implemented on top of the basic storage and computation services. It demonstrates the potential of the Logistical Networking infrastructure to serve as a service creation platform

University of Tennessee, Knoxville: Trace

Recommended from our members

RAS enhancements for RDMA communications

Author: Cardona Omar
Publication venue
Publication date: 21/02/2011
Field of study

textEthernet as the communication medium in the enterprise data center has outlived all competing mediums and resisted the test of time with regards to speed and costs. The future is also poised for growth with 40 and 100Gps speeds just over horizon. The current state of the technology is being enhanced and extended with lossless features to allow for fabric convergence of Storage and Inter Process Communication (IPC) Networks. It is under this medium that an increase in the adoption of Remote Direct Memory Access (RDMA) over Ethernet using offloaded TCP/IP (iWARP) and Infiniband over Ethernet (RoCE) communication stacks to RDMA capable NIC adapter s (RNIC) is observed. RDMA enables direct application to application communication over the network resulting in numerous and significant benefits such as reduced CPU utilization, lower latency communications, increased energy efficiency, and reduced overall system requirements. However, with said benefits also comes increased software complexity in how RDMA interface users communicate. The RDMA communication semantics, which originate from the HPC domain, are heavily biased towards Low-Latency and High-Bandwidth communications rather than Reliability, Availability, and Serviceability (RAS). As adoption increases, and enterprise data centers begins to leverage RDMA over Ethernet, enhancements to the OS stack software architecture and design of the components involved is required to address these deficiencies. Operating system interfaces, device drivers, adapter hardware design, and embedded firmware features must be viewed from a high-availability and maintainability point of view. RAS enhancements for RDMA communications proposes the software architectural tradeoffs for enhancing the iWARP and RoCE RDMA implementations for communications in the enterprise data center, with new and traditional RAS features for existing communications stacks and devices. The architecture leverages software enhancements in traceability, availability, maintainability, serviceability, fault-isolation and resource management; such that in the advent of errors, the probability that the forensics data points to identify root cause are immediately and automatically available is increased.Electrical and Computer Engineerin

Texas ScholarWorks

Gollach : configuration of a cluster based linux virtual server

Author: Gwena Tinashe
Publication venue: 'University of Babylon - Department of Mechanical Engineering, Faculty of Engineering'
Publication date: 01/01/2003
Field of study

Includes bibliographical references.This thesis describes the Gollach cluster. The Gollach is an eight machine computing cluster that is aimed at being a general purpose computing resource for research purposes. This includes image processing and simulations. The main quest in this project is to create a cluster server that gives increased computational power and a unified system image (at several levels) without requiring the users to learn specialised tricks. At the same time the cluster must not be tasking to administer

Cape Town University OpenUCT

Reducing Internet Latency : A Survey of Techniques and their Merit

Author: Briscoe Bob
Brunstrom Anna
Fairhurst Gorry
Gjessing Stein
Griwodz Carsten
Hayes David
Petlund Andreas
Ross David
Tsan Ing-Jyh
Welzl Michael
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2014
Field of study

Bob Briscoe, Anna Brunstrom, Andreas Petlund, David Hayes, David Ros, Ing-Jyh Tsang, Stein Gjessing, Gorry Fairhurst, Carsten Griwodz, Michael WelzlPeer reviewedPreprin

Aberdeen University Research

Improving Cloud Middlebox Infrastructure for Online Services

Author: Gandhi Rohan S
Publication venue: 'Purdue University (bepress)'
Publication date: 01/01/2016
Field of study

Middleboxes are an indispensable part of the datacenter networks that provide high availability, scalability and performance to the online services. Using load balancer as an example, this thesis shows that the prevalent scale-out middlebox designs using commodity servers are plagued with three fundamental problems: (1) The server-based layer-4 middleboxes are costly and inflate round-trip-time as much as 2x by processing the packets in software. (2) The middlebox instances cause traffic detouring en route from sources to destinations, which inflates network bandwidth usage by as much as 3.2x and can cause transient congestion. (3) Additionally, existing cloud providers do not support layer-7 middleboxes as a service, and third-party proxy-based layer-7 middlebox design exhibits poor availability as TCP state stored locally on middlebox instances are lost upon instance failure. This thesis examines the root causes of the above problems and proposes new cloud-scale middlebox design principles that systemically address all three problems. First, to address the performance problem, we make a key observation that existing commodity switches have resources available to implement key layer-4 middlebox functionalities such as load balancer, and by processing packets in hardware, switches offer low latency and high capacity benefits, at no additional cost as the switch resources are idle. Motivated by this observation, we propose the design principle of using idle switch resources to accelerate middlebox functionailites. To demonstrate the principle, we developed the complete L4 load balancer design that uses commodity switches for low cost and high performance, and carefully fuses a few software load balancer instances to provide for high availability. Second, to address the high network overhead problem from traffic detouring through middlebox instances, we propose to exploit the principles of locality and flexibility in placing the middlebox instances and servers to handle the traffic closer to the sources and reduce the overall traffic and link utilization in the network. Third, to provide high availability in a layer 7 middleboxes, we propose a novel middlebox design principle of decoupling the TCP state from middlebox instances and storing it in persistent key-value store so that any middlebox instance can seamlessly take over any TCP connection when middlebox instances fail. We demonstrate the effectiveness of the above cloud-scale middlebox design principles using load balancers as an example. Specifically, we have prototyped the three design principles in three cloud-scale load balancers: Duet, Rubik, and Yoda, respectively. Our evaluation using a datacenter testbed and large scale simulations show that Duet lowers the costs by 12x and latency overhead by 1000x, Rubik further lowers the datacenter network traffic overhead by 3x, and Yoda L7 Load balancer-as-a-service is practical; decoupling TCP state from load balancer instances has a negligible

Purdue E-Pubs

A Clouded Future: Analysis of Microsoft Windows Azure As a Platform for Hosting E-Science Applications

Author: Blair Ian
Publication venue: ePublications at Regis University
Publication date: 15/11/2010
Field of study

Microsoft Windows Azure is Microsoft\u27s cloud based platform for hosting .NET applications. Azure provides a simple, cost effective method for outsourcing application hosting. Windows Azure has caught the eye of researchers in e-science who require parallel computing infrastructures to process mountains of data. Windows Azure offers the same benefits to e-science as it does to other industries. This paper examines the technology behind Azure and analyzes two case studies of e-science projects built on the Windows Azure platform

ePublications at Regis University

Docker-ryppäiden clusterf -kuormantasaajan suunnittelu ja toteutus

Author: Marttila Tero
Publication venue
Publication date: 27/10/2016
Field of study

Docker uses the Linux container namespace and cgroup primitives to provide isolated application runtime environments, allowing the distribution of self-contained application images that can be run in the form of Docker containers. Container platforms provide the infrastructure needed to deploy horizontally scalable container services into the Cloud, including orchestration, networking, service discovery and load balancing. This thesis studies container networking architectures and existing Cloud load balancer implementations to create a design for a scalable load balancer using the Linux IPVS network-level load balancer implementation. The clusterf load balancer architecture uses a two-level load balancing scheme combining different packet forwarding methods for scalability and compatibility with existing Docker applications. A distributed load balancer control plane is implemented to provide automatic load balancing for Docker containers. The clusterf load balancer is evaluated using a testbed environment, measuring the performance the network-level Linux IPVS implementation. The scalability of the clusterf load balancer is tested using Equal-Cost Multi-Path (ECMP) routing and IPVS connection synchronization The result is that the network-level Linux IPVS load balancer performs significantly better than the application-level HAProxy load balancer in the same configuration. The clusterf design allows for horizontal scaling with connection failover between IPVS load balancers. The current clusterf implementation requires the use of asymmetric routing within a network, such as provided by local Ethernet networks. Extending the clusterf design to support deployment onto existing Cloud infrastructure platforms with different networking implementations would qualify the clusterf load balancer for use in container platforms.Docker tarjoaa omavaraisia sovelluslevykuvia joita voidaan siirtää eri tietokoneille ja tavan suorittaa kyseisiä sovelluslevykuvia Docker konttien muodostamissa eristetyssä sovellusympäristöissä. Konttialustat tarjoavat pilvessä ajettavien skaalautuvien sovelluspalveluiden käyttöönottoon vaadittuja peruspalveluita, kuten orkestrointia, verkkoviestintää, palvelinetsintää sekä kuormantasausta. Tämä työ tutkii konttiverkkojen arkkitehtuuria sekä olemassa olevia pilvikuormantasaajatoteutuksia luodakseen skaalautuvan Linux IPVS-toteutukseen pohjautuvan verkkotason kuormantasaajan toteutusmallin. Tämä clusterf kuormantasaaja käyttää kaksikerroksista kuormantasausmallia joka yhdistää eri kuormantasausmenetelmiä saavuttaakseen sekä skaalautuvuuden että yhteensopivuuden olemassa olevien Docker konttisovelluksien kanssa. Työssä toteutetaan hajautettu ohjauspinta joka tarjoaa automaattisen kuormatasauksen Docker konttisovelluksille. Tämän clusterf kuormantasausmallin toteutusta arvioidaan erillisessä ympäristössä mittaamalla Linux IPVS-toteutuksen tarjoama suorituskyky. Tämän clusterf kuormantasaajan skaalautuvuus arvioidaan käyttäen Equal-Cost Multi-Path (ECMP) reititystä sekä IPVS-yhteystilojen synkronointia. Tutkimustyön tuloksena nähdään että verkkotason IPVS kuormantasaaja suoriutuu huomattavasti paremmin kuin sovellustason HAProxy kuormantasaaja samassa konfiguraatiossa. Tämä clusterf toteutus mahdollistaa myös kuormantasaajan skaalautumisen, sallien yhteyksien siirtämisen kuormantasaajalta toiselle. Nykyinen clusterf toteutus perustuu epäsymmetrisen reitityksen käyttöön, joka onnistuu Ethernet-pohjaisessa paikallisverkoissa. Toteutuksen laajentaminen mahdollistaakseen käyttöönoton erilaisia verkkototeutuksia käyttävien pilvialustojen päällä sallisi clusterf-kuormantasaajan käytön osana yleistä konttialustaa

Aaltodoc Publication Archive