4,252 research outputs found

    A Taxonomy of Data Grids for Distributed Data Sharing, Management and Processing

    Full text link
    Data Grids have been adopted as the platform for scientific communities that need to share, access, transport, process and manage large data collections distributed worldwide. They combine high-end computing technologies with high-performance networking and wide-area storage management techniques. In this paper, we discuss the key concepts behind Data Grids and compare them with other data sharing and distribution paradigms such as content delivery networks, peer-to-peer networks and distributed databases. We then provide comprehensive taxonomies that cover various aspects of architecture, data transportation, data replication and resource allocation and scheduling. Finally, we map the proposed taxonomy to various Data Grid systems not only to validate the taxonomy but also to identify areas for future exploration. Through this taxonomy, we aim to categorise existing systems to better understand their goals and their methodology. This would help evaluate their applicability for solving similar problems. This taxonomy also provides a "gap analysis" of this area through which researchers can potentially identify new issues for investigation. Finally, we hope that the proposed taxonomy and mapping also helps to provide an easy way for new practitioners to understand this complex area of research.Comment: 46 pages, 16 figures, Technical Repor

    The growing complexity of content delivery networks: Challenges and implications for the Internet ecosystem

    Get PDF
    Since the commercialization of the Internet, content and related applications, including video streaming, news, advertisements, and social interaction have moved online. It is broadly recognized that the rise of all of these different types of content (static and dynamic, and increasingly multimedia) has been one of the main forces behind the phenomenal growth of the Internet, and its emergence as essential infrastructure for how individuals across the globe gain access to the content sources they want. To accelerate the delivery of diverse content in the Internet and to provide commercial-grade performance for video delivery and the Web, Content Delivery Networks (CDNs) were introduced. This paper describes the current CDN ecosystem and the forces that have driven its evolution. We outline the different CDN architectures and consider their relative strengths and weaknesses. Our analysis highlights the role of location, the growing complexity of the CDN ecosystem, and their relationship to and implications for interconnection markets.EC/H2020/679158/EU/Resolving the Tussle in the Internet: Mapping, Architecture, and Policy Making/ResolutioNe

    Scalable and Reliable Middlebox Deployment

    Get PDF
    Middleboxes are pervasive in modern computer networks providing functionalities beyond mere packet forwarding. Load balancers, intrusion detection systems, and network address translators are typical examples of middleboxes. Despite their benefits, middleboxes come with several challenges with respect to their scalability and reliability. The goal of this thesis is to devise middlebox deployment solutions that are cost effective, scalable, and fault tolerant. The thesis includes three main contributions: First, distributed service function chaining with multiple instances of a middlebox deployed on different physical servers to optimize resource usage; Second, Constellation, a geo-distributed middlebox framework enabling a middlebox application to operate with high performance across wide area networks; Third, a fault tolerant service function chaining system

    Rethinking Distributed Caching Systems Design and Implementation

    Get PDF
    Distributed caching systems based on in-memory key-value stores have become a crucial aspect of fast and efficient content delivery in modern web-applications. However, due to the dynamic and skewed execution environments and workloads, under which such systems typically operate, several problems arise in the form of load imbalance. This thesis addresses the sources of load imbalance in caching systems, mainly: i) data placement, which relates to distribution of data items across servers and ii) data item access frequency, which describes amount of requests each server has to process, and how each server is able to cope with it. Thus, providing several strategies to overcome the sources of imbalance in isolation. As a use case, we analyse Memcached, its variants, and propose a novel solution for distributed caching systems. Our solution revolves around increasing parallelism through load segregation, and solutions to overcome the load discrepancies when reaching high saturation scenarios, mostly through access re-arrangement, and internal replication.Os sistemas de cache distribuídos baseados em armazenamento de pares chave-valor em RAM, tornaram-se um aspecto crucial em aplicações web modernas para o fornecimento rápido e eficiente de conteúdo. No entanto, estes sistemas normalmente estão sujeitos a ambientes muito dinâmicos e irregulares. Este tipo de ambientes e irregularidades, causa vários problemas, que emergem sob a forma de desequilíbrios de carga. Esta tese aborda as diferentes origens de desequilíbrio de carga em sistemas de caching distribuído, principalmente: i) colocação de dados, que se relaciona com a distribuição dos dados pelos servidores e a ii) frequência de acesso aos dados, que reflete a quantidade de pedidos que cada servidor deve processar e como cada servidor lida com a sua carga. Desta forma, demonstramos várias estratégias para reduzir o impacto proveniente das fontes de desequilíbrio, quando analizadas em isolamento. Como caso de uso, analisamos o sistema Memcached, as suas variantes, e propomos uma nova solução para sistemas de caching distribuídos. A nossa solução gira em torno de aumento de paralelismo atraves de segregação de carga e em como superar superar as discrepâncias de carga a quando de sistema entra em grande saturação, principalmente atraves de reorganização de acesso e de replicação intern

    Design of Overlay Networks for Internet Multicast - Doctoral Dissertation, August 2002

    Get PDF
    Multicast is an efficient transmission scheme for supporting group communication in networks. Contrasted with unicast, where multiple point-to-point connections must be used to support communications among a group of users, multicast is more efficient because each data packet is replicated in the network – at the branching points leading to distinguished destinations, thus reducing the transmission load on the data sources and traffic load on the network links. To implement multicast, networks need to incorporate new routing and forwarding mechanisms in addition to the existing are not adequately supported in the current networks. The IP multicast are not adequately supported in the current networks. The IP multicast solution has serious scaling and deployment limitations, and cannot be easily extended to provide more enhanced data services. Furthermore, and perhaps most importantly, IP multicast has ignored the economic nature of the problem, lacking incentives for service providers to deploy the service in wide area networks. Overlay multicast holds promise for the realization of large scale Internet multicast services. An overlay network is a virtual topology constructed on top of the Internet infrastructure. The concept of overlay networks enables multicast to be deployed as a service network rather than a network primitive mechanism, allowing deployment over heterogeneous networks without the need of universal network support. This dissertation addresses the network design aspects of overlay networks to provide scalable multicast services in the Internet. The resources and the network cost in the context of overlay networks are different from that in conventional networks, presenting new challenges and new problems to solve. Our design goal are the maximization of network utility and improved service quality. As the overall network design problem is extremely complex, we divide the problem into three components: the efficient management of session traffic (multicast routing), the provisioning of overlay network resources (bandwidth dimensioning) and overlay topology optimization (service placement). The combined solution provides a comprehensive procedure for planning and managing an overlay multicast network. We also consider a complementary form of overlay multicast called application-level multicast (ALMI). ALMI allows end systems to directly create an overlay multicast session among themselves. This gives applications the flexibility to communicate without relying on service provides. The tradeoff is that users do not have direct control on the topology and data paths taken by the session flows and will typically get lower quality of service due to the best effort nature of the Internet environment. ALMI is therefore suitable for sessions of small size or sessions where all members are well connected to the network. Furthermore, the ALMI framework allows us to experiment with application specific components such as data reliability, in order to identify a useful set of communication semantic for enhanced data services

    Anycast services and its applications

    Full text link
    Anycast in next generation Internet Protocol is a hot topic in the research of computer networks. It has promising potentials and also many challenges, such as architecture, routing, Quality-of-Service, anycast in ad hoc networks, application-layer anycast, etc. In this thesis, we tackle some important topics among them. The thesis at first presents an introduction about anycast, followed by the related work. Then, as our major contributions, a number of challenging issues are addressed in the following chapters. We tackled the anycast routing problem by proposing a requirement based probing algorithm at application layer for anycast routing. Compared with the existing periodical based probing routing algorithm, the proposed routing algorithm improves the performance in terms of delay. We addressed the reliable service problem by the design of a twin server model for the anycast servers, providing a transparent and reliable service for all anycast queries. We addressed the load balance problem of anycast servers by proposing new job deviation strategies, to provide a similar Quality-of-Service to all clients of anycast servers. We applied the mesh routing methodology in the anycast routing in ad hoc networking environment, which provides a reliable routing service and uses much less network resources. We combined the anycast protocol and the multicast protocol to provide a bidirectional service, and applied the service to Web-based database applications, achieving a better query efficiency and data synchronization. Finally, we proposed a new Internet based service, minicast, as the combination of the anycast and multicast protocols. Such a service has potential applications in information retrieval, parallel computing, cache queries, etc. We show that the minicast service consumes less network resources while providing the same services. The last chapter of the thesis presents the conclusions and discusses the future work

    Systems and Methods for Measuring and Improving End-User Application Performance on Mobile Devices

    Full text link
    In today's rapidly growing smartphone society, the time users are spending on their smartphones is continuing to grow and mobile applications are becoming the primary medium for providing services and content to users. With such fast paced growth in smart-phone usage, cellular carriers and internet service providers continuously upgrade their infrastructure to the latest technologies and expand their capacities to improve the performance and reliability of their network and to satisfy exploding user demand for mobile data. On the other side of the spectrum, content providers and e-commerce companies adopt the latest protocols and techniques to provide smooth and feature-rich user experiences on their applications. To ensure a good quality of experience, monitoring how applications perform on users' devices is necessary. Often, network and content providers lack such visibility into the end-user application performance. In this dissertation, we demonstrate that having visibility into the end-user perceived performance, through system design for efficient and coordinated active and passive measurements of end-user application and network performance, is crucial for detecting, diagnosing, and addressing performance problems on mobile devices. My dissertation consists of three projects to support this statement. First, to provide such continuous monitoring on smartphones with constrained resources that operate in such a highly dynamic mobile environment, we devise efficient, adaptive, and coordinated systems, as a platform, for active and passive measurements of end-user performance. Second, using this platform and other passive data collection techniques, we conduct an in-depth user trial of mobile multipath to understand how Multipath TCP (MPTCP) performs in practice. Our measurement study reveals several limitations of MPTCP. Based on the insights gained from our measurement study, we propose two different schemes to address the identified limitations of MPTCP. Last, we show how to provide visibility into the end- user application performance for internet providers and in particular home WiFi routers by passively monitoring users' traffic and utilizing per-app models mapping various network quality of service (QoS) metrics to the application performance.PHDComputer Science & EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttps://deepblue.lib.umich.edu/bitstream/2027.42/146014/1/ashnik_1.pd

    Performance analysis of a database caching system in a grid environment

    Get PDF
    Tese de mestrado. Engenharia Informática. Faculdade de Engenharia. Universidade do Porto. 200

    SCADA and related technologies for irrigation district modernization

    Get PDF
    Presented at SCADA and related technologies for irrigation district modernization: a USCID water management conference on October 26-29, 2005 in Vancouver, Washington.Includes bibliographical references.Overview of Supervisory Control and Data Acquisition (SCADA) -- Total Channel Control™ - The value of automation in irrigation distribution systems -- Design and implementation of an irrigation canal SCADA -- All American Canal Monitoring Project -- Taking closed piping flowmeters to the next level - new technologies support trends in data logging and SCADA systems -- Real-time model-based dam automation: a case study of the Piute Dam -- Effective implementation of algorithm theory into PLCs -- Optimal fuzzy control for canal control structures -- SCADA over Zigbee™ -- Synchronous radio modem technology for affordable irrigation SCADA systems -- A suggested criteria for the selection of RTUs and sensors -- Irrigation canals in Spain: the integral process of modernization -- Ten years of SCADA data quality control and utilization for system management and planning modernization -- Moderately priced SCADA implementation -- Increasing peak power generation using SCADA and automation: a case study of the Kaweah River Power Authority -- Eastern Irrigation District canal automation and Supervisory Control and Data Acquisition (SCADA) -- Case study on design and construction of a regulating reservoir pumping station -- Saving water with Total Channel Control® in the Macalister Irrigation District, Australia -- Leveraging SCADA to modernize operations in the Klamath Irrigation Project -- A 2005 update on the installation of a VFD/SCADA system at Sutter Mutual Water Company -- Truckee Carson Irrigation District Turnout Water Measurement Program -- The myth of a "Turnkey" SCADA system and other lessons learned -- Canal modernization in Central California Irrigation District - case study -- Remote monitoring and operation at the Colorado River Irrigation District -- Web-based GIS decision support system for irrigation districts -- Using RiverWare as a real time river systems management tool -- Submerged venturi flume -- Ochoco Irrigation District telemetry case study -- Uinta Basin Replacement Project: a SCADA case study in managing multiple interests and adapting to loss of storage -- Training SCADA operators with real-time simulation -- Demonstration of gate control with SCADA system in Lower Rio Grande Valley, in Texas -- Incorporating sharp-crested weirs into irrigation SCADA systems

    Future of networking is the future of Big Data, The

    Get PDF
    2019 Summer.Includes bibliographical references.Scientific domains such as Climate Science, High Energy Particle Physics (HEP), Genomics, Biology, and many others are increasingly moving towards data-oriented workflows where each of these communities generates, stores and uses massive datasets that reach into terabytes and petabytes, and projected soon to reach exabytes. These communities are also increasingly moving towards a global collaborative model where scientists routinely exchange a significant amount of data. The sheer volume of data and associated complexities associated with maintaining, transferring, and using them, continue to push the limits of the current technologies in multiple dimensions - storage, analysis, networking, and security. This thesis tackles the networking aspect of big-data science. Networking is the glue that binds all the components of modern scientific workflows, and these communities are becoming increasingly dependent on high-speed, highly reliable networks. The network, as the common layer across big-science communities, provides an ideal place for implementing common services. Big-science applications also need to work closely with the network to ensure optimal usage of resources, intelligent routing of requests, and data. Finally, as more communities move towards data-intensive, connected workflows - adopting a service model where the network provides some of the common services reduces not only application complexity but also the necessity of duplicate implementations. Named Data Networking (NDN) is a new network architecture whose service model aligns better with the needs of these data-oriented applications. NDN's name based paradigm makes it easier to provide intelligent features at the network layer rather than at the application layer. This thesis shows that NDN can push several standard features to the network. This work is the first attempt to apply NDN in the context of large scientific data; in the process, this thesis touches upon scientific data naming, name discovery, real-world deployment of NDN for scientific data, feasibility studies, and the designs of in-network protocols for big-data science
    corecore