1,757 research outputs found

    TECHNIQUES FOR FILE SYNCHRONISATION: A SURVEY

    Get PDF
    Abstract:We are inhabiting in the information age.Large amount of data of the order of Terabytes, Exa-bytes and Zettabytes are of contents are transported from one gimmick to another.The bad news is that the network bandwidth provided by internet providers is determined.It will take long time to remove all files among different computer organization. The immediate solution is to synchronize the files in a better way in the clusters of a computer system which will consume less bandwidth and will give faster file synchronization operation. The main problem of file synchronization is to transfer media files in a bandwidth efficient manner. Apart from this file need to be transferred in small time, so that the client should not wait for file transfer to happen. There is a very wide research on this topic in Rsync [1

    Adaptive data synchronization algorithm for IoT-oriented low-power wide-area networks

    Get PDF
    The Internet of Things (IoT) is by now very close to be realized, leading the world towards a new technological era where people’s lives and habits will be definitively revolutionized. Furthermore, the incoming 5G technology promises significant enhancements concerning the Quality of Service (QoS) in mobile communications. Having billions of devices simultaneously connected has opened new challenges about network management and data exchange rules that need to be tailored to the characteristics of the considered scenario. A large part of the IoT market is pointing to Low-Power Wide-Area Networks (LPWANs) representing the infrastructure for several applications having energy saving as a mandatory goal besides other aspects of QoS. In this context, we propose a low-power IoT-oriented file synchronization protocol that, by dynamically optimizing the amount of data to be transferred, limits the device level of interaction within the network, therefore extending the battery life. This protocol can be adopted with different Layer 2 technologies and provides energy savings at the IoT device level that can be exploited by different applications

    A Taxonomy of Data Grids for Distributed Data Sharing, Management and Processing

    Full text link
    Data Grids have been adopted as the platform for scientific communities that need to share, access, transport, process and manage large data collections distributed worldwide. They combine high-end computing technologies with high-performance networking and wide-area storage management techniques. In this paper, we discuss the key concepts behind Data Grids and compare them with other data sharing and distribution paradigms such as content delivery networks, peer-to-peer networks and distributed databases. We then provide comprehensive taxonomies that cover various aspects of architecture, data transportation, data replication and resource allocation and scheduling. Finally, we map the proposed taxonomy to various Data Grid systems not only to validate the taxonomy but also to identify areas for future exploration. Through this taxonomy, we aim to categorise existing systems to better understand their goals and their methodology. This would help evaluate their applicability for solving similar problems. This taxonomy also provides a "gap analysis" of this area through which researchers can potentially identify new issues for investigation. Finally, we hope that the proposed taxonomy and mapping also helps to provide an easy way for new practitioners to understand this complex area of research.Comment: 46 pages, 16 figures, Technical Repor

    Future of networking is the future of Big Data, The

    Get PDF
    2019 Summer.Includes bibliographical references.Scientific domains such as Climate Science, High Energy Particle Physics (HEP), Genomics, Biology, and many others are increasingly moving towards data-oriented workflows where each of these communities generates, stores and uses massive datasets that reach into terabytes and petabytes, and projected soon to reach exabytes. These communities are also increasingly moving towards a global collaborative model where scientists routinely exchange a significant amount of data. The sheer volume of data and associated complexities associated with maintaining, transferring, and using them, continue to push the limits of the current technologies in multiple dimensions - storage, analysis, networking, and security. This thesis tackles the networking aspect of big-data science. Networking is the glue that binds all the components of modern scientific workflows, and these communities are becoming increasingly dependent on high-speed, highly reliable networks. The network, as the common layer across big-science communities, provides an ideal place for implementing common services. Big-science applications also need to work closely with the network to ensure optimal usage of resources, intelligent routing of requests, and data. Finally, as more communities move towards data-intensive, connected workflows - adopting a service model where the network provides some of the common services reduces not only application complexity but also the necessity of duplicate implementations. Named Data Networking (NDN) is a new network architecture whose service model aligns better with the needs of these data-oriented applications. NDN's name based paradigm makes it easier to provide intelligent features at the network layer rather than at the application layer. This thesis shows that NDN can push several standard features to the network. This work is the first attempt to apply NDN in the context of large scientific data; in the process, this thesis touches upon scientific data naming, name discovery, real-world deployment of NDN for scientific data, feasibility studies, and the designs of in-network protocols for big-data science

    Prioritized data synchronization with applications

    Full text link
    We are interested on the problem of synchronizing data on two distinct devices with differed priorities using minimum communication. A variety of distributed sys- tems require communication efficient and prioritized synchronization, for example, where the bandwidth is limited or certain information is more time sensitive than others. Our particular approach, P-CPI, involving the interactive synchronization of prioritized data, is efficient both in communication and computation. This protocol sports some desirable features, including (i) communication and computational com- plexity primarily tied to the number of di erences between the hosts rather than the amount of the data overall and (ii) a memoryless fast restart after interruption. We provide a novel analysis of this protocol, with proved high-probability performance bound and fast-restart in logarithmic time. We also provide an empirical model for predicting the probability of complete synchronization as a function of time and symmetric differences. We then consider two applications of our core algorithm. The first is a string reconciliation protocol, for which we propose a novel algorithm with online time com- plexity that is linear in the size of the string. Our experimental results show that our string reconciliation protocol can potentially outperform existing synchroniza- tion tools such like rsync in some cases. We also look into the benefit brought by our algorithm to delay-tolerant networks(DTNs). We propose an optimized DTN routing protocol with P-CPI implemented as middleware. As a proof of concept, we demonstrate improved delivery rate, reduced metadata and reduced average delay

    Interprocess communication in highly distributed systems

    Get PDF
    Issued as Final technical report, Project no. G-36-632Final technical report has title: Interprocess communication in highly distributed system

    Group-oriented coordination models for distributed client-server computing

    Get PDF
    This paper describes group-oriented control models for distributed client-server interactions. These models transparently coordinate requests for services that involve multiple servers, such as queries across distributed databases. Specific capabilities include: decomposing and replicating client requests; dispatching request subtasks or copies to independent, networked servers; and combining server results into a single response for the client. The control models were implemented by combining request broker and process group technologies with an object-oriented communication middleware tool. The models are illustrated in the context of a distributed operations support application for space-based systems
    • …
    corecore