280,210 research outputs found

    Systematic review of distributed file system

    Get PDF
    Recently, with the rapid development of cloud and distributed systems (DS); the demand of supporting file system was in physically distributed environment. A user may wish to make his actions contingent upon information from a remote site, or may wish to update remote information. Normally Universal Naming Convention used to share the server based resources that have syntax like \\se^versname\sharedname, Most frequently such resources are spread among the organization. Sometimes the physical movement of a user may require his data to be accessible elsewhere may result the denial of file access. Omitting this problem we should update everywhere. But this is an operational nightmare. The power of distributed computing can clearly be seen in some of the most ubiquitous of modem applications: the Internet search engines. These use massive amounts of distributed computing to discover and index as much of the Web as possible There have been many projects focused on network computing that have designed and implemented distributed file systems(DFS) with different architectures and functionalities. The DFS is one of the most important and widely used forms of shared permanent storage. It is to allow users of physically distributed computers to share data and storage resources by using a common file system but it's to run as a single system. In this paper, This systematic literature review were carried out to develop a comprehensive nomenclature for describing distributed file system architectures and use this nomenclature to review existing distributed file system implementations in very large-scale network computing systems such as Grids, Search Engines, etc. Based on the above nomenclature the features, its advantages and disadvantages of each DFS are outlined which enables to select an appropriate one according to their needs. Regardless of the specific technical direction taken by distributed file systems in the next decade, there is little doubt that it will be an area of considerable ferment in industry and academics

    Extending JXTA for P2P File Sharing Systems

    Get PDF
    File sharing is among the most important features of the today’s Internet-based applications. Most of such applications are server-based approaches inheriting thus the disadvantages of centralized systems. Advances in P2P systems are allowing to share huge quantities of data and files in a distributed way. In this paper, we present extensions of JXTA protocols to support file sharing in P2P systems with the aim to overcome limitations of server-mediated approaches. Our proposal is validated in practice by deploying a P2P file sharing system in a real P2P network.The empirical study revealed the benefits and drawbacks of using JXTA protocol for P2P file sharing systems.one of the most important concern

    Graffiti Networks: A Subversive, Internet-Scale File Sharing Model

    Full text link
    The proliferation of peer-to-peer (P2P) file sharing protocols is due to their efficient and scalable methods for data dissemination to numerous users. But many of these networks have no provisions to provide users with long term access to files after the initial interest has diminished, nor are they able to guarantee protection for users from malicious clients that wish to implicate them in incriminating activities. As such, users may turn to supplementary measures for storing and transferring data in P2P systems. We present a new file sharing paradigm, called a Graffiti Network, which allows peers to harness the potentially unlimited storage of the Internet as a third-party intermediary. Our key contributions in this paper are (1) an overview of a distributed system based on this new threat model and (2) a measurement of its viability through a one-year deployment study using a popular web-publishing platform. The results of this experiment motivate a discussion about the challenges of mitigating this type of file sharing in a hostile network environment and how web site operators can protect their resources

    Cloud Computing Based-Collaborative Network Security Management System Using Botnet

    Get PDF
    Now-a-days due to increase in the users of internet the balancing of traffic in the network becomes most serious issue. Also the internet security is also the important factor in terms of internet worms, spam, phishing attacks and botnet too. Botnet is an internet connected program communicates with other similar program to perform malicious activities and to spread a Distributed Denial of service on the victim computer and also in network. And make that machine or network resource unavailable to its users. The botmaster which hides itself behind a server called Command & control(C & C) Server, and give commands to C & C to attack on the victim hosts and spread bot in the network. In this paper, we design a cloud computing- Based Collaborative Network Security Management System Using Botnet which balances the load in the network and check for each and every file transferring in the cloud for the bot. If that file contains the bot then the folder in which that file is save will be deleted from that client. In this way, we design a system to protect the cloud from botnet and prevent the cloud from botnet attack

    Cloud Storage Performance and Security Analysis with Hadoop and GridFTP

    Get PDF
    Even though cloud server has been around for a few years, most of the web hosts today have not converted to cloud yet. If the purpose of the cloud server is distributing and storing files on the internet, FTP servers were much earlier than the cloud. FTP server is sufficient to distribute content on the internet. Therefore, is it worth to shift from FTP server to cloud server? The cloud storage provider declares high durability and availability for their users, and the ability to scale up for more storage space easily could save users tons of money. However, does it provide higher performance and better security features? Hadoop is a very popular platform for cloud computing. It is free software under Apache License. It is written in Java and supports large data processing in a distributed environment. Characteristics of Hadoop include partitioning of data, computing across thousands of hosts, and executing application computations in parallel. Hadoop Distributed File System allows rapid data transfer up to thousands of terabytes, and is capable of operating even in the case of node failure. GridFTP supports high-speed data transfer for wide-area networks. It is based on the FTP and features multiple data channels for parallel transfers. This report describes the technology behind HDFS and enhancement to the Hadoop security features with Kerberos. Based on data transfer performance and security features of HDFS and GridFTP server, we can decide if we should replace GridFTP server with HDFS. According to our experiment result, we conclude that GridFTP server provides better throughput than HDFS, and Kerberos has minimal impact to HDFS performance. We proposed a solution which users authenticate with HDFS first, and get the file from HDFS server to the client using GridFTP

    Distributive Join Strategy Based on Tuple Inversion

    Get PDF
    In this paper, we propose a new direction for distributive join operations. We assume that there will be a scalable distributed computer system in which many computers (processors) are connected through a communication network that can be in a LAN or as part of the Internet with sufficient bandwidth. A relational database is then distributed across this network of processors. However, in our approach, the distribution of the database is very fine-grained and is based on the Distributed Hash Table (DHT) concept. A tuple of a table is assigned to a specific processor by using a fair hash function applied to its key value. For each joinable attribute, an inverted file list is further generated and distributed again based on the DHT. This pre-distribution is done when the tuple enters the system and therefore does not require any distribution of data tuples on the fly when the join is executed. When a join operation request is broadcast, each processor performs a local join and the results are sent back to a query processor which, in turn, merges the join results and returns them to the user. Note that the distribution of the DHT of the inverted file lists can be either pre-processed or distributed on the fly. If the lists are pre-processed and distributed, they have to be maintained. We evaluate our approach by comparing it empirically to two other approaches: the naive join method and the fully distributed join method. The results show a significantly higher performance of our method for a wide range of possible parameter

    A Survey on Internet Traffic Measurement and Analysis

    Get PDF
    As the number of Internet users increasing rapidly in this world, Internet traffic is also increased. In computer network traffic measurement is the process of measuring the amount and type of traffic on a particular network. Internet traffic measurement and analysis are mostly used to characterize and analysis of network usage and user behaviour, but faces the problem of scalability under the explosive growth of Internet traffic and high speed access. It is not easy to handle Tera and Pera-byte traffic data with single server. Scalable Internet traffic measurement and analysis is difficult because a large dataset requires matching commutating and storage resources. To analyse this traffic multiple tools are available. But they do not perform well when the traffic data size increase. As data grows it is necessary to increase the necessary infrastructure to process it. The distributed File System can be used for this purpose, but it has certain limitation such as scalability, availability and fault tolerant. Hadoop is popular parallel processing framework that is widely used for working with large datasets and it is an open source distributed computing platform having MapReduce for distributed processing and HDFS to store huge amount of data. In future work we will present a Hadoop-based traffic monitoring system that perform a multiple types of analysis on large amount of internet traffic in a scalable manner Keywords- Traffic monitoring, Hadoop, MapReduce, HDFS, NetFlow

    Enhanced Document Search and Sharing Tool in Organized P2p Frameworks

    Get PDF
    In internet p2p file sharing system generating more traffic. in this system file querying is important functionality which indicates the performance of p2p system .To improve file query performance cluster the common interested peers based on physical proximity .Existing methods are dedicated to only unstructured p2p systems and they don’t have strict policy for topology construction which decreases the file location efficiency. In this project proposing a proximity aware interest –clustered p2p file sharing system implemented in structured p2p file system. It forms a cluster based on node proximity as well as groups the nodes which having common interest into sub-cluster. A novel lookup function named as DHT and file replication algorithm which supports efficient file lookup and access. To reduce overhead and file searching delay the file querying may become inefficient due to the sub-interest supernode overload or failure. Thus, though the sub-interest based file querying improves querying efficiency, it is still not sufficiently scalable when there are a very large number of nodes in a sub-interest group. We then propose a distributed intra-sub-cluster file querying method in order to further improve the file querying efficiency

    A Hybrid Web Caching Design Model for Internet-Content Delivery

    Get PDF
    The need for online contents (or resources) to be shared and distributed in a large and sophisticated networks of users, geographical dispersed location of servers and their clients, time taken to fulfil clients requests pose major challenge. Therefore the choice of suitable architecture forInternet-based content delivery (ICD) technologies readily comes to mind. To achieve this, Akamai and Gnutella Web technologies are extensively reviewed to identify their strengths and weakness because of their popularity across the world for delivering contents. This new design for Internet-based content distribution is called AkaGnu because of the extra layer (Gnutella network)inserted into Akamai architecture, which provides greater Internet edge over each technology deployed independently. The paper presents a new ICD technology that performs better than Akamai system as a result of new features and behaviours introduced that reduce network traffic, more clients Internet connectivity, increase file sharing, improved speed of contents deliveries, andenhanced network security.Keywords/Index Terms- ICD, Akamai, Gnutella, peer-to-peer, AkaGnu, network traffic, security, architecture, technolog

    Building global and scalable systems with atomic multicast

    Get PDF
    The rise of worldwide Internet-scale services demands large distributed systems. Indeed, when handling several millions of users, it is common to operate thousands of servers spread across the globe. Here, replication plays a central role, as it contributes to improve the user experience by hiding failures and by providing acceptable latency. In this thesis, we claim that atomic multicast, with strong and well-defined properties, is the appropriate abstraction to efficiently design and implement globally scalable distributed systems. Internet-scale services rely on data partitioning and replication to provide scalable performance and high availability. Moreover, to reduce user-perceived response times and tolerate disasters (i.e., the failure of a whole datacenter), services are increasingly becoming geographically distributed. Data partitioning and replication, combined with local and geographical distribution, introduce daunting challenges, including the need to carefully order requests among replicas and partitions. One way to tackle this problem is to use group communication primitives that encapsulate order requirements. While replication is a common technique used to design such reliable distributed systems, to cope with the requirements of modern cloud based ``always-on'' applications, replication protocols must additionally allow for throughput scalability and dynamic reconfiguration, that is, on-demand replacement or provisioning of system resources. We propose a dynamic atomic multicast protocol which fulfills these requirements. It allows to dynamically add and remove resources to an online replicated state machine and to recover crashed processes. Major efforts have been spent in recent years to improve the performance, scalability and reliability of distributed systems. In order to hide the complexity of designing distributed applications, many proposals provide efficient high-level communication abstractions. Since the implementation of a production-ready system based on this abstraction is still a major task, we further propose to expose our protocol to developers in the form of distributed data structures. B-trees for example, are commonly used in different kinds of applications, including database indexes or file systems. Providing a distributed, fault-tolerant and scalable data structure would help developers to integrate their applications in a distribution transparent manner. This work describes how to build reliable and scalable distributed systems based on atomic multicast and demonstrates their capabilities by an implementation of a distributed ordered map that supports dynamic re-partitioning and fast recovery. To substantiate our claim, we ported an existing SQL database atop of our distributed lock-free data structure. Here, replication plays a central role, as it contributes to improve the user experience by hiding failures and by providing acceptable latency. In this thesis, we claim that atomic multicast, with strong and well-defined properties, is the appropriate abstraction to efficiently design and implement globally scalable distributed systems. Internet-scale services rely on data partitioning and replication to provide scalable performance and high availability. Moreover, to reduce user-perceived response times and tolerate disasters (i.e., the failure of a whole datacenter), services are increasingly becoming geographically distributed. Data partitioning and replication, combined with local and geographical distribution, introduce daunting challenges, including the need to carefully order requests among replicas and partitions. One way to tackle this problem is to use group communication primitives that encapsulate order requirements. While replication is a common technique used to design such reliable distributed systems, to cope with the requirements of modern cloud based ``always-on'' applications, replication protocols must additionally allow for throughput scalability and dynamic reconfiguration, that is, on-demand replacement or provisioning of system resources. We propose a dynamic atomic multicast protocol which fulfills these requirements. It allows to dynamically add and remove resources to an online replicated state machine and to recover crashed processes. Major efforts have been spent in recent years to improve the performance, scalability and reliability of distributed systems. In order to hide the complexity of designing distributed applications, many proposals provide efficient high-level communication abstractions. Since the implementation of a production-ready system based on this abstraction is still a major task, we further propose to expose our protocol to developers in the form of distributed data structures. B- trees for example, are commonly used in different kinds of applications, including database indexes or file systems. Providing a distributed, fault-tolerant and scalable data structure would help developers to integrate their applications in a distribution transparent manner. This work describes how to build reliable and scalable distributed systems based on atomic multicast and demonstrates their capabilities by an implementation of a distributed ordered map that supports dynamic re-partitioning and fast recovery. To substantiate our claim, we ported an existing SQL database atop of our distributed lock-free data structure
    corecore