1,778 research outputs found

    Design of Overlay Networks for Internet Multicast - Doctoral Dissertation, August 2002

    Get PDF
    Multicast is an efficient transmission scheme for supporting group communication in networks. Contrasted with unicast, where multiple point-to-point connections must be used to support communications among a group of users, multicast is more efficient because each data packet is replicated in the network – at the branching points leading to distinguished destinations, thus reducing the transmission load on the data sources and traffic load on the network links. To implement multicast, networks need to incorporate new routing and forwarding mechanisms in addition to the existing are not adequately supported in the current networks. The IP multicast are not adequately supported in the current networks. The IP multicast solution has serious scaling and deployment limitations, and cannot be easily extended to provide more enhanced data services. Furthermore, and perhaps most importantly, IP multicast has ignored the economic nature of the problem, lacking incentives for service providers to deploy the service in wide area networks. Overlay multicast holds promise for the realization of large scale Internet multicast services. An overlay network is a virtual topology constructed on top of the Internet infrastructure. The concept of overlay networks enables multicast to be deployed as a service network rather than a network primitive mechanism, allowing deployment over heterogeneous networks without the need of universal network support. This dissertation addresses the network design aspects of overlay networks to provide scalable multicast services in the Internet. The resources and the network cost in the context of overlay networks are different from that in conventional networks, presenting new challenges and new problems to solve. Our design goal are the maximization of network utility and improved service quality. As the overall network design problem is extremely complex, we divide the problem into three components: the efficient management of session traffic (multicast routing), the provisioning of overlay network resources (bandwidth dimensioning) and overlay topology optimization (service placement). The combined solution provides a comprehensive procedure for planning and managing an overlay multicast network. We also consider a complementary form of overlay multicast called application-level multicast (ALMI). ALMI allows end systems to directly create an overlay multicast session among themselves. This gives applications the flexibility to communicate without relying on service provides. The tradeoff is that users do not have direct control on the topology and data paths taken by the session flows and will typically get lower quality of service due to the best effort nature of the Internet environment. ALMI is therefore suitable for sessions of small size or sessions where all members are well connected to the network. Furthermore, the ALMI framework allows us to experiment with application specific components such as data reliability, in order to identify a useful set of communication semantic for enhanced data services

    The Ultralight project: the network as an integrated and managed resource for data-intensive science

    Get PDF
    Looks at the UltraLight project which treats the network interconnecting globally distributed data sets as a dynamic, configurable, and closely monitored resource to construct a next-generation system that can meet the high-energy physics community's data-processing, distribution, access, and analysis needs

    Climbing Up Cloud Nine: Performance Enhancement Techniques for Cloud Computing Environments

    Get PDF
    With the transformation of cloud computing technologies from an attractive trend to a business reality, the need is more pressing than ever for efficient cloud service management tools and techniques. As cloud technologies continue to mature, the service model, resource allocation methodologies, energy efficiency models and general service management schemes are not yet saturated. The burden of making this all tick perfectly falls on cloud providers. Surely, economy of scale revenues and leveraging existing infrastructure and giant workforce are there as positives, but it is far from straightforward operation from that point. Performance and service delivery will still depend on the providers’ algorithms and policies which affect all operational areas. With that in mind, this thesis tackles a set of the more critical challenges faced by cloud providers with the purpose of enhancing cloud service performance and saving on providers’ cost. This is done by exploring innovative resource allocation techniques and developing novel tools and methodologies in the context of cloud resource management, power efficiency, high availability and solution evaluation. Optimal and suboptimal solutions to the resource allocation problem in cloud data centers from both the computational and the network sides are proposed. Next, a deep dive into the energy efficiency challenge in cloud data centers is presented. Consolidation-based and non-consolidation-based solutions containing a novel dynamic virtual machine idleness prediction technique are proposed and evaluated. An investigation of the problem of simulating cloud environments follows. Available simulation solutions are comprehensively evaluated and a novel design framework for cloud simulators covering multiple variations of the problem is presented. Moreover, the challenge of evaluating cloud resource management solutions performance in terms of high availability is addressed. An extensive framework is introduced to design high availability-aware cloud simulators and a prominent cloud simulator (GreenCloud) is extended to implement it. Finally, real cloud application scenarios evaluation is demonstrated using the new tool. The primary argument made in this thesis is that the proposed resource allocation and simulation techniques can serve as basis for effective solutions that mitigate performance and cost challenges faced by cloud providers pertaining to resource utilization, energy efficiency, and client satisfaction

    SDSF : social-networking trust based distributed data storage and co-operative information fusion.

    Get PDF
    As of 2014, about 2.5 quintillion bytes of data are created each day, and 90% of the data in the world was created in the last two years alone. The storage of this data can be on external hard drives, on unused space in peer-to-peer (P2P) networks or using the more currently popular approach of storing in the Cloud. When the users store their data in the Cloud, the entire data is exposed to the administrators of the services who can view and possibly misuse the data. With the growing popularity and usage of Cloud storage services like Google Drive, Dropbox etc., the concerns of privacy and security are increasing. Searching for content or documents, from this distributed stored data, given the rate of data generation, is a big challenge. Information fusion is used to extract information based on the query of the user, and combine the data and learn useful information. This problem is challenging if the data sources are distributed and heterogeneous in nature where the trustworthiness of the documents may be varied. This thesis proposes two innovative solutions to resolve both of these problems. Firstly, to remedy the situation of security and privacy of stored data, we propose an innovative Social-based Distributed Data Storage and Trust based co-operative Information Fusion Framework (SDSF). The main objective is to create a framework that assists in providing a secure storage system while not overloading a single system using a P2P like approach. This framework allows the users to share storage resources among friends and acquaintances without compromising the security or privacy and enjoying all the benefits that the Cloud storage offers. The system fragments the data and encodes it to securely store it on the unused storage capacity of the data owner\u27s friends\u27 resources. The system thus gives a centralized control to the user over the selection of peers to store the data. Secondly, to retrieve the stored distributed data, the proposed system performs the fusion also from distributed sources. The technique uses several algorithms to ensure the correctness of the query that is used to retrieve and combine the data to improve the information fusion accuracy and efficiency for combining the heterogeneous, distributed and massive data on the Cloud for time critical operations. We demonstrate that the retrieved documents are genuine when the trust scores are also used while retrieving the data sources. The thesis makes several research contributions. First, we implement Social Storage using erasure coding. Erasure coding fragments the data, encodes it, and through introduction of redundancy resolves issues resulting from devices failures. Second, we exploit the inherent concept of trust that is embedded in social networks to determine the nodes and build a secure net-work where the fragmented data should be stored since the social network consists of a network of friends, family and acquaintances. The trust between the friends, and availability of the devices allows the user to make an informed choice about where the information should be stored using `k\u27 optimal paths. Thirdly, for the purpose of retrieval of this distributed stored data, we propose information fusion on distributed data using a combination of Enhanced N-grams (to ensure correctness of the query), Semantic Machine Learning (to extract the documents based on the context and not just bag of words and also considering the trust score) and Map Reduce (NSM) Algorithms. Lastly we evaluate the performance of distributed storage of SDSF using era- sure coding and identify the social storage providers based on trust and evaluate their trustworthiness. We also evaluate the performance of our information fusion algorithms in distributed storage systems. Thus, the system using SDSF framework, implements the beneficial features of P2P networks and Cloud storage while avoiding the pitfalls of these systems. The multi-layered encrypting ensures that all other users, including the system administrators cannot decode the stored data. The application of NSM algorithm improves the effectiveness of fusion since large number of genuine documents are retrieved for fusion

    Software-implemented attack tolerance for critical information retrieval

    Get PDF
    The fast-growing reliance of our daily life upon online information services often demands an appropriate level of privacy protection as well as highly available service provision. However, most existing solutions have attempted to address these problems separately. This thesis investigates and presents a solution that provides both privacy protection and fault tolerance for online information retrieval. A new approach to Attack-Tolerant Information Retrieval (ATIR) is developed based on an extension of existing theoretical results for Private Information Retrieval (PIR). ATIR uses replicated services to protect a user's privacy and to ensure service availability. In particular, ATIR can tolerate any collusion of up to t servers for privacy violation and up to ƒ faulty (either crashed or malicious) servers in a system with k replicated servers, provided that k ≄ t + ƒ + 1 where t ≄ 1 and ƒ ≀ t. In contrast to other related approaches, ATIR relies on neither enforced trust assumptions, such as the use of tanker-resistant hardware and trusted third parties, nor an increased number of replicated servers. While the best solution known so far requires k (≄ 3t + 1) replicated servers to cope with t malicious servers and any collusion of up to t servers with an O(n^*^) communication complexity, ATIR uses fewer servers with a much improved communication cost, O(n1/2)(where n is the size of a database managed by a server).The majority of current PIR research resides on a theoretical level. This thesis provides both theoretical schemes and their practical implementations with good performance results. In a LAN environment, it takes well under half a second to use an ATIR service for calculations over data sets with a size of up to 1MB. The performance of the ATIR systems remains at the same level even in the presence of server crashes and malicious attacks. Both analytical results and experimental evaluation show that ATIR offers an attractive and practical solution for ever-increasing online information applications

    Entangled cloud storage

    Get PDF
    Entangled cloud storage (Aspnes et al., ESORICS 2004) enables a set of clients to “entangle” their files into a single clew to be stored by a (potentially malicious) cloud provider. The entanglement makes it impossible to modify or delete significant part of the clew without affecting all files encoded in the clew. A clew keeps the files in it private but still lets each client recover his own data by interacting with the cloud provider; no cooperation from other clients is needed. At the same time, the cloud provider is discouraged from altering or overwriting any significant part of the clew as this will imply that none of the clients can recover their files. We put forward the first simulation-based security definition for entangled cloud storage, in the framework of universal composability (Canetti, 2001). We then construct a protocol satisfying our security definition, relying on an entangled encoding scheme based on privacy-preserving polynomial interpolation; entangled encodings were originally proposed by Aspnes et al. as useful tools for the purpose of data entanglement. As a contribution of independent interest we revisit the security notions for entangled encodings, putting forward stronger definitions than previous work (that for instance did not consider collusion between clients and the cloud provider). Protocols for entangled cloud storage find application in the cloud setting, where clients store their files on a remote server and need to be ensured that the cloud provider will not modify or delete their data illegitimately. Current solutions, e.g., based on Provable Data Possession and Proof of Retrievability, require the server to be challenged regularly to provide evidence that the clients’ files are stored at a given time. Entangled cloud storage provides an alternative approach where any single client operates implicitly on behalf of all others, i.e., as long as one client's files are intact, the entire remote database continues to be safe and unblemishe

    Cloud-edge hybrid applications

    Get PDF
    Many modern applications are designed to provide interactions among users, including multi- user games, social networks and collaborative tools. Users expect application response time to be in the order of milliseconds, to foster interaction and interactivity. The design of these applications typically adopts a client-server model, where all interac- tions are mediated by a centralized component. This approach introduces availability and fault- tolerance issues, which can be mitigated by replicating the server component, and even relying on geo-replicated solutions in cloud computing infrastructures. Even in this case, the client-server communication model leads to unnecessary latency penalties for geographically close clients and high operational costs for the application provider. This dissertation proposes a cloud-edge hybrid model with secure and ecient propagation and consistency mechanisms. This model combines client-side replication and client-to-client propagation for providing low latency and minimizing the dependency on the server infras- tructure, fostering availability and fault tolerance. To realize this model, this works makes the following key contributions. First, the cloud-edge hybrid model is materialized by a system design where clients maintain replicas of the data and synchronize in a peer-to-peer fashion, and servers are used to assist clients’ operation. We study how to bring most of the application logic to the client-side, us- ing the centralized service primarily for durability, access control, discovery, and overcoming internetwork limitations. Second, we dene protocols for weakly consistent data replication, including a novel CRDT model (∆-CRDTs). We provide a study on partial replication, exploring the challenges and fundamental limitations in providing causal consistency, and the diculty in supporting client- side replicas due to their ephemeral nature. Third, we study how client misbehaviour can impact the guarantees of causal consistency. We propose new secure weak consistency models for insecure settings, and algorithms to enforce such consistency models. The experimental evaluation of our contributions have shown their specic benets and limitations compared with the state-of-the-art. In general, the cloud-edge hybrid model leads to faster application response times, lower client-to-client latency, higher system scalability as fewer clients need to connect to servers at the same time, the possibility to work oine or disconnected from the server, and reduced server bandwidth usage. In summary, we propose a hybrid of cloud-and-edge which provides lower user-to-user la- tency, availability under server disconnections, and improved server scalability – while being ecient, reliable, and secure.Muitas aplicaçÔes modernas sĂŁo criadas para fornecer interaçÔes entre utilizadores, incluindo jogos multiutilizador, redes sociais e ferramentas colaborativas. Os utilizadores esperam que o tempo de resposta nas aplicaçÔes seja da ordem de milissegundos, promovendo a interação e interatividade. A arquitetura dessas aplicaçÔes normalmente adota um modelo cliente-servidor, onde todas as interaçÔes sĂŁo mediadas por um componente centralizado. Essa abordagem apresenta problemas de disponibilidade e tolerĂąncia a falhas, que podem ser mitigadas com replicação no componente do servidor, atĂ© com a utilização de soluçÔes replicadas geogracamente em infraestruturas de computação na nuvem. Mesmo neste caso, o modelo de comunicação cliente-servidor leva a penalidades de latĂȘncia desnecessĂĄrias para clientes geogracamente prĂłximos e altos custos operacionais para o provedor das aplicaçÔes. Esta dissertação propĂ”e um modelo hĂ­brido cloud-edge com mecanismos seguros e ecientes de propagação e consistĂȘncia. Esse modelo combina replicação do lado do cliente e propagação de cliente para cliente para fornecer baixa latĂȘncia e minimizar a dependĂȘncia na infraestrutura do servidor, promovendo a disponibilidade e tolerĂąncia a falhas. Para realizar este modelo, este trabalho faz as seguintes contribuiçÔes principais. Primeiro, o modelo hĂ­brido cloud-edge Ă© materializado por uma arquitetura do sistema em que os clientes mantĂȘm rĂ©plicas dos dados e sincronizam de maneira ponto a ponto e onde os servidores sĂŁo usados para auxiliar na operação dos clientes. Estudamos como trazer a maior parte da lĂłgica das aplicaçÔes para o lado do cliente, usando o serviço centralizado principalmente para durabilidade, controlo de acesso, descoberta e superação das limitaçÔes inter-rede. Em segundo lugar, denimos protocolos para replicação de dados fracamente consistentes, incluindo um novo modelo de CRDTs (∆-CRDTs). Fornecemos um estudo sobre replicação parcial, explorando os desaos e limitaçÔes fundamentais em fornecer consistĂȘncia causal e a diculdade em suportar rĂ©plicas do lado do cliente devido Ă  sua natureza efĂ©mera. Terceiro, estudamos como o mau comportamento da parte do cliente pode afetar as garantias da consistĂȘncia causal. Propomos novos modelos seguros de consistĂȘncia fraca para conguraçÔes inseguras e algoritmos para impor tais modelos de consistĂȘncia. A avaliação experimental das nossas contribuiçÔes mostrou os benefĂ­cios e limitaçÔes em comparação com o estado da arte. Em geral, o modelo hĂ­brido cloud-edge leva a tempos de resposta nas aplicaçÔes mais rĂĄpidos, a uma menor latĂȘncia de cliente para cliente e Ă  possibilidade de trabalhar oine ou desconectado do servidor. Adicionalmente, obtemos uma maior escalabilidade do sistema, visto que menos clientes precisam de estar conectados aos servidores ao mesmo tempo e devido Ă  redução na utilização da largura de banda no servidor. Em resumo, propomos um modelo hĂ­brido entre a orla (edge) e a nuvem (cloud) que fornece menor latĂȘncia entre utilizadores, disponibilidade durante desconexĂ”es do servidor e uma melhor escalabilidade do servidor – ao mesmo tempo que Ă© eciente, conĂĄvel e seguro

    Deep Learning Techniques for Mobility Prediction and Management in Mobile Networks

    Get PDF
    Trajectory prediction is an important research topic in modern mobile networks (e.g., 5G and beyond 5G) to enhance the network quality of service by accurately predicting the future locations of mobile users, such as pedestrians and vehicles, based on their past mobility patterns. A trajectory is defined as the sequence of locations the user visits over time. The primary objective of this thesis is to improve the modeling of mobility data and establish personalized, scalable, collective-intelligent, distributed, and strategic trajectory prediction techniques that can effectively adapt to the dynamics of urban environments in order to facilitate the optimal delivery of mobility-aware network services. Our proposed approaches aim to increase the accuracy of trajectory prediction while minimizing communication and computational costs leading to more efficient mobile networks. The thesis begins by introducing a personalized trajectory prediction technique using deep learning and reinforcement learning. It adapts the neural network architecture to capture the distinct characteristics of mobile users’ data. Furthermore, it introduces advanced anticipatory handover management and dynamic service migration techniques that optimize network management using our high-performance trajectory predictor. This approach ensures seamless connectivity and proactively migrates network services, enhancing the quality of service in dense wireless networks. The second contribution of the thesis introduces cluster-level prediction to extend the reinforcement learning-based trajectory prediction, addressing scalability challenges in large-scale networks. Cluster-level trajectory prediction leverages users’ similarities within clusters to train only a few representatives. This enables efficient transfer learning of pre-trained mobility models and reduces computational overhead enhancing the network scalability. The third contribution proposes a collaborative social-aware multi-agent trajectory prediction technique that accounts for the interactions between multiple intra-cluster agents in a dynamic urban environment, increasing the prediction accuracy but decreasing the algorithm complexity and computational resource usage. The fourth contribution proposes a federated learning-driven multi-agent trajectory prediction technique that leverages the collaborative power of multiple local data sources in a decentralized manner to enhance user privacy and improve the accuracy of trajectory prediction while jointly minimizing computational and communication costs. The fifth contribution proposes a game theoretic non-cooperative multi-agent prediction technique that considers the strategic behaviors among competitive inter-cluster mobile users. The proposed approaches are evaluated on small-scale and large-scale location-based mobility datasets, where locations could be GPS coordinates or cellular base station IDs. Our experiments demonstrate that our proposed approaches outperform state-of-the-art trajectory prediction methods making significant contributions to the field of mobile networks

    Reducing the cumulative file download time and variance in a P2P overlay via proximity based peer selection

    Get PDF
    The time it takes to download a file in a peer-to-peer (P2P) overlay network is dependent on several factors. These factors include the quality of the network between peers (e.g. packet loss, latency, and link failures), distance, peer selection technique, and packet loss due to Internet Service Providers (ISPs) engaging in traffic shaping. Recent research shows that P2P download time is adversely impacted by the presence of distant peers, particularly when traffic goes across an ISP that could be engaging in P2P traffic throttling activities. It has also been observed that additional delays are introduced when distant candidate nodes for exchanging data are included during the formation of a P2P network overlay. Researchers have shifted their attention to the mechanism for peer selection. They started questioning the random technique because it ignores the location of nodes in the topology of the underlying physical network. Therefore, selecting nodes for interaction in a distributed system based on their position in the network continues to be an active area of research. The goal of this work was to reduce the cumulative file download time and variance for the majority of participating peers in a P2P network by using a peer selection mechanism that favors nearby nodes. In this proposed proximity strategy, the Internet address space is separated by IP blocks that belong to different Autonomous Systems (AS). IP blocks are further broken up into subsets named zones. Each zone is given a landmark (a.k.a. beacon), for example routers or DNS servers, with a known geographical location. At the time peers joined the network, peers were grouped into zones based on their geographical distance to the selected beacons. Peers that end up in the same zone were put at the top of the list of available nodes for interactions during the formation of the overlay. Experiments were conducted to compare the proposed proximity based peer selection strategy to the random peer selection strategy. The results indicate that the proximity technique outperforms the random approach for peer selection in a network with low packet loss and latency and also in a more realistic network subject to packet loss, traffic shaping and long distances. However, this improved performance came at the cost of additional memory (230 megabytes) and to a lesser extent some additional CPU cycles to run the additional subroutines needed to group peers into zones. The framework and algorithms developed for this work made it possible to implement a fully functioning prototype that implements the proximity strategy. This prototype enabled high fidelity testing with a real client implementation in real networks including the Internet. This made it possible to test without having to rely exclusively on event-driven simulations to prove the hypothesis
    • 

    corecore