Search CORE

102 research outputs found

Scalable discovery of networked data : Algorithms, Infrastructure, Applications

Author: Kotoulas S.
Publication venue: Amsterdam: Vrije Universiteit
Publication date: 01/01/2010
Field of study

Harmelen, F.A.H. van [Promotor]Siebes, R.M. [Copromotor

VU Research Portal

Towards a Framework for DHT Distributed Computing

Author: Rosen Andrew
Publication venue: ScholarWorks @ Georgia State University
Publication date: 12/08/2016
Field of study

Distributed Hash Tables (DHTs) are protocols and frameworks used by peer-to-peer (P2P) systems. They are used as the organizational backbone for many P2P file-sharing systems due to their scalability, fault-tolerance, and load-balancing properties. These same properties are highly desirable in a distributed computing environment, especially one that wants to use heterogeneous components. We show that DHTs can be used not only as the framework to build a P2P file-sharing service, but as a P2P distributed computing platform. We propose creating a P2P distributed computing framework using distributed hash tables, based on our prototype system ChordReduce. This framework would make it simple and efficient for developers to create their own distributed computing applications. Unlike Hadoop and similar MapReduce frameworks, our framework can be used both in both the context of a datacenter or as part of a P2P computing platform. This opens up new possibilities for building platforms to distributed computing problems. One advantage our system will have is an autonomous load-balancing mechanism. Nodes will be able to independently acquire work from other nodes in the network, rather than sitting idle. More powerful nodes in the network will be able use the mechanism to acquire more work, exploiting the heterogeneity of the network. By utilizing the load-balancing algorithm, a datacenter could easily leverage additional P2P resources at runtime on an as needed basis. Our framework will allow MapReduce-like or distributed machine learning platforms to be easily deployed in a greater variety of contexts

ScholarWorks @ Georgia State University

Recommended from our members

Global Data Plane: A Widely Distributed Storage and Communication Infrastructure

Author: Mor Nitesh
Publication venue: eScholarship, University of California
Publication date: 01/01/2019
Field of study

With the advancement of technology, richer computation devices are making their way into everyday life. However, such smarter devices merely act as a source and sink of information; the storage of information is highly centralized in data-centers in today’s world. Even though such data-centers allow for amortization of cost per bit of information, the density and distribution of such data-centers is not necessarily representative of human population density. This disparity of where the information is produced and consumed vs where it is stored only slightly affects the applications of today, but it will be the limiting factor for applications of tomorrow.The computation resources at the edge are more powerful than ever, and present an opportunity to address this disparity. We envision that a seamless combination of these edge-resources with the data-center resources is the way forward. However, the resulting issues of trust and data-security are not easy to solve in a world full of complexity. Toward this vision of a federated infrastructure composed of resources at the edge as well as those in data-centers, we describe the architecture and design of a widely distributed system for data storage and communication that attempts to alleviate some of these data security challenges; we call this system the Global Data Plane (GDP).The key abstraction in the GDP is a secure cohesive container of information called a DataCapsule, which provides a layer of uniformity on top of a heterogeneous infrastructure. A DataCapsule represents a secure history of transactions in a persistent form that can be used for building other applications on top. Existing applications can be refactored to use DataCapsules as the ground truth of persistent state; such a refactoring enables cleaner application design that allows for better security analysis of information flows. Not only cleaner design, the GDP also enables locality of access for performance and data privacy—an ever growing concern in the information age.The DataCapsules are enabled by an underlying routing fabric, called the GDP network, which provides secure routing for datagrams in a flat namespace. The GDP network is a core component of the GDP that enables various GDP components to interact with each other. In addition to the DataCapsules, this underlying network is available to applications for native communication as well. Flat namespace networks are known to provide a number of desirable properties, such as location independence, built-in multicast, etc. However, existing architectures for such networks suffer from routing security issues, typically because malicious entities can claim to possess arbitrary names and thus, receive traffic intended for arbitrary destinations. GDP network takes a different approach by defining an ownership of the name and the associated mechanisms for participants to delegate routing for such names to others. By directly integrating with GDP network, applications can enjoy the benefits of flat namespace networks without compromising routing security.The Global Data Plane and DataCapsules together represent our vision for secure ubiquitous storage. As opposed to the current approach of perimeter security for infrastructure, i.e. drawing a perimeter around parts of infrastructure and trusting everything inside it, our vision is to use cryptographic tools to enable intrinsic security for the information itself regardless of the context in which such information lives. In this dissertation, we show how to make this vision a reality, and how to adapt real world applications to reap the benefits of secure ubiquitous storage

eScholarship - University of California

Self-management for large-scale distributed systems

Author: Al-Shishtawy Ahmad
Publication venue: 'Breakthrough Institute, Rockefeller Philanthropy Advisors'
Publication date: 01/01/2012
Field of study

Autonomic computing aims at making computing systems self-managing by using autonomic managers in order to reduce obstacles caused by management complexity. This thesis presents results of research on self-management for large-scale distributed systems. This research was motivated by the increasing complexity of computing systems and their management. In the first part, we present our platform, called Niche, for programming self-managing component-based distributed applications. In our work on Niche, we have faced and addressed the following four challenges in achieving self-management in a dynamic environment characterized by volatile resources and high churn: resource discovery, robust and efficient sensing and actuation, management bottleneck, and scale. We present results of our research on addressing the above challenges. Niche implements the autonomic computing architecture, proposed by IBM, in a fully decentralized way. Niche supports a network-transparent view of the system architecture simplifying the design of distributed self-management. Niche provides a concise and expressive API for self-management. The implementation of the platform relies on the scalability and robustness of structured overlay networks. We proceed by presenting a methodology for designing the management part of a distributed self-managing application. We define design steps that include partitioning of management functions and orchestration of multiple autonomic managers. In the second part, we discuss robustness of management and data consistency, which are necessary in a distributed system. Dealing with the effect of churn on management increases the complexity of the management logic and thus makes its development time consuming and error prone. We propose the abstraction of Robust Management Elements, which are able to heal themselves under continuous churn. Our approach is based on replicating a management element using finite state machine replication with a reconfigurable replica set. Our algorithm automates the reconfiguration (migration) of the replica set in order to tolerate continuous churn. For data consistency, we propose a majority-based distributed key-value store supporting multiple consistency levels that is based on a peer-to-peer network. The store enables the tradeoff between high availability and data consistency. Using majority allows avoiding potential drawbacks of a master-based consistency control, namely, a single-point of failure and a potential performance bottleneck. In the third part, we investigate self-management for Cloud-based storage systems with the focus on elasticity control using elements of control theory and machine learning. We have conducted research on a number of different designs of an elasticity controller, including a State-Space feedback controller and a controller that combines feedback and feedforward control. We describe our experience in designing an elasticity controller for a Cloud-based key-value store using state-space model that enables to trade-off performance for cost. We describe the steps in designing an elasticity controller. We continue by presenting the design and evaluation of ElastMan, an elasticity controller for Cloud-based elastic key-value stores that combines feedforward and feedback control

Publikationer från KTH

RISE – Research Institutes of Sweden

Digitala Vetenskapliga Arkivet - Academic Archive On-line

Swedish Institute of Computer Science Publications Database

Software institutes' Online Digital Archive

The 5th Conference of PhD Students in Computer Science

Author
Publication venue
Publication date: 01/01/2006
Field of study

University of Szeged

Modulhandbuch Master Research in Computer and Systems Engineering: Prüfungsordnungsversion: 2012

Author
Publication venue
Publication date: 04/05/2015
Field of study

Digitale Bibliothek Thüringen

Modulhandbuch Master Research in Computer and Systems Engineering: Prüfungsordnungsversion: 2009

Author
Publication venue
Publication date: 04/05/2015
Field of study

Digitale Bibliothek Thüringen

Big Data Processing with Peer-To-Peer Architectures

Author: GOH WEI XIANG
Publication venue
Publication date: 20/06/2014
Field of study

Ph.DDOCTOR OF PHILOSOPH

ScholarBank@NUS

Global connectivity architecture of mobile personal devices

Author: Ford Bryan, 1973-
Publication venue: Massachusetts Institute of Technology
Publication date: 01/01/2008
Field of study

Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2008.This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.Includes bibliographical references (p. 193-207).The Internet's architecture, designed in the days of large, stationary computers tended by technically savvy and accountable administrators, fails to meet the demands of the emerging ubiquitous computing era. Nontechnical users now routinely own multiple personal devices, many of them mobile, and need to share information securely among them using interactive, delay-sensitive applications.Unmanaged Internet Architecture (UIA) is a novel, incrementally deployable network architecture for modern personal devices, which reconsiders three architectural cornerstones: naming, routing, and transport. UIA augments the Internet's global name system with a personal name system, enabling users to build personal administrative groups easily and intuitively, to establish secure bindings between his devices and with other users' devices, and to name his devices and his friends much like using a cell phone's address book. To connect personal devices reliably, even while mobile, behind NATs or firewalls, or connected via isolated ad hoc networks, UIA gives each device a persistent, location-independent identity, and builds an overlay routing service atop IP to resolve and route among these identities. Finally, to support today's interactive applications built using concurrent transactions and delay-sensitive media streams, UIA introduces a new structured stream transport abstraction, which solves the efficiency and responsiveness problems of TCP streams and the functionality limitations of UDP datagrams. Preliminary protocol designs and implementations demonstrate UIA's features and benefits. A personal naming prototype supports easy and portable group management, allowing use of personal names alongside global names in unmodified Internet applications. A prototype overlay router leverages the naming layer's social network to provide efficient ad hoc connectivity in restricted but important common-case scenarios.(cont) Simulations of more general routing protocols--one inspired by distributed hash tables, one based on recent compact routing theory--explore promising generalizations to UIA's overlay routing. A library-based prototype of UIA's structured stream transport enables incremental deployment in either OS infrastructure or applications, and demonstrates the responsiveness benefits of the new transport abstraction via dynamic prioritization of interactive web downloads. Finally, an exposition and experimental evaluation of NAT traversal techniques provides insight into routing optimizations useful in UIA and elsewhere.by Bryan Alexander Ford.Ph.D

DSpace@MIT

D.1.3 – Protocols for emergent localities

Author: Frey Davide
Kermarrec Anne-Marie
Maddock Christopher
Mauthe Andreas
Mostefaoui Achour
Perrin Matthieu
Roman Pierre-Louis
Taïani François
Publication venue: HAL CCSD
Publication date: 01/07/2016
Field of study

GDD_HCERES2020This report presents two contributions that illustrate the potential of emerging-locality protocols in large-scale decentralized systems, in two areas of decentralized social computing: recommendation, and eventual consistency of mutable data structures. The first contribution consists of a framework supporting the development of dynamically adaptive decen-tralised recommendation systems. Decentralised recommenders have been proposed to deliver privacy-preserving, personalised and highly scalable on-line recommendations. Current implementations tend, however, to rely on a hard-wired similarity metric that cannot adapt. This constitutes a strong limitation in the face of evolving needs. Our framework address this through a decentralised form of adaptation, in which individual nodes can independently select, and update their own recommendation algorithm, while still collectively contributing to the overall system's mission. Our second contribution addresses the growing demand for differentiated consistency requirements in large-scale applications. A large number of today's applications rely on Eventual Consistency, a consistency model that emphasizes liveness over safety. Designers generally adopt this consistency model uniformly throughout a distributed system due to its ability to scale as the number of users or devices grows larger. But this clashes with the need for differentiated consistency requirements. In this contribution, we address this need by introducing UPS, a novel consistency mechanism that offers differentiated eventual consistency and delivery speed by working in pair with a two-phase epidemic broadcast protocol. We propose a closed-form analysis of our approach's delivery speed, and we evaluate our complete protocol experimentally on a simulated network of one million nodes. To measure the consistency trade-off, we formally define a novel and scalable consistency metric operating at runtime

HAL-CentraleSupelec

INRIA a CCSD electronic archive server

HAL-Rennes 1