    Efficient Multi-site Data Movement Using Constraint Programming for Data Hungry Science

    For the past decade, HENP experiments have been heading towards a distributed computing model in an effort to concurrently process tasks over enormous data sets that have been increasing in size as a function of time. In order to optimize all available resources (geographically spread) and minimize the processing time, it is necessary to face also the question of efficient data transfers and placements. A key question is whether the time penalty for moving the data to the computational resources is worth the presumed gain. Onward to the truly distributed task scheduling we present the technique using a Constraint Programming (CP) approach. The CP technique schedules data transfers from multiple resources considering all available paths of diverse characteristic (capacity, sharing and storage) having minimum user's waiting time as an objective. We introduce a model for planning data transfers to a single destination (data transfer) as well as its extension for an optimal data set spreading strategy (data placement). Several enhancements for a solver of the CP model will be shown, leading to a faster schedule computation time using symmetry breaking, branch cutting, well studied principles from job-shop scheduling field and several heuristics. Finally, we will present the design and implementation of a corner-stone application aimed at moving datasets according to the schedule. Results will include comparison of performance and trade-off between CP techniques and a Peer-2-Peer model from simulation framework as well as the real case scenario taken from a practical usage of a CP scheduler.Comment: To appear in proceedings of Computing in High Energy and Nuclear Physics 200

    A Content-Addressable Network for Similarity Search in Metric Spaces

    Because of the ongoing digital data explosion, more advanced search paradigms than the traditional exact match are needed for contentbased retrieval in huge and ever growing collections of data produced in application areas such as multimedia, molecular biology, marketing, computer-aided design and purchasing assistance. As the variety of data types is fast going towards creating a database utilized by people, the computer systems must be able to model human fundamental reasoning paradigms, which are naturally based on similarity. The ability to perceive similarities is crucial for recognition, classification, and learning, and it plays an important role in scientific discovery and creativity. Recently, the mathematical notion of metric space has become a useful abstraction of similarity and many similarity search indexes have been developed. In this thesis, we accept the metric space similarity paradigm and concentrate on the scalability issues. By exploiting computer networks and applying the Peer-to-Peer communication paradigms, we build a structured network of computers able to process similarity queries in parallel. Since no centralized entities are used, such architectures are fully scalable. Specifically, we propose a Peer-to-Peer system for similarity search in metric spaces called Metric Content-Addressable Network (MCAN) which is an extension of the well known Content-Addressable Network (CAN) used for hash lookup. A prototype implementation of MCAN was tested on real-life datasets of image features, protein symbols, and text — observed results are reported. We also compared the performance of MCAN with three other, recently proposed, distributed data structures for similarity search in metric spaces

    Approximate information filtering in structured peer-to-peer networks

    Today';s content providers are naturally distributed and produce large amounts of information every day, making peer-to-peer data management a promising approach offering scalability, adaptivity to dynamics, and failure resilience. In such systems, subscribing with a continuous query is of equal importance as one-time querying since it allows the user to cope with the high rate of information production and avoid the cognitive overload of repeated searches. In the information filtering setting users specify continuous queries, thus subscribing to newly appearing documents satisfying the query conditions. Contrary to existing approaches providing exact information filtering functionality, this doctoral thesis introduces the concept of approximate information filtering, where users subscribe to only a few selected sources most likely to satisfy their information demand. This way, efficiency and scalability are enhanced by trading a small reduction in recall for lower message traffic. This thesis contains the following contributions: (i) the first architecture to support approximate information filtering in structured peer-to-peer networks, (ii) novel strategies to select the most appropriate publishers by taking into account correlations among keywords, (iii) a prototype implementation for approximate information retrieval and filtering, and (iv) a digital library use case to demonstrate the integration of retrieval and filtering in a unified system.Heutige Content-Anbieter sind verteilt und produzieren riesige Mengen an Daten jeden Tag. Daher wird die Datenhaltung in Peer-to-Peer Netzen zu einem vielversprechenden Ansatz, der Skalierbarkeit, Anpassbarkeit an Dynamik und Ausfallsicherheit bietet. Für solche Systeme besitzt das Abonnieren mit Daueranfragen die gleiche Wichtigkeit wie einmalige Anfragen, da dies dem Nutzer erlaubt, mit der hohen Datenrate umzugehen und gleichzeitig die Überlastung durch erneutes Suchen verhindert. Im Information Filtering Szenario legen Nutzer Daueranfragen fest und abonnieren dadurch neue Dokumente, die die Anfrage erfüllen. Im Gegensatz zu vorhandenen Ansätzen für exaktes Information Filtering führt diese Doktorarbeit das Konzept von approximativem Information Filtering ein. Ein Nutzer abonniert nur wenige ausgewählte Quellen, die am ehesten die Anfrage erfüllen werden. Effizienz und Skalierbarkeit werden verbessert, indem Recall gegen einen geringeren Nachrichtenverkehr eingetauscht wird. Diese Arbeit beinhaltet folgende Beiträge: (i) die erste Architektur für approximatives Information Filtering in strukturierten Peer-to-Peer Netzen, (ii) Strategien zur Wahl der besten Anbieter unter Berücksichtigung von Schlüsselwörter-Korrelationen, (iii) ein Prototyp, der approximatives Information Retrieval und Filtering realisiert und (iv) ein Anwendungsfall für Digitale Bibliotheken, der beide Funktionalitäten in einem vereinten System aufzeigt

    Contributions to High-Throughput Computing Based on the Peer-to-Peer Paradigm

    XII, 116 p.This dissertation focuses on High Throughput Computing (HTC) systems and how to build a working HTC system using Peer-to-Peer (P2P) technologies. The traditional HTC systems, designed to process the largest possible number of tasks per unit of time, revolve around a central node that implements a queue used to store and manage submitted tasks. This central node limits the scalability and fault tolerance of the HTC system. A usual solution involves the utilization of replicas of the master node that can replace it. This solution is, however, limited by the number of replicas used. In this thesis, we propose an alternative solution that follows the P2P philosophy: a completely distributed system in which all worker nodes participate in the scheduling tasks, and with a physically distributed task queue implemented on top of a P2P storage system. The fault tolerance and scalability of this proposal is, therefore, limited only by the number of nodes in the system. The proper operation and scalability of our proposal have been validated through experimentation with a real system. The data availability provided by Cassandra, the P2P data management framework used in our proposal, is analysed by means of several stochastic models. These models can be used to make predictions about the availability of any Cassandra deployment, as well as to select the best possible con guration of any Cassandra system. In order to validate the proposed models, an experimentation with real Cassandra clusters is made, showing that our models are good descriptors of Cassandra's availability. Finally, we propose a set of scheduling policies that try to solve a common problem of HTC systems: re-execution of tasks due to a failure in the node where the task was running, without additional resource misspending. In order to reduce the number of re-executions, our proposals try to nd good ts between the reliability of nodes and the estimated length of each task. An extensive simulation-based experimentation shows that our policies are capable of reducing the number of re-executions, improving system performance and utilization of nodes

    Scalable adaptive group communication on bi-directional shared prefix trees

    Efficient group communication within the Internet has been implemented by multicast. Unfortunately, its global deployment is missing. Nevertheless, emerging and progressively establishing popular applications, like IPTV or large-scale social video chats, require an economical data distribution throughout the Internet. To overcome the limitations of multicast deployment, we introduce and analyze BIDIR-SAM, the rest structured overlay multicast scheme based on bi-directional shared prefix trees. BIDIR-SAM admits predictable costs growing logarithmically with increasing group size. We also present a broadcast approach for DHT-enabled P2P networks. Both schemes are integrated in a standard compliant hybrid group communication architecture, bridging the gap between overlay and underlay as well as between inter- and intra-domain multicast

    Advanced methods for query routing in peer-to-peer information retrieval

    One of the most challenging problems in peer-to-peer networks is query routing: effectively and efficiently identifying peers that can return high-quality local results for a given query. Existing methods from the areas of distributed information retrieval and metasearch engines do not adequately address the peculiarities of a peer-to-peer network. The main contributions of this thesis are as follows: 1. Methods for query routing that take into account the mutual overlap of different peers\u27; collections, 2. Methods for query routing that take into account the correlations between multiple terms, 3. Comparative evaluation of different query routing methods. Our experiments confirm the superiority of our novel query routing methods over the prior state-of-the-art, in particular in the context of peer-to-peer Web search.Eines der drängendsten Probleme in Peer-to-Peer-Netzwerken ist Query-Routing: das effektive und effiziente Identifizieren solcher Peers, die qualitativ hochwertige lokale Ergebnisse zu einer gegebenen Anfrage liefern können. Die bisher bekannten Verfahren aus dem Bereich der verteilten Informationssuche sowie der Metasuchmaschinen werden den Besonderheiten von Peer-to-Peer-Netzwerken nicht gerecht. Die Hautbeiträge dieser Arbeit teilen sich in folgende Schwerpunkte: 1. Query-Routing unter Berücksichtigung der gegenseitigen überlappung der Kollektionen verschiedener Peers, 2. Query-Routing unter Berücksichtigung der Korrelationen zwischen verschiedenen Termen, 3. Vergleichende Evaluierung verschiedener Methoden zum Query-Routing. Unsere Experimente bestätigen die Überlegenheit der in dieser Arbeit entwickelten Verfahren gegenüber den bisher bekannten Verfahren, insbesondere im Kontext von Peer-to-Peer-Websuche

    Reliable massively parallel symbolic computing : fault tolerance for a distributed Haskell

    As the number of cores in manycore systems grows exponentially, the number of failures is also predicted to grow exponentially. Hence massively parallel computations must be able to tolerate faults. Moreover new approaches to language design and system architecture are needed to address the resilience of massively parallel heterogeneous architectures. Symbolic computation has underpinned key advances in Mathematics and Computer Science, for example in number theory, cryptography, and coding theory. Computer algebra software systems facilitate symbolic mathematics. Developing these at scale has its own distinctive set of challenges, as symbolic algorithms tend to employ complex irregular data and control structures. SymGridParII is a middleware for parallel symbolic computing on massively parallel High Performance Computing platforms. A key element of SymGridParII is a domain specific language (DSL) called Haskell Distributed Parallel Haskell (HdpH). It is explicitly designed for scalable distributed-memory parallelism, and employs work stealing to load balance dynamically generated irregular task sizes. To investigate providing scalable fault tolerant symbolic computation we design, implement and evaluate a reliable version of HdpH, HdpH-RS. Its reliable scheduler detects and handles faults, using task replication as a key recovery strategy. The scheduler supports load balancing with a fault tolerant work stealing protocol. The reliable scheduler is invoked with two fault tolerance primitives for implicit and explicit work placement, and 10 fault tolerant parallel skeletons that encapsulate common parallel programming patterns. The user is oblivious to many failures, they are instead handled by the scheduler. An operational semantics describes small-step reductions on states. A simple abstract machine for scheduling transitions and task evaluation is presented. It defines the semantics of supervised futures, and the transition rules for recovering tasks in the presence of failure. The transition rules are demonstrated with a fault-free execution, and three executions that recover from faults. The fault tolerant work stealing has been abstracted in to a Promela model. The SPIN model checker is used to exhaustively search the intersection of states in this automaton to validate a key resiliency property of the protocol. It asserts that an initially empty supervised future on the supervisor node will eventually be full in the presence of all possible combinations of failures. The performance of HdpH-RS is measured using five benchmarks. Supervised scheduling achieves a speedup of 757 with explicit task placement and 340 with lazy work stealing when executing Summatory Liouville up to 1400 cores of a HPC architecture. Moreover, supervision overheads are consistently low scaling up to 1400 cores. Low recovery overheads are observed in the presence of frequent failure when lazy on-demand work stealing is used. A Chaos Monkey mechanism has been developed for stress testing resiliency with random failure combinations. All unit tests pass in the presence of random failure, terminating with the expected results

    Distributed data structures and the power of topological self-stabilization

    In dieser Arbeit betrachten wir Probleme im Bereich verteilter Systeme und lokaler Algorithmen. Wir betrachten verteilte Systeme, die gegeben sind durch bestimmte Topologien miteinander vernetzter Knoten, und stellen die Frage, ob solche Topologien wiederhergestellt werden können, wenn das Netzwerk durch den Ausfall oder Hinzukommen von Knoten oder Kanten verändert wird. Dabei sollen lokale verteilte Algorithmen entwickelt werden, die das Netzwerk von einer beliebigen schwach zusammenhängenden Starttopologie in eine Zieltopologie überführen. Diese Eigenschaft eines Algorithmus nennen wir topologische Selbststabilisierung. Motiviert wird diese Betrachtung durch die zunehmende Nutzung von Peer-to-Peer Systemen und von Cloud Dienstleistern, also Szenarien in denen das System aus Ressourcen besteht, für die Ausfälle nicht mehr kontrolliert werden können. Zur Analyse von topologisch selbststabilisierenden Algorithmen oder Protokollen führen wir geeignete Modelle ein. Wir präsentieren dann für einige bestimme Topologien mit welchen topologisch selbststabilisierenden Protokollen diese erreicht werden können. Wir betrachten dabei als einführendes Beispiel eine sortierte Liste von Knoten und fahren dann mit komplexeren Topologien wie einem Small-World Netzwerk und einem vollständigem Graphen fort. Als nächstes wenden wir die Idee von topologisch selbststabilisierenden Protokollen auf das Konzept von verteilten Hashtabellen an. Dabei zeigen wir, dass eine solche Lösung für bereits existierende verteilte Hashtabellen möglich ist und entwickeln dann eine weitere verteilte Hashtabelle, die heterogene Kapazitäten unterstützt. Zum Schluss betrachten wir, wie verteilte Hashtabellen erweitert werden können, sodass nicht nur exakte Suchanfragen sondern auch Suchanfragen nach ähnlichen Schlüsseln unterstützt werden.This thesis considers problems located in the fields of distributed systems and local algorithms. In particular we consider such systems given by specific topologies of interconnected nodes and want to examine whether these topologies can be rebuilt in case the network is (massively) changed by failing or joining nodes or edges. For this case we search for local distributed algorithms, i.e. the algorithms are executed on every single node and only use local information stored at the nodes like their neighborhood of nodes. By executing these algorithms we will show that the desired goal topologies can be reached from any weakly connected start topology. We call this property of an algorithm topological self-stabilization and motivate it by the increasing usage of peer-to-peer (P2P) systems and of cloud computing. In both cases the user or owner of the data and executed algorithms cannot control the resources and their connectivity. In order to analyze topological self-stabilizing algorithms or protocols we introduce suited models. For some specific topologies we then present and analyze topological self-stabilizing protocols. We consider topologies like a sorted list of nodes, which we use as a simple introductory example. We then proceed with more complex topologies like a specific small-world network and a clique. We then show that the concept of topological self-stabilization can be used for distributed hash tables. In particular we show that for existing distributed hash tables a topological self-stabilizing protocol canbe found. We also construct a new overlay network, that builds a distributed hash table that supports heterogeneous capacities, and a corresponding topological self-stabilizing protocol. At last we leave the concept of topological self-stabilization behind and instead show how to extend the usage of distributed hash tables, in order to answer more than only exact queries.Tag der Verteidigung: 21.05.2015Paderborn, Univ., Diss., 201

    A P2P middleware design for digital access nodes in marginalised rural areas

    This thesis addresses software design within the field of Information and Communications Technology for Development (ICTD). Specifically, it makes a case for the design and development of software which is custom-made for the context of marginalised rural areas (MRAs). One of the main aims of any ICTD project is sustainability and such sustainability is particularly difficult in MRAs because of the high costs of projects located there. Most literature on ICTD projects focuses on other factors, such as management, regulations, social and community issues when discussing this issue. Technical matters are often down-played or ignored entirely. This thesis argues that MRAs exhibit unique technical characteristics and that by understanding these characteristics, one can possibly design more cost-effective software. One specific characteristic is described and addressed in this thesis – a characteristic we describe here for the first time and call a network island. Further analysis of the literature generates a picture of a distributed network of access nodes (DANs) within such network islands, which are connected by high speed networks and are able to share resources and stimulate usage of technology by offering a wide range of services. This thesis attempts to design a fitting middleware platform for such a context, which would achieve the following aims: i) allow software developers to create solutions for the context more efficiently (correctly, rapidly); ii) stimulate product managers and business owners to create innovative software products more easily (cost-effectively). A given in the context of this thesis is that the software should use free/libre open source software (FLOSS) – good arguments do also exist for the use of FLOSS. A review of useful FLOSS frameworks is undertaken and several of these are examined in an applied part of the thesis, to see how useful they may be. They form the basis for a walking skeleton implementation of the proposed middleware. The Spring framework is the basis for experiments, along with Spring-Webservices, JMX and PHP 5’s web service capabilities. This thesis builds on three years of work at the Siyakhula Living Lab (SLL), an experimental testbed in a MRA in the Mbashe district of the Eastern Cape of South Africa. Several existing products are deployed at the SLL in the fields of eCommerce, eGovernment and eLearning. Requirements specifications are engineered from a variety of sources, including interviews, mailing lists, the author’s experience as a supervisor at the SLL, and a review of the existing SLL products. Future products are also investigated, as the thesis considers current trends in ICTD. Use cases are also derived and listed. Most of the use cases are concerned with management functions of DANs that can be automated, so that operators of DANs can focus on their core business and not on technology. Using the UML Components methodology, the thesis then proceeds to design a middleware component architecture that is derived from the requirements specification. The process proceeds step-by-step, so that the reader can follow how business rules, operations and interfaces are derived from the use cases. Ultimately, the business rules, interfaces and operations are related to business logic, system interfaces and operations that are situated in specific components. The components in turn are derived from the business information model, that is derived from the business concepts that were initially used to describe the context for the requirements engineering. In this way, a logical method for software design is applied to the problem domain to methodically derive a software design for a middleware solution. The thesis tests the design by considering possible weaknesses in the design. The network aspect is tested by interpolating from formal assumptions about the nature of the context. The data access layer is also identified as a possible bottleneck. We suggest the use of fast indexing methods instead of relational databases to maintain flexibility and efficiency of the data layer. Lessons learned from the exercise are discussed, within the context of the author’s experience in software development teams, as well as in ICTD projects. This synthesis of information leads to warnings about the psychology of middleware development. We note that the ICTD domain is a particularly difficult one with regards to software development as business requirements are not usually clearly formulated and developers do not have the requisite domain knowledge. In conclusion, the core arguments of the thesis are recounted in a bullet form, to lay bare the reasoning behind this work. Novel aspects of the work are also highlighted. They include the description of a network island, and aspects of the DAN middleware requirements engineering and design. Future steps for work based on this thesis are mapped out and open problems relating to this research are touched upon

    Analisi e valutazione di algoritmi distribuiti per la costruzione della Triangolazione di Delaunay

    Delaunay triangulations are very useful because of their mathematical properties, expolited in several distributed applications, from peer-to-peer networks to sensor and geographical networks. For these reasons, several distributed algorithms for the construction of Delaunay based overlays have been recently proposed. This thesis presents a survey of the main distributed algorithms for the construction of the Delaunay Triangulation presented in the last years, and of their applications, with particular focus on the innovative techniques. The analysis has led to the definition of NewACE, a new distributed algorithm, which has been compared with two state of art approahes. The thesis presents a set of experimental results showing the pro and the cons of these algorithms