546 research outputs found

    Allocation Strategies for Data-Oriented Architectures

    Get PDF
    Data orientation is a common design principle in distributed data management systems. In contrast to process-oriented or transaction-oriented system designs, data-oriented architectures are based on data locality and function shipping. The tight coupling of data and processing thereon is implemented in different systems in a variety of application scenarios such as data analysis, database-as-a-service, and data management on multiprocessor systems. Data-oriented systems, i.e., systems that implement a data-oriented architecture, bundle data and operations together in tasks which are processed locally on the nodes of the distributed system. Allocation strategies, i.e., methods that decide the mapping from tasks to nodes, are core components in data-oriented systems. Good allocation strategies can lead to balanced systems while bad allocation strategies cause skew in the load and therefore suboptimal application performance and infrastructure utilization. Optimal allocation strategies are hard to find given the complexity of the systems, the complicated interactions of tasks, and the huge solution space. To ensure the scalability of data-oriented systems and to keep them manageable with hundreds of thousands of tasks, thousands of nodes, and dynamic workloads, fast and reliable allocation strategies are mandatory. In this thesis, we develop novel allocation strategies for data-oriented systems based on graph partitioning algorithms. Therefore, we show that systems from different application scenarios with different abstraction levels can be generalized to generic infrastructure and workload descriptions. We use weighted graph representations to model infrastructures with bounded and unbounded, i.e., overcommited, resources and possibly non-linear performance characteristics. Based on our generalized infrastructure and workload model, we formalize the allocation problem, which seeks valid and balanced allocations that minimize communication. Our allocation strategies partition the workload graph using solution heuristics that work with single and multiple vertex weights. Novel extensions to these solution heuristics can be used to balance penalized and secondary graph partition weights. These extensions enable the allocation strategies to handle infrastructures with non-linear performance behavior. On top of the basic algorithms, we propose methods to incorporate heterogeneous infrastructures and to react to changing workloads and infrastructures by incrementally updating the partitioning. We evaluate all components of our allocation strategy algorithms and show their applicability and scalability with synthetic workload graphs. In end-to-end--performance experiments in two actual data-oriented systems, a database-as-a-service system and a database management system for multiprocessor systems, we prove that our allocation strategies outperform alternative state-of-the-art methods

    Doctor of Philosophy

    Get PDF
    dissertationNetwork emulation has become an indispensable tool for the conduct of research in networking and distributed systems. It offers more realism than simulation and more control and repeatability than experimentation on a live network. However, emulation testbeds face a number of challenges, most prominently realism and scale. Because emulation allows the creation of arbitrary networks exhibiting a wide range of conditions, there is no guarantee that emulated topologies reflect real networks; the burden of selecting parameters to create a realistic environment is on the experimenter. While there are a number of techniques for measuring the end-to-end properties of real networks, directly importing such properties into an emulation has been a challenge. Similarly, while there exist numerous models for creating realistic network topologies, the lack of addresses on these generated topologies has been a barrier to using them in emulators. Once an experimenter obtains a suitable topology, that topology must be mapped onto the physical resources of the testbed so that it can be instantiated. A number of restrictions make this an interesting problem: testbeds typically have heterogeneous hardware, scarce resources which must be conserved, and bottlenecks that must not be overused. User requests for particular types of nodes or links must also be met. In light of these constraints, the network testbed mapping problem is NP-hard. Though the complexity of the problem increases rapidly with the size of the experimenter's topology and the size of the physical network, the runtime of the mapper must not; long mapping times can hinder the usability of the testbed. This dissertation makes three contributions towards improving realism and scale in emulation testbeds. First, it meets the need for realistic network conditions by creating Flexlab, a hybrid environment that couples an emulation testbed with a live-network testbed, inheriting strengths from each. Second, it attends to the need for realistic topologies by presenting a set of algorithms for automatically annotating generated topologies with realistic IP addresses. Third, it presents a mapper, assign, that is capable of assigning experimenters' requested topologies to testbeds' physical resources in a manner that scales well enough to handle large environments

    Enabling 5G Edge Native Applications

    Get PDF

    Grid Analysis of Radiological Data

    Get PDF
    IGI-Global Medical Information Science Discoveries Research Award 2009International audienceGrid technologies and infrastructures can contribute to harnessing the full power of computer-aided image analysis into clinical research and practice. Given the volume of data, the sensitivity of medical information, and the joint complexity of medical datasets and computations expected in clinical practice, the challenge is to fill the gap between the grid middleware and the requirements of clinical applications. This chapter reports on the goals, achievements and lessons learned from the AGIR (Grid Analysis of Radiological Data) project. AGIR addresses this challenge through a combined approach. On one hand, leveraging the grid middleware through core grid medical services (data management, responsiveness, compression, and workflows) targets the requirements of medical data processing applications. On the other hand, grid-enabling a panel of applications ranging from algorithmic research to clinical use cases both exploits and drives the development of the services

    Synthesis, Interdiction, and Protection of Layered Networks

    Get PDF
    This research developed the foundation, theory, and framework for a set of analysis techniques to assist decision makers in analyzing questions regarding the synthesis, interdiction, and protection of infrastructure networks. This includes extension of traditional network interdiction to directly model nodal interdiction; new techniques to identify potential targets in social networks based on extensions of shortest path network interdiction; extension of traditional network interdiction to include layered network formulations; and develops models/techniques to design robust layered networks while considering trade-offs with cost. These approaches identify the maximum protection/disruption possible across layered networks with limited resources, find the most robust layered network design possible given the budget limitations while ensuring that the demands are met, include traditional social network analysis, and incorporate new techniques to model the interdiction of nodes and edges throughout the formulations. In addition, the importance and effects of multiple optimal solutions for these (and similar) models is investigated. All the models developed are demonstrated on notional examples and were tested on a range of sample problem sets
    corecore