162 research outputs found

    Topics in access, storage, and sensor networks

    Get PDF
    In the first part of this dissertation, Data Over Cable Service Interface Specification (DOCSIS) and IEEE 802.3ah Ethernet Passive Optical Network (ETON), two access networking standards, are studied. We study the impact of two parameters of the DOCSIS protocol and derive the probability of message collision in the 802.3ah device discovery scheme. We survey existing bandwidth allocation schemes for EPONs, derive the average grant size in one such scheme, and study the performance of the shortest-job-first heuristic. In the second part of this dissertation, we study networks of mobile sensors. We make progress towards an architecture for disconnected collections of mobile sensors. We propose a new design abstraction called tours which facilitates the combination of mobility and communication into a single design primitive and enables the system of sensors to reorganize into desirable topologies alter failures. We also initiate a study of computation in mobile sensor networks. We study the relationship between two distributed computational models of mobile sensor networks: population protocols and self-similar functions. We define the notion of a self-similar predicate and show when it is computable by a population protocol. Transition graphs of population protocols lead its to the consideration of graph powers. We consider the direct product of graphs and its new variant which we call the lexicographic direct product (or the clique product). We show that invariants concerning transposable walks in direct graph powers and transposable independent sets in graph families generated by the lexicographic direct product are uncomputable. The last part of this dissertation makes contributions to the area of storage systems. We propose a sequential access detect ion and prefetching scheme and a dynamic cache sizing scheme for large storage systems. We evaluate the cache sizing scheme theoretically and through simulations. We compute the expected hit ratio of our and competing schemes and bound the expected size of our dynamic cache sufficient to obtain an optimal hit ratio. We also develop a stand-alone simulator for studying our proposed scheme and integrate it with an empirically validated disk simulator

    Geometric partitioning algorithms for fair division of geographic resources

    Get PDF
    University of Minnesota Ph.D. dissertation. July 2014. Major: Industrial and Systems Engineering. Advisor: John Gunnar Carlsson. 1 computer file (PDF): vi, 140 pages, appendices p. 129-140.This dissertation focuses on a fundamental but under-researched problem: how does one divide a piece of territory into smaller pieces in an efficient way? In particular, we are interested in \emph{map segmentation problem} of partitioning a geographic region into smaller subregions for allocating resources or distributing a workload among multiple agents. This work would result in useful solutions for a variety of fundamental problems, ranging from congressional districting, facility location, and supply chain management to air traffic control and vehicle routing. In a typical map segmentation problem, we are given a geographic region RR, a probability density function defined on RR (representing, say population density, distribution of a natural resource, or locations of clients) and a set of points in RR (representing, say service facilities or vehicle depots). We seek a \emph{partition} of RR that is a collection of disjoint sub-regions {R1,...,Rn}\{R_1, . . . , R_n\} such that iRi=R\bigcup_i R_i = R, that optimizes some objective function while satisfying a shape condition. As examples of shape conditions, we may require that all sub-regions be compact, convex, star convex, simply connected (not having holes), connected, or merely measurable.Such problems are difficult because the search space is infinite-dimensional (since we are designing boundaries between sub-regions) and because the shape conditions are generally difficult to enforce using standard optimization methods. There are also many interesting variants and extensions to this problem. It is often the case that the optimal partition for a problem changes over time as new information about the region is collected. In that case, we have an \emph{online} problem and we must re-draw the sub-region boundaries as time progresses. In addition, we often prefer to construct these sub-regions in a \emph{decentralized} fashion: that is, the sub-region assigned to agent ii should be computable using only local information to agent ii (such as nearby neighbors or information about its surroundings), and the optimal boundary between two sub-regions should be computable using only knowledge available to those two agents.This dissertation is an attempt to design geometric algorithms aiming to solve the above mentioned problems keeping in view the various design constraints. We describe the drawbacks of the current approach to solving map segmentation problems, its ineffectiveness to impose geometric shape conditions and its limited utility in solving the online version of the problem. Using an intrinsically interdisciplinary approach, combining elements from variational calculus, computational geometry, geometric probability theory, and vector space optimization, we present an approach where we formulate the problems geometrically and then use a fast geometric algorithm to solve them. We demonstrate our success by solving problems having a particular choice of objective function and enforcing certain shape conditions. In fact, it turns out that such methods actually give useful insights and algorithms into classical location problems such as the continuous kk-medians problem, where the aim is to find optimal locations for facilities. We use a map segmentation technique to present a constant factor approximation algorithm to solve the continuous kk-medians problem in a convex polygon. We conclude this thesis by describing how we intend to build on this success and develop algorithms to solve larger classes of these problems

    Methodology and Software for Interactive Decision Support

    Get PDF
    These Proceedings report the scientific results of an International Workshop on "Methodology and Software for Interactive Decision Support" organized jointly by the System and Decision Sciences Program of IIASA and The National Committee for Applied Systems Analysis and Management in Bulgaria. Several other Bulgarian institutions sponsored the workshop -- The Committee for Science to the Council of Ministers, The State Committee for Research and Technology and The Bulgarian Industrial Association. The workshop was held in Albena, on the Black Sea Coast. In the first section, "Theory and Algorithms for Multiple Criteria Optimization," new theoretical developments in multiple criteria optimization are presented. In the second section, "Theory, Methodology and Software for Decision Support Systems," the principles of building decision support systems are presented as well as software tools constituting the building components of such systems. Moreover, several papers are devoted to the general methodology of building such systems or present experimental design of systems supporting certain class of decision problems. The third section addresses issues of "Applications of Decision Support Systems and Computer Implementations of Decision Support Systems." Another part of this section has a special character. Beside theoretical and methodological papers, several practical implementations of software for decision support have been presented during the workshop. These software packages varied from very experimental and illustrative implementations of some theoretical concept to well developed and documented systems being currently commercially distributed and used for solving practical problems

    Elastic Dataflow Processing on the Cloud

    Get PDF
    Τα νεφη εχουν μετατραπει σε μια ελκυστικη πλατφορμα για την πολυπλοκη επεξεργασια δεδομενων μεγαλης κλιμακας, ειδικα εξαιτιας της εννοιας της ελαστικοτητας, η οποια και τα χαρακτηριζει: οι υπολογιστικοι ποροι μπορουν να εκμισθωθουν δυναμικα και να χρησιμοποιουνται για οσο χρονο ειναι απαραιτητο. Αυτο δινει την δυνατοτητα να δημιουργηθει μια εικονικη υποδομη η οποια μπορει να αλλαζει δυναμικα στο χρονο. Οι συγχρονες εφαρμογες απαιτουν την εκτελεση πολυπλοκων ερωτηματων σε Μεγαλα Δεδομενα για την εξορυξη γνωσης και την υποστηριξη επιχειρησιακων αποφασεων. Τα πολυπλοκα αυτα ερωτηματα, εκφραζονται σε γλωσσες υψηλου επιπεδου και τυπικα μεταφραζονται σε ροες επεξεργασιας δεδομενων, η απλα ροες δεδομενων. Ενα λογικο ερωτημα που τιθεται ειναι κατα ποσον η ελαστικοτητα επηρεαζει την εκτελεση των ροων δεδομενων και με πιο τροπο. Ειναι λογικο οτι η εκτελεση να ειναι πιθανον γρηγοροτερη αν χρησιμοποιηθουν περισ- σοτεροι υπολογιστικοι ποροι, αλλα το κοστος θα ειναι υψηλοτερο. Αυτο δημιουργει την εννοια της οικο-ελαστικοτητας, ενος επιπλεον τυπου ελαστικοτητας ο οποιος προερχεται απο την οικονο- μικη θεωρια, και συλλαμβανει τις εναλλακτικες μεταξυ του χρονου εκτελεσης και του χρηματικου κοστους οπως προκυπτει απο την χρηση των πορων. Στα πλαισια αυτης της διδακτορικης διατριβης, προσεγγιζουμε την ελαστικοτητα με ενα ενοποιημενο μοντελο που περιλαμβανει και τις δυο ειδων ελαστικοτητες που υπαρχουν στα υπολογιστικα νεφη. Αυτη η ενοποιημενη προσεγγιση της ελαστικοτητας ειναι πολυ σημαντικη στην σχεδιαση συστηματων που ρυθμιζονται αυτοματα (auto-tuned) σε περιβαλλοντα νεφους. Αρχικα δειχνουμε οτι η οικο-ελαστικοτητα υπαρχει σε αρκετους τυπους υπολογισμου που εμφανιζονται συχνα στην πραξη και οτι μπορει να βρεθει χρησιμοποιωντας εναν απλο, αλλα ταυτοχρονα αποδοτικο και ε- πεκτασιμο αλγοριθμο. Επειτα, παρουσιαζουμε δυο εφαρμογες που χρησιμοποιουν αλγοριθμους οι οποιοι χρησιμοποιουν το ενοποιημενο μοντελο ελαστικοτητας που προτεινουμε για να μπορουν να προσαρμοζουν δυναμικα το συστημα στα ερωτηματα της εισοδου: 1) την ελαστικη επεξεργασια αναλυτικων ερωτηματων τα οποια εχουν πλανα εκτελεσης με μορφη δεντρων με σκοπο την μεγι- στοποιηση του κερδους και 2) την αυτοματη διαχειριση χρησιμων ευρετηριων λαμβανοντας υποψη το χρηματικο κοστος των υπολογιστικων και των αποθηκευτικων πορων. Τελος, παρουσιαζουμε το EXAREME, ενα συστημα για την ελαστικη επεξεργασια μεγαλου ογκου δεδομενων στο νεφος το οποιο εχει χρησιμοποιηθει και επεκταθει σε αυτην την δουλεια. Το συστημα προσφερει δηλωτικες γλωσσες που βασιζονται στην SQL επεκταμενη με συναρτησεις οι οποιες μπορει να οριστουν απο χρηστες (User-Defined Functions, UDFs). Επιπλεον, το συντακτικο της γλωσσας εχει επεκταθει με στοιχεια παραλληλισμου. Το EXAREME εχει σχεδιαστει για να εκμεταλλευεται τις ελαστικοτη- τες που προσφερουν τα νεφη, δεσμευοντας και αποδεσμευοντας υπολογιστικους πορους δυναμικα με σκοπο την προσαρμογη στα ερωτηματα.Clouds have become an attractive platform for the large-scale processing of modern applications on Big Data, especially due to the concept of elasticity, which characterizes them: resources can be leased on demand and used for as much time as needed, offering the ability to create virtual infrastructures that change dynamically over time. Such applications often require processing of complex queries that are expressed in a high-level language and are typically transformed into data processing flows (dataflows). A logical question that arises is whether elasticity affects dataflow execution and in which way. It seems reasonable that the execution is faster when more resources are used, however the monetary cost is higher. This gives rise to the concept eco-elasticity, an additional kind of elasticity that comes from economics, and captures the trade-offs between the response time of the system and the amount of money we pay for it as influenced by the use of different amounts of resources. In this thesis, we approach the elasticity of clouds in a unified way that combines both the traditional notion and eco-elasticity. This unified elasticity concept is essential for the development of auto-tuned systems in cloud environments. First, we demonstrate that eco-elasticity exists in several common tasks that appear in practice and that can be discovered using a simple, yet highly scalable and efficient algorithm. Next, we present two cases of auto-tuned algorithms that use the unified model of elasticity in order to adapt to the query workload: 1) processing analytical queries in the form of tree execution plans in order to maximize profit and 2) automated index management taking into account compute and storage re- sources. Finally, we describe EXAREME, a system for elastic data processing on the cloud that has been used and extended in this work. The system offers declarative languages that are based on SQL with user-defined functions (UDFs) extended with parallelism primi- tives. EXAREME exploits both elasticities of clouds by dynamically allocating and deallocating compute resources in order to adapt to the query workload

    Datacenter Architectures for the Microservices Era

    Full text link
    Modern internet services are shifting away from single-binary, monolithic services into numerous loosely-coupled microservices that interact via Remote Procedure Calls (RPCs), to improve programmability, reliability, manageability, and scalability of cloud services. Computer system designers are faced with many new challenges with microservice-based architectures, as individual RPCs/tasks are only a few microseconds in most microservices. In this dissertation, I seek to address the most notable challenges that arise due to the dissimilarities of the modern microservice based and classic monolithic cloud services, and design novel server architectures and runtime systems that enable efficient execution of µs-scale microservices on modern hardware. In the first part of my dissertation, I seek to address the problem of Killer Microseconds, which refers to µs-scale “holes” in CPU schedules caused by stalls to access fast I/O devices or brief idle times between requests in high throughput µs-scale microservices. Whereas modern computing platforms can efficiently hide ns-scale and ms-scale stalls through micro-architectural techniques and OS context switching, they lack efficient support to hide the latency of µs-scale stalls. In chapter II, I propose Duplexity, a heterogeneous server architecture that employs aggressive multithreading to hide the latency of killer microseconds, without sacrificing the Quality-of-Service (QoS) of latency-sensitive microservices. Duplexity is able to achieve 1.9× higher core utilization and 2.7× lower iso-throughput 99th-percentile tail latency over an SMT-based server design, on average. In chapters III-IV, I comprehensively investigate the problem of tail latency in the context of microservices and address multiple aspects of it. First, in chapter III, I characterize the tail latency behavior of microservices and provide general guidelines for optimizing computer systems from a queuing perspective to minimize tail latency. Queuing is a major contributor to end-to-end tail latency, wherein nominal tasks are enqueued behind rare, long ones, due to Head-of-Line (HoL) blocking. Next, in chapter IV, I introduce Q-Zilla, a scheduling framework to tackle tail latency from a queuing perspective, and CoreZilla, a microarchitectural instantiation of the framework. Q-Zilla is composed of the ServerQueue Decoupled Size-Interval Task Assignment (SQD-SITA) scheduling algorithm and the Express-lane Simultaneous Multithreading (ESMT) microarchitecture, which together seek to address HoL blocking by providing an “express-lane” for short tasks, protecting them from queuing behind rare, long ones. By combining the ESMT microarchitecture and the SQD-SITA scheduling algorithm, CoreZilla is able to improves tail latency over a conventional SMT core with 2, 4, and 8 contexts by 2.25×, 3.23×, and 4.38×, on average, respectively, and outperform a theoretical 32-core scale-up organization by 12%, on average, with 8 contexts. Finally, in chapters V-VI, I investigate the tail latency problem of microservices from a cluster, rather than server-level, perspective. Whereas Service Level Objectives (SLOs) define end-to-end latency targets for the entire service to ensure user satisfaction, with microservice-based applications, it is unclear how to scale individual microservices when end-to-end SLOs are violated or underutilized. I introduce Parslo as an analytical framework for partial SLO allocation in virtualized cloud microservices. Parslo takes a microservice graph as an input and employs a Gradient Descent-based approach to allocate “partial SLOs” to different microservice nodes, enabling independent auto-scaling of individual microservices. Parslo achieves the optimal solution, minimizing the total cost for the entire service deployment, and is applicable to general microservice graphs.PHDComputer Science & EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/167978/1/miramir_1.pd

    Structural issues and energy efficiency in data centers

    Get PDF
    Mención Internacional en el título de doctorWith the rise of cloud computing, data centers have been called to play a main role in the Internet scenario nowadays. Despite this relevance, they are probably far from their zenith yet due to the ever increasing demand of contents to be stored in and distributed by the cloud, the need of computing power or the larger and larger amounts of data being analyzed by top companies such as Google, Microsoft or Amazon. However, everything is not always a bed of roses. Having a data center entails two major issues: they are terribly expensive to build, and they consume huge amounts of power being, therefore, terribly expensive to maintain. For this reason, cutting down the cost of building and increasing the energy efficiency (and hence reducing the carbon footprint) of data centers has been one of the hottest research topics during the last years. In this thesis we propose different techniques that can have an impact in both the building and the maintenance costs of data centers of any size, from small scale to large flagship data centers. The first part of the thesis is devoted to structural issues. We start by analyzing the bisection (band)width of a topology, of product graphs in particular, a useful parameter to compare and choose among different data center topologies. In that same part we describe the problem of deploying the servers in a data center as a Multidimensional Arrangement Problem (MAP) and propose a heuristic to reduce the deployment and wiring costs. We target energy efficiency in data centers in the second part of the thesis. We first propose a method to reduce the energy consumption in the data center network: rate adaptation. Rate adaptation is based on the idea of energy proportionality and aims to consume power on network devices proportionally to the load on their links. Our analysis proves that just using rate adaptation we may achieve average energy savings in the order of a 30-40% and up to a 60% depending on the network topology. We continue by characterizing the power requirements of a data center server given that, in order to properly increase the energy efficiency of a data center, we first need to understand how energy is being consumed. We present an exhaustive empirical characterization of the power requirements of multiple components of data center servers, namely, the CPU, the disks, and the network card. To do so, we devise different experiments to stress these components, taking into account the multiple available frequencies as well as the fact that we are working with multicore servers. In these experiments, we measure their energy consumption and identify their optimal operational points. Our study proves that the curve that defines the minimal power consumption of the CPU, as a function of the load in Active Cycles Per Second (ACPS), is neither concave nor purely convex. Moreover, it definitively has a superlinear dependence on the load. We also validate the accuracy of the model derived from our characterization by running different Hadoop applications in diverse scenarios obtaining an error below 4:1% on average. The last topic we study is the Virtual Machine Assignment problem (VMA), i.e., optimizing how virtual machines (VMs) are assigned to physical machines (PMs) in data centers. Our optimization target is to minimize the power consumed by all the PMs when considering that power consumption depends superlinearly on the load. We study four different VMA problems, depending on whether the number of PMs and their capacity are bounded or not. We study their complexity and perform an offline and online analysis of these problems. The online analysis is complemented with simulations that show that the online algorithms we propose consume substantially less power than other state of the art assignment algorithms.Programa Oficial de Doctorado en Ingeniería TelemáticaPresidente: Joerg Widmer.- Secretario: José Manuel Moya Fernández.- Vocal: Shmuel Zak

    Proceedings of the Fourth MIT/ONR Workshop on Distributed Information and Decision Systems Motivated by Command-Control-Communications (C3) Problems, June 15-June 26, 1981, San Diego, California

    Get PDF
    "OSP number 85552"--Cover.Library has v. 2 only.Includes bibliographies.Workshop suppported by the Office of Naval Research under contract ONR/N00014-77-C-0532edited by Michael Athans ... [et al.].v.1. Surveillance and target tracking--v.2. Systems architecture and evaluation--v.3. Communication, data bases & decision support--v.4. C3 theory

    User-Oriented Methodology and Techniques of Decision Analysis and Support

    Get PDF
    This volume contains 26 papers selected from Workshop presentations. The book is divided into two sections; the first is devoted to the methodology of decision analysis and support and related theoretical developments, and the second reports on the development of tools -- algorithms, software packages -- for decision support as well as on their applications. Several major contributions on constructing user interfaces, on organizing intelligent DSS, on modifying theory and tools in response to user needs -- are included in this volume

    A computer graphics approach to logistics strategy modelling

    Get PDF
    This thesis describes the development and application of a decision support system for logistics strategy modelling. The decision support system that is developed enables the modelling of logistics systems at a strategic level for any country or area in the world. The model runs on IBM PC or compatible computers under DOS (disk operating system). The decision support system uses colour graphics to represent the different physical functions of a logistics system. The graphics of the system is machine independent. The model displays on the screen the map of the area or country which is being considered for logistic planning. The decision support system is hybrid in term of algorithm. It employs optimisation for allocation. The customers are allocated by building a network path from customer to the source points taking into consideration all the production and throughput constraints on factories, distribution depots and transshipment points. The system uses computer graphic visually interactive heuristics to find the best possible location for distribution depots and transshipment points. In a one depot system it gives the optimum solution but where more than one depot is involved, the optimum solution is not guaranteed. The developed model is a cost-driven model. It represents all the logistics system costs in their proper form. Its solution very much depends on the relationship between all the costs. The locations of depots and transshipment points depend on the relationship between inbound and outbound transportation costs. The model has been validated on real world problems, some of which are described here. The advantages of such a decision support system for the formulation of a problem are discussed. Also discussed is the contribution of such an approach at the validation and solution presentation stages

    Scalable Automated Incrementalization for Real-Time Static Analyses

    Get PDF
    This thesis proposes a framework for easy development of static analyses, whose results are incrementalized to provide instantaneous feedback in an integrated development environment (IDE). Today, IDEs feature many tools that have static analyses as their foundation to assess software quality and catch correctness problems. Yet, these tools often fail to provide instantaneous feedback and are thus restricted to nightly build processes. This precludes developers from fixing issues at their inception time, i.e., when the problem and the developed solution are both still fresh in mind. In order to provide instantaneous feedback, incrementalization is a well-known technique that utilizes the fact that developers make only small changes to the code and, hence, analysis results can be re-computed fast based on these changes. Yet, incrementalization requires carefully crafted static analyses. Thus, a manual approach to incrementalization is unattractive. Automated incrementalization can alleviate these problems and allows analyses writers to formulate their analyses as queries with the full data set in mind, without worrying over the semantics of incremental changes. Existing approaches to automated incrementalization utilize standard technologies, such as deductive databases, that provide declarative query languages, yet also require to materialize the full dataset in main-memory, i.e., the memory is permanently blocked by the data required for the analyses. Other standard technologies such as relational databases offer better scalability due to persistence, yet require large transaction times for data. Both technologies are not a perfect match for integrating static analyses into an IDE, since the underlying data, i.e., the code base, is already persisted and managed by the IDE. Hence, transitioning the data into a database is redundant work. In this thesis a novel approach is proposed that provides a declarative query language and automated incrementalization, yet retains in memory only a necessary minimum of data, i.e., only the data that is required for the incrementalization. The approach allows to declare static analyses as incrementally maintained views, where the underlying formalism for incrementalization is the relational algebra with extensions for object-orientation and recursion. The algebra allows to deduce which data is the necessary minimum for incremental maintenance and indeed shows that many views are self-maintainable, i.e., do not require to materialize memory at all. In addition an optimization for the algebra is proposed that allows to widen the range of self-maintainable views, based on domain knowledge of the underlying data. The optimization works similar to declaring primary keys for databases, i.e., the optimization is declared on the schema of the data, and defines which data is incrementally maintained in the same scope. The scope makes all analyses (views) that correlate only data within the boundaries of the scope self-maintainable. The approach is implemented as an embedded domain specific language in a general-purpose programming language. The implementation can be understood as a database-like engine with an SQL-style query language and the execution semantics of the relational algebra. As such the system is a general purpose database-like query engine and can be used to incrementalize other domains than static analyses. To evaluate the approach a large variety of static analyses were sampled from real-world tools and formulated as incrementally maintained views in the implemented engine
    corecore