2,908 research outputs found

    When Two Choices Are not Enough: Balancing at Scale in Distributed Stream Processing

    Full text link
    Carefully balancing load in distributed stream processing systems has a fundamental impact on execution latency and throughput. Load balancing is challenging because real-world workloads are skewed: some tuples in the stream are associated to keys which are significantly more frequent than others. Skew is remarkably more problematic in large deployments: more workers implies fewer keys per worker, so it becomes harder to "average out" the cost of hot keys with cold keys. We propose a novel load balancing technique that uses a heaving hitter algorithm to efficiently identify the hottest keys in the stream. These hot keys are assigned to d≥2d \geq 2 choices to ensure a balanced load, where dd is tuned automatically to minimize the memory and computation cost of operator replication. The technique works online and does not require the use of routing tables. Our extensive evaluation shows that our technique can balance real-world workloads on large deployments, and improve throughput and latency by 150%\mathbf{150\%} and 60%\mathbf{60\%} respectively over the previous state-of-the-art when deployed on Apache Storm.Comment: 12 pages, 14 Figures, this paper is accepted and will be published at ICDE 201

    Quaestor: Query web caching for database-as-a-service providers

    Get PDF
    Today, web performance is primarily governed by round-trip latencies between end devices and cloud services. To improve performance, services need to minimize the delay of accessing data. In this paper, we propose a novel approach to low latency that relies on existing content delivery and web caching infrastructure. The main idea is to enable application-independent caching of query results and records with tunable consistency guarantees, in particular bounded staleness. Q uaestor (Query Store) employs two key concepts to incorporate both expiration-based and invalidation-based web caches: (1) an Expiring Bloom Filter data structure to indicate potentially stale data, and (2) statistically derived cache expiration times to maximize cache hit rates. Through a distributed query invalidation pipeline, changes to cached query results are detected in real-time. The proposed caching algorithms offer a new means for data-centric cloud services to trade latency against staleness bounds, e.g. in a database-as-a-service. Q uaestor is the core technology of the backend-as-a-service platform Baqend, a cloud service for low-latency websites. We provide empirical evidence for Q uaestor 's scalability and performance through both simulation and experiments. The results indicate that for read-heavy workloads, up to tenfold speed-ups can be achieved through Q uaestor 's caching. </jats:p

    The Effects of Group Size on Student Learning, Student Contributions, \Mental Effort, and Group Outcomes for Middle-Aged Adults Working in an Ill-Structured Problem-Solving Environment

    Get PDF
    Group work has become increasingly important within adult education as educators strive to present students with problems and processes that they encounter in their professional lives. In many work environments, individuals are expected to function as a part of a team to solve complex problems. Consequently, there has been a shift towards teaching students how to solve problems as part of a group rather than individually. An important question becomes What size group maximizes students learning? This study compared student learning, student participation levels, and mental effort for middle-aged, professional students in large (six students) and small groups (three students) while working in a collaborative, ill-structured problem solving environment to determine if group size impacted student performance. This study found that there was no significant difference in learning, participation, and mental effort between large and small groups. It also confirmed earlier research demonstrating that group product scores, even when adjusted for student participation, did not predict individual student learning. A multiple regression was used to determine if group size, participation, mental effort or group scores could be used to predict individual student learning. The study showed that for middle-aged professional students, group size, mental effort, participation, or group quality were not effective predictors of student learning

    Improving healthcare supply chains and decision making in the management of pharmaceuticals

    Get PDF
    The rising cost of quality healthcare is becoming an increasing concern. A significant part of healthcare cost is the pharmaceutical supply component. Improving healthcare supply chains is critical not only because of the financial magnitude but also because it impacts so many people. Efforts such as this project are essential in understanding the current operations of healthcare pharmacy systems and in offering decision support tools to managers struggling to make the best use of organizational resources. The purpose of this study is to address the objectives of a local hospital that exhibits typical problems in pharmacy supply chain management. We analyze the pharmacy supply network structure and the different, often conflicting goals in the decisions of the various stakeholders. We develop quantitative models useful in optimizing supply chain management and inventory management practices. We provide decision support tools that improve operational, tactical, and strategic decision making in the pharmacy supply chain and inventory management of pharmaceuticals. On one hand, advanced computerized technology that manages pharmaceutical dispensation and automates the ordering process offers considerable progress to support pharmacy product distribution. On the other hand, the available information is not utilized to help the managers in making the appropriate decisions and control the supply chain management. Quantitative methods are presented that provide simplified, practical solutions to pharmacy objectives and serve as decision support tools. For operational inventory decisions we provide the min and max par levels (reorder point and order up to level) that control the automated ordering system for pharmaceuticals. These parameters are based on two near-optimal allocation policies of cycle stock and safety stock under storage space constraint. For the tactical decision we demonstrate the influence of varying inventory holding cost rates on setting the optimal reorder point and order quantity for items. We present a strategic decision support tool to analyze the tradeoffs among the refill workload, the emergency workload, and the variety of drugs offered. We reveal the relationship of these tradeoffs to the three key performance indicators at a local care unit: the expected number of daily refills, the service level, and the storage space utilization
    • …
    corecore