2,670 research outputs found

    S-Store: Streaming Meets Transaction Processing

    Get PDF
    Stream processing addresses the needs of real-time applications. Transaction processing addresses the coordination and safety of short atomic computations. Heretofore, these two modes of operation existed in separate, stove-piped systems. In this work, we attempt to fuse the two computational paradigms in a single system called S-Store. In this way, S-Store can simultaneously accommodate OLTP and streaming applications. We present a simple transaction model for streams that integrates seamlessly with a traditional OLTP system. We chose to build S-Store as an extension of H-Store, an open-source, in-memory, distributed OLTP database system. By implementing S-Store in this way, we can make use of the transaction processing facilities that H-Store already supports, and we can concentrate on the additional implementation features that are needed to support streaming. Similar implementations could be done using other main-memory OLTP platforms. We show that we can actually achieve higher throughput for streaming workloads in S-Store than an equivalent deployment in H-Store alone. We also show how this can be achieved within H-Store with the addition of a modest amount of new functionality. Furthermore, we compare S-Store to two state-of-the-art streaming systems, Spark Streaming and Storm, and show how S-Store matches and sometimes exceeds their performance while providing stronger transactional guarantees

    Parameterized complexity of machine scheduling: 15 open problems

    Full text link
    Machine scheduling problems are a long-time key domain of algorithms and complexity research. A novel approach to machine scheduling problems are fixed-parameter algorithms. To stimulate this thriving research direction, we propose 15 open questions in this area whose resolution we expect to lead to the discovery of new approaches and techniques both in scheduling and parameterized complexity theory.Comment: Version accepted to Computers & Operations Researc

    Discovering Job Preemptions in the Open Science Grid

    Full text link
    The Open Science Grid(OSG) is a world-wide computing system which facilitates distributed computing for scientific research. It can distribute a computationally intensive job to geo-distributed clusters and process job's tasks in parallel. For compute clusters on the OSG, physical resources may be shared between OSG and cluster's local user-submitted jobs, with local jobs preempting OSG-based ones. As a result, job preemptions occur frequently in OSG, sometimes significantly delaying job completion time. We have collected job data from OSG over a period of more than 80 days. We present an analysis of the data, characterizing the preemption patterns and different types of jobs. Based on observations, we have grouped OSG jobs into 5 categories and analyze the runtime statistics for each category. we further choose different statistical distributions to estimate probability density function of job runtime for different classes.Comment: 8 page

    Scheduling of data-intensive workloads in a brokered virtualized environment

    Full text link
    Providing performance predictability guarantees is increasingly important in cloud platforms, especially for data-intensive applications, for which performance depends greatly on the available rates of data transfer between the various computing/storage hosts underlying the virtualized resources assigned to the application. With the increased prevalence of brokerage services in cloud platforms, there is a need for resource management solutions that consider the brokered nature of these workloads, as well as the special demands of their intra-dependent components. In this paper, we present an offline mechanism for scheduling batches of brokered data-intensive workloads, which can be extended to an online setting. The objective of the mechanism is to decide on a packing of the workloads in a batch that minimizes the broker's incurred costs, Moreover, considering the brokered nature of such workloads, we define a payment model that provides incentives to these workloads to be scheduled as part of a batch, which we analyze theoretically. Finally, we evaluate the proposed scheduling algorithm, and exemplify the fairness of the payment model in practical settings via trace-based experiments

    Competitive two-agent scheduling with deteriorating jobs on a single parallel-batching machine

    Get PDF
    We consider a scheduling problem in which the jobs are generated by two agents and have time-dependent proportional-linear deteriorating processing times. The two agents compete for a common single batching machine to process their jobs, and each agent has its own criterion to optimize. The jobs may have identical or different release dates. The batching machine can process several jobs simultaneously as a batch and the processing time of a batch is equal to the longest of the job processing times in the batch. The problem is to determine a schedule for processing the jobs such that the objective of one agent is minimized, while the objective of the other agent is maintained under a fixed value. For the unbounded model, we consider various combinations of regular objectives on the basis of the compatibility of the two agents. For the bounded model, we consider two different objectives for incompatible and compatible agents: minimizing the makespan of one agent subject to an upper bound on the makespan of the other agent and minimizing the number of tardy jobs of one agent subject to an upper bound on the number of tardy jobs of the other agent. We analyze the computational complexity of various problems by either demonstrating that the problem is intractable or providing an efficient exact algorithm for the problem. Moreover, for certain problems that are shown to be intractable, we provide efficient algorithms for certain special cases

    On-line scheduling with delivery time on a single batch machine

    Get PDF
    AbstractWe consider a single batch machine on-line scheduling problem with jobs arriving over time. A batch processing machine can handle up to B jobs simultaneously as a batch, and the processing time for a batch is equal to the longest processing time among the jobs in it. Each job becomes available at its arrival time, which is not known in advance, and its characteristics, such as processing time and delivery time, become known at its arrival. Once the processing of a job is completed we deliver it to the destination. The objective is to minimize the time by which all jobs have been delivered. In this paper, we deal with two variants: the unbound model where B is sufficiently large and the bounded model where B is finite. We provide on-line algorithms with competitive ratio 2 for the unbounded model and with competitive ratio 3 for the bounded model. For when each job has the same processing time, we provide on-line algorithms with competitive ratios (5+1)/2, and these results are the best possible

    The unbounded single machine parallel batch scheduling problem with family jobs and release dates to minimize makespan

    Get PDF
    Author name used in this publication: Z. H. LiuAuthor name used in this publication: T. C. E. Cheng2003-2004 > Academic research: refereed > Publication in refereed journalAccepted ManuscriptPublishe
    corecore