34 research outputs found

    Brief Announcement: DeadlineAware Scheduling

    Get PDF
    ABSTRACT This paper presents a novel algorithm for scheduling big data jobs on large compute clusters. In our model, each job is represented by a DAG consisting of several stages linked by precedence constraints. The resource allocation per stage is malleable, in the sense that the processing time of a stage depends on the resources allocated to it (the dependency can be arbitrary in general). The goal of the scheduler is to maximize the total value of completed jobs, where the value for each job depends on its completion time. We design an algorithm for the problem which guarantees an expected constant approximation factor when the cluster capacity is sufficiently high. To the best of our knowledge, this is the first constant-factor approximation algorithm for the problem. The algorithm is based on formulating the problem as a linear program and then rounding an optimal (fractional) solution into a feasible (integral) schedule using randomized rounding

    Performance Regression Detection in DevOps

    Get PDF
    Performance is an important aspect of software quality. The goals of performance are typically defined by setting upper and lower bounds for response time and throughput of a system and physical level measurements such as CPU, memory, and I/O. To meet such performance goals, several performance-related activities are needed in development (Dev) and operations (Ops). Large software system failures are often due to performance issues rather than functional bugs. One of the most important performance issues is performance regression. Although performance regressions are not all bugs, they often have a direct impact on users’ experience of the system. The process of detection of performance regressions in development and operations is faced with challenges. First, the detection of performance regression is conducted after the fact, i.e., after the system is built and deployed in the field or dedicated performance testing environments. Large amounts of resources are required to detect, locate, understand, and fix performance regressions at such a late stage in the development cycle. Second, even we can detect a performance regression, it is extremely hard to fix it because other changes are applied to the system after the introduction of the regression. These challenges call for further in-depth analyses of the performance regression. In this thesis, to avoid performance regression slipping into operation, we first perform an exploratory study on the source code changes that introduce performance regressions in order to understand root-causes of performance regression in the source code level. Second, we propose an approach that automatically predicts whether a test would manifest performance regressions in a code commit. Most of the performance issues are related to configurations. Therefore, third, we propose an approach that predicts whether a configuration option manifests a performance variation issue. To assist practitioners to analyze system performance with operational data, we propose an approach to recovering field-representative workload that can be used to detect performance regression

    Statistical Machine Learning Makes Automatic Control Practical for Internet Datacenters

    Get PDF
    Horizontally-scalable Internet services on clusters of commodity computers appear to be a great fit for automatic control: there is a target output (service-level agreement), observed output (actual latency), and gain controller (adjusting the number of servers). Yet few datacenters are automated this way in practice, due in part to well-founded skepticism about whether the simple models often used in the research literature can capture complex real-life workload/performance relationships and keep up with changing conditions that might invalidate the models. We argue that these shortcomings can be fixed by importing modeling, control, and analysis techniques from statistics and machine learning. In particular, we apply rich statistical models of the application’s performance, simulation-based methods for finding an optimal control policy, and change-point methods to find abrupt changes in performance. Preliminary results running a Web 2.0 benchmark application driven by real workload traces on Amazon’s EC2 cloud show that our method can effectively control the number of servers, even in the face of performance anomalies.

    Increasing the effectivity of the antimicrobial surface of carbon quantum dots-based nanocomposite by atmospheric pressure plasma

    Get PDF
    Preventing nosocomial infections is one of the most significant challenges in modern medicine. The disinfection of medical facilities and medical devices is crucial in order to prevent the uncontrolled spread of bacteria and viruses. Cost-effective, eco-friendly and fast-acting antibacterial coatings are being developed as the prevention of bacteria and viruses' multiplication on various surfaces. One of the possibilities to create such antimicrobial coatings can rely on a photoactive material, that produces singlet oxygen. However, a remote production of the singlet oxygen and disinfection of the desired surface is a time-consuming process. Hence, a coating material that would autonomously produce singlet oxygen employing ambient light will have a significant impact on the shortening of the disinfection time; leading into an increased number of patients that can be cured in one facility. In this work, an ultra-fast and eco-friendly method for decreasing the disinfection time of the photoactive surface is presented. The atmospheric pressure plasma surface treatment on the hydrophobic carbon quantum dots-polydimethylsiloxane nanocomposite is employed. The plasma-treated samples exhibited improved antibacterial properties compared to non-plasma treated samples, with the best results obtained after only 30 seconds of plasma treatment. The short duration and the scalability potential of the here described method open new possibilities of how to improve the already existing antibacterial coatings. © 2020 Elsevier GmbHResearch & Innovation Operational Programme - ERDF; Czech Science FoundationGrant Agency of the Czech Republic [19-16861S]; project Buildingup Centre for Advanced Materials Application of the Slovak Academy of Sciences [313021T081]; [VEGA 2/0051/20]; [APVV-15-0641

    Language Change in Multi-generational Community

    No full text
    Steels in [4] claims that both flux of agents (changing of agents in an experiment) and stochasticity in communication of agents are necessary for a spontaneous change in language. This paper argues that flux of agents alone could be responsible for a spontaneous change in language. This hypothesis is demonstrated by modeling language use through language games played in a population of evolving agents.

    Advanced Tools for Operators of Internet Services

    No full text
    Web applications suffer from software and configuration faults that lower their availability. Recovering from failure is dominated by the time interval between when these faults appear and when they are detected and fixed by site operators. We introduce a set of tools that augment the ability of operators to perceive the presence of failure: the first tool uses an automatic anomaly detector to scours HTTP access logs to find changes in user behavior that are indicative of site failures, and a visualizer helps operators rapidly detect and diagnose problems. Visualization addresses a key question of autonomic computing of how to win operators ’ confidence so that new tools will be embraced. Evaluation performed using HTTP logs from Ebates.com demonstrates that these tools can enhance the detection of failure as well as shorten detection time. Our approach is application-generic and can be applied to any Web application without the need for instrumentation. The other two tools are based on our experience with operators and resolvers at Amazon.com. The first tool lets the operators explore the health of system components and dependencies between them; the other monitors the actions of operators and automatically suggests solution
    corecore