Search CORE

34 research outputs found

Brief Announcement: DeadlineAware Scheduling

Author: ) Naor
Ishai Menache
Jonathan Yaniv
Joseph ( Seffi
Peter Bodík
Publication venue
Publication date: 01/01/2014
Field of study

ABSTRACT This paper presents a novel algorithm for scheduling big data jobs on large compute clusters. In our model, each job is represented by a DAG consisting of several stages linked by precedence constraints. The resource allocation per stage is malleable, in the sense that the processing time of a stage depends on the resources allocated to it (the dependency can be arbitrary in general). The goal of the scheduler is to maximize the total value of completed jobs, where the value for each job depends on its completion time. We design an algorithm for the problem which guarantees an expected constant approximation factor when the cluster capacity is sufficiently high. To the best of our knowledge, this is the first constant-factor approximation algorithm for the problem. The algorithm is based on formulating the problem as a linear program and then rounding an optimal (fractional) solution into a feasible (integral) schedule using randomized rounding

CiteSeerX

Performance Regression Detection in DevOps

Author: Bodík Peter
Chen Jinfu
Foo King Chun
Malik Haroon
Tan Jiaqi
Publication venue
Publication date: 02/10/2020
Field of study

Performance is an important aspect of software quality. The goals of performance are typically defined by setting upper and lower bounds for response time and throughput of a system and physical level measurements such as CPU, memory, and I/O. To meet such performance goals, several performance-related activities are needed in development (Dev) and operations (Ops). Large software system failures are often due to performance issues rather than functional bugs. One of the most important performance issues is performance regression. Although performance regressions are not all bugs, they often have a direct impact on users’ experience of the system. The process of detection of performance regressions in development and operations is faced with challenges. First, the detection of performance regression is conducted after the fact, i.e., after the system is built and deployed in the field or dedicated performance testing environments. Large amounts of resources are required to detect, locate, understand, and fix performance regressions at such a late stage in the development cycle. Second, even we can detect a performance regression, it is extremely hard to fix it because other changes are applied to the system after the introduction of the regression. These challenges call for further in-depth analyses of the performance regression. In this thesis, to avoid performance regression slipping into operation, we first perform an exploratory study on the source code changes that introduce performance regressions in order to understand root-causes of performance regression in the source code level. Second, we propose an approach that automatically predicts whether a test would manifest performance regressions in a code commit. Most of the performance issues are related to configurations. Therefore, third, we propose an approach that predicts whether a configuration option manifests a performance variation issue. To assist practitioners to analyze system performance with operational data, we propose an approach to recovering field-representative workload that can be used to detect performance regression

Crossref

Concordia University Research Repository

Statistical Machine Learning Makes Automatic Control Practical for Internet Datacenters

Author: Bodík Peter
Fox Armando
Griffith Rean
Jordan Michael
Patterson David
Sutton Charles
Publication venue
Publication date: 01/01/2009
Field of study

Horizontally-scalable Internet services on clusters of commodity computers appear to be a great fit for automatic control: there is a target output (service-level agreement), observed output (actual latency), and gain controller (adjusting the number of servers). Yet few datacenters are automated this way in practice, due in part to well-founded skepticism about whether the simple models often used in the research literature can capture complex real-life workload/performance relationships and keep up with changing conditions that might invalidate the models. We argue that these shortcomings can be fixed by importing modeling, control, and analysis techniques from statistics and machine learning. In particular, we apply rich statistical models of the application’s performance, simulation-based methods for finding an optimal control policy, and change-point methods to find abrupt changes in performance. Preliminary results running a Web 2.0 benchmark application driven by real workload traces on Amazon’s EC2 cloud show that our method can effectively control the number of servers, even in the face of performance anomalies.

CiteSeerX

Edinburgh Research Explorer

Increasing the effectivity of the antimicrobial surface of carbon quantum dots-based nanocomposite by atmospheric pressure plasma

Author: Bodík Michal
Humpolíček Petr
Kováčová Mária
Mičušík Matej
Šiffalovič Peter
Špitálsky Zdenko
Publication venue: Elsevier GmbH
Publication date: 22/12/2020
Field of study

Preventing nosocomial infections is one of the most significant challenges in modern medicine. The disinfection of medical facilities and medical devices is crucial in order to prevent the uncontrolled spread of bacteria and viruses. Cost-effective, eco-friendly and fast-acting antibacterial coatings are being developed as the prevention of bacteria and viruses' multiplication on various surfaces. One of the possibilities to create such antimicrobial coatings can rely on a photoactive material, that produces singlet oxygen. However, a remote production of the singlet oxygen and disinfection of the desired surface is a time-consuming process. Hence, a coating material that would autonomously produce singlet oxygen employing ambient light will have a significant impact on the shortening of the disinfection time; leading into an increased number of patients that can be cured in one facility. In this work, an ultra-fast and eco-friendly method for decreasing the disinfection time of the photoactive surface is presented. The atmospheric pressure plasma surface treatment on the hydrophobic carbon quantum dots-polydimethylsiloxane nanocomposite is employed. The plasma-treated samples exhibited improved antibacterial properties compared to non-plasma treated samples, with the best results obtained after only 30 seconds of plasma treatment. The short duration and the scalability potential of the here described method open new possibilities of how to improve the already existing antibacterial coatings. © 2020 Elsevier GmbHResearch & Innovation Operational Programme - ERDF; Czech Science FoundationGrant Agency of the Czech Republic [19-16861S]; project Buildingup Centre for Advanced Materials Application of the Slovak Academy of Sciences [313021T081]; [VEGA 2/0051/20]; [APVV-15-0641

Institutional repository of Tomas Bata University Library

Performance Anomaly Detection and Bottleneck Identification

Author: Alpaydin E.
Barham Paul
Berkhin Pavel
Bodík Peter
Brey Jack
Burke Shaun
Chung Hsin
Cohen Ira
Dean Daniel J.
Fodor Imola K.
Frank
Fu Song
Fu Song
Gregg Brendan
Guan Qiang
Gunther Neil J.
Huang Su-Yun
Igor
Jeffrey
John
Kang Hui
Kelly Terence
Kotsiantis S. B.
Lee Han Bok
Lee Wenke
Lilja David J.
Malkowski Simon
McHugh Andrew
Oakland John S.
Panourgias Iakovos
Reiss Charles
Reynolds Douglas
Sambasivan Raja R.
Shallahamer Craig A.
Shende Sameer
Tan Yongmin
Tarby Jean-Claude
Trubin Igor
Wang Chengwei
Wang Haichuan
Wang Tao
Wilder John
Yu Minlan
Zhang Qi
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date
Field of study

Crossref

Language Change in Multi-generational Community

Author: Peter Bodík
Publication venue: Erlbaum
Publication date
Field of study

Steels in [4] claims that both flux of agents (changing of agents in an experiment) and stochasticity in communication of agents are necessary for a spontaneous change in language. This paper argues that flux of agents alone could be responsible for a spontaneous change in language. This hypothesis is demonstrated by modeling language use through language games played in a population of evolving agents.

CiteSeerX

Advanced Tools for Operators of Internet Services

Author: Peter Bodík
Publication venue
Publication date
Field of study

Web applications suffer from software and configuration faults that lower their availability. Recovering from failure is dominated by the time interval between when these faults appear and when they are detected and fixed by site operators. We introduce a set of tools that augment the ability of operators to perceive the presence of failure: the first tool uses an automatic anomaly detector to scours HTTP access logs to find changes in user behavior that are indicative of site failures, and a visualizer helps operators rapidly detect and diagnose problems. Visualization addresses a key question of autonomic computing of how to win operators ’ confidence so that new tools will be embraced. Evaluation performed using HTTP logs from Ebates.com demonstrates that these tools can enhance the detection of failure as well as shorten detection time. Our approach is application-generic and can be applied to any Web application without the need for instrumentation. The other two tools are based on our experience with operators and resolvers at Amazon.com. The first tool lets the operators explore the health of system components and dependencies between them; the other monitors the actions of operators and automatically suggests solution

CiteSeerX

Formation of a Common Spatial Lexicon and its Change in a Community of Moving Agents

Author: Martin Takáč
Peter Bodík
Publication venue
Publication date
Field of study

CiteSeerX