10 research outputs found

    Fewer Cores, More Hertz: Leveraging High-Frequency Cores in the OS Scheduler for Improved Application Performance

    Get PDF
    International audienceIn modern server CPUs, individual cores can run at different frequencies, which allows for fine-grained control of the per-formance/energy tradeoff. Adjusting the frequency, however, incurs a high latency. We find that this can lead to a problem of frequency inversion, whereby the Linux scheduler places a newly active thread on an idle core that takes dozens to hundreds of milliseconds to reach a high frequency, just before another core already running at a high frequency becomes idle. In this paper, we first illustrate the significant performance overhead of repeated frequency inversion through a case study of scheduler behavior during the compilation of the Linux kernel on an 80-core Intel R Xeon-based machine. Following this, we propose two strategies to reduce the likelihood of frequency inversion in the Linux scheduler. When benchmarked over 60 diverse applications on the Intel R Xeon, the better performing strategy, S move , improves performance by more than 5% (at most 56% with no energy overhead) for 23 applications, and worsens performance by more than 5% (at most 8%) for only 3 applications. On a 4-core AMD Ryzen we obtain performance improvements up to 56%

    Doctor of Philosophy

    Get PDF
    dissertationA modern software system is a composition of parts that are themselves highly complex: operating systems, middleware, libraries, servers, and so on. In principle, compositionality of interfaces means that we can understand any given module independently of the internal workings of other parts. In practice, however, abstractions are leaky, and with every generation, modern software systems grow in complexity. Traditional ways of understanding failures, explaining anomalous executions, and analyzing performance are reaching their limits in the face of emergent behavior, unrepeatability, cross-component execution, software aging, and adversarial changes to the system at run time. Deterministic systems analysis has a potential to change the way we analyze and debug software systems. Recorded once, the execution of the system becomes an independent artifact, which can be analyzed offline. The availability of the complete system state, the guaranteed behavior of re-execution, and the absence of limitations on the run-time complexity of analysis collectively enable the deep, iterative, and automatic exploration of the dynamic properties of the system. This work creates a foundation for making deterministic replay a ubiquitous system analysis tool. It defines design and engineering principles for building fast and practical replay machines capable of capturing complete execution of the entire operating system with an overhead of several percents, on a realistic workload, and with minimal installation costs. To enable an intuitive interface of constructing replay analysis tools, this work implements a powerful virtual machine introspection layer that enables an analysis algorithm to be programmed against the state of the recorded system through familiar terms of source-level variable and type names. To support performance analysis, the replay engine provides a faithful performance model of the original execution during replay

    Anomaly Detection and Fault Localization Using Runtime State Models

    Get PDF
    Software systems are impacting every aspect of our daily lives, making software failures expensive, even life endangering. Despite rigorous testing, software bugs inevitably exist, especially in complex systems. Existing tools to aid debugging, such as tracing, profiling, and logging facilities, reveal the behavior of a programā€™s execution; however, they require the developers to manually correlate the data to diagnose faults. This work is the first to introduce the Runtime State Model, a summarization of a programā€™s behavior, for software anomaly detection and fault localization. A Runtime State Model is constructed from variablesā€™ value change events of an execution. It consists of a set of states, and state transitions, where a state is a set of variables with their current values, and a state transition is induced by a variableā€™s value change. Comparisons between states from difference executions can be conducted to detect software anomalies. Deviations from the healthy states also help explain and locate faults in the source code. To automate this process, we implement Xtract, a facility that automatically extracts runtime traces from the Java Virtual Machines and constructs Runtime State Models for multiple simultaneous Java applications. Our evaluation provides evidence that Runtime State Models might be effective in detecting and locating injected faults to a RUBiS server with Xtract

    Quantitative performance modeling of scientific computations and creating locality in numerical algorithms

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 1995.Includes bibliographical references (p. 141-150) and index.by Sivan Avraham Toledo.Ph.D

    Automated Extraction of Behaviour Model of Applications

    Get PDF
    Highly replicated cloud applications are deployed only when they are deemed to be func- tional. That is, they generally perform their task and their failure rate is relatively low. However, even though failure is rare, it does occur and is very difficult to diagnose. We devise a tool for failure diagnosis which learns the normal behaviour of an application in terms of the statistical properties of variables used throughout its execution, and then monitors it for deviation from these statistical properties. Our study reveals that many variables have unique statistical characteristics that amount to an invariant of the pro- gram. Therefore, any significant deviation from these characteristics reflects an abnormal behaviour of the application which may be caused by a program error. It is difficult to get the invariant from the applicationā€™s static code analysis alone. For example, the name of a person usually does not include a semicolon; however, an intruder may try to do a SQL injection (which will include a semicolon) through the ā€˜nameā€™ field while entering his information and be successful if there is no checking for this case. This scenario can only be captured at runtime and may not be tested by the application de- veloper. The character range of the ā€˜nameā€™ variable is one of its statistical properties; by learning this range from the execution of the application it is possible to detect the above described abnormal input. Hence, monitoring the statistics of values taken by the different variables of an application is an effective way to detect anomalies that can help to diagnose the failure of the application. We build a tool that collects frequent snapshots of the applicationā€™s heap and build a statistical model solely from the extensional knowledge of the application. The extensional knowledge is only obtainable from runtime data of the application without having any description or explanation of the applicationā€™s execution flow. The model characterizes the applicationā€™s normal behaviour. Collecting snapshots in form of memory dumps and determine the applicationā€™s behaviour model from them without code instrumentation make our tool applicable in cases where instrumentation is computationally expensive. Our approach allows a behaviour model to be automatically and efficiently built using the monitoring data alone. We evaluate the utility of our approach by applying it on an e-commerce application and online bidding system, and then derive different statisti- cal properties of variables from their runtime-exhibited values. Our experimental result demonstrates 96% accuracy in the generated statistical model with a maximum 1% per- formance overhead. This accuracy is measured at the basis of generating less false positive alerts when the application is running without any anomaly. The high accuracy and low performance overhead indicates that our tool can successfully determine the applicationā€™s normal behaviour without affecting the performance of the application and can be used to monitor it in production time. Moreover, our tool also correctly detected two anomalous condition while monitoring the application with a small amount of injected fault. In ad- dition to anomaly detection, our tool logs all the variables of the application that violates the learned model. The log file can help to diagnose any failure caused by the variables and gives our tool a source-code granularity in fault localization

    Performance assertion checking

    No full text
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 1992.Includes bibliographical references (p. 99-103).by Sharon Esther Perl.Ph.D

    Performance Assertion Checking

    No full text
    Performance assertion checking is an approach to describing and monitoring the performance of complex software systems. The idea is simple: system implementors write assertions that capture their expectations for performance, the system is instrumented to collect performance data, and then the assertions are checked automatically against the data to detect violations signifying potential performance bugs. Because performance assertions provide a means of filtering data based on expectations, they form a good basis for tools. Data indicating that a system is performing as expected can be discarded automatically, while data indicating potential problems can be brought to the attention of a person

    Performance assertion checking

    No full text
    corecore