131 research outputs found

    Low-Impact System Performance Analysis Using Hardware Assisted Tracing Techniques

    Get PDF
    RÉSUMÉ Les applications modernes sont difficiles à diagnostiquer avec les outils de débogage et de profilage traditionnels. Dans les systèmes de production, la première priorité est de minimiser la perturbation sur l'application cible. Les outils de traçage sont très appropriés pour l'étude des performances de tels systèmes car les événements sont enregistrés et l'analyse se fait a posteriori. Une des principales exigences des systèmes de traçage est le faible surcoût. L'activation d'un nombre réduit d'événements aide à respecter cette exigence, mais au prix de la diminution de la granularité de la trace. Dans cette thèse, nous présentons notre travail de recherche qui traite du problème de la granularité limitée des traces en maintenant un faible surcoût sur les applications cibles. Nous présentons de nouvelles techniques et algorithmes qui abordent le problème en se basant d'abord sur une approche de filtrage logiciel et de traçage coopératif, puis en explorant des mécanismes plus avancés de traçage matériel. Nous avons proposé une approche efficace de traçage conditionnel dans l'espace noyau et utilisateur qui se base sur des mécanismes de filtrages compilés en code natif. Afin d'atteindre l'objectif d'avoir une trace détaillée du système, nous expliquons que les processeurs modernes contiennent des blocs de traçage matériel qui n'ont pas encore été entièrement exploités dans le domaine du traçage. Nous caractérisons leur performance et nous analysons les paquets de traces, leur relation avec l'exécution du logiciel, et les possibilités de les utiliser pour une trace détaillée. Nous proposons des techniques à faible surcoût, assistées par le matériel, rendant possible une analyse détaillée permettant la détection des latences d'interruption et des appels systèmes. Nous présentons aussi une nouvelle technique qui se base sur les paquets de trace à bas niveau du processeur pour analyser efficacement les processus et les ressources utilisées dans une machine virtuelle. De plus, nous avons identifié et solutionné des problèmes reliés au traçage matériel en utilisant l'assistance logicielle du système d'exploitation, ouvrant ainsi la voie à des recherches plus approfondies sur les approches coopératives de traçage matériel-logiciel. Comme nos techniques sont axées sur les exigences du traçage à haute vitesse dans les systèmes embarqués et de production traitant des transactions à haute fréquence, nous avons constaté que nos progrès dans le domaine du traçage matériel-logiciel se sont avérés très utiles pour détecter la contention des ressources et la latence dans les systèmes.----------ABSTRACT Modern applications are becoming hard to diagnose using traditional debugging and profiling tools. For production systems the first priority is to have minimal disturbance on the target application. To analyze performance of such systems, tracing tools are imperative where events can be logged and analyzed post-execution. One of the key requirements for tracing solutions however, is low overhead. A generic solution can be to select only a few events to trace, but at the cost of trace granularity. In this thesis, we present our research work that deals with the problem of lack of high granularity in traces while maintaining a low-overhead on target applications. We present our new techniques and algorithms that approach the problem initially from a software filtering and co-operative tracing approach, and then explore more advanced hardware tracing mechanisms that can be used. We have proposed an efficient kernel and userspace conditional tracing approach, with an enhanced native compiled filtering mechanism. Continuing towards our goal to have a detailed trace of a system, we further discuss how modern processors contain new hardware tracing blocks that have not yet been fully explored and exploited in the tracing domain. We characterize their performance and analyze the trace packets, their relation with software executions and opportunities to utilize them for a detailed trace. We therefore propose low-overhead hardware assisted techniques that allow a fine grained instruction based interrupt and system call latency detection mechanism. We also present a new algorithm that shows how such low-level trace packets coming directly from the processor, can be effectively utilized to analyze even the processes or resources consumed inside a VM. We have also identified and improved upon issues related to hardware tracing itself using software assistance from operating systems thus laying out ground for further research in hardware-software co-operative tracing approaches. As our techniques are focused towards requirements of high speed tracing in embedded or production systems, catering high frequency transactions, we have found that our advancements in the hardware-software domain have proved to be invaluable in detecting resource contention and latency in systems

    Rethinking Software Network Data Planes in the Era of Microservices

    Get PDF
    L'abstract è presente nell'allegato / the abstract is in the attachmen

    Diagnosing performance variations by comparing multi-level execution traces

    Get PDF
    Tracing allows the analysis of task interactions with each other and with the operating system. Locating performance problems in a trace is not trivial because of their large size. Furthermore, deep knowledge of all components of the observed system is required to decide whether observed behavior is normal. We introduce TraceCompare, a framework that automatically identifies differences between groups of executions of the same task at the user space and kernel levels. Many performance problems manifest themselves as variations that are easily identified by our framework. Our comparison algorithm takes into account all threads that affect the completion time of analyzed executions. Differences are correlated with application code to facilitate the correction of identified problems. Performance characteristics of task executions are represented by a new data structure called enhanced calling context tree (ECCT). We demonstrate the efficiency of our approach by presenting four case studies in which TraceCompare was used to uncover serious performance problems in enterprise and open source applications, without any prior knowledge of their codebase. We also show that the overhead of our tracing solution is between 0.2 and 9 percent depending on the type of application

    Survey and analysis of kernel and userspace tracers on Linux : design, implementation, and overhead

    Get PDF
    As applications and operating systems are becoming more complex, the last decade has seen the rise of many tracing tools all across the software stack. This article presents a hands-on comparison of modern tracers on Linux systems, both in user space and kernel space. The authors implement microbenchmarks that not only quantify the overhead of different tracers, but also sample fine-grained metrics that unveil insights into the tracers’ internals and show the cause of each tracer’s overhead. Internal design choices and implementation particularities are discussed, which helps us to understand the challenges of developing tracers. Furthermore, this analysis aims to help users choose and configure their tracers based on their specific requirements to reduce their overhead and get the most of out of them

    Towards A Verified Complex Protocol Stack in a Production Kernel: Methodology and Demonstration

    Get PDF
    Any useful computer system performs communication and any communication must be parsed before it is computed upon. Given their importance, one might expect parsers to receive a significant share of attention from the security community. This is, however, not the case: bugs in parsers continue to account for a surprising portion of reported and exploited vulnerabilities. In this thesis, I propose a methodology for supporting the development of software that depends on parsers---such as anything connected to the Internet---to safely support any reasonably designed protocol: data structures to describe protocol messages; validation routines that check that data received from the wire conforms to the rules of the protocol; systems that allow a defender to inject arbitrary, crafted input so as to explore the effectiveness of the parser; and systems that allow for the observation of the parser code while it is being explored. Then, I describe principled method of producing parsers that automatically generates the myriad parser-related software from a description of the protocol. This has many significant benefits: it makes implementing parsers simpler, easier, and faster; it reduces the trusted computing base to the description of the protocol and the program that compiles the description to runnable code; and it allows for easier formal verification of the generated code. I demonstrate the merits of the proposed methodology by creating a description of the USB protocol using a domain-specific language (DSL) embedded in Haskell and integrating it with the FreeBSD operating system. Using the industry-standard umap test-suite, I measure the performance and efficacy of the generated parser. I show that it is stable, that it is effective at protecting a system from both accidentally and maliciously malformed input, and that it does not incur unreasonable overhead

    Consistent and efficient output-streams management in optimistic simulation platforms

    Get PDF
    Optimistic synchronization is considered an effective means for supporting Parallel Discrete Event Simulations. It relies on a speculative approach, where concurrent processes execute simulation events regardless of their safety, and consistency is ensured via proper rollback mechanisms, upon the a-posteriori detection of causal inconsistencies along the events' execution path. Interactions with the outside world (e.g. generation of output streams) are a well-known problem for rollback-based systems, since the outside world may have no notion of rollback. In this context, approaches for allowing the simulation modeler to generate consistent output rely on either the usage of ad-hoc APIs (which must be provided by the underlying simulation kernel) or temporary suspension of processing activities in order to wait for the final outcome (commit/rollback) associated with a speculatively-produced output. In this paper we present design indications and a reference implementation for an output streams' management subsystem which allows the simulation-model writer to rely on standard output-generation libraries (e.g. stdio) within code blocks associated with event processing. Further, the subsystem ensures that the produced output is consistent, namely associated with events that are eventually committed, and system-wide ordered along the simulation time axis. The above features jointly provide the illusion of a classical (simple to deal with) sequential programming model, which spares the developer from being aware that the simulation program is run concurrently and speculatively. We also show, via an experimental study, how the design/development optimizations we present lead to limited overhead, giving rise to the situation where the simulation run would have been carried out with near-to-zero or reduced output management cost. At the same time, the delay for materializing the output stream (making it available for any type of audit activity) is shown to be fairly limited and constant, especially for good mixtures of I/O-bound vs CPU-bound behaviors at the application level. Further, the whole output streams' management subsystem has been designed in order to provide scalability for I/O management on clusters. © 2013 ACM

    Using VProbes for intrusion detection

    Get PDF
    Thesis: M. Eng., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2013.Cataloged from PDF version of thesis.Includes bibliographical references (pages 89-90).Many current intrusion detection systems (IDSes) are vulnerable to intruders because they are running under the same operating system (OS) as a potential attacker. Since an attacker will often be attempting to co-opt the OS, this leaves the IDS vulnerable to subversion by the attacker. While some systems escape this threat, they typically do so by running the OS inside a modified hypervisor. This risks of adding new bugs that reduce the correctness or security of the hypervisor, and may make it harder to incorporate upstream improvements. VMware has a technology called VProbes that allows setting breakpoints, examining machine state, and inspecting memory from a VM host. This thesis introduces VProbe Instrumentation for VM Intrusion Detection (VIVID), which makes subverting the instrumentation much harder while still allowing the use of an off-the-shelf hypervisor.by Alexander Worthington Dehnert.M. Eng

    Flexible Monitoring of Storage I/O

    Get PDF
    For any computer system, monitoring its performance is vital to understanding and fixing problems and performance bottlenecks. In this work we present the architecture and implementation of a system for monitoring storage devices that serve virtual machines. In contrast to existing approaches, our system is more flexible because it employs a query language that can capture both specific and detailed information on I/O transfers. Therefore our monitoring solution provides the user with enough statistics to enable him or her to find and solve problems, but not overwhelm them with too much information. Our system monitors I/O activity in virtual machines and supports basic distributed query processing. Experiments show the performance overhead of the prototype implementation to be acceptable in many realistic settings

    Software Tracing Comparison Using Data Mining Techniques

    Get PDF
    La performance est devenue une question cruciale sur le développement, le test et la maintenance des logiciels. Pour répondre à cette préoccupation, les développeurs et les testeurs utilisent plusieurs outils pour améliorer les performances ou suivre les bogues liés à la performance. L’utilisation de méthodologies comparatives telles que Flame Graphs fournit un moyen formel de vérifier les causes des régressions et des problèmes de performance. L’outil de comparaison fournit des informations pour l’analyse qui peuvent être utilisées pour les améliorer par un mécanisme de profilage profond, comparant habituellement une donnée normale avec un profil anormal. D’autre part, le mécanisme de traçage est un mécanisme de tendance visant à enregistrer des événements dans le système et à réduire les frais généraux de son utilisation. Le registre de cette information peut être utilisé pour fournir aux développeurs des données pour l’analyse de performance. Cependant, la quantité de données fournies et les connaissances requises à comprendre peuvent constituer un défi pour les méthodes et les outils d’analyse actuels. La combinaison des deux méthodologies, un mécanisme comparatif de profilage et un système de traçabilité peu élevé peut permettre d’évaluer les causes des problèmes répondant également à des exigences de performance strictes en même temps. La prochaine étape consiste à utiliser ces données pour développer des méthodes d’analyse des causes profondes et d’identification des goulets d’étranglement. L’objectif de ce recherche est d’automatiser le processus d’analyse des traces et d’identifier automatiquement les différences entre les groupes d’exécutions. La solution présentée souligne les différences dans les groupes présentant une cause possible de cette différence, l’utilisateur peut alors bénéficier de cette revendication pour améliorer les exécutions. Nous présentons une série de techniques automatisées qui peuvent être utilisées pour trouver les causes profondes des variations de performance et nécessitant des interférences mineures ou non humaines. L’approche principale est capable d’indiquer la performance en utilisant une méthodologie de regroupement comparative sur les exécutions et a été appliquée sur des cas d’utilisation réelle. La solution proposée a été mise en oeuvre sur un cadre d’analyse pour aider les développeurs à résoudre des problèmes similaires avec un outil différentiel de flamme. À notre connaissance, il s’agit de la première tentative de corréler les mécanismes de regroupement automatique avec l’analyse des causes racines à l’aide des données de suivi. Dans ce projet, la plupart des données utilisées pour les évaluations et les expériences ont été effectuées dans le système d’exploitation Linux et ont été menées à l’aide de Linux Trace Toolkit Next Generation (LTTng) qui est un outil très flexible avec de faibles coûts généraux.----------ABSTRACT: Performance has become a crucial matter in software development, testing and maintenance. To address this concern, developers and testers use several tools to improve the performance or track performance related bugs. The use of comparative methodologies such as Flame Graphs provides a formal way to verify causes of regressions and performance issues. The comparison tool provides information for analysis that can be used to improve the study by a deep profiling mechanism, usually comparing normal with abnormal profiling data. On the other hand, Tracing is a popular mechanism, targeting to record events in the system and to reduce the overhead associated with its utilization. The record of this information can be used to supply developers with data for performance analysis. However, the amount of data provided, and the required knowledge to understand it, may present a challenge for the current analysis methods and tools. Combining both methodologies, a comparative mechanism for profiling and a low overhead trace system, can enable the easier evaluation of issues and underlying causes, also meeting stringent performance requirements at the same time. The next step is to use this data to develop methods for root cause analysis and bottleneck identification. The objective of this research project is to automate the process of trace analysis and automatic identification of differences among groups of executions. The presented solution highlights differences in the groups, presenting a possible cause for any difference. The user can then benefit from this claim to improve the executions. We present a series of automated techniques that can be used to find the root causes of performance variations, while requiring small or no human intervention. The main approach is capable to identify the performance difference cause using a comparative grouping methodology on the executions, and was applied to real use cases. The proposed solution was implemented on an analysis framework to help developers with similar problems, together with a differential flame graph tool. To our knowledge, this is the first attempt to correlate automatic grouping mechanisms with root cause analysis using tracing data. In this project, most of the data used for evaluations and experiments were done with the Linux Operating System and were conducted using the Linux Trace Toolkit Next Generation (LTTng), which is a very flexible tool with low overhead
    • …
    corecore