412 research outputs found

    Improving quality of service in application clusters

    Get PDF
    Quality of service (QoS) requirements, which include availability, integrity, performance and responsiveness are increasingly needed by science and engineering applications. Rising computational demands and data mining present a new challenge in the IT world. As our needs for more processing, research and analysis increase, performance and reliability degrade exponentially. In this paper we present a software system that manages quality of service for Unix based distributed application clusters. Our approach is synthetic and involves intelligent agents that make use of static and dynamic ontologies to monitor, diagnose and correct faults at run time, over a private network. Finally, we provide experimental results from our pilot implementation in a production environment

    Finding Faulty Functions From the Traces of Field Failures

    Get PDF
    Corrective maintenance, which rectifies field faults, consumes 30-60% time of software maintenance. Literature indicates that 50% to 90% of the field failures are rediscoveries of previous faults, and that 20% of the code is responsible for 80% of the faults. Despite this, identification of the location of the field failures in system code remains challenging and consumes substantial (30-40%) time of corrective maintenance. Prior fault discovery techniques for field traces require many pass-fail traces, discover only crashing failures, or identify faulty coarse grain code such as files as the source of faults. This thesis (which is in the integrated article format) first describes a novel technique (F007) that focuses on identifying finer grain faulty code (faulty functions) from only the failing traces of deployed software. F007 works by training the decision trees on the function-call level failed traces of previous faults of a program. When a new failed trace arrives, F007 then predicts a ranked list of faulty functions based on the probability of fault proneness obtained via the decision trees. Second, this thesis describes a novel strategy, F007-plus, that trains F007 on the failed traces of mutants (artificial faults) and previous faults. F007-plus facilitates F007 in discovering new faulty functions that could not be discovered because they were not faulty in the traces of previously known actual faults. F007 (including F007-plus) was evaluated on the Siemens suite, Space program, four UNIX utilities, and a large commercial application of size approximately 20 millions LOC. F007 (including the use of F007-plus) was able to identify faulty functions in approximately 90% of the failed traces by reviewing approximately less than 10% of the code (i.e., by reviewing only the first few functions in the ranked list). These results, in fact, lead to an emerging theory that a faulty function can be identified by using prior traces of at least one fault in that function. Thus, F007 and F007-plus can correctly identify faulty functions in the failed traces of the majority (80%-90%) of the field failures by using the knowledge of faults in a small percentage (20%) of functions

    Sandboxed, Online Debugging of Production Bugs for SOA Systems

    Get PDF
    Short time-to-bug localization is extremely important for any 24x7 service-oriented application. To this end, we introduce a new debugging paradigm called live debugging. There are two goals that any live debugging infrastructure must meet: Firstly, it must offer real-time insight for bug diagnosis and localization, which is paramount when errors happen in user-facing applications. Secondly, live debugging should not impact user-facing performance for normal events. In large distributed applications, bugs which impact only a small percentage of users are common. In such scenarios, debugging a small part of the application should not impact the entire system. With the above-stated goals in mind, this thesis presents a framework called Parikshan, which leverages user-space containers (OpenVZ) to launch application instances for the express purpose of live debugging. Parikshan is driven by a live-cloning process, which generates a replica (called debug container) of production services, cloned from a production container which continues to provide the real output to the user. The debug container provides a sandbox environment, for safe execution of monitoring/debugging done by the users without any perturbation to the execution environment. As a part of this framework, we have designed customized-network proxies, which replicate inputs from clients to both the production and test-container, as well safely discard all outputs. Together the network duplicator, and the debug container ensure both compute and network isolation of the debugging environment. We believe that this piece of work provides the first of its kind practical real-time debugging of large multi-tier and cloud applications, without requiring any application downtime, and minimal performance impact

    Testing Feedforward Neural Networks Training Programs

    Full text link
    Nowadays, we are witnessing an increasing effort to improve the performance and trustworthiness of Deep Neural Networks (DNNs), with the aim to enable their adoption in safety critical systems such as self-driving cars. Multiple testing techniques are proposed to generate test cases that can expose inconsistencies in the behavior of DNN models. These techniques assume implicitly that the training program is bug-free and appropriately configured. However, satisfying this assumption for a novel problem requires significant engineering work to prepare the data, design the DNN, implement the training program, and tune the hyperparameters in order to produce the model for which current automated test data generators search for corner-case behaviors. All these model training steps can be error-prone. Therefore, it is crucial to detect and correct errors throughout all the engineering steps of DNN-based software systems and not only on the resulting DNN model. In this paper, we gather a catalog of training issues and based on their symptoms and their effects on the behavior of the training program, we propose practical verification routines to detect the aforementioned issues, automatically, by continuously validating that some important properties of the learning dynamics hold during the training. Then, we design, TheDeepChecker, an end-to-end property-based debugging approach for DNN training programs. We assess the effectiveness of TheDeepChecker on synthetic and real-world buggy DL programs and compare it with Amazon SageMaker Debugger (SMD). Results show that TheDeepChecker's on-execution validation of DNN-based program's properties succeeds in revealing several coding bugs and system misconfigurations, early on and at a low cost. Moreover, TheDeepChecker outperforms the SMD's offline rules verification on training logs in terms of detection accuracy and DL bugs coverage

    Evidence-driven testing and debugging of software systems

    Get PDF
    Program debugging is the process of testing, exposing, reproducing, diagnosing and fixing software bugs. Many techniques have been proposed to aid developers during software testing and debugging. However, researchers have found that developers hardly use or adopt the proposed techniques in software practice. Evidently, this is because there is a gap between proposed methods and the state of software practice. Most methods fail to address the actual needs of software developers. In this dissertation, we pose the following scientific question: How can we bridge the gap between software practice and the state-of-the-art automated testing and debugging techniques? To address this challenge, we put forward the following thesis: Software testing and debugging should be driven by empirical evidence collected from software practice. In particular, we posit that the feedback from software practice should shape and guide (the automation) of testing and debugging activities. In this thesis, we focus on gathering evidence from software practice by conducting several empirical studies on software testing and debugging activities in the real-world. We then build tools and methods that are well-grounded and driven by the empirical evidence obtained from these experiments. Firstly, we conduct an empirical study on the state of debugging in practice using a survey and a human study. In this study, we ask developers about their debugging needs and observe the tools and strategies employed by developers while testing, diagnosing and repairing real bugs. Secondly, we evaluate the effectiveness of the state-of-the-art automated fault localization (AFL) methods on real bugs and programs. Thirdly, we conducted an experiment to evaluate the causes of invalid inputs in software practice. Lastly, we study how to learn input distributions from real-world sample inputs, using probabilistic grammars. To bridge the gap between software practice and the state of the art in software testing and debugging, we proffer the following empirical results and techniques: (1) We collect evidence on the state of practice in program debugging and indeed, we found that there is a chasm between (available) debugging tools and developer needs. We elicit the actual needs and concerns of developers when testing and diagnosing real faults and provide a benchmark (called DBGBench) to aid the automated evaluation of debugging and repair tools. (2) We provide empirical evidence on the effectiveness of several state-of-the-art AFL techniques (such as statistical debugging formulas and dynamic slicing). Building on the obtained empirical evidence, we provide a hybrid approach that outperforms the state-of-the-art AFL techniques. (3) We evaluate the prevalence and causes of invalid inputs in software practice, and we build on the lessons learned from this experiment to build a general-purpose algorithm (called ddmax) that automatically diagnoses and repairs real-world invalid inputs. (4) We provide a method to learn the distribution of input elements in software practice using probabilistic grammars and we further employ the learned distribution to drive the test generation of inputs that are similar (or dissimilar) to sample inputs found in the wild. In summary, we propose an evidence-driven approach to software testing and debugging, which is based on collecting empirical evidence from software practice to guide and direct software testing and debugging. In our evaluation, we found that our approach is effective in improving the effectiveness of several debugging activities in practice. In particular, using our evidence-driven approach, we elicit the actual debugging needs of developers, improve the effectiveness of several automated fault localization techniques, effectively debug and repair invalid inputs, and generate test inputs that are (dis)similar to real-world inputs. Our proposed methods are built on empirical evidence and they improve over the state-of-the-art techniques in testing and debugging.Software-Debugging bezeichnet das Testen, Aufspüren, Reproduzieren, Diagnostizieren und das Beheben von Fehlern in Programmen. Es wurden bereits viele Debugging-Techniken vorgestellt, die Softwareentwicklern beim Testen und Debuggen unterstützen. Dennoch hat sich in der Forschung gezeigt, dass Entwickler diese Techniken in der Praxis kaum anwenden oder adaptieren. Das könnte daran liegen, dass es einen großen Abstand zwischen den vorgestellten und in der Praxis tatsächlich genutzten Techniken gibt. Die meisten Techniken genügen den Anforderungen der Entwickler nicht. In dieser Dissertation stellen wir die folgende wissenschaftliche Frage: Wie können wir die Kluft zwischen Software-Praxis und den aktuellen wissenschaftlichen Techniken für automatisiertes Testen und Debugging schließen? Um diese Herausforderung anzugehen, stellen wir die folgende These auf: Das Testen und Debuggen von Software sollte von empirischen Daten, die in der Software-Praxis gesammelt wurden, vorangetrieben werden. Genauer gesagt postulieren wir, dass das Feedback aus der Software-Praxis die Automation des Testens und Debuggens formen und bestimmen sollte. In dieser Arbeit fokussieren wir uns auf das Sammeln von Daten aus der Software-Praxis, indem wir einige empirische Studien über das Testen und Debuggen von Software in der echten Welt durchführen. Auf Basis der gesammelten Daten entwickeln wir dann Werkzeuge, die sich auf die Daten der durchgeführten Experimente stützen. Als erstes führen wir eine empirische Studie über den Stand des Debuggens in der Praxis durch, wobei wir eine Umfrage und eine Humanstudie nutzen. In dieser Studie befragen wir Entwickler zu ihren Bedürfnissen, die sie beim Debuggen haben und beobachten die Werkzeuge und Strategien, die sie beim Diagnostizieren, Testen und Aufspüren echter Fehler einsetzen. Als nächstes bewerten wir die Effektivität der aktuellen Automated Fault Localization (AFL)- Methoden zum automatischen Aufspüren von echten Fehlern in echten Programmen. Unser dritter Schritt ist ein Experiment, um die Ursachen von defekten Eingaben in der Software-Praxis zu ermitteln. Zuletzt erforschen wir, wie Häufigkeitsverteilungen von Teileingaben mithilfe einer Grammatik von echten Beispiel-Eingaben aus der Praxis gelernt werden können. Um die Lücke zwischen Software-Praxis und der aktuellen Forschung über Testen und Debuggen von Software zu schließen, bieten wir die folgenden empirischen Ergebnisse und Techniken: (1) Wir sammeln aktuelle Forschungsergebnisse zum Stand des Software-Debuggens und finden in der Tat eine Diskrepanz zwischen (vorhandenen) Debugging-Werkzeugen und dem, was der Entwickler tatsächlich benötigt. Wir sammeln die tatsächlichen Bedürfnisse von Entwicklern beim Testen und Debuggen von Fehlern aus der echten Welt und entwickeln einen Benchmark (DbgBench), um das automatische Evaluieren von Debugging-Werkzeugen zu erleichtern. (2) Wir stellen empirische Daten zur Effektivität einiger aktueller AFL-Techniken vor (z.B. Statistical Debugging-Formeln und Dynamic Slicing). Auf diese Daten aufbauend, stellen wir einen hybriden Algorithmus vor, der die Leistung der aktuellen AFL-Techniken übertrifft. (3) Wir evaluieren die Häufigkeit und Ursachen von ungültigen Eingaben in der Softwarepraxis und stellen einen auf diesen Daten aufbauenden universell einsetzbaren Algorithmus (ddmax) vor, der automatisch defekte Eingaben diagnostiziert und behebt. (4) Wir stellen eine Methode vor, die Verteilung von Schnipseln von Eingaben in der Software-Praxis zu lernen, indem wir Grammatiken mit Wahrscheinlichkeiten nutzen. Die gelernten Verteilungen benutzen wir dann, um den Beispiel-Eingaben ähnliche (oder verschiedene) Eingaben zu erzeugen. Zusammenfassend stellen wir einen auf der Praxis beruhenden Ansatz zum Testen und Debuggen von Software vor, welcher auf empirischen Daten aus der Software-Praxis basiert, um das Testen und Debuggen zu unterstützen. In unserer Evaluierung haben wir festgestellt, dass unser Ansatz effektiv viele Debugging-Disziplinen in der Praxis verbessert. Genauer gesagt finden wir mit unserem Ansatz die genauen Bedürfnisse von Entwicklern, verbessern die Effektivität vieler AFL-Techniken, debuggen und beheben effektiv fehlerhafte Eingaben und generieren Test-Eingaben, die (un)ähnlich zu Eingaben aus der echten Welt sind. Unsere vorgestellten Methoden basieren auf empirischen Daten und verbessern die aktuellen Techniken des Testens und Debuggens

    RAFT under fire

    Get PDF
    corecore