1,048 research outputs found

    Isolation of malicious external inputs in a security focused adaptive execution environment

    Get PDF
    pre-printReliable isolation of malicious application inputs is necessary for preventing the future success of an observed novel attack after the initial incident. In this paper we describe, measure and analyze, Input-Reduction, a technique that can quickly isolate malicious external inputs that embody unforeseen and potentially novel attacks, from other benign application inputs. The Input-Reduction technique is integrated into an advanced, security-focused, and adaptive execution environment that automates diagnosis and repair. In experiments we show that Input-Reduction is highly accurate and efficient in isolating attack inputs and determining casual relations between inputs. We also measure and show that the cost incurred by key services that support reliable reproduction and fast attack isolation is reasonable in the adaptive execution environment

    Applying and Extending the Delta Debugging Algorithm for Elevator Dispatching Algorithms (Experience Paper)

    Full text link
    Elevator systems are one kind of Cyber-Physical Systems (CPSs), and as such, test cases are usually complex and long in time. This is mainly because realistic test scenarios are employed (e.g., for testing elevator dispatching algorithms, typically a full day of passengers traveling through a system of elevators is used). However, in such a context, when needing to reproduce a failure, it is of high benefit to provide the minimal test input to the software developers. This way, analyzing and trying to localize the root-cause of the failure is easier and more agile. Delta debugging has been found to be an efficient technique to reduce failure-inducing test inputs. In this paper, we enhance this technique by first monitoring the environment at which the CPS operates as well as its physical states. With the monitored information, we search for stable states of the CPS during the execution of the simulation. In a second step, we use such identified stable states to help the delta debugging algorithm isolate the failure-inducing test inputs more efficiently. We report our experience of applying our approach into an industrial elevator dispatching algorithm. An empirical evaluation carried out with real operational data from a real installation of elevators suggests that the proposed environment-wise delta debugging algorithm is between 1.3 to 1.8 times faster than the traditional delta debugging, while producing a larger reduction in the failure-inducing test inputs. The results provided by the different implemented delta debugging algorithm versions are qualitatively assessed with domain experts. This assessment provides new insights and lessons learned, such as, potential applications of the delta debugging algorithm beyond debugging

    Enabling Program Analysis Through Deterministic Replay and Optimistic Hybrid Analysis

    Full text link
    As software continues to evolve, software systems increase in complexity. With software systems composed of many distinct but interacting components, today’s system programmers, users, and administrators find themselves requiring automated ways to find, understand, and handle system mis-behavior. Recent information breaches such as the Equifax breach of 2017, and the Heartbleed vulnerability of 2014 show the need to understand and debug prior states of computer systems. In this thesis I focus on enabling practical entire-system retroactive analysis, allowing programmers, users, and system administrators to diagnose and understand the impact of these devastating mishaps. I focus primarly on two techniques. First, I discuss a novel deterministic record and replay system which enables fast, practical recollection of entire systems of computer state. Second, I discuss optimistic hybrid analysis, a novel optimization method capable of dramatically accelerating retroactive program analysis. Record and replay systems greatly aid in solving a variety of problems, such as fault tolerance, forensic analysis, and information providence. These solutions, however, assume ubiquitous recording of any application which may have a problem. Current record and replay systems are forced to trade-off between disk space and replay speed. This trade-off has historically made it impractical to both record and replay large histories of system level computation. I present Arnold, a novel record and replay system which efficiently records years of computation on a commodity hard-drive, and can efficiently replay any recorded information. Arnold combines caching with a unique process-group granularity of recording to produce both small, and quickly recalled recordings. My experiments show that under a desktop workload, Arnold could store 4 years of computation on a commodity 4TB hard drive. Dynamic analysis is used to retroactively identify and address many forms of system mis-behaviors including: programming errors, data-races, private information leakage, and memory errors. Unfortunately, the runtime overhead of dynamic analysis has precluded its adoption in many instances. I present a new dynamic analysis methodology called optimistic hybrid analysis (OHA). OHA uses knowledge of the past to predict program behaviors in the future. These predictions, or likely invariants are speculatively assumed true by a static analysis. This creates a static analysis which can be far more accurate than its traditional counterpart. Once this predicated static analysis is created, it is speculatively used to optimize a final dynamic analysis, creating a far more efficient dynamic analysis than otherwise possible. I demonstrate the effectiveness of OHA by creating an optimistic hybrid backward slicer, OptSlice, and optimistic data-race detector OptFT. OptSlice and OptFT are just as accurate as their traditional hybrid counterparts, but run on average 8.3x and 1.6x faster respectively. In this thesis I demonstrate that Arnold’s ability to record and replay entire computer systems, combined with optimistic hybrid analysis’s ability to quickly analyze prior computation, enable a practical and useful entire system retroactive analysis that has been previously unrealized.PHDComputer Science & EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttps://deepblue.lib.umich.edu/bitstream/2027.42/144052/1/ddevec_1.pd

    Locating Faults with Program Slicing: An Empirical Analysis

    Get PDF
    Statistical fault localization is an easily deployed technique for quickly determining candidates for faulty code locations. If a human programmer has to search the fault beyond the top candidate locations, though, more traditional techniques of following dependencies along dynamic slices may be better suited. In a large study of 457 bugs (369 single faults and 88 multiple faults) in 46 open source C programs, we compare the effectiveness of statistical fault localization against dynamic slicing. For single faults, we find that dynamic slicing was eight percentage points more effective than the best performing statistical debugging formula; for 66% of the bugs, dynamic slicing finds the fault earlier than the best performing statistical debugging formula. In our evaluation, dynamic slicing is more effective for programs with single fault, but statistical debugging performs better on multiple faults. Best results, however, are obtained by a hybrid approach: If programmers first examine at most the top five most suspicious locations from statistical debugging, and then switch to dynamic slices, on average, they will need to examine 15% (30 lines) of the code. These findings hold for 18 most effective statistical debugging formulas and our results are independent of the number of faults (i.e. single or multiple faults) and error type (i.e. artificial or real errors)

    Evidence-driven testing and debugging of software systems

    Get PDF
    Program debugging is the process of testing, exposing, reproducing, diagnosing and fixing software bugs. Many techniques have been proposed to aid developers during software testing and debugging. However, researchers have found that developers hardly use or adopt the proposed techniques in software practice. Evidently, this is because there is a gap between proposed methods and the state of software practice. Most methods fail to address the actual needs of software developers. In this dissertation, we pose the following scientific question: How can we bridge the gap between software practice and the state-of-the-art automated testing and debugging techniques? To address this challenge, we put forward the following thesis: Software testing and debugging should be driven by empirical evidence collected from software practice. In particular, we posit that the feedback from software practice should shape and guide (the automation) of testing and debugging activities. In this thesis, we focus on gathering evidence from software practice by conducting several empirical studies on software testing and debugging activities in the real-world. We then build tools and methods that are well-grounded and driven by the empirical evidence obtained from these experiments. Firstly, we conduct an empirical study on the state of debugging in practice using a survey and a human study. In this study, we ask developers about their debugging needs and observe the tools and strategies employed by developers while testing, diagnosing and repairing real bugs. Secondly, we evaluate the effectiveness of the state-of-the-art automated fault localization (AFL) methods on real bugs and programs. Thirdly, we conducted an experiment to evaluate the causes of invalid inputs in software practice. Lastly, we study how to learn input distributions from real-world sample inputs, using probabilistic grammars. To bridge the gap between software practice and the state of the art in software testing and debugging, we proffer the following empirical results and techniques: (1) We collect evidence on the state of practice in program debugging and indeed, we found that there is a chasm between (available) debugging tools and developer needs. We elicit the actual needs and concerns of developers when testing and diagnosing real faults and provide a benchmark (called DBGBench) to aid the automated evaluation of debugging and repair tools. (2) We provide empirical evidence on the effectiveness of several state-of-the-art AFL techniques (such as statistical debugging formulas and dynamic slicing). Building on the obtained empirical evidence, we provide a hybrid approach that outperforms the state-of-the-art AFL techniques. (3) We evaluate the prevalence and causes of invalid inputs in software practice, and we build on the lessons learned from this experiment to build a general-purpose algorithm (called ddmax) that automatically diagnoses and repairs real-world invalid inputs. (4) We provide a method to learn the distribution of input elements in software practice using probabilistic grammars and we further employ the learned distribution to drive the test generation of inputs that are similar (or dissimilar) to sample inputs found in the wild. In summary, we propose an evidence-driven approach to software testing and debugging, which is based on collecting empirical evidence from software practice to guide and direct software testing and debugging. In our evaluation, we found that our approach is effective in improving the effectiveness of several debugging activities in practice. In particular, using our evidence-driven approach, we elicit the actual debugging needs of developers, improve the effectiveness of several automated fault localization techniques, effectively debug and repair invalid inputs, and generate test inputs that are (dis)similar to real-world inputs. Our proposed methods are built on empirical evidence and they improve over the state-of-the-art techniques in testing and debugging.Software-Debugging bezeichnet das Testen, Aufspüren, Reproduzieren, Diagnostizieren und das Beheben von Fehlern in Programmen. Es wurden bereits viele Debugging-Techniken vorgestellt, die Softwareentwicklern beim Testen und Debuggen unterstützen. Dennoch hat sich in der Forschung gezeigt, dass Entwickler diese Techniken in der Praxis kaum anwenden oder adaptieren. Das könnte daran liegen, dass es einen großen Abstand zwischen den vorgestellten und in der Praxis tatsächlich genutzten Techniken gibt. Die meisten Techniken genügen den Anforderungen der Entwickler nicht. In dieser Dissertation stellen wir die folgende wissenschaftliche Frage: Wie können wir die Kluft zwischen Software-Praxis und den aktuellen wissenschaftlichen Techniken für automatisiertes Testen und Debugging schließen? Um diese Herausforderung anzugehen, stellen wir die folgende These auf: Das Testen und Debuggen von Software sollte von empirischen Daten, die in der Software-Praxis gesammelt wurden, vorangetrieben werden. Genauer gesagt postulieren wir, dass das Feedback aus der Software-Praxis die Automation des Testens und Debuggens formen und bestimmen sollte. In dieser Arbeit fokussieren wir uns auf das Sammeln von Daten aus der Software-Praxis, indem wir einige empirische Studien über das Testen und Debuggen von Software in der echten Welt durchführen. Auf Basis der gesammelten Daten entwickeln wir dann Werkzeuge, die sich auf die Daten der durchgeführten Experimente stützen. Als erstes führen wir eine empirische Studie über den Stand des Debuggens in der Praxis durch, wobei wir eine Umfrage und eine Humanstudie nutzen. In dieser Studie befragen wir Entwickler zu ihren Bedürfnissen, die sie beim Debuggen haben und beobachten die Werkzeuge und Strategien, die sie beim Diagnostizieren, Testen und Aufspüren echter Fehler einsetzen. Als nächstes bewerten wir die Effektivität der aktuellen Automated Fault Localization (AFL)- Methoden zum automatischen Aufspüren von echten Fehlern in echten Programmen. Unser dritter Schritt ist ein Experiment, um die Ursachen von defekten Eingaben in der Software-Praxis zu ermitteln. Zuletzt erforschen wir, wie Häufigkeitsverteilungen von Teileingaben mithilfe einer Grammatik von echten Beispiel-Eingaben aus der Praxis gelernt werden können. Um die Lücke zwischen Software-Praxis und der aktuellen Forschung über Testen und Debuggen von Software zu schließen, bieten wir die folgenden empirischen Ergebnisse und Techniken: (1) Wir sammeln aktuelle Forschungsergebnisse zum Stand des Software-Debuggens und finden in der Tat eine Diskrepanz zwischen (vorhandenen) Debugging-Werkzeugen und dem, was der Entwickler tatsächlich benötigt. Wir sammeln die tatsächlichen Bedürfnisse von Entwicklern beim Testen und Debuggen von Fehlern aus der echten Welt und entwickeln einen Benchmark (DbgBench), um das automatische Evaluieren von Debugging-Werkzeugen zu erleichtern. (2) Wir stellen empirische Daten zur Effektivität einiger aktueller AFL-Techniken vor (z.B. Statistical Debugging-Formeln und Dynamic Slicing). Auf diese Daten aufbauend, stellen wir einen hybriden Algorithmus vor, der die Leistung der aktuellen AFL-Techniken übertrifft. (3) Wir evaluieren die Häufigkeit und Ursachen von ungültigen Eingaben in der Softwarepraxis und stellen einen auf diesen Daten aufbauenden universell einsetzbaren Algorithmus (ddmax) vor, der automatisch defekte Eingaben diagnostiziert und behebt. (4) Wir stellen eine Methode vor, die Verteilung von Schnipseln von Eingaben in der Software-Praxis zu lernen, indem wir Grammatiken mit Wahrscheinlichkeiten nutzen. Die gelernten Verteilungen benutzen wir dann, um den Beispiel-Eingaben ähnliche (oder verschiedene) Eingaben zu erzeugen. Zusammenfassend stellen wir einen auf der Praxis beruhenden Ansatz zum Testen und Debuggen von Software vor, welcher auf empirischen Daten aus der Software-Praxis basiert, um das Testen und Debuggen zu unterstützen. In unserer Evaluierung haben wir festgestellt, dass unser Ansatz effektiv viele Debugging-Disziplinen in der Praxis verbessert. Genauer gesagt finden wir mit unserem Ansatz die genauen Bedürfnisse von Entwicklern, verbessern die Effektivität vieler AFL-Techniken, debuggen und beheben effektiv fehlerhafte Eingaben und generieren Test-Eingaben, die (un)ähnlich zu Eingaben aus der echten Welt sind. Unsere vorgestellten Methoden basieren auf empirischen Daten und verbessern die aktuellen Techniken des Testens und Debuggens

    How does machine learning change software development practices?

    Get PDF

    Using App Inventor to Explore Low-Achieving Students\u27 Understanding of Fractions

    Get PDF
    A student\u27s understanding of fraction magnitude impacts his/her understanding of algebra (e.g., Booth & Newton, 2012; Siegler et al., 2012), which then influences his/her likelihood of graduating high school (Orihuela, 2006) or succeeding in higher education (Adelman & United States., 2006; Trusty & Niles, 2004). Literature suggests that students gain this understanding when they create and work with various representations of fractions (e.g., Ainsworth, Bibby, & Wood, 2002; Panaoura et al., 2009; Siegler, Fazio, Bailey, & Zhou, 2013), which can occur when students engage in constructivist activities such as developing games (Kafai, 1996, Apr). This study examines an intervention where low-achieving eighth-grade students develop games about fraction magnitude using App Inventor, a novice programming environment, to determine what representations students create in their games, how their understanding of fraction magnitude develops when making their games, and what challenges they experience other than challenges concerning fractions. It uses a holistic case study with embedded units to understand the major themes for each research question while considering the influences of individual backgrounds and the various kinds of games each developed. Kolb\u27s (1984) experiential learning theory, which states that ideas are formed by experiences and which occurs when one programs or codes a computer (Robins, Rountree, & Rountree, 2003), grounds the data analysis. The findings of this study indicate that students primarily use numeric representations and area models to represent fraction magnitude, which are also the most common representations found in textbooks (Zhang, 2012). They developed their understanding by working with area models, talking about area models, or by developing code to compare two fractions. The way they constructed and critiqued these representations map to the experiential learning cycle, showing that they engaged in concrete experiences with fractions, reflected on the experience, conceptualized their new learning, and experimented with that learning to develop their understanding of fraction magnitude. The challenges they experienced ranged from coding difficulties, such as decomposing their designs into components to code, to non-coding challenges, such as collaborating. Limitations of this study are discussed and implications for practice and future research are delineated
    corecore