47 research outputs found

    Locating Faults with Program Slicing: An Empirical Analysis

    Get PDF
    Statistical fault localization is an easily deployed technique for quickly determining candidates for faulty code locations. If a human programmer has to search the fault beyond the top candidate locations, though, more traditional techniques of following dependencies along dynamic slices may be better suited. In a large study of 457 bugs (369 single faults and 88 multiple faults) in 46 open source C programs, we compare the effectiveness of statistical fault localization against dynamic slicing. For single faults, we find that dynamic slicing was eight percentage points more effective than the best performing statistical debugging formula; for 66% of the bugs, dynamic slicing finds the fault earlier than the best performing statistical debugging formula. In our evaluation, dynamic slicing is more effective for programs with single fault, but statistical debugging performs better on multiple faults. Best results, however, are obtained by a hybrid approach: If programmers first examine at most the top five most suspicious locations from statistical debugging, and then switch to dynamic slices, on average, they will need to examine 15% (30 lines) of the code. These findings hold for 18 most effective statistical debugging formulas and our results are independent of the number of faults (i.e. single or multiple faults) and error type (i.e. artificial or real errors)

    Locating Faults with Program Slicing: An Empirical Analysis

    Get PDF
    Statistical fault localization is an easily deployed technique for quickly determining candidates for faulty code locations. If a human programmer has to search the fault beyond the top candidate locations, though, more traditional techniques of following dependencies along dynamic slices may be better suited. In a large study of 457 bugs (369 single faults and 88 multiple faults) in 46 open-source C programs, we compare the effectiveness of statistical fault localization against dynamic slicing. For single faults, we find that dynamic slicing was eight percentage points more effective than the best per- forming statistical debugging formula; for 66% of the bugs, dynamic slicing finds the fault earlier than the best performing statistical debugging formula. In our evaluation, dynamic slicing is more effective for programs with single fault, but statistical debugging performs better on multiple faults. Best results, however, are obtained by a hybrid approach: If programmers first examine at most the top five most suspicious locations from statistical debugging, and then switch to dynamic slices, on average, they will need to examine 15% (30 lines) of the code. These findings hold for 18 most effective statistical debugging formulas and our results are independent of the number of faults (i.e. single or multiple faults) and error type (i.e. artificial or real errors)

    Evidence-driven testing and debugging of software systems

    Get PDF
    Program debugging is the process of testing, exposing, reproducing, diagnosing and fixing software bugs. Many techniques have been proposed to aid developers during software testing and debugging. However, researchers have found that developers hardly use or adopt the proposed techniques in software practice. Evidently, this is because there is a gap between proposed methods and the state of software practice. Most methods fail to address the actual needs of software developers. In this dissertation, we pose the following scientific question: How can we bridge the gap between software practice and the state-of-the-art automated testing and debugging techniques? To address this challenge, we put forward the following thesis: Software testing and debugging should be driven by empirical evidence collected from software practice. In particular, we posit that the feedback from software practice should shape and guide (the automation) of testing and debugging activities. In this thesis, we focus on gathering evidence from software practice by conducting several empirical studies on software testing and debugging activities in the real-world. We then build tools and methods that are well-grounded and driven by the empirical evidence obtained from these experiments. Firstly, we conduct an empirical study on the state of debugging in practice using a survey and a human study. In this study, we ask developers about their debugging needs and observe the tools and strategies employed by developers while testing, diagnosing and repairing real bugs. Secondly, we evaluate the effectiveness of the state-of-the-art automated fault localization (AFL) methods on real bugs and programs. Thirdly, we conducted an experiment to evaluate the causes of invalid inputs in software practice. Lastly, we study how to learn input distributions from real-world sample inputs, using probabilistic grammars. To bridge the gap between software practice and the state of the art in software testing and debugging, we proffer the following empirical results and techniques: (1) We collect evidence on the state of practice in program debugging and indeed, we found that there is a chasm between (available) debugging tools and developer needs. We elicit the actual needs and concerns of developers when testing and diagnosing real faults and provide a benchmark (called DBGBench) to aid the automated evaluation of debugging and repair tools. (2) We provide empirical evidence on the effectiveness of several state-of-the-art AFL techniques (such as statistical debugging formulas and dynamic slicing). Building on the obtained empirical evidence, we provide a hybrid approach that outperforms the state-of-the-art AFL techniques. (3) We evaluate the prevalence and causes of invalid inputs in software practice, and we build on the lessons learned from this experiment to build a general-purpose algorithm (called ddmax) that automatically diagnoses and repairs real-world invalid inputs. (4) We provide a method to learn the distribution of input elements in software practice using probabilistic grammars and we further employ the learned distribution to drive the test generation of inputs that are similar (or dissimilar) to sample inputs found in the wild. In summary, we propose an evidence-driven approach to software testing and debugging, which is based on collecting empirical evidence from software practice to guide and direct software testing and debugging. In our evaluation, we found that our approach is effective in improving the effectiveness of several debugging activities in practice. In particular, using our evidence-driven approach, we elicit the actual debugging needs of developers, improve the effectiveness of several automated fault localization techniques, effectively debug and repair invalid inputs, and generate test inputs that are (dis)similar to real-world inputs. Our proposed methods are built on empirical evidence and they improve over the state-of-the-art techniques in testing and debugging.Software-Debugging bezeichnet das Testen, Aufspüren, Reproduzieren, Diagnostizieren und das Beheben von Fehlern in Programmen. Es wurden bereits viele Debugging-Techniken vorgestellt, die Softwareentwicklern beim Testen und Debuggen unterstützen. Dennoch hat sich in der Forschung gezeigt, dass Entwickler diese Techniken in der Praxis kaum anwenden oder adaptieren. Das könnte daran liegen, dass es einen großen Abstand zwischen den vorgestellten und in der Praxis tatsächlich genutzten Techniken gibt. Die meisten Techniken genügen den Anforderungen der Entwickler nicht. In dieser Dissertation stellen wir die folgende wissenschaftliche Frage: Wie können wir die Kluft zwischen Software-Praxis und den aktuellen wissenschaftlichen Techniken für automatisiertes Testen und Debugging schließen? Um diese Herausforderung anzugehen, stellen wir die folgende These auf: Das Testen und Debuggen von Software sollte von empirischen Daten, die in der Software-Praxis gesammelt wurden, vorangetrieben werden. Genauer gesagt postulieren wir, dass das Feedback aus der Software-Praxis die Automation des Testens und Debuggens formen und bestimmen sollte. In dieser Arbeit fokussieren wir uns auf das Sammeln von Daten aus der Software-Praxis, indem wir einige empirische Studien über das Testen und Debuggen von Software in der echten Welt durchführen. Auf Basis der gesammelten Daten entwickeln wir dann Werkzeuge, die sich auf die Daten der durchgeführten Experimente stützen. Als erstes führen wir eine empirische Studie über den Stand des Debuggens in der Praxis durch, wobei wir eine Umfrage und eine Humanstudie nutzen. In dieser Studie befragen wir Entwickler zu ihren Bedürfnissen, die sie beim Debuggen haben und beobachten die Werkzeuge und Strategien, die sie beim Diagnostizieren, Testen und Aufspüren echter Fehler einsetzen. Als nächstes bewerten wir die Effektivität der aktuellen Automated Fault Localization (AFL)- Methoden zum automatischen Aufspüren von echten Fehlern in echten Programmen. Unser dritter Schritt ist ein Experiment, um die Ursachen von defekten Eingaben in der Software-Praxis zu ermitteln. Zuletzt erforschen wir, wie Häufigkeitsverteilungen von Teileingaben mithilfe einer Grammatik von echten Beispiel-Eingaben aus der Praxis gelernt werden können. Um die Lücke zwischen Software-Praxis und der aktuellen Forschung über Testen und Debuggen von Software zu schließen, bieten wir die folgenden empirischen Ergebnisse und Techniken: (1) Wir sammeln aktuelle Forschungsergebnisse zum Stand des Software-Debuggens und finden in der Tat eine Diskrepanz zwischen (vorhandenen) Debugging-Werkzeugen und dem, was der Entwickler tatsächlich benötigt. Wir sammeln die tatsächlichen Bedürfnisse von Entwicklern beim Testen und Debuggen von Fehlern aus der echten Welt und entwickeln einen Benchmark (DbgBench), um das automatische Evaluieren von Debugging-Werkzeugen zu erleichtern. (2) Wir stellen empirische Daten zur Effektivität einiger aktueller AFL-Techniken vor (z.B. Statistical Debugging-Formeln und Dynamic Slicing). Auf diese Daten aufbauend, stellen wir einen hybriden Algorithmus vor, der die Leistung der aktuellen AFL-Techniken übertrifft. (3) Wir evaluieren die Häufigkeit und Ursachen von ungültigen Eingaben in der Softwarepraxis und stellen einen auf diesen Daten aufbauenden universell einsetzbaren Algorithmus (ddmax) vor, der automatisch defekte Eingaben diagnostiziert und behebt. (4) Wir stellen eine Methode vor, die Verteilung von Schnipseln von Eingaben in der Software-Praxis zu lernen, indem wir Grammatiken mit Wahrscheinlichkeiten nutzen. Die gelernten Verteilungen benutzen wir dann, um den Beispiel-Eingaben ähnliche (oder verschiedene) Eingaben zu erzeugen. Zusammenfassend stellen wir einen auf der Praxis beruhenden Ansatz zum Testen und Debuggen von Software vor, welcher auf empirischen Daten aus der Software-Praxis basiert, um das Testen und Debuggen zu unterstützen. In unserer Evaluierung haben wir festgestellt, dass unser Ansatz effektiv viele Debugging-Disziplinen in der Praxis verbessert. Genauer gesagt finden wir mit unserem Ansatz die genauen Bedürfnisse von Entwicklern, verbessern die Effektivität vieler AFL-Techniken, debuggen und beheben effektiv fehlerhafte Eingaben und generieren Test-Eingaben, die (un)ähnlich zu Eingaben aus der echten Welt sind. Unsere vorgestellten Methoden basieren auf empirischen Daten und verbessern die aktuellen Techniken des Testens und Debuggens

    Extending the Reach of Fault Localization to Assist in Automated Debugging

    Get PDF
    Software debugging is one of the most time-consuming tasks in modern software maintenance. To assist developers with debugging, researchers have proposed fault localization techniques. These techniques aim to automate the process of locating faults in software, which can greatly reduce debugging time and assist developers in understanding the faults. Effective fault localization is also crucial for automated program repair techniques, as it helps identify potential faulty locations for patching. Despite recent efforts to advance fault localization techniques, their effectiveness is still limited. With the increasing complexity of modern software, fault localization may not always provide direct identification of the root causes of faults. Further, there is a lack of studies on their application in modern software development. Most prior studies have evaluated these techniques in traditional software development settings, where only a single snapshot of the system is considered. However, modern software development often involves continuous and fine-grained changes to the system. This dissertation proposes a series of approaches to explore new automated debugging solutions that can enhance software quality assurance and reliability practices, with a specific focus on extending the reach of fault localization in modern software development. The dissertation begins with an empirical study on user-reported logs in bug reports, revealing that re-constructed execution paths from these logs provide valuable debugging hints. To further assist developers in debugging, we propose using static analysis techniques for information-retrieval and path-guided fault localization. By leveraging execution paths from logs in bug reports, we can improve the effectiveness of fault localization techniques. Second, we investigate the characteristics of operational data in continuous integration that can help capture faults early in the testing phase. As there is currently no available continuous integration benchmark that incorporates continuous test execution and failure, we present T-Evos, a dataset that comprises various operational data in continuous integration settings. We propose automated fault localization techniques that integrate change information from continuous integration settings, and demonstrate that leveraging such fine-grained change information can significantly improve their effectiveness. Finally, the dissertation investigates the data cleanness in fault localization by examining developers' knowledge in fault-triggering tests. The study reveals a significant degradation in the performance of fault localization techniques when evaluated on faults without developer knowledge. Through case studies and experiments, the proposed techniques in this dissertation significantly improve the effectiveness of fault localization and facilitate their adoption in modern software development. Additionally, this dissertation provides valuable insights into new debugging solutions for future research

    Security Threats to 5G Networks for Social Robots in Public Spaces: A Survey

    Get PDF
    This paper surveys security threats to 5G-enabled wireless access networks for social robots in public spaces (SRPS). The use of social robots (SR) in public areas requires specific Quality of Service (QoS) planning to meet its unique requirements. Its 5G threat landscape entails more than cybersecurity threats that most previous studies focus on. This study examines the 5G wireless RAN for SRPS from three perspectives: SR and wireless access points, the ad hoc network link between SR and user devices, and threats to SR and users’ communication equipment. The paper analyses the security threats to confidentiality, integrity, availability, authentication, authorisation, and privacy from the SRPS security objectives perspective. We begin with an overview of SRPS use cases and access network requirements, followed by 5G security standards, requirements, and the need for a more representative threat landscape for SRPS. The findings confirm that the RAN of SRPS is most vulnerable to physical, side-channel, intrusion, injection, manipulation, and natural and malicious threats. The paper presents existing mitigation to the identified attacks and recommends including physical level security (PLS) and post-quantum cryptography in the early design of SRPS. The insights from this survey will provide valuable risk assessment and management input to researchers, industrial practitioners, policymakers, and other stakeholders of SRPS.publishedVersio

    Scaling Causality Analysis for Production Systems.

    Full text link
    Causality analysis reveals how program values influence each other. It is important for debugging, optimizing, and understanding the execution of programs. This thesis scales causality analysis to production systems consisting of desktop and server applications as well as large-scale Internet services. This enables developers to employ causality analysis to debug and optimize complex, modern software systems. This thesis shows that it is possible to scale causality analysis to both fine-grained instruction level analysis and analysis of Internet scale distributed systems with thousands of discrete software components by developing and employing automated methods to observe and reason about causality. First, we observe causality at a fine-grained instruction level by developing the first taint tracking framework to support tracking millions of input sources. We also introduce flexible taint tracking to allow for scoping different queries and dynamic filtering of inputs, outputs, and relationships. Next, we introduce the Mystery Machine, which uses a ``big data'' approach to discover causal relationships between software components in a large-scale Internet service. We leverage the fact that large-scale Internet services receive a large number of requests in order to observe counterexamples to hypothesized causal relationships. Using discovered casual relationships, we identify the critical path for request execution and use the critical path analysis to explore potential scheduling optimizations. Finally, we explore using causality to make data-quality tradeoffs in Internet services. A data-quality tradeoff is an explicit decision by a software component to return lower-fidelity data in order to improve response time or minimize resource usage. We perform a study of data-quality tradeoffs in a large-scale Internet service to show the pervasiveness of these tradeoffs. We develop DQBarge, a system that enables better data-quality tradeoffs by propagating critical information along the causal path of request processing. Our evaluation shows that DQBarge helps Internet services mitigate load spikes, improve utilization of spare resources, and implement dynamic capacity planning.PHDComputer Science & EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/135888/1/mcchow_1.pd

    Blockchain for the metaverse: A Review

    Get PDF
    Since Facebook officially changed its name to Meta in Oct. 2021, the metaverse has become a new norm of social networks and three-dimensional (3D) virtual worlds. The metaverse aims to bring 3D immersive and personalized experiences to users by leveraging many pertinent technologies. Despite great attention and benefits, a natural question in the metaverse is how to secure its users’ digital content and data. In this regard, blockchain is a promising solution owing to its distinct features of decentralization, immutability, and transparency. To better understand the role of blockchain in the metaverse, we aim to provide an extensive survey on the applications of blockchain for the metaverse. We first present a preliminary to blockchain and the metaverse and highlight the motivations behind the use of blockchain for the metaverse. Next, we extensively discuss blockchain-based methods for the metaverse from technical perspectives, such as data acquisition, data storage, data sharing, data interoperability, and data privacy preservation. For each perspective, we first discuss the technical challenges of the metaverse and then highlight how blockchain can help. Moreover, we investigate the impact of blockchain on key-enabling technologies in the metaverse, including Internet-of-Things, digital twins, multi-sensory and immersive applications, artificial intelligence, and big data. We also present some major projects to showcase the role of blockchain in metaverse applications and services. Finally, we present some promising directions to drive further research innovations and developments toward the use of blockchain in the metaverse in the future

    The Cloud-to-Thing Continuum

    Get PDF
    The Internet of Things offers massive societal and economic opportunities while at the same time significant challenges, not least the delivery and management of the technical infrastructure underpinning it, the deluge of data generated from it, ensuring privacy and security, and capturing value from it. This Open Access Pivot explores these challenges, presenting the state of the art and future directions for research but also frameworks for making sense of this complex area. This book provides a variety of perspectives on how technology innovations such as fog, edge and dew computing, 5G networks, and distributed intelligence are making us rethink conventional cloud computing to support the Internet of Things. Much of this book focuses on technical aspects of the Internet of Things, however, clear methodologies for mapping the business value of the Internet of Things are still missing. We provide a value mapping framework for the Internet of Things to address this gap. While there is much hype about the Internet of Things, we have yet to reach the tipping point. As such, this book provides a timely entrée for higher education educators, researchers and students, industry and policy makers on the technologies that promise to reshape how society interacts and operates

    Governance of Cloud-hosted Web Applications

    Get PDF
    Cloud computing has revolutionized the way developers implement and deploy applications. By running applications on large-scale compute infrastructures and programming platforms that are remotely accessible as utility services, cloud computing provides scalability, high availability, and increased user productivity.Despite the advantages inherent to the cloud computing model, it has also given rise to several software management and maintenance issues. Specifically, cloud platforms do not enforce developer best practices, and other administrative requirements when deploying applications. Cloud platforms also do not facilitate establishing service level objectives (SLOs) on application performance, which are necessary to ensure reliable and consistent operation of applications. Moreover, cloud platforms do not provide adequate support to monitor the performance of deployed applications, and conduct root cause analysis when an application exhibits a performance anomaly.We employ governance as a methodology to address the above mentioned issues prevalent in cloud platforms. We devise novel governance solutions that achieve administrative conformance, developer best practices, and performance SLOs in the cloud via policy enforcement, SLO prediction, performance anomaly detection and root cause analysis. The proposed solutions are fully automated, and built into the cloud platforms as cloud-native features thereby precluding the application developers from having to implement similar features by themselves. We evaluate our methodology using real world cloud platforms, and show that our solutions are highly effective and efficient
    corecore