12,532 research outputs found

    Evaluation Methodologies in Software Protection Research

    Full text link
    Man-at-the-end (MATE) attackers have full control over the system on which the attacked software runs, and try to break the confidentiality or integrity of assets embedded in the software. Both companies and malware authors want to prevent such attacks. This has driven an arms race between attackers and defenders, resulting in a plethora of different protection and analysis methods. However, it remains difficult to measure the strength of protections because MATE attackers can reach their goals in many different ways and a universally accepted evaluation methodology does not exist. This survey systematically reviews the evaluation methodologies of papers on obfuscation, a major class of protections against MATE attacks. For 572 papers, we collected 113 aspects of their evaluation methodologies, ranging from sample set types and sizes, over sample treatment, to performed measurements. We provide detailed insights into how the academic state of the art evaluates both the protections and analyses thereon. In summary, there is a clear need for better evaluation methodologies. We identify nine challenges for software protection evaluations, which represent threats to the validity, reproducibility, and interpretation of research results in the context of MATE attacks

    Explainable Automated Debugging via Large Language Model-driven Scientific Debugging

    Full text link
    Automated debugging techniques have the potential to reduce developer effort in debugging, and have matured enough to be adopted by industry. However, one critical issue with existing techniques is that, while developers want rationales for the provided automatic debugging results, existing techniques are ill-suited to provide them, as their deduction process differs significantly from that of human developers. Inspired by the way developers interact with code when debugging, we propose Automated Scientific Debugging (AutoSD), a technique that given buggy code and a bug-revealing test, prompts large language models to automatically generate hypotheses, uses debuggers to actively interact with buggy code, and thus automatically reach conclusions prior to patch generation. By aligning the reasoning of automated debugging more closely with that of human developers, we aim to produce intelligible explanations of how a specific patch has been generated, with the hope that the explanation will lead to more efficient and accurate developer decisions. Our empirical analysis on three program repair benchmarks shows that AutoSD performs competitively with other program repair baselines, and that it can indicate when it is confident in its results. Furthermore, we perform a human study with 20 participants, including six professional developers, to evaluate the utility of explanations from AutoSD. Participants with access to explanations could judge patch correctness in roughly the same time as those without, but their accuracy improved for five out of six real-world bugs studied: 70% of participants answered that they wanted explanations when using repair tools, while 55% answered that they were satisfied with the Scientific Debugging presentation

    Data-centric AI: Perspectives and Challenges

    Full text link
    The role of data in building AI systems has recently been significantly magnified by the emerging concept of data-centric AI (DCAI), which advocates a fundamental shift from model advancements to ensuring data quality and reliability. Although our community has continuously invested efforts into enhancing data in different aspects, they are often isolated initiatives on specific tasks. To facilitate the collective initiative in our community and push forward DCAI, we draw a big picture and bring together three general missions: training data development, inference data development, and data maintenance. We provide a top-level discussion on representative DCAI tasks and share perspectives. Finally, we list open challenges. More resources are summarized at https://github.com/daochenzha/data-centric-AIComment: Accepted by SDM 2023 Blue Sky Track. More resources are summarized at https://github.com/daochenzha/data-centric-A

    A User Study for Evaluation of Formal Verification Results and their Explanation at Bosch

    Full text link
    Context: Ensuring safety for any sophisticated system is getting more complex due to the rising number of features and functionalities. This calls for formal methods to entrust confidence in such systems. Nevertheless, using formal methods in industry is demanding because of their lack of usability and the difficulty of understanding verification results. Objective: We evaluate the acceptance of formal methods by Bosch automotive engineers, particularly whether the difficulty of understanding verification results can be reduced. Method: We perform two different exploratory studies. First, we conduct a user survey to explore challenges in identifying inconsistent specifications and using formal methods by Bosch automotive engineers. Second, we perform a one-group pretest-posttest experiment to collect impressions from Bosch engineers familiar with formal methods to evaluate whether understanding verification results is simplified by our counterexample explanation approach. Results: The results from the user survey indicate that identifying refinement inconsistencies, understanding formal notations, and interpreting verification results are challenging. Nevertheless, engineers are still interested in using formal methods in real-world development processes because it could reduce the manual effort for verification. Additionally, they also believe formal methods could make the system safer. Furthermore, the one-group pretest-posttest experiment results indicate that engineers are more comfortable understanding the counterexample explanation than the raw model checker output. Limitations: The main limitation of this study is the generalizability beyond the target group of Bosch automotive engineers.Comment: This manuscript is under review with the Empirical Software Engineering journa

    CodeBase Relationship Visualizer: Visualizing Relationships Between Source Code Files

    Get PDF
    Understanding relationships between files and their directory structure is a fundamental part of the software development process. However, it can be hard to grasp these relationships without a convenient way to visualize how files are connected and how they fit into the directory structure of the codebase. In this paper we describe CodeBase Relationship Visualizer (CBRV), a Visual Studio Code extension that interactively visualizes the relationships between files. CBRV displays the relationships between files as arrows superimposed over a diagram of the codebase\u27s directory structure. CBRV comes bundled with visualizations of the stack trace path, a dependency graph for Python codebases, and a hyperlink graph for HTML and Markdown. CBRV also exposes an API that can be used to create visualizations for multiple different relationships. CBRV is a convenient and easy-to-use tool that offers a big picture perspective on the relationships within a codebase

    Competency Matrix Design and Evaluation of Crisis Informatics Solutions for Transportation Authorities

    Get PDF
    The development of technologies such as AI and ML has contributed to the growth in interdisciplinary collaboration to address significant social and engineering challenges. The rise of crisis informatics and the utilization of social media data sources has permitted the development of models, methods, and theories around crisis communication. The motivation behind crisis informatics is to protect society with tools to improve emergency response during times of crisis. Crisis informatics can be applied on a large scale where events such as infrastructure collapse, earthquakes, fires, and hurricanes among others. But can also be targeted towards specific networks such as the road network for a transportation authority. Solutions for this type of event have been developed in industry and academia with different focuses and capabilities. These solutions can be integrated into the public through public procurement of IT software technologies. In this thesis, a competency matrix was designed from the study of state-of-the-art technology in crisis informatics and the status of public procurement for IT software. The competency matrix was used to evaluate the different capabilities among the studied solutions. The three proposed solutions showed different capabilities and brought positive aspects to tackle the problem. However, it is the differences among them and their alignment with the client’s needs and goals that will determine the optimal solution.M.S

    Next generation forensic taphonomy: Automation for experimental, field-based research.

    Get PDF
    Determining the post-mortem interval (PMI) is often a critical goal in forensic casework. Consequently, the discipline of forensic taphonomy has involved considerable research efforts towards achieving this goal, with substantial strides made in the past 40 years. Importantly, quantification of decompositional data (and the models derived from them) and standardisation in experimental protocols are being increasingly recognised as key components of this drive. However, despite the discipline's best efforts, significant challenges remain. Still lacking are standardisation of many core components of experimental design, forensic realism in experimental design, true quantitative measures of the progression of decay, and high-resolution data. Without these critical elements, large-scale, synthesised multi-biogeographically representative datasets - necessary for building comprehensive models of decay to precisely estimate PMI - remain elusive. To address these limitations, we propose the automation of taphonomic data collection. We present the world's first reported fully automated, remotely operable forensic taphonomic data collection system, inclusive of technical design details. Through laboratory testing and field deployments, the apparatus substantially reduced the cost of actualistic (field-based) forensic taphonomic data collection, improved data resolution, and provided for more forensically realistic experimental deployments and simultaneous multi-biogeographic experiments. We argue that this device represents a quantum leap in experimental methodology in this field, paving the way for the next generation of forensic taphonomic research and, we hope, attainment of the elusive goal of precise estimation of PMI. [Abstract copyright: Copyright © 2023 The Authors. Published by Elsevier B.V. All rights reserved.

    A dynamic Bayesian optimized active recommender system for curiosity-driven Human-in-the-loop automated experiments

    Full text link
    Optimization of experimental materials synthesis and characterization through active learning methods has been growing over the last decade, with examples ranging from measurements of diffraction on combinatorial alloys at synchrotrons, to searches through chemical space with automated synthesis robots for perovskites. In virtually all cases, the target property of interest for optimization is defined apriori with limited human feedback during operation. In contrast, here we present the development of a new type of human in the loop experimental workflow, via a Bayesian optimized active recommender system (BOARS), to shape targets on the fly, employing human feedback. We showcase examples of this framework applied to pre-acquired piezoresponse force spectroscopy of a ferroelectric thin film, and then implement this in real time on an atomic force microscope, where the optimization proceeds to find symmetric piezoresponse amplitude hysteresis loops. It is found that such features appear more affected by subsurface defects than the local domain structure. This work shows the utility of human-augmented machine learning approaches for curiosity-driven exploration of systems across experimental domains. The analysis reported here is summarized in Colab Notebook for the purpose of tutorial and application to other data: https://github.com/arpanbiswas52/varTBOComment: 7 figures in main text, 3 figures in Supp Materia
    • …
    corecore